Скачать презентацию Towards a US and LHC Grid Environment for Скачать презентацию Towards a US and LHC Grid Environment for

a36b6b99cf6d8635571353350e72eafa.ppt

  • Количество слайдов: 38

Towards a US (and LHC) Grid Environment for HENP Experiments CHEP 2000 Grid Workshop Towards a US (and LHC) Grid Environment for HENP Experiments CHEP 2000 Grid Workshop Harvey B. Newman, Caltech Padova February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Data Grid Hierarchy: Integration, Collaboration, Marshal resources 1 TIPS = 25, 000 Spec. Int Data Grid Hierarchy: Integration, Collaboration, Marshal resources 1 TIPS = 25, 000 Spec. Int 95 ~PBytes/sec Online System Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Tier 1 France Regional Center ~100 MBytes/sec PC (today) = 10 -15 Spec. Int 95 Offline Farm ~20 TIPS ~100 MBytes/sec ~622 Mbits/sec or Air Freight Tier 0 Germany Regional Center CERN Computer Center Italy Regional Center Fermilab ~4 TIPS ~2. 4 Gbits/sec Tier 2 Center Tier 2 Center ~1 TIPS ~1 TIPS ~622 Mbits/sec Tier 3 Institute ~0. 25 TIPS Physics data cache Workstations Institute 100 - 1000 Mbits/sec Tier 4 Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

To Solve: the LHC “Data Problem” The proposed LHC computing and data handling will To Solve: the LHC “Data Problem” The proposed LHC computing and data handling will not support FREE access, transport or processing for more than a small part of the data Balance between proximity to large computational and data handling facilities, and proximity to end users and more local resources for frequently-accessed datasets Strategies must be studied and prototyped, to ensure both: acceptable turnaround times, and efficient resource utilisation Problems to be Explored How to meet demands of hundreds of users who need transparent access to local and remote data, in disk caches and tape stores Prioritise hundreds of requests of local and remote communities, consistent with local and regional policies Ensure that the system is dimensioned/used/managed optimally, for the mixed workload February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Regional Center Architecture Example by I. Gaines (MONARC) Tape Mass Storage & Disk Servers Regional Center Architecture Example by I. Gaines (MONARC) Tape Mass Storage & Disk Servers Database Servers Network from CERN Network from Tier 2 & simulation centers Tier 2 Local institutes Production Analysis Individual Analysis CERN Raw/Sim ESD Tapes Production Reconstruction ESD AOD DPD and plots Tapes Scheduled Chaotic Physics groups Physicists Scheduled, predictable experiment/ physics groups Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers February 12, 2000: Towards a US and LHC Grid Environment for Experiments Desktops Training Consulting Help Desk Harvey Newman (CIT)

Grid Services Architecture [*]: Applns HEP Data-Analysis Related Applications Appln Toolkits Remote data toolkit Grid Services Architecture [*]: Applns HEP Data-Analysis Related Applications Appln Toolkits Remote data toolkit Grid Services Protocols, authentication, policy, resource management, instrumentation, data discovery, etc. Grid Fabric Remote comp. toolkit Remote viz toolkit Remote collab. toolkit Remote. . . sensors toolkit Networks, data stores, computers, display devices, etc. ; associated local services (local implementations) [*] Adapted from Ian Foster February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Grid Hierarchy Goals: Better Resource Use and Faster Turnaround “Grid” integration and (de facto Grid Hierarchy Goals: Better Resource Use and Faster Turnaround “Grid” integration and (de facto standard) common services to ease development, operation, management and security Efficient resource use and improved responsiveness through: Treatment of the ensemble of site and network resources as an integrated (loosely coupled) system Resource discovery, query estimation (redirection), co-scheduling, prioritization, local and global allocations Network and site “instrumentation”: performance tracking, monitoring, forward-prediction, problem trapping and handling February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Gri. Phy. N: First Production Scale “Grid Physics Network” Develop a New Integrated Distributed Gri. Phy. N: First Production Scale “Grid Physics Network” Develop a New Integrated Distributed System, while Meeting Primary Goals of the US LIGO, SDSS and LHC Programs Unified GRID System Concept; Hierarchical Structure ~Twenty Centers; with Three Sub-Implementations 5 -6 Each in US for LIGO, CMS, ATLAS; 2 -3 for SDSS Emphasis on Training, Mentoring and Remote Collaboration Focus on LIGO, SDSS (+Ba. Bar and Run 2) handling of real data, and LHC Mock Data Challenges with simulated data Making the Process of Discovery Accessible to Students Worldwide Gri. Phy. N Web Site: http: //www. phys. ufl. edu/~avery/mre/ White Paper: Towards a US and LHC Gridufl. edu/~for Experiments /white_paper. html February 12, 2000: http: //www. phys. Environment avery/mre Harvey Newman (CIT)

Grid Development Issues Integration of applications with Grid Middleware k. Performance-oriented user application software Grid Development Issues Integration of applications with Grid Middleware k. Performance-oriented user application software architecture is required, to deal with the realities of data access and delivery k. Application frameworks must work with system state and policy information (“instructions”) from the Grid O(R)DBMS’s must be extended to work across networks E. g. “Invisible” (to the DBMS) data transport, and catalog update Interfacility cooperation at a new level, across world regions Agreement on choice and implementation of standard Grid components, services, security and authentication Interface the common services locally to match with heterogeneous resources, performance levels, and local operational requirements Accounting and “exchange of value” software to enable cooperation February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Roles of Projects for HENP Distributed Analysis RD 45, GIOD: Networked Object Databases Clipper/GC; Roles of Projects for HENP Distributed Analysis RD 45, GIOD: Networked Object Databases Clipper/GC; High speed access to Objects or File data FNAL/SAM for processing and analysis SLAC/OOFS Distributed File System + Objectivity Interface NILE, Condor: Fault Tolerant Distributed Computing with Heterogeneous CPU Resources MONARC: PPDG: ALDAP: Gri. Phy. N: LHC Computing Models: Architecture, Simulation, Strategy, Politics First Distributed Data Services and Data Grid System Prototype OO Database Structures and Access Methods for Astrophysics and HENP Data Production-Scale Data Grid Simulation/Modeling, Application + Network Instrumentation, System Optimization/Evaluation APOGEE February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Other ODBMS tests Tests with Versant (fallback ODBMS) DRO WAN Tests with CERN Production Other ODBMS tests Tests with Versant (fallback ODBMS) DRO WAN Tests with CERN Production on CERN’s PCSF and file movement to Caltech Objectivity/DB Creation of 32000 database federation February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

The China Clipper Project: A Data Intensive Grid ANL-SLAC-Berkeley China Clipper Goal Develop and The China Clipper Project: A Data Intensive Grid ANL-SLAC-Berkeley China Clipper Goal Develop and demonstrate middleware allowing applications transparent, high-speed access to large data sets distributed over wide-area networks. Builds on expertise and assets at ANL, LBNL & SLAC NERSC, ESnet Builds on Globus Middleware and high-performance distributed storage system (DPSS from LBNL) Initial focus on large DOE HENP applications RHIC/STAR, Ba. Bar Demonstrated data rates to 57 Mbytes/sec. February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Grand Challenge Architecture An order-optimized prefetch architecture for data retrieval from multilevel storage in Grand Challenge Architecture An order-optimized prefetch architecture for data retrieval from multilevel storage in a multiuser environment Queries select events and specific event components based upon tag attribute ranges Query estimates are provided prior to execution Queries are monitored for progress, multi-use Because event components are distributed over several files, processing an event requires delivery of a “bundle” of files Events are delivered in an order that takes advantage of what is already on disk, and multiuser policy-based prefetching of further data from tertiary storage GCA intercomponent communication is CORBA-based, but physicists are shielded from this layer February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

GCA System Overview Client GCA STACS Client Index File Catalog (Other) disk-resident event data GCA System Overview Client GCA STACS Client Index File Catalog (Other) disk-resident event data Event Tags Staged event files February 12, 2000: Towards a US and LHC Grid Environment for Experiments pftp HPSS Harvey Newman (CIT)

STorage Access Coordination System (STACS) Query Estimate Query Estimator List of file bundles and STorage Access Coordination System (STACS) Query Estimate Query Estimator List of file bundles and events File Bundles, Event lists Query Monitor Requests for file caching and purging Pftp and file purge commands Bit-Sliced Index Policy Module Query Status, Cache Map Cache Manager February 12, 2000: Towards a US and LHC Grid Environment for Experiments File Catalog Harvey Newman (CIT)

The Particle Physics Data Grid (PPDG) ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, The Particle Physics Data Grid (PPDG) ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U. Wisc/CS PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot Site to Site Data Replication Service 100 Mbytes/sec SECONDARY SITE CPU, Disk, Tape Robot First Year Goal: Optimized cached read access to 1 -10 Gbytes, drawn from a total data set of order One Petabyte Multi-Site Cached File Access Service PRIMARY SITE DAQ, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Satellite Site Disk, Robot Tape, CPU, Disk, Robot February 12, 2000: Towards a US and LHC Grid Environment for Experiments University CPU, Disk, University Users Disk, CPU, Users Harvey Newman (CIT)

The Particle Physics Data Grid (PPDG) The ability to query and partially retrieve hundreds The Particle Physics Data Grid (PPDG) The ability to query and partially retrieve hundreds of terabytes across Wide Area Networks within seconds, PPDG uses advanced services in three areas: Distributed caching: to allow for rapid data delivery in response to multiple requests Matchmaking and Request/Resource co-scheduling: to manage workflow and use computing and net resources efficiently; to achieve high throughput Differentiated Services: to allow particle-physics bulk data transport to coexist with interactive and real-time remote collaboration sessions, and other network traffic. February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

PPDG: Architecture for Reliable High Speed Data Delivery Object-based and File-based Application Services File PPDG: Architecture for Reliable High Speed Data Delivery Object-based and File-based Application Services File Access Service Cache Manager Resource Management Matchmaking Service File Replication Index Cost Estimation File Fetching Service Mass Storage Manager File Mover + Future File and Object Export; Cache & State Tracking; Forward Prediction Site Boundary February 12, 2000: Towards a US and LHC Grid Environment for Experiments File Mover End-to-End Network Services Security Domain Harvey Newman (CIT)

First Year PPDG “System” Components Middleware Components (Initial Choice): See PPDG Proposal Object and First Year PPDG “System” Components Middleware Components (Initial Choice): See PPDG Proposal Object and File-Based Objectivity/DB (SLAC enhanced) Application Services GC Query Object, Event Iterator, Query Monitor FNAL SAM System Resource Management Start with Human Intervention (but begin to deploy resource discovery & mgmnt tools: Condor, SRB) File Access Service Components of OOFS (SLAC) Cache Manager GC Cache Manager (LBNL) Mass Storage Manager HPSS, Enstore, OSM (Site-dependent) Matchmaking Service Condor (U. Wisconsin) File Replication Index MCAT (SDSC) Transfer Cost Estimation Service Globus (ANL) File Fetching Service Components of OOFS File Movers(s) SRB (SDSC) ; Site specific End-to-end Network Services Globus tools for Qo. S reservation Security and authentication Globus (ANL) February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

CONDOR Matchmaking A Resource Allocation Paradigm Parties use Class. Ads to advertise properties, requirements CONDOR Matchmaking A Resource Allocation Paradigm Parties use Class. Ads to advertise properties, requirements and ranking to a matchmaker Class. Ads are Selfdescribing (no separate schema) Class. Ads combine query and data Application Agent Customer Agent Environment Agent Owner Agent Local Resource Management Resource http: //www. cs. wisc. edu/condor February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Agents for Remote Execution in Condor in CONDOR Execution Submission Request Queue Data & Agents for Remote Execution in Condor in CONDOR Execution Submission Request Queue Data & Object Files Ckpt Files Customer Agent Object Files Owner Agent Application Agent Execution Agent Application Process Object Files Application Process Remote I/O & Ckpt February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Beyond Traditional Architectures: Mobile Agents (Java Aglets) Application Service Agent “Agents are objects with Beyond Traditional Architectures: Mobile Agents (Java Aglets) Application Service Agent “Agents are objects with rules and legs” -- D. Taylor Naturally Heterogeneous Extensible Concept: Agent Hierarchies February 12, 2000: Towards a US and LHC Grid Environment for Experiments Agent Adaptive Robust, Fault Tolerant Agent Overcome Network Latency; Some Outages Agent Reduce Network Load: Local Conversations Agent Execute Asynchronously Agent Mobile Agents Harvey Newman (CIT)

Using the Globus Tools Tests with “gsiftp”, a modified ftp server/client that allows control Using the Globus Tools Tests with “gsiftp”, a modified ftp server/client that allows control of the TCP buffer size Transfers of Objy database files from the Exemplar to ~25 MB/sec on Hi. PPI loop-back Itself An O 2 K at Argonne (via Cal. REN 2 and Abilene) ~4 MB/sec to Argonne by tuning TCP window size A Linux machine at INFN (via US-CERN Transatlantic link) Target /dev/null in multiple streams (1 to 16 parallel gsiftp sessions). Aggregate throughput as a function of number of streams and send/receive buffer sizes Saturating available B/W to Argonne February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Distributed Data Delivery and LHC Software Architecture Software Architectural Choices Traditional, single-threaded applications k Distributed Data Delivery and LHC Software Architecture Software Architectural Choices Traditional, single-threaded applications k Wait for data location, arrival and reassembly OR Performance-Oriented (Complex) k I/O requests up-front; multi-threaded; data driven; respond to ensemble of (changing) cost estimates k Possible code movement as well as data movement k Loosely coupled, dynamic February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Gri. Phy. N Foundation Build on the Distributed System Results of the GIOD, MONARC, Gri. Phy. N Foundation Build on the Distributed System Results of the GIOD, MONARC, NILE, Clipper/GC and PPDG Projects Long Term Vision in Three Phases 1. Read/write access to high volume data and processing power k. Condor/Globus/SRB + Net. Logger components to manage jobs and resources 2. WAN-distributed data-intensive Grid computing system k Tasks move automatically to the “most effective” Node in the Grid k Scalable implementation using mobile agent technology 3. “Virtual Data” concept for multi-PB distributed data management, with large-scale Agent Hierarchies k Transparently match data to sites, manage data replication or transport, co-schedule data & compute resources Build on VRVS Developments for Remote Collaboration February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Gri. Phy. N/APOGEE: Production-Design of a Data Analysis Grid INSTRUMENTATION, SIMULATION, OPTIMIZATION, COORDINATION SIMULATION Gri. Phy. N/APOGEE: Production-Design of a Data Analysis Grid INSTRUMENTATION, SIMULATION, OPTIMIZATION, COORDINATION SIMULATION of a Production-Scale Grid Hierarchy k Provide a Toolset for HENP experiments to test and optimize their data analysis and resource usage strategies INSTRUMENTATION of Grid Prototypes k Characterize the Grid components’ performance under load k Validate the Simulation k Monitor, Track and Report system state, trends and “Events” OPTIMIZATION of the Data Grid k Genetic algorithms, or other evolutionary methods k Deliver optimization package for HENP distributed systems k Applications to other experiments; accelerator and other control systems; other fields COORDINATE with Experiment-Specific Projects: CMS, ATLAS, Ba. Bar, Run 2 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Grid (IT) Issues to be Addressed Dataset compaction; data caching and mirroring strategies Using Grid (IT) Issues to be Addressed Dataset compaction; data caching and mirroring strategies Using large time-quanta or very high bandwidth bursts, for large data transactions Query estimators, Query Monitors (cf. GCA work) Enable flexible, resilient prioritisation schemes (marginal utility) Query redirection, fragmentation, priority alteration, etc. Pre-Emptive and realtime data/resource matchmaking Resource discovery k. Data and CPU Location Brokers Co-scheduling and queueing processes State, workflow, & performance-monitoring instrumentation; tracking and forward prediction Security: Authentication (for resource allocation/usage and priority); running a certificate authority February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

CMS Example: Data Grid Program of Work (I) FY 2000 Build basic services; “ CMS Example: Data Grid Program of Work (I) FY 2000 Build basic services; “ 1 Million event” samples on proto-Tier 2’s k. For HLT milestones and detector/physics studies with ORCA MONARC Phase 3 simulations for study/optimization FY 2001 Set up initial Grid system based on PPDG deliverables at the first Tier 2 centers and Tier 1 -prototype centers k. High speed site-to-site file replication service k. Multi-site cached file access CMS Data Challenges in support of DAQ TDR Shakedown of preliminary PPDG (+ MONARC and GIOD) system strategies and tools FY 2002 Deploy Grid system at the second set of Tier 2 centers CMS Data Challenges for Software and Computing TDR and Physics TDR February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Data Analysis Grid Program of Work (II) FY 2003 Deploy Tier 2 centers at Data Analysis Grid Program of Work (II) FY 2003 Deploy Tier 2 centers at last set of sites 5%-Scale Data Challenge in Support of Physics TDR Production-prototype test of Grid Hierarchy System, with first elements of the production Tier 1 Center FY 2004 20% Production (Online and Offline) CMS Mock Data Challenge, with all Tier 2 Centers, and partly completed Tier 1 Center Build Production-quality Grid System FY 2005 (Q 1 - Q 2) Final Production CMS (Online and Offline) Shakedown Full distributed system software and instrumentation Using full capabilities of the Tier 2 and Tier 1 Centers February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Summary The HENP/LHC data handling problem H Multi-Petabyte scale, binary pre-filtered data, resources distributed Summary The HENP/LHC data handling problem H Multi-Petabyte scale, binary pre-filtered data, resources distributed worldwide H Has no analog now, but will be increasingly prevalent in research, and industry by ~2005. Development of a robust PB-scale networked data access and analysis system is mission-critical An effective partnership exists, HENP-wide, through many R&D projects H RD 45, GIOD, MONARC, Clipper, GLOBUS, CONDOR, ALDAP, PPDG, . . . An aggressive R&D program is required to develop H Resilient “self-aware” systems, for data access, processing and analysis across a hierarchy of networks Solutions that could be widely applicable to data problems in other scientific fields and industry, by LHC startup H Focus on Data Grids for Next Generation Physics February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

LHC Data Models: 1994 -2000 HEP data models are complex! Rich hierarchy of hundreds LHC Data Models: 1994 -2000 HEP data models are complex! Rich hierarchy of hundreds of complex data types (classes) Many relations between them Different access patterns (Multiple Viewpoints) OO technology Event Tracker Calorimeter OO applications deal with networks of objects (and containers) Pointers (or references) are used to describe relations Track. List Hit. List Existing solutions do not scale Solution suggested by RD 45: ODBMS coupled to a Mass Storage System Track Track Hit Hit Hit Construction of “Compact” Datasets for Analysis: Rapid Access/Navigation/Transport February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Content Delivery Networks (CDN) Web-Based Server-Farm Networks Circa 2000 Dynamic (Grid-Like) Content Delivery Engines Content Delivery Networks (CDN) Web-Based Server-Farm Networks Circa 2000 Dynamic (Grid-Like) Content Delivery Engines Akamai, Adero, Sandpiper 1200 Thousands of Network-Resident Servers 25 60 ISP Networks 25 30 Countries 40+ Corporate Customers $ 25 B Capitalization Resource Discovery Build “Weathermap” of Server Network (State Tracking) Query Estimation; Matchmaking/Optimization; Request rerouting Virtual IP Addressing: One address per server-farm Mirroring, Caching (1200) Autonomous-Agent Implementation February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Strawman Tier 2 Evolution 2000 2005 Linux Farm: 1, 200 SI 95 20, 000 Strawman Tier 2 Evolution 2000 2005 Linux Farm: 1, 200 SI 95 20, 000 SI 95* Disks on CPUs 4 TB 50 TB RAID Array 1 TB 30 TB Tape Library 1 -2 TB LAN Speed 0. 1 - 1 Gbps 10 -100 Gbps 155 - 622 Mbps 2. 5 - 10 Gbps WAN Speed Collaborative MPEG 2 VGA Infrastructure (1. 5 - 3 Mbps) 50 -100 TB Realtime HDTV (10 - 20 Mbps) [*] Reflects lower Tier 2 component costs due to less demanding usage. Some of the CPU will be used for simulation. February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

USCMS S&C Spending profile 2006 is a model year for the operations phase of USCMS S&C Spending profile 2006 is a model year for the operations phase of CMS February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Gri. Phy. N Cost System support $ 8. 0 M R&D $ 15. 0 Gri. Phy. N Cost System support $ 8. 0 M R&D $ 15. 0 M Software $ 2. 0 M Tier 2 networking $ 10. 0 M Tier 2 hardware $ 50. 0 M Total $ 85. 0 M February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Grid Hierarchy Concept: Broader Advantages Partitioning of users into “proximate” communities into for support, Grid Hierarchy Concept: Broader Advantages Partitioning of users into “proximate” communities into for support, troubleshooting, mentoring Partitioning of facility tasks, to manage and focus resources Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region k. Lower tiers of the hierarchy More local control February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Storage Request Brokers (SRB) Name Transparency: Access to data by attributes stored in an Storage Request Brokers (SRB) Name Transparency: Access to data by attributes stored in an RDBMS (MCAT). Location Transparency: Logical collections (by attributes) spanning multiple physical resources. Combined Location and Name Transparency means that datasets can be replicated across multiple caches and data archives (PPDG). Data Management Protocol Transparency: SRB with custom-built drivers in front of each storage system User does not need to know how the data is accessed; SRB deals with local file system managers SRBs (agents) authenticate themselves and users, using Grid Security Infrastructure (GSI) February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Role of Simulation for Distributed Systems Simulations are widely recognized and used as essential Role of Simulation for Distributed Systems Simulations are widely recognized and used as essential tools for the design, performance evaluation and optimisation of complex distributed systems From battlefields to agriculture; from the factory floor to telecommunications systems Discrete event simulations with an appropriate and high level of abstraction Just beginning to be part of the HEP culture k Some experience in trigger, DAQ and tightly coupled computing systems: CERN CS 2 models (Event-oriented) k MONARC (Process-Oriented; Java 2 Threads + Class Lib) These simulations are very different from HEP “Monte Carlos” k “Time” intervals and interrupts are the essentials Simulation is a vital part of the study of site architectures, network behavior, data access/processing/delivery strategies, for HENP Grid Design and Optimization February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)

Monitoring Architecture: Use of Net. Logger in CLIPPER End-to-end monitoring of grid assets is Monitoring Architecture: Use of Net. Logger in CLIPPER End-to-end monitoring of grid assets is necessary to Resolve network throughput problems Dynamically schedule resources Add precision-timed event monitor agents to: ATM switches Storage servers Testbed computational resources Produce trend analysis modules for monitor agents Make results available to applications February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT)