db0e2620843ccf220a0a199d3eb7f49a.ppt
- Количество слайдов: 62
U. S. Grid Projects: Grid 3 and Open Science Grid Paul Avery University of Florida avery@phys. ufl. edu International ICFA Workshop on HEP, Networking & Digital Divide Issues for Global e-Science Daegu, Korea May 23, 2005 Digital Divide Meeting (May 23, 2005) Paul Avery 1
U. S. “Trillium” Grid Partnership Ø Trillium = PPDG + Gri. Phy. N + i. VDGL u Particle Physics Data Grid: $12 M (DOE) (1999 – 2006) u Gri. Phy. N: $12 M (NSF) (2000 – 2005) u i. VDGL: $14 M (NSF) (2001 – 2006) Ø Basic composition (~150 people) u PPDG: 4 universities, 6 labs u Gri. Phy. N: 12 universities, SDSC, 3 labs u i. VDGL: 18 universities, SDSC, 4 labs, foreign partners u Expts: Ba. Bar, D 0, STAR, Jlab, CMS, ATLAS, LIGO, SDSS/NVO Ø Coordinated u Gri. Phy. N: internally to meet broad goals CS research, Virtual Data Toolkit (VDT) development u i. VDGL: Grid laboratory deployment using VDT, applications u PPDG: “End to end” Grid services, monitoring, analysis u Common use of VDT for underlying Grid middleware Digital Divide Meeting (May 23, 2005) Paul Avery 2 u Unified entity when collaborating internationally
Goal: Peta-scale Data Grids for Global Science Production Team Single Researcher Workgroups Interactive User Tools Virtual Data Tools Request Planning & Scheduling Tools Resource Management Services Security and Policy Services Peta. Ops u Petabytes u Performance u Other Grid Services Transforms Distributed resources Raw data source Digital Divide Meeting (May 23, 2005) Request Execution & Management Tools (code, storage, CPUs, networks) Paul Avery 3
Grid Middleware: Virtual Data Toolkit VDT NMI Sources (CVS) Build & Test Condor pool 22+ Op. Systems Build Test Pacman cache Package Patching GPT src bundles Binaries RPMs Build Binaries Test Build Binaries Many Contributors A unique laboratory for testing, supporting, deploying, packaging, upgrading, & Digital Divide Meeting (May 23, 2005) Paul Avery 4 troubleshooting complex sets of software!
VDT Growth Over 3 Years # of components www. griphyn. org/vdt/ VDT 1. 1. 8 First real use by LCG VDT 1. 0 Globus 2. 0 b Condor 6. 3. 1 VDT 1. 1. 11 Grid 3 VDT 1. 1. 7 Switch to Globus 2. 2 Digital Divide Meeting (May 23, 2005) Paul Avery 5
Trillium Science Drivers at Large Hadron Collider u New Ø High Energy & Nuclear Physics expts u Top quark, nuclear matter at extreme density u ~1 Petabyte (1000 TB) 1997 – present Ø LIGO (gravity wave search) u Search for gravitational waves u 100 s of Terabytes 2002 – present Ø Sloan Digital Sky Survey 2007 2005 2003 2001 Data growth fundamental particles and forces u 100 s of Petabytes 2007 - ? 2009 Community growth Ø Experiments u Systematic survey of astronomical objects u 10 s of Terabytes 2001 – present Digital Divide Meeting (May 23, 2005) Paul Avery 6
LHC: Petascale Global Science Ø Complexity: Millions Ø Scale: of individual detector channels Peta. Ops (CPU), 100 s of Petabytes (Data) Ø Distribution: Global distribution of people & resources Ba. Bar/D 0 Example - 2004 700+ Physicists 100+ Institutes 35+ Countries CMS Example- 2007 5000+ Physicists 250+ Institutes 60+ Countries Digital Divide Meeting (May 23, 2005) Paul Avery 7
LHC Global Data Grid (2007+) Ø 5000 physicists, 60 countries Ø 10 s of Petabytes/yr by 2008 Ø 1000 Petabytes in < 10 yrs? CMS Experiment Online System 200 - 1500 MB/s Tier 0 Tier 1 Korea CERN Computer Center Russia UK 10 -40 Gb/s USA >10 Gb/s U Florida Tier 2 Caltech UCSD 2. 5 -10 Gb/s Tier 3 Tier 4 FIU Physics caches Digital Divide Meeting (May 23, 2005) Iowa Maryland PCs Paul Avery 8
University LHC Tier 2 Centers Ø Tier 2 facility u Essential university role in extended computing infrastructure u 20 – 25% of Tier 1 national laboratory, supported by NSF u Validated by 3 years of experience (CMS, ATLAS) Ø Functions u Perform physics analysis, simulations u Support experiment software, smaller institutions Ø Official role in Grid hierarchy (U. S. ) u Sanctioned by MOU with parent organization (ATLAS, CMS) u Selection by collaboration via careful process Digital Divide Meeting (May 23, 2005) Paul Avery 9
Grids and Globally Distributed Teams Ø Non-hierarchical: Chaotic analyses + productions Ø Superimpose significant random data flows Digital Divide Meeting (May 23, 2005) Paul Avery 10
Analysis Client • Discovery • ACL management • Cert. based access • ROOT (analysis tool) • Python • Cojac (detector viz)/ IGUANA (cms viz) HTTP, SOAP, XMLRPC Clarens CMS: Grid Enabled Analysis Architecture Analysis Client u Clients talk standard protocols to “Grid Services Web Server” Grid Services Web Server Scheduler Sphinx MCRunjob Catalogs Fully. Abstract Planner Metadata Ref. DB Partially. Abstract Planner Chimera Fully. Concrete Planner u Simple Web service API allows simple or complex analysis clients Data Management Virtual Data Mon. ALISA MOPDB Monitoring Replica BOSS u Typical clients: ROOT, Web Browser, …. ORCA Applications ROOT FAMOS POOL Execution Priority Manager u Key features: Global Scheduler, Catalogs, Monitoring, Grid-wide Execution service VDT-Server Grid Wide Execution Digital Divide Meeting (May 23, 2005) Service u Clarens portal hides complexity Paul Avery 11
Grid 3: A National Grid Infrastructure Ø 32 sites, 4000 CPUs: Universities + 4 national labs Ø Part of LHC Grid, Running since October 2003 Ø Sites in US, Korea, Brazil, Taiwan Ø Applications in HEP, LIGO, SDSS, Genomics, f. MRI, CS Brazil Digital Divide Meeting (May 23, 2005) www. ivdgl. org/grid 3 Paul Avery 12
Grid 3 Components Ø Computers Ø Uniform & storage at ~30 sites: 4000 CPUs service environment at each site u Globus 3. 2: Authentication, execution management, data movement u Pacman: Installation of numerous VDT and application services Ø Global & virtual organization services u Certification Ø Client-side & reg. authorities, VO membership & monitor services tools for data access & analysis u Virtual data, execution planning, DAG management, execution management, monitoring Ø IGOC: Ø Grid i. VDGL Grid Operations Center testbed: Grid 3 dev u Middleware development and testing, new VDT versions, etc. Digital Divide Meeting (May 23, 2005) Paul Avery 13
Grid 3 Applications CMS experiment p-p collision simulations & analysis ATLAS experiment p-p collision simulations & analysis BTEV experiment p-p collision simulations & analysis LIGO Search for gravitational wave sources SDSS Galaxy cluster finding Bio-molecular analysis Shake n Bake (Sn. B) (Buffalo) Genome analysis GADU/Gnare f. MRI Functional MRI (Dartmouth) CS Demonstrators Job Exerciser, Grid. FTP, Net. Logger www. ivdgl. org/grid 3/applications Digital Divide Meeting (May 23, 2005) Paul Avery 14
Usage: CPUs Grid 3 Shared Use Over 6 months ATLAS DC 2 CMS DC 04 Sep 10, 2004 Digital Divide Meeting (May 23, 2005) Paul Avery 15
Grid 3 Production Over 13 Months Digital Divide Meeting (May 23, 2005) Paul Avery 16
U. S. CMS 2003 Production Ø 10 M p-p collisions; largest ever u 2 simulation sample u ½ manpower ØMulti-VO sharing Digital Divide Meeting (May 23, 2005) Paul Avery 17
Grid 3 Lessons Learned Ø How to operate a Grid as a facility u Tools, services, error recovery, procedures, docs, organization u Delegation of responsibilities (Project, VO, service, site, …) u Crucial role of Grid Operations Center (GOC) Ø How to support people relations u Face-face Ø How to test and validate Grid tools and applications u Vital Ø How role of testbeds to scale algorithms, software, process u Some Ø How meetings, phone cons, 1 -1 interactions, mail lists, etc. successes, but “interesting” failure modes still occur to apply distributed cyberinfrastructure u Successful production runs for several applications Digital Divide Meeting (May 23, 2005) Paul Avery 18
Grid 3 Open Science Grid Ø Iteratively build & extend Grid 3 OSG-0 OSG-1 OSG-2 … u Shared resources, benefiting broad set of disciplines u Grid middleware based on Virtual Data Toolkit (VDT) u Grid 3 Ø Consolidate elements of OSG collaboration u Computer and application scientists u Facility, technology and resource providers (labs, universities) Ø Further develop OSG u Partnerships with other sciences, universities u Incorporation of advanced networking u Focus on general services, operations, end-to-end performance Ø Aim for July 2005 deployment Digital Divide Meeting (May 23, 2005) Paul Avery 19
http: //www. opensciencegrid. org Digital Divide Meeting (May 23, 2005) Paul Avery 20
OSG Organization Advisory Committee Universities, Labs Service Providers Technical Groups Executive Board (8 -15 representatives Chair, Officers) Sites Researchers VOs activity 1 1 Activities Research Grid Projects Enterprise Digital Divide Meeting (May 23, 2005) Core OSG Staff (few FTEs, manager) Paul Avery OSG Council (all members above a certain threshold, Chair, officers) 21
OSG Technical Groups & Activities Ø Technical Groups address and coordinate technical areas u Propose and carry out activities related to their given areas u Liaise & collaborate with other peer projects (U. S. & international) u Participate in relevant standards organizations. u Chairs participate in Blueprint, Integration and Deployment activities Ø Activities are well-defined, scoped tasks contributing to OSG u Each Activity has deliverables and a plan u … is self-organized and operated u … is overseen & sponsored by one or more Technical Groups TGs and Activities are where the real work gets done Digital Divide Meeting (May 23, 2005) Paul Avery 22
OSG Technical Groups Governance Charter, organization, by-laws, agreements, formal processes Policy VO & site policy, authorization, priorities, privilege & access rights Security Common security principles, security infrastructure Monitoring and Information Services Resource monitoring, information services, auditing, troubleshooting Storage services at remote sites, interfaces, interoperability Infrastructure and services for user support, helpdesk, trouble ticket Training, interface with various E/O projects Support Centers Education / Outreach Networks (new) Digital Divide Meeting (May 23, 2005) Including interfacing with various networking projects Paul Avery 23
OSG Activities Blueprint Defining principles and best practices for OSG Deployment of resources & services Provisioning Connected to deployment Incidence response Plans and procedures for responding to security incidents Integration Testing & validating & integrating new services and technologies Data Resource Management (DRM) Deployment of specific Storage Resource Management technology Documentation Organizing the documentation infrastructure Accounting and auditing use of OSG resources Interoperability Primarily interoperability between Operations Operating Grid-wide services Digital Divide Meeting (May 23, 2005) Paul Avery 24
The Path to the OSG Operating Grid Readiness plan adopted Readiness plan Effort Resources VO Application Software Installation Software & packaging Service deployment Middleware Interoperability Functionality & Scalability Tests feedback Digital Divide Meeting (May 23, 2005) Release Description Paul Avery Application validation Metrics & Certification Release Candidate OSG Operations-Provisioning Activity OSG Deployment Activity OSG Integration Activity 25
OSG Integration Testbed >20 Sites and Rising Brazil Digital Divide Meeting (May 23, 2005) Paul Avery 26
Status of OSG Deployment Ø OSG infrastructure release “accepted” for deployment US CMS application “flood testing” successful u D 0 simulation & reprocessing jobs running on selected OSG sites u Others in various stages of readying applications & infrastructure (ATLAS, CMS, STAR, CDF, Ba. Bar, f. MRI) u Ø Deployment process underway: End of July? Open OSG and transition resources from Grid 3 u Applications will use growing ITB & OSG resources during transition u http: //osg. ivdgl. org/twiki/bin/view/Integration/Web. Home Digital Divide Meeting (May 23, 2005) Paul Avery 27
Connections to LCG and EGEE Many LCG-OSG interactions Digital Divide Meeting (May 23, 2005) Paul Avery 28
Interoperability & Federation Ø Transparent use of federated Grid infrastructures a goal u LCG, EGEE u Tera. Grid u State-wide Grids u Campus Grids (Wisconsin, Florida, etc) Ø Some early activities with LCG u Some OSG/Grid 3 sites appear in LCG map u D 0 bringing reprocessing to LCG sites through adaptor node u CMS and ATLAS can run their jobs on both LCG and OSG Ø Increasing interaction with Tera. Grid u CMS and ATLAS sample simulation jobs are running on Tera. Grid u Plans for Tera. Grid allocation for jobs running in Grid 3 model (group accounts, binary distributions, external data management, etc) Digital Divide Meeting (May 23, 2005) Paul Avery 29
Ultra. Light: Advanced Networking in Applications Funded by ITR 2004 Digital Divide Meeting (May 23, 2005) Paul Avery 10 Gb/s+ network • Caltech, UF, FIU, UM, MIT • SLAC, FNAL • Int’l partners 30 • Level(3), Cisco, NLR
Ultra. Light: New Information System ØA new class of integrated information systems u Includes networking as a managed resource for the first time u Uses “Hybrid” packet-switched and circuit-switched optical network infrastructure u Monitor, manage & optimize network and Grid Systems in realtime Ø Flagship applications: HEP, e. VLBI, “burst” imaging u “Terabyte-scale” data transactions in minutes u Extend Real-Time e. VLBI to the 10 – 100 Gb/s Range Ø Powerful testbed u Significant Ø Strong storage, optical networks for testing new Grid services vendor partnerships u Cisco, Calient, NLR, CENIC, Internet 2/Abilene Digital Divide Meeting (May 23, 2005) Paul Avery 31
i. VDGL, Gri. Phy. N Education/Outreach Basics Ø Ø $200 K/yr Led by UT Brownsville Workshops, portals, tutorials New partnerships with Quark. Net, CHEPREO, LIGO E/O, … Digital Divide Meeting (May 23, 2005) Paul Avery 32
U. S. Grid Summer School Ø First of its kind in the U. S. (June 2004, South Padre Island) u 36 students, diverse origins and types (M, F, MSIs, etc) Ø Marks new direction for U. S. Grid efforts u First attempt to systematically train people in Grid technologies u First attempt to gather relevant materials in one place u Today: Students in CS and Physics u Next: Students, postdocs, junior & senior scientists Ø Reaching a wider audience u Put lectures, exercises, video, on the web u More tutorials, perhaps 2 -3/year u Dedicated resources for remote tutorials u Create “Grid Cookbook”, e. g. Georgia Tech Ø Second u South workshop: July 11– 15, 2005 Padre Island Digital Divide Meeting (May 23, 2005) Paul Avery 33
Quark. Net/Gri. Phy. N e-Lab Project http: //quarknet. uchicago. edu/elab/cosmic/home. jsp Digital Divide Meeting (May 23, 2005) Paul Avery 34
Student Muon Lifetime Analysis in Gri. Phy. N/Quark. Net Digital Divide Meeting (May 23, 2005) Paul Avery 35
CHEPREO: Center for High Energy Physics Research and Educational Outreach Florida International University § § Physics Learning Center CMS Research i. VDGL Grid Activities AMPATH network (S. America) Ø Funded September 2003 Ø $4 M initially (3 years) Ø MPS, CISE, EHR, INT
Science Grid Communications Broad set of activities ØNews releases, PR, etc. ØScience Grid This Week ØKatie Yurkewicz talk Digital Divide Meeting (May 23, 2005) Paul Avery 37
Grids and the Digital Divide Rio de Janeiro + Daegu Background Ø World Summit on Information Society Ø HEP Standing Committee on Interregional Connectivity (SCIC) Themes Ø Global collaborations, Grids and addressing the Digital Divide Ø Focus on poorly connected regions Ø Brazil (2004), Korea (2005) Digital Divide Meeting (May 23, 2005) Paul Avery 38
New Campus Research Grids (e. g. , Florida) QTP HCS CISE Bio Astro Applications HEP ACIS Ge o MBI Nano Facilities QTP HCS HPC I Digital Divide Meeting (May 23, 2005) Middleware Grid Operations Certificate Authority Grid Infrastructure Service Providers User Support Database Operations CNS ACIS DWI MBI HPC II Paul Avery CMS Tier 2 Chem QTP HPC III 39
US HEP Data Grid Timeline 2000 PPDG approved, $9. 5 M i. VDGL approved, $13. 7 M+$2 M First US-LHC Grid Testbeds CHEPREO approved, $4 M Gri. Phy. N approved, $11. 9 M+$1. 6 M 2001 VDT 1. 0 2002 2003 Digital Divide Workshops 2004 DISUN approved, $10 M 2005 Grid Summer School II LIGO Grid Start Grid 3 operations Grid Summer School I Ultra. Light approved, $2 M GLORIAD funded Grid Communications Start Open Science Grid operations 2006 Digital Divide Meeting (May 23, 2005) Paul Avery 40
Summary Ø Grids enable 21 st century collaborative science u Linking research communities and resources for scientific discovery u Needed by global collaborations pursuing “petascale” science Ø Grid 3 was an important first step in developing US Grids u Value of planning, coordination, testbeds, rapid feedback u Value of learning how to operate a Grid as a facility u Value of building & sustaining community relationships Ø Grids drive need for advanced optical networks Ø Grids impact education and outreach u Providing technologies & resources for training, education, outreach u Addressing the Digital Divide Ø OSG: a scalable computing infrastructure for science? u Strategies needed to cope with increasingly large scale Digital Divide Meeting (May 23, 2005) Paul Avery 41
Grid Project References ØOpen Science Grid ØUltra. Light u www. opensciencegrid. org ØGrid 3 u ultralight. cacr. caltech. edu ØGlobus u www. ivdgl. org/grid 3 ØVirtual Data Toolkit u www. griphyn. org/vdt ØGri. Phy. N u www. griphyn. org Øi. VDGL u www. ivdgl. org ØPPDG u www. ppdg. net u www. globus. org ØCondor u www. cs. wisc. edu/condor ØLCG u www. cern. ch/lcg ØEU Data. Grid u www. eu-datagrid. org ØEGEE u www. eu-egee. org ØCHEPREO u www. chepreo. org Digital Divide Meeting (May 23, 2005) Paul Avery 42
Extra Slides Digital Divide Meeting (May 23, 2005) Paul Avery 43
Gri. Phy. N Goals Ø Conduct CS research to achieve vision u Virtual Data as unifying principle u Planning, execution, performance monitoring Ø Disseminate u. A through Virtual Data Toolkit “concrete” deliverable Ø Integrate into Gri. Phy. N science experiments u Common Ø Educate, Grid tools, services involve, train students in IT research u Undergrads, postdocs, u Underrepresented groups Digital Divide Meeting (May 23, 2005) Paul Avery 44
i. VDGL Goals Ø Deploy a Grid laboratory u Support research mission of data intensive experiments u Provide computing and personnel resources at university sites u Provide platform for computer science technology development u Prototype and deploy a Grid Operations Center (i. GOC) Ø Integrate u Into Grid software tools computing infrastructures of the experiments Ø Support delivery of Grid technologies u Hardening of the Virtual Data Toolkit (VDT) and other middleware technologies developed by Gri. Phy. N and other Grid projects Ø Education and Outreach u Lead and collaborate with Education and Outreach efforts u Provide tools and mechanisms for underrepresented groups and remote regions to participate in international science projects Digital Divide Meeting (May 23, 2005) Paul Avery 45
“Virtual Data”: Derivation & Provenance Ø Most scientific data are not simple “measurements” u They are computationally corrected/reconstructed u They can be produced by numerical simulation Ø Science & eng. projects are more CPU and data intensive u Programs are significant community resources (transformations) u So are the executions of those programs (derivations) Ø Management of dataset dependencies critical! u Derivation: u Provenance: Instantiation of a potential data product Complete history of any existing data product ØPreviously: Manual methods ØGri. Phy. N: Automated, robust tools Digital Divide Meeting (May 23, 2005) Paul Avery 46
Virtual Data Example: HEP Analysis decay = bb decay = WW WW leptons decay = ZZ mass = 160 decay = WW Other cuts Digital Divide Meeting (May 23, 2005) decay = WW WW e Other cuts Paul Avery Scientist adds a new derived data branch & continues analysis decay = WW WW e Pt > 20 Other cuts 47
Packaging of Grid Software: Pacman Ø Language: define software environments Ø Interpreter: create, install, configure, update, verify environments Ø Version 3. 0. 2 released Jan. 2005 Ø LCG/Scram Ø ATLAS/CMT Ø CMS DPE/tar/make Ø LIGO/tar/make Ø Open. Source/tar/make Ø Globus/GPT Ø NPACI/Tera. Grid/tar/make Ø D 0/UPS-UPD Ø Commercial/tar/make Combine and manage software from arbitrary sources. “ 1 button install”: Reduce burden on administrators % pacman –get i. VDGL: Grid 3 LIGO VDTVDT UCHEP i. VDGL % pacman Digital Divide Meeting (May 23, 2005) D-Zero ATLAS CMS/DPE Paul Avery NPAC I Remote experts define installation/ config/updating for everyone at once 48
Virtual Data Motivations “I’ve found some interesting data, but I need to know exactly what corrections were applied before I can trust it. ” “I’ve detected a muon calibration error and want to know which derived data products need to be recomputed. ” Describe Discover VDC Reuse Validate “I want to search a database for 3 muon events. If a program that does this analysis exists, I won’t have to write one from scratch. ” Digital Divide Meeting (May 23, 2005) Paul Avery “I want to apply a forward jet analysis to 100 M events. If the results already exist, I’ll save weeks of computation. ” 49
Background: Data Grid Projects Driven primarily by HEP applications ØU. S. Funded Projects ØEU, u Gri. Phy. N (NSF) u i. VDGL (NSF) u Particle Physics Data Grid (DOE) u Ultra. Light u Tera. Grid (NSF) u DOE Science Grid (DOE) u NEESgrid (NSF) u NSF Middleware Initiative (NSF) Asia projects u EGEE (EU) u LCG (CERN) u Data. Grid u EU national Projects u Data. TAG (EU) u Cross. Grid (EU) u Grid. Lab (EU) u Japanese, Korea Projects Many projects driven/led by HEP + CS Ø Many 10 s x $M brought into the field Ø Large impact on other sciences, education Ø Digital Divide Meeting (May 23, 2005) Paul Avery 50
“Virtual Data”: Derivation & Provenance Ø Most scientific data are not simple “measurements” u They are computationally corrected/reconstructed u They can be produced by numerical simulation Ø Science & eng. projects are more CPU and data intensive u Programs are significant community resources (transformations) u So are the executions of those programs (derivations) Ø Management of dataset dependencies critical! u Derivation: u Provenance: Instantiation of a potential data product Complete history of any existing data product ØPreviously: Manual methods ØGri. Phy. N: Automated, robust tools Digital Divide Meeting (May 23, 2005) Paul Avery 51
Muon Lifetime Analysis Workflow Digital Divide Meeting (May 23, 2005) Paul Avery 52
(Early) Virtual Data Language pythia_input pythia. exe cmsim_input cmsim. exe write. Hits write. Digis CMS “Pipeline”Avery Paul Digital Divide Meeting (May 23, 2005) begin v /usr/local/demo/scripts/cmkin_input. csh file i ntpl_file_path file i template_file i num_events stdout cmkin_param_file end begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms 121. exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_file end begin v /usr/local/demo/scripts/cmsim_input. csh file i ntpl_file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_file end begin v /usr/local/demo/binaries/cms 121. exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file o hbook_file end begin v /usr/local/demo/binaries/write. Hits. sh condor getenv=true pre orca_hits file i fz_file i detinput file i condor_write. Hits_log file i oo_fd_boot file i datasetname stdout write. Hits_log file o hits_db end begin v /usr/local/demo/binaries/write. Digis. sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_write. Digis_log stdout write. Digis_log file o digis_db end 53
Quark. Net Portal Architecture Ø Simpler interface for non-experts Ø Builds on Chiron portal Digital Divide Meeting (May 23, 2005) Paul Avery 54
Integration of Gri. Phy. N and IVDGL Ø Both funded by NSF large ITRs, overlapping periods u Gri. Phy. N: 9/2005) u i. VDGL: Ø Basic Grid Laboratory, applications (9/2000– (9/2001– 9/2006) composition u Gri. Phy. N: u i. VDGL: u Expts: Ø Gri. Phy. N 12 universities, SDSC, 4 labs (~80 people) 18 institutions, SDSC, 4 labs (~100 people) CMS, ATLAS, LIGO, SDSS/NVO (Grid research) vs i. VDGL (Grid deployment) u Gri. Phy. N: u i. VDGL: Ø Many CS Research, Virtual Data Toolkit 2/3 “CS” + 1/3 “physics” 1/3 “CS” + 2/3 “physics” ( 0% H/W) (20% H/W) common elements u Common Directors, Advisory Committee, linked management Virtual Data Toolkit (VDT) Grid testbeds Outreach effort Digital Divide Meeting (May 23, 2005) Paul Avery 55
Science Review sharing Analysis exec. discovery data Researcher composition Applications instrument Chimera virtual data system Planning planning Production Manager data Gri. Phy. N Overview Virtual Production Data params discovery storage element Services storage element Grid Fabric Digital Divide Meeting (May 23, 2005) Pegasus planner DAGman Globus Toolkit Condor Ganglia, etc. Virtual Data Paul Avery Toolkit Execution 56
Chiron/Quark. Net Architecture Digital Divide Meeting (May 23, 2005) Paul Avery 57
Cyberinfrastructure “A new age has dawned in scientific & engineering research, pushed by continuing progress in computing, information, and communication technology, & pulled by the expanding complexity, scope, and scale of today’s challenges. The capacity of this technology has crossed thresholds that now make possible a comprehensive “cyberinfrastructure” on which to build new types of scientific & engineering knowledge environments & organizations and to pursue research in new ways & with increased efficacy. ” [NSF Blue Ribbon Panel report, 2003] Digital Divide Meeting (May 23, 2005) Paul Avery 58
Fulfilling the Promise of Next Generation Science Our multidisciplinary partnership of physicists, computer scientists, engineers, networking specialists and education experts, from universities and laboratories, has achieved tremendous success in creating and maintaining general purpose cyberinfrastructure supporting leading-edge science. But these achievements have occurred in the context of overlapping short-term projects. How can we ensure the survival of valuable existing cyberinfrastructure while continuing to address new challenges posed by frontier scientific and engineering endeavors? Digital Divide Meeting (May 23, 2005) Paul Avery 59
Production Simulations on Grid 3 US-CMS Monte Carlo Simulation Used = 1. 5 US-CMS resources Non-USCMS Digital Divide Meeting (May 23, 2005) Paul Avery 60
Components of VDT 1. 3. 5 u Globus 3. 2. 1 u Condor 6. 7. 6 u RLS 3. 0 u Class. Ads 0. 9. 7 u Replica 2. 2. 4 u DOE/EDG CA certs u ftsh 2. 0. 5 u EDG mkgridmap u EDG CRL Update u GLUE Schema 1. 0 u VDS 1. 3. 5 b u Java u Netlogger 3. 2. 4 u Gatekeeper-Authz u My. Proxy 1. 11 u KX 509 Digital Divide Meeting (May 23, 2005) u System Profiler u GSI Open. SSH 3. 4 u Monalisa 1. 2. 32 u Py. Globus 1. 0. 6 u My. SQL u Uber. FTP 1. 11 u DRM 1. 2. 6 a u VOMS 1. 4. 0 u VOMS Admin 0. 7. 5 u Tomcat u PRIMA 0. 2 u Certificate Scripts u Apache u j. Clarens 0. 5. 3 u New Grid. FTP Server u GUMS 1. 0. 1 Paul Avery 61
Collaborative Relationships: A CS + VDT Perspective Partner science projects Partner networking projects Partner outreach projects Requirements Prototyping & experiments Other linkages Ø Work force Ø CS researchers Ø Industry U. S. Grids Int’l Outreach Production Deployment Computer Virtual Larger Techniques Tech Science Data Science & software Research Toolkit Transfer Community Globus, Condor, NMI, i. VDGL, PPDG EU Data. Grid, LHC Experiments, Quark. Net, CHEPREO, Dig. Divide Digital Divide Meeting (May 23, 2005) Paul Avery 62


