Скачать презентацию CERN Review of the EU Data Grid Project Скачать презентацию CERN Review of the EU Data Grid Project

7b201ab5bb743c7bab2e00b9a6f8f05d.ppt

  • Количество слайдов: 31

CERN Review of the EU Data. Grid Project and other EU Grid initiatives Fabrizio CERN Review of the EU Data. Grid Project and other EU Grid initiatives Fabrizio GAGLIARDI CERN Geneva-Switzerland EU-Data. Grid Project Head November 2001 F. [email protected] ch Krakow 2001

Talk summary § Introduction § EU Data. Grid background § Project Status § Future Talk summary § Introduction § EU Data. Grid background § Project Status § Future Plans § Other EU initiatives § CERN Conclusions Krakow 2001 2

CERN The European Organisation for Nuclear Research 20 European countries 2, 500 staff 6, CERN The European Organisation for Nuclear Research 20 European countries 2, 500 staff 6, 000 users 3

CERN 27 km of tunnel stuffed with magnets and klystrons Krakow 2001 4 CERN 27 km of tunnel stuffed with magnets and klystrons Krakow 2001 4

CERN One of the four LHC detectors 40 M online system multi-level trigger filter CERN One of the four LHC detectors 40 M online system multi-level trigger filter out background reduce data volume leve Hz (40 TB/ - spe sec) 75 K cial leve hard l 2 - Hz (7 war 5 G emb 5 KH edde B/sec) e d pr z( l 1 (10 oces 5 G leve B/se sors l 3 c) 100 PCs H z B/se c) 0 M data offli record ne a ing & naly sis Krakow 2001 5

The LHC Detectors CERN CMS ATLAS ~6 -8 Peta. Bytes / year ~108 events/year The LHC Detectors CERN CMS ATLAS ~6 -8 Peta. Bytes / year ~108 events/year LHCb Krakow 2001 6

Funding CERN § Requirements growing faster than Moore’s law § CERN’s overall budget is Funding CERN § Requirements growing faster than Moore’s law § CERN’s overall budget is fixed Estimated cost of facility at CERN ~ 30% of offline requirements* Budget level in 2000 for all physics data handling R&D testbed Physics WAN Systems administration Mass Storage disks processors Krakow 2001 *assumes physics in July 2005, rapid ramp-up of luminosity 7

World Wide Collaboration distributed computing & storage capacity CMS: Krakow 2001 CERN 1800 physicists World Wide Collaboration distributed computing & storage capacity CMS: Krakow 2001 CERN 1800 physicists 150 institutes 32 countries 8

LHC Computing Model Lab m Uni x USA Brookhaven Tier 2 Physics Department USA LHC Computing Model Lab m Uni x USA Brookhaven Tier 2 Physics Department USA Fermi. Lab b CERN Uni y Uni n ………. NL Krakow 2001 France Tier 1 Italy Desktop UK Uni a Germany Uni b Lab c les. [email protected] ch Lab a CERN 9

Five Emerging Models of Networked Computing From The Grid CERN Distributed Computing § || Five Emerging Models of Networked Computing From The Grid CERN Distributed Computing § || synchronous processing High-Throughput Computing § || asynchronous processing On-Demand Computing § || dynamic resources Data-Intensive Computing § || databases Collaborative Computing § || scientists § § § Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure, ” Morgan Kaufmann, 1999, http: //www. mkp. com/grids Krakow 2001 10

EU Data. Grid background § Motivated by the challenge of the LHC computing § EU Data. Grid background § Motivated by the challenge of the LHC computing § CERN Excellent Grid computing model match to HEP requirements (Foster’s quote: HEP is Grid computing “par excellence” ) § Large amount of data (~10 Pbytes/year starting in 2006) § Distributed computing resources and skills § Geographical worldwide distributed community (VO) § Transition from supercomputers to commodity computing done § Distributed job level parallelism (no strong need for MPI) § High throughput computing rather than supercomputing § VO tradition already long established § Prototype Grid activity in some CERN member states Krakow 2001 11

Main project goals and characteristics § To build a significant prototype of the LHC Main project goals and characteristics § To build a significant prototype of the LHC computing model § To collaborate with and complement other European and US projects § To develop a sustainable computing model applicable to other sciences and industry: biology, earth observation etc. § Specific project objectives: § CERN Open source and communication: Krakow 2001 § Middleware for fabric & Grid management (mostly funded by the EU): evaluation, test, and integration of existing M/W S/W and research and development of new S/W as appropriate § Large scale testbed (mostly funded by the partners) § Production quality demonstrations (partially funded by the EU) § Global GRID Forum § Industry and Research Forum 12

Main Partners CERN § CERN – International (Switzerland/France) § CNRS - France § ESA/ESRIN Main Partners CERN § CERN – International (Switzerland/France) § CNRS - France § ESA/ESRIN – International (Italy) § INFN - Italy § NIKHEF – The Netherlands § PPARC - UK Krakow 2001 13

Associated Partners CERN Research and Academic Institutes • CESNET (Czech Republic) • Commissariat à Associated Partners CERN Research and Academic Institutes • CESNET (Czech Republic) • Commissariat à l'énergie atomique (CEA) – France • Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI) • Consiglio Nazionale delle Ricerche (Italy) • Helsinki Institute of Physics – Finland • Institut de Fisica d'Altes Energies (IFAE) - Spain • Istituto Trentino di Cultura (IRST) – Italy • Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany • Royal Netherlands Meteorological Institute (KNMI) • Ruprecht-Karls-Universität Heidelberg - Germany • Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands • Swedish Natural Science Research Council (NFR) - Sweden Industrial Partners • Datamat (Italy) • IBM (UK) • CS-SI (France) Krakow 2001 14

Project scope § 9. 8 M Euros EU funding over 3 years § 90% Project scope § 9. 8 M Euros EU funding over 3 years § 90% for middleware and applications (HEP, EO and biology) § Three year phased developments & demos (2001 -2003) § CERN Possible extensions (time and funds) on the basis of first successful results: § § Krakow 2001 Data. TAG (2002 -2003) Cross. Grid (2002 -2004) Grid. Start (2002 -2004) … 15

Data. Grid status § CERN Preliminary architecture defined § Enough to deploy first testbed Data. Grid status § CERN Preliminary architecture defined § Enough to deploy first testbed 1 § First M/W delivery § (GDMP, first workload management system, fabric management tools, Globus installation, including certification and authorization, Condor tools) § First application test cases ready, long term cases defined § Integration team actively deploying Testbed 1 Krakow 2001 16

EU-Data. Grid Architecture Local Application CERN Local Database Local Computing Grid Application Layer Data EU-Data. Grid Architecture Local Application CERN Local Database Local Computing Grid Application Layer Data Management Job Management Metadata Management Object to File Mapping Collective Services Information & Monitoring Replica Manager Grid Scheduler Underlying Grid Services SQL Database Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index Grid Fabric services Resource Management Krakow 2001 Configuration Management Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management 17

Test Bed Schedule § CERN Test. Bed 0 (early 2001) § International test bed Test Bed Schedule § CERN Test. Bed 0 (early 2001) § International test bed 0 infrastructure deployed ¨ § Globus 1 only - no EDG middleware Test. Bed 1 ( now ) § First release of EU Data. Grid software to defined users within the project: ¨ ¨ Biology applications (WP 9) ¨ § HEP experiments (WP 8) Earth Observation (WP 10) Test. Bed 2 (Sept. 2002) § Builds on Test. Bed 1 to extend facilities of Data. Grid § Test. Bed 3 (March 2003) & 4 (Sept 2003) Krakow 2001 18

Data. Grid status § First M/W delivery § (GDMP, first workload management system, fabric Data. Grid status § First M/W delivery § (GDMP, first workload management system, fabric management tools, Globus installation, including certification and authorization, Condor tools) § First application test cases ready, long term cases defined § Integration team first release of Testbed 1 Krakow 2001 Elisabetta Ronchieri Shahzad Muzaffar Alex Martin Maite Barroso Lopez Jean Philippe Baud Frank Bonnassieux WP 1 WP 2 WP 3 WP 4 WP 5 WP 7 Brian Coghlan Flavia Donno Eric Fede Fabio Hernandez Nadia Lajili Charles Loomis Pietro Paolo Martucci Andrew Mc. Nab Sophie Nicoud Yannik Patois Anders Waananen WP 6 WP 6 WP 6 Pier. Giorgio Cerello Eric Van Herwijnen Julian Lindford Andrea Parrini Yannick Legre Preliminary architecture defined § Enough to deploy testbed 1 § CERN WP 8 WP 9 WP 10 19

Test bed 1 Approach § CERN Software integration § combines software from each middle-ware Test bed 1 Approach § CERN Software integration § combines software from each middle-ware work package and underlying external tool kits (e. g. Globus) § performed by integration team at CERN on a cluster of 10 Linux PCs § Basic integration tests § performed by integration team to verify basic functionality § Validation tests § application groups use testbed 1 to exercise their application software ¨ Krakow 2001 e. g. LHC experiments run jobs using their offline software suites on test-bed 1 sites 20

Detailed Test. Bed 1 Schedule § § § Krakow 2001 CERN October 1: Intensive Detailed Test. Bed 1 Schedule § § § Krakow 2001 CERN October 1: Intensive integration starts Based on Globus 2 November 1: First beta release of Data. Grid (CERN & Lyon) (depends on changes needed Globus 1 ->2) November 15: Initial limited application testing finished Data. Grid ready for deployment on partner sites (~5 sites) November 30: Widespread deployment Code machines split for development Testbed 1 open to all applications (~40 sites) December 29: WE ARE DONE! 21

Test. Bed 1 Sites § First round (15 Nov. ) § CERN, Lyon, RAL, Test. Bed 1 Sites § First round (15 Nov. ) § CERN, Lyon, RAL, Bologna § CERN Second Round (30 Nov. ) · Netherlands: NIKHEF · UK: 6 sites: Bristol, Edinburgh, Glasgow, Lancaster, Liverpool, Oxford · Italy: 6 -7 sites: Catania, Legnaro/Padova, Milan, Pisa, Rome, Turin, Cagliari? · France: Ecole-Polytechnique · Russia: Moscow · Spain: Barcelona? · Scandinavia: Lund? · WP 9 (GOME): ESA, KNMI, IPSL, ENEA Krakow 2001 22

Licenses & Copyrights § Package Repository and web site § § § Will be Licenses & Copyrights § Package Repository and web site § § § Will be the same (or very similar) to Globus license A BSD-style license which puts few restrictions on use Condor-G (used by WP 1) § § § Copyright (c) 2001 EU Data. Grid – see http: //www. edg. org/license. html License § § § Provides access to the packaged Globus, Data. Grid and required external software All software is packaged as source and binary RPMs Copyright Statement § § CERN Not open source or redistributable Through special agreement, can redistribute within Data. Grid LCFG (used by WP 4) § Krakow 2001 Uses GPL 23

Security § CERN The EDG software supports many Certification Authorities from the various partners Security § CERN The EDG software supports many Certification Authorities from the various partners involved in the project § http: //marianne. in 2 p 3. fr/datagrid/ca/ca-table-ca. html § but not Globus CA § For a machine to participate as a Testbed 1 resource all the CAs must be enabled. § all CA certificates can be installed without compromising local site security § Each host running a Grid service needs to be able to authenticate users and other hosts § site manager has full control over security for local nodes § Virtual Organisation represents a community of users § 6 VOs for testbed 1: 4 HEP (ALICE, ATLAS, CMS, LHCb), 1 EO, 1 Biology Krakow 2001 24

Node configuration tools Node configuration and installation tools CERN • For reference platform (Linux Node configuration tools Node configuration and installation tools CERN • For reference platform (Linux Red. Hat 6. 2) • Initial installation tool using system image cloning • LCFG (Edinburgh University) for software updates and maintenance LCFG configuration files mkxprof HTTP Web Server XML Profile (one per client node) rdxprof Server node ldxprof Generic Component DBM File Client nodes Krakow 2001 LCFG Components 25

Middleware components Job Description Language (JDL) §script to describe the CERN Logging and Book-keeping Middleware components Job Description Language (JDL) §script to describe the CERN Logging and Book-keeping (L&B ) job parameters User Interface (UI) §sends the job to the RB and receives the results Resource Broker (RB) § records job status information Grid Information Service (GIS) § Information Index about state of Grid fabric Replica Catalog § list of data sets and their §locates and selects the target duplicates held on Storage Computing Element (CE) Elements (SE) Job Submission Service (JSS) §submits the job to the target CE Krakow 2001 26

A Job Submission Example UI JDL Job Submit Event Input Sandbox Output Sandbox Replica A Job Submission Example UI JDL Job Submit Event Input Sandbox Output Sandbox Replica Catalogue Input Sandbox Resource Broker Storage Element Job Submission Service Output Sandbox Job Status Krakow 2001 Information Service Brokerinfo Logging & Book-keeping CERN Compute Element 27

Iterative Releases § CERN Planned intermediate release schedule § Test. Bed 1: October 2001 Iterative Releases § CERN Planned intermediate release schedule § Test. Bed 1: October 2001 § Release 1. 1: January 2002 § Release 1. 2: March 2002 § Release 1. 3: May 2002 § Release 1. 4: July 2002 § Test. Bed 2: September 2002 § § Similar schedule will be organised for 2003 Each release includes § feedback from use of previous release by application groups § planned improvements/extension by middle-ware WPs § use of software infrastructure § feeds into architecture group Krakow 2001 28

Software Infrastructure § CERN Toolset for aiding the development & integration of middleware § Software Infrastructure § CERN Toolset for aiding the development & integration of middleware § § § code repositories (CVS) browsing tools (CVSweb) build tools (autoconf, make etc. ) document builders (doxygen) coding standards and check tools (e. g. Code. Checker) nightly builds § Guidelines, examples and documentation § Development facility § Provided and managed by WP 6 Krakow 2001 § show the software developers how to use the toolset § test environment for software (small set of PCs in a few partner sites) § setting-up toolset and organising development facility 29

Future Plans § Tighter connection to applications principal architects § Closer integration of the Future Plans § Tighter connection to applications principal architects § Closer integration of the software components § Improve software infrastructure toolset and test suites § Evolve architecture on the basis of Test. Bed results § Enhance synergy with US via Data. TAG-i. VDGL and Inter. Grid § CERN Promote early standards adoption with participation to GGF WGs § First project EU review end of February 2002 § Final software release by end of 2003 Krakow 2001 30

Conclusions § EU Data. Grid well on its way, but coordination with other Grid Conclusions § EU Data. Grid well on its way, but coordination with other Grid initiatives essential both in EU and elsewhere § The EU is very supportive for a worldwide collaboration § Important to extend project test beds to entire HEP (and other applications communities: EO, Bio, etc. ) § CERN Grid activity well started in Poland, need to make sure this well coordinated with the rest of the world Krakow 2001 31