Скачать презентацию Introduction to GRID computing and overview of the Скачать презентацию Introduction to GRID computing and overview of the

b01d8ee9758e83f0df0bf6cf99383f23.ppt

  • Количество слайдов: 25

Introduction to GRID computing and overview of the European Data Grid Project The European Introduction to GRID computing and overview of the European Data Grid Project The European Data. Grid Project http: //www. edg. org

Overview Ø What is GRID computing ? Ø What is a GRID ? Ø Overview Ø What is GRID computing ? Ø What is a GRID ? Ø Why GRIDs ? Ø GRID projects world wide Ø The European Data Grid n Overview of EDG goals and organization n Overview of the EDG middleware components Introduction to GRID Computing and the EDG 2

The Grid Vision Researchers perform their activities regardless geographical location, interact with colleagues, share The Grid Vision Researchers perform their activities regardless geographical location, interact with colleagues, share and access data The GRID: networked data processing centres and ”middleware” software as the “glue” of resources. Scientific instruments and experiments provide huge amount of data Federico. Carminati@cern. ch Introduction to GRID Computing and the EDG 3

What is GRID computing : Ø coordinated resource sharing and problem solving in dynamic, What is GRID computing : Ø coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. [ I. Foster] n A VO is a collection of users sharing similar needs and requirements in their access to processing, data and distributed resources and pursuing similar goals. Ø Key n concept : ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose [I. Foster] Introduction to GRID Computing and the EDG 4

The GRID distributed computing idea 1/2 Once upon a time……. . mainframe Microcomputer Mini The GRID distributed computing idea 1/2 Once upon a time……. . mainframe Microcomputer Mini Computer Cluster (by Christophe Jacquet) Introduction to GRID Computing and the EDG 5

The GRID distributed computing idea 2/2 …and today (by Christophe Jacquet) Introduction to GRID The GRID distributed computing idea 2/2 …and today (by Christophe Jacquet) Introduction to GRID Computing and the EDG 6

Differences between grids and distributed applications Ø Distributed applications already exist, but they tend Differences between grids and distributed applications Ø Distributed applications already exist, but they tend to be specialised systems intended for a single purpose or user group Ø Grids n Different kinds of resources § n Not always the same hardware, data and applications Different kinds of interaction § n go further and take into account: User groups or applications want to interact with grids in different ways Dynamic nature § Resources and User Groups added/removed/changed frequently Introduction to GRID Computing and the EDG 7

Main characteristics of a grid architecture Ø Service providers n n Such services may Main characteristics of a grid architecture Ø Service providers n n Such services may come-and-go or change dynamically n Ø Publish the availability of their services via information systems E. g. a testbed site that offers x CPUs and y GB of storage Service brokers n n Ø Register and categorize published services and provide search capabilities E. g. 1) EDG Resource Broker selects the best site for a “job” 2) Catalogues of data held at each testbed site Service requesters n Single sign-on: log into the grid once n Use brokering services to find a needed service and employ it n E. g. CMS physicists submit a simulation job that needs 12 CPUs for 6 hours and 15 GB which gets scheduled, via the Resource Broker, on the CERN testbed site Introduction to GRID Computing and the EDG 8

GRID security Ø Ø Resource providers are essentially “opening themselves up” to itinerant users GRID security Ø Ø Resource providers are essentially “opening themselves up” to itinerant users Secure access to resources is required n Ø Ø X. 509 Public Key Infrastructure User’s identity has to be certified by (mutually recognized) national Certification Authorities (CAs) Resources (node machines) have to be certified by CAs Temporary delegation from users to processes to be executed “in user’s name” ( proxy certificates ) Common agreed policies for accessing resource and handling user’s rights across different domains within Virtual Organizations Introduction to GRID Computing and the EDG 9

Why GRIDs Ø Scale of the problems n Ø GRIDs provide access to large Why GRIDs Ø Scale of the problems n Ø GRIDs provide access to large data processing power and huge data storage possibilities n Ø frontier research in many different fields today requires world-wide collaborations (i. e. multi-domain access to distributed resources) As the grid grows its usefulness increases (more resources available) Large communities of possible GRID users : n n High Energy Physics Environmental studies: Earthquakes forecast, geologic and climate changes, ozone monitoring n Biology, Genetics, Earth Observation n Astrophysics, n New composite materials research n Astronautics, etc. Introduction to GRID Computing and the EDG 10

High Energy Physics The LHC Detectors CMS ATLAS ~6 -8 Peta. Bytes / year High Energy Physics The LHC Detectors CMS ATLAS ~6 -8 Peta. Bytes / year ~108 events/year ~103 batch and interactive users Federico. carminati , EU review presentation LHCb Introduction to GRID Computing and the EDG 11

Earth Observation ESA missions: • about 100 Gbytes of data per day (ERS 1/2) Earth Observation ESA missions: • about 100 Gbytes of data per day (ERS 1/2) • 500 Gbytes, for the next ENVISAT mission (2002). Data. Grid contribute to EO: Federico. Carminati , EU review presentation, 1 March 2002 • enhance the ability to access high level products • allow reprocessing of large historical archives • improve Earth science complex applications (data fusion, data mining, modelling …) Source: L. Fusco, June 2001 Introduction to GRID Computing and the EDG 12

Biology – Ø Bio-informatics n Phylogenetics n Search for primers n Statistical genetics n Biology – Ø Bio-informatics n Phylogenetics n Search for primers n Statistical genetics n Bio-informatics web portal n Parasitology n Data-mining on DNA chips n Ø Bio. Informatics Geometrical protein comparison 1. Query the medical image database and retrieve a patient image Exam image patient key ACL. . . Medical Metadata images Medical imaging n MR image simulation n Medical data and metadata management n Mammographies analysis n Simulation platform for PET/SPECT Applications deployed Applications tested on EDG Applications under preparation 2. Compute similarity measures over the database images Submit 1 job per image 3. Retrieve most similar cases Similar images Low score images Introduction to GRID Computing and the EDG 13

GRID projects world wide Ø EU n n Cross. GRID – Qo. S – GRID projects world wide Ø EU n n Cross. GRID – Qo. S – interactive apps. [ www. crossgrid. org ] n Data. TAG - inter-operability (EU-USA) [ www. datatag. org ] n LCG – The LHC Computing GRID – Deployment [ cern. ch/lcg ] n Ø EDG (EU-IST) – R&D EU GRID project [ www. edg. org ] The new 16, 2 B Euro EU VI Framework Prog. GEANT based GRID projects USA n Gri. Phy. N [ www. griphyn. org ] Ø i. VDGL-VDTv 1 [ www. idvgl. org ] PPDG ( NSF, Do. E ) [ www. ppdg. org ] Asia n Ap. Grid Pragma (USA-Asia) [ www. apgrid. org ] And many more. . . Introduction to GRID Computing and the EDG 14

The European Data Grid Ø To build on the emerging Grid technology to develop The European Data Grid Ø To build on the emerging Grid technology to develop a sustainable computing model for effective share of computing resources and data Ø Start : Jan 1, 2001 Ø Specific project objectives: n End : Dec 31, 2003 Middleware for fabric & Grid management (mostly funded by the EU) n Large scale testbed (mostly funded by the partners) n Production quality demonstrations (partially funded by the EU) Ø To collaborate with and complement other European and US projects Ø Contribute to Open Standards and international bodies: n Co-founder of Global GRID Forum and host of GGF 1 and GGF 3 n Industry and Research Forum for dissemination of project results Introduction to GRID Computing and the EDG 15

The EDG Main Partners Ø CERN – International (Switzerland/France) Ø CNRS - France Ø The EDG Main Partners Ø CERN – International (Switzerland/France) Ø CNRS - France Ø ESA/ESRIN – International (Italy) Ø INFN - Italy Ø NIKHEF – The Netherlands Ø PPARC - UK Introduction to GRID Computing and the EDG 16

EDG Assistant Partners Industrial Partners • Datamat (Italy) • IBM-UK (UK) • CS-SI (France) EDG Assistant Partners Industrial Partners • Datamat (Italy) • IBM-UK (UK) • CS-SI (France) Research and Academic Institutes • CESNET (Czech Republic) • Commissariat à l'énergie atomique (CEA) – France • Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI) • Consiglio Nazionale delle Ricerche (Italy) • Helsinki Institute of Physics – Finland • Institut de Fisica d'Altes Energies (IFAE) - Spain • Istituto Trentino di Cultura (IRST) – Italy • Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany • Royal Netherlands Meteorological Institute (KNMI) • Ruprecht-Karls-Universität Heidelberg - Germany • Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands • Swedish Research Council - Sweden Introduction to GRID Computing and the EDG 17

EDG overview: Middleware release schedule Ø Release schedule n n testbed 2: 2002 n EDG overview: Middleware release schedule Ø Release schedule n n testbed 2: 2002 n testbed 3: 2003 n Ø testbed 1: 2001 Incremental releases between these major dates Each release includes n n Ø feedback on use of previous release by application groups planned improvements/extension by middle-ware groups Application groups (HEP, EO, Bio-Info) are using existing software and testbed to explore how they can best exploit grids Introduction to GRID Computing and the EDG 18

Current Project Status Ø EDG currently provides a set of middleware services Ø Ø Current Project Status Ø EDG currently provides a set of middleware services Ø Ø GRID & Network monitoring Ø Security, Authentication & Authorization tools Ø Ø Job & Data Management Fabric Management EDG release 1. 4 currently deployed to the EDG-Testbeds Ø Linux Red. Hat 6. 2 on Intel PCs Ø ~15 sites in application testbed actively used by application groups Ø Ø Core sites CERN(CH), RAL(UK), NIKHEF(NL), CNAF(I), CC-Lyon(F) EDG sw also deployed at total of ~40 sites via Cross. Grid, Data. TAG and national grid projects Ø Many applications ported to EDG testbeds and actively being used Ø Intense middleware development continuously going-on Ø 2 nd annual project review passed – 4 & 5 th February 2003 Introduction to GRID Computing and the EDG 19

Data. Grid in Numbers People Testbeds >350 registered users >15 regular sites 12 Virtual Data. Grid in Numbers People Testbeds >350 registered users >15 regular sites 12 Virtual Organisations >10’ 000 s jobs submitted 16 Certificate Authorities >1000 CPUs >200 people trained 3 Mass Storage Systems 278 man-years of effort >5 Tera. Bytes disk 100 years funded Software 50 use cases 18 software releases >300 K lines of code Scientific applications 5 Earth Obs institutes 9 bio-informatics apps 6 HEP experiments Introduction to GRID Computing and the EDG 20

EDG structure : work packages Ø The EDG collaboration is structured in 12 Work EDG structure : work packages Ø The EDG collaboration is structured in 12 Work Packages: n WP 1: Work Load Management System n WP 2: Data Management n WP 3: Grid Monitoring / Grid Information Systems n WP 4: Fabric Management n WP 5: Storage Element n WP 6: Testbed and demonstrators n WP 7: Network Monitoring n WP 8: High Energy Physics Applications n WP 9: Earth Observation n WP 10: Biology n WP 11: Dissemination n WP 12: Management } Applications Introduction to GRID Computing and the EDG 21

EDG Globus-based middleware architecture Ø Current EDG architectural functional blocks: n Basic Services (authentication, EDG Globus-based middleware architecture Ø Current EDG architectural functional blocks: n Basic Services (authentication, authorization, Replica Catalog , secure file transfer, Info Providers) rely on Globus 2. 0 n Higher level EDG middleware (developed within EDG) n Applications (HEP, BIO, EO) Specific application layer VOs common application layer GRID middleware GLOBU S 2. 0 ALICE ATLAS CMS LHCb Other apps LHC Other apps High level GRID middleware Basic Services OS & Net services Introduction to GRID Computing and the EDG 22

EDG middleware GRID architecture APPLICATIONS Local Computing Grid Local Application Local Database Grid Application EDG middleware GRID architecture APPLICATIONS Local Computing Grid Local Application Local Database Grid Application Layer Data Management Job Management Metadata Management Collective Services Grid Scheduler Information & Monitoring Replica Manager Underlying Grid Services SQL Database Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index M / W Grid Fabric services Resource Management Configuration Management Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management GLOBUS Introduction to GRID Computing and the EDG 23

EDG interfaces Application Developers System Managers Local Database Scientists Certificate Authorities Grid Application Layer EDG interfaces Application Developers System Managers Local Database Scientists Certificate Authorities Grid Application Layer Data Management Job Management File Systems Local Application Metadata Management Object to File Mapping Collective Services User Accounts Information & Monitoring Replica Manager Grid Scheduler Underlying Grid Services SQL Database Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index Fabric services Resource Management Configuration Management Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management Operating Systems Storage Mass Storage Systems Elements HPSS, Castor Batch Systems PBS, LSF, etc. Computing Elements Introduction to GRID Computing and the EDG 24

EDG : reference web sites Ø EDG web site n Ø Source for all EDG : reference web sites Ø EDG web site n Ø Source for all required software : n Ø http: //marianne. in 2 p 3. fr/datagrid/documentation/EDG-Users-Guide. html EDG tutorials web site n Ø http: //marianne. in 2 p 3. fr EDG users guide n Ø http: //datagrid. in 2 p 3. fr EDG testbed web site n Ø http: //www. edg. org http: //cern. ch/edg-tutorials EDG production testbed current real time updated set up n http: //testbed 007. cern. ch/tbstatus-bin/infoindexcern. pl Introduction to GRID Computing and the EDG 25