Скачать презентацию The Grid Meeting the LHC computing challenge Gavin Скачать презентацию The Grid Meeting the LHC computing challenge Gavin

b186ff726977d1fe662357a2757971cc.ppt

  • Количество слайдов: 24

The Grid Meeting the LHC computing challenge Gavin Mc. Cance University of Glasgow RSE The Grid Meeting the LHC computing challenge Gavin Mc. Cance University of Glasgow RSE 6 th February 2002

Outline Scale of the LHC computing challenge Grid ‘Middleware’ l Data Replication Experimental testbed Outline Scale of the LHC computing challenge Grid ‘Middleware’ l Data Replication Experimental testbed RSE 6 February 2002 Gavin Mc. Cance 1

LHC computing challenge Typical experiment: 2 MB per event l 2. 7 x 109 LHC computing challenge Typical experiment: 2 MB per event l 2. 7 x 109 event sample 5. 4 PB/year l Up to 9 PB/year Monte Carlo samples l Very large storage and computational requirements CERN can handle max of 1/3 of this! RSE 6 February 2002 Gavin Mc. Cance 2

…computing challenge Distribute data store and compute resources Take advantage of existing local clusters …computing challenge Distribute data store and compute resources Take advantage of existing local clusters and local infrastructure l Easier to get funding for local clusters, particularly cross-experiment or crossdisciplinary compute resources l RSE 6 February 2002 Gavin Mc. Cance 3

Tiered model Tier 0 CERN Computer Basic reconstructed data Tier 1 Italian Regional Centre Tiered model Tier 0 CERN Computer Basic reconstructed data Tier 1 Italian Regional Centre US Regional Centre French Regional Centre Scot. GRID RAL Regional Centre Tier 2 Centre Tier 2 Higher level analysis data and Monte Carlo Tier 3 Institute Tag data and Monte Carlo RSE 6 February 2002 Gavin Mc. Cance 4

UK Grid. PP collaboration Basic reconstructed data Tier 1 Scot. GRID RAL Regional Centre UK Grid. PP collaboration Basic reconstructed data Tier 1 Scot. GRID RAL Regional Centre Tier 2 Centre Tier 2 Higher level analysis data and Monte Carlo Tier 3 Institute Tag data and Monte Carlo RSE 6 February 2002 Gavin Mc. Cance 5

Grid. PP RSE 6 February 2002 Gavin Mc. Cance 6 Grid. PP RSE 6 February 2002 Gavin Mc. Cance 6

…Grid. PP £ 17 M three year project Working in collaboration with the EU …Grid. PP £ 17 M three year project Working in collaboration with the EU Data. Grid project Middleware production Integration of middleware technologies into HEP experiments Validation of Grid software RSE 6 February 2002 Gavin Mc. Cance 7

Middleware What is middleware…? ? ? Application programs –gridopen() call Grid middleware Data access Middleware What is middleware…? ? ? Application programs –gridopen() call Grid middleware Data access specifics – HPSS, Castor Job submission specifics – PBS, LSF Specific security procedures RSE 6 February 2002 Gavin Mc. Cance Layered API’s. Transparent security. Transparent data access. Intelligent use of distributed resources. 8

Middleware Activities Grid. PP ~mirrors EU Data. Grid: Workload Management l What jobs go Middleware Activities Grid. PP ~mirrors EU Data. Grid: Workload Management l What jobs go where? Data Management (*) l Where’s the (best) data? Information Services l What’s the state of everything? RSE 6 February 2002 Gavin Mc. Cance 9

…Middleware Activities Fabric and Mass Storage Management l Interfaces to underlying systems Network Monitoring …Middleware Activities Fabric and Mass Storage Management l Interfaces to underlying systems Network Monitoring l What’s the bandwidth from here to there? Security l Crops up everywhere … transparent to applications RSE 6 February 2002 Gavin Mc. Cance 10

Data Management Data Replication Meta Data Catalogues Replica Optimisation l Which replica should I Data Management Data Replication Meta Data Catalogues Replica Optimisation l Which replica should I use? RSE 6 February 2002 Gavin Mc. Cance 11

Data Replication Problems if data exist only in one place No one site can Data Replication Problems if data exist only in one place No one site can afford to store all data! l Multiple accesses to the same data overload network! Petabytes! l What if site / network is down? l Make Replica! But need to keep track of all the files and their various replica! l Need replica catalogue! RSE 6 February 2002 Gavin Mc. Cance 12

Replica Catalogues Distributed catalogue in database: Have a globally unique Logical File Name (LFN) Replica Catalogues Distributed catalogue in database: Have a globally unique Logical File Name (LFN) mapping to multiple physical instances of the file (PFNs). Paris LFN File-1 Database must be globally accessible and secure l Key is to leverage industry standard technologies RSE 6 February 2002 Gavin Mc. Cance Glasgow File-1 Chicago File-1 13

Metadata Catalogue Oracle My. SQL Postgre. SQL + + PKI Security Standard communication protocols Metadata Catalogue Oracle My. SQL Postgre. SQL + + PKI Security Standard communication protocols (XML/SOAP over HTTPS) = SQL Metadata Service Allows a client to access securely any remote SQL database on the Grid over HTTP(S) RSE 6 February 2002 Gavin Mc. Cance 14

Distribution Don’t want a single point of failure or bottleneck Must distribute SQL database Distribution Don’t want a single point of failure or bottleneck Must distribute SQL database l Designing scalable architectures e. g. a RC may exist on each storage site responsible for its own files l CERN Root RC INFN RC UK RC RSE 6 February 2002 CERN RC Queries will propagate down until replica information is found… Gavin Mc. Cance 15

Choosing the ‘best’ What does the ‘best’ replica mean? l Nearest? Fastest? Real cost? Choosing the ‘best’ What does the ‘best’ replica mean? l Nearest? Fastest? Real cost? For multiple files, the ‘best’ run location is some minimisation Network cost – network monitoring l Monetary cost – EU-US link l A reasonable decision must be made on the basis of limited information! RSE 6 February 2002 Gavin Mc. Cance 16

Economic models Data files viewed as ‘commodities’ to be bought and sold by storage Economic models Data files viewed as ‘commodities’ to be bought and sold by storage sites The ‘buyer’ is a job requesting a file The (virtual) ‘cost’ is: Reverse auction, buy from ‘cheapest’ RSE 6 February 2002 Gavin Mc. Cance 17

Economic replication If a storage site believes it can ‘make money’ on a popular Economic replication If a storage site believes it can ‘make money’ on a popular file (based on its observation of access patterns) it can buy it from another site (replication) B 1 14 15 A 1 File 1 20 Selfish local optimisation should lead to a reasonable global optimisation for file distribution l Inherently distributed optimisation. . No distributed planning overhead! RSE 6 February 2002 Gavin Mc. Cance 18

Will it work? ? ? real Grid. . . Developing simulation tool …simulated Grid Will it work? ? ? real Grid. . . Developing simulation tool …simulated Grid provides testing arena for these ideas! RSE 6 February 2002 Gavin Mc. Cance 19

Testbed Software UK HEP is providing testbed EU experiments. . CERN LHC l US Testbed Software UK HEP is providing testbed EU experiments. . CERN LHC l US experiments. . Fermilab / SLAC l First EU Data. Grid software release! Currently being tested. . RSE 6 February 2002 Gavin Mc. Cance 20

Experiment integration Taking the kit and trying to integrate it into the experiments’ software Experiment integration Taking the kit and trying to integrate it into the experiments’ software frameworks Make Grid Services transparently available to ATLAS and LHCb programs ATLAS/LHCb software framework (GAUDI) GANGA framework Grid middleware RSE 6 February 2002 Gavin Mc. Cance 21

UK and EU Testbed Some successful tests so far… l e. g. large file UK and EU Testbed Some successful tests so far… l e. g. large file transfers UK, Italy, US and CERN Increasing Monte Carlo challenges planned Currently UK testbed RSE 6 February 2002 Gavin Mc. Cance 22

…finally Basic Grid software has been delivered l More developments to come Integration with …finally Basic Grid software has been delivered l More developments to come Integration with experiments and testing l Already successful tests A excellent base to build on! Plenty still to do! RSE 6 February 2002 Gavin Mc. Cance 23