013c441c6fd112ee0045f892bc73e40c.ppt
- Количество слайдов: 32
Enabling Grids for E-scienc. E g. Lite Overview Gang Chen CC-IHEP, Chinese Academy of Sciences Gang. chen@ihep. ac. cn The 6 th Joint Training of OMII-Europe & CNGrid Hong Kong, 10 -11 January, 2008 www. eu-egee. org www. glite. org EGEE-II INFSO-RI-031688
Outline Enabling Grids for E-scienc. E • LCG & EGEE Projects • The g. Lite Middleware – – – Security Information System Workload management Data management … • Summary EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 2/32
LHC: Large Hadron Collider Enabling Grids for E-scienc. E LHC gets ready! EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 3/32
The LHC Computing Challenge Enabling Grids for E-scienc. E • Data volume – High rate x large number of channels x 4 experiments 15 Peta. Bytes of new data each year • Compute power – Event complexity x Nb. events x thousands users 100 k of today's fastest CPUs • Worldwide analysis & funding – Computing funding locally in major regions & countries – Efficient analysis everywhere GRID technology (WLCG: Worldwide LHC Computing Grid) EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 4/32
WLCG Tier Structure Enabling Grids for E-scienc. E EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 5/32
Centers around the world form a Supercomputer Enabling Grids for E-scienc. E • The EGEE and OSG projects are the basis of WLCG EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 6/32
The EGEE project Enabling Grids for E-scienc. E • EGEE – 1 April 2004 – 31 March 2006 – 71 partners in 27 countries, federated in regional Grids • EGEE-II – 1 April 2006 – 31 March 2008 – 91 partners in 32 countries – 13 Federations • Objectives – Large-scale, production-quality infrastructure for e-Science – Attracting new resources and users from industry as well as science – Improving and maintaining “g. Lite” Grid middleware EGEE-II INFSO-RI-031688 LCG-1 LCG-2 EGEE-1 EGEE-2 Globus 2 based Web services based The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 7/32
Applications on EGEE Enabling Grids for E-scienc. E • Applications from an increasing number of domains – – – Astrophysics Computational Chemistry Earth Sciences Financial Simulation Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … – Book of abstracts: http: //doc. cern. ch//archive/electronic/egee/tr/egee-tr-2006 -005. pdf EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 8/32
Related EU projects Enabling Grids for E-scienc. E EU GRID EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 9/32
EGEE Middleware: g. Lite Enabling Grids for E-scienc. E • g. Lite – Exploit experience and existing components from VDT (Condor. G, Globus), EDG/LCG, Ali. En, and others – Develop a lightweight stack of generic middleware useful to EGEE applications (HEP and Biomedics are pilot applications). § Should eventually deploy dynamically (e. g. as a globus job) § Pluggable components – cater for different implementations – Focus is on re-engineering and hardening – Early prototype and fast feedback turnaround envisaged EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 10/32
The release of g. Lite 3. 0 Enabling Grids for E-scienc. E • Convergence of LCG 2. 7. 0 and g. Lite 1. 5. 0 in spring LCG-2 2006 2004 – Continuity on the production infrastructure ensured usability by applications – Initial focus on the new Job Management Forces – Start of the EGEE SA 3 Activity for integration and certification – “Continuous release process” § No big-bang releases! EGEE-II INFSO-RI-031688 prototyping § Thorough testing and optimization together with the applications • Migration to the ETICS build system – ETICS project started in January • Reorganization of the work according to the new process – EGEE Technical Coordination Group and Task g. Lite product 2005 product 2006 g. Lite 3. 0 2007 g. Lite 3. 1 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 11/32
Middleware Structure Enabling Grids for E-scienc. E Applications Higher-Level Grid Services Workload Management Replica Management Visualization Workflow Grid Economies. . . Foundation Grid Middleware Security model and infrastructure Computing (CE) and Storage Elements (SE) Accounting Information and Monitoring EGEE-II INFSO-RI-031688 • Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware • Higher-Level Grid Services are supposed to help the users building their computing infrastructure but should not be mandatory • Foundation Grid Middleware will be deployed on the EGEE infrastructure – Must be complete and robust – Should allow interoperation with other major grid infrastructures – Should not assume the use of Higher-Level Grid Services The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 12/32
g. Lite Services Decomposition Enabling Grids for E-scienc. E CLI API Information & Monitoring Authorization Auditing Authentication Security Services Metadata Catalog File & Replica Catalog Storage Element Data Movement Accounting Data Services EGEE-II INFSO-RI-031688 Access Service Discovering Network Monitoring Information & Monitoring Services Job Provenance Package Manager Computing Element Workload Management Job Management Services The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 13/32
Main components Enabling Grids for E-scienc. E User Interface (UI): (UI) The place where users logon to the Grid Resource Broker (RB): Matches the user requirements with the available (RB) resources on the Grid Information System: Characteristics and status of CE and SE System (Uses “GLUE schema”) Computing Element (CE): A batch queue on a site’s computers where (CE) the user’s job is executed Storage Element (SE): provides (large-scale) storage for files (SE) EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 14/32
Current production middleware Enabling Grids for E-scienc. E “User interface” Input “sandbox” Replica Catalogue Data. Sets info Information Service Output “sandbox” SE & In CE i nfo s tu St a Jo b Job Status Publish nfo EGEE-II INFSO-RI-031688 san Logging & Book-keeping t“ tpu Ou Job Query Job Submit Event Author. &Authen. I er ok Br ”+ ox ” db ox san db t“ pu Resource Broker Storage Element Computing Element The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 15/32
Grid Foundation: Security Enabling Grids for E-scienc. E • Authentication based on X. 509 PKI infrastructure – Certificate Authorities (CA) issue (long lived) certificates identifying individuals (much like a passport) § Commonly used in web browsers to authenticate to sites – Trust between CAs and sites is established (offline) – In order to reduce vulnerability, on the Grid user identification is done by using (short lived) proxies of their certificates • Proxies can – Be delegated to a service such that it can act on the user’s behalf – Include additional attributes (like VO information via the VO Membership Service VOMS) – Be stored in an external proxy store (My. Proxy) – Be renewed (in case they are about to expire) EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 16/32
Auth. N and Auth. Z: pre-VOMS Enabling Grids for E-scienc. E 1. • Authentication – User receives certificate Personal/once 2. signed by CA – Connects to “UI” by ssh – Downloads certificate – Single logon to Grid – create proxy - then Grid Security Infrastructure identifies user to other machines • Authorisation – User joins Virtual Organisation – VO negotiates access to Grid nodes and resources – Authorisation tested by CE – gridmapfile maps user to local account EGEE-II INFSO-RI-031688 CA 3. UI AUP VO mgr VO service GSI VO database Daily update grid-mapfiles on Grid services The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 17/32
VOMS: concepts Enabling Grids for E-scienc. E Authentication Request VOMS AC y er Qu Virtual Organization Membership Service: – Extends the proxy with info on VO membership, group, roles – Fully compatible with GSI – Each VO has a database containing group membership, roles and capabilities informations for each user – User contacts VOMS server requesting his authorization info – Server sends authorization info to the client, which includes it in a proxy certificate C=IT/O=INFN VOMS /L=CNAF AC /CN=Pinco Palla /CN=proxy Auth DB [glite-tutor] /home/giorgio > voms-proxy-init --voms gilda Cannot find file or dir: /home/giorgio/. glite/vomses Your identity: /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Emidio Giorgio/Email=emidio. giorgio@ct. infn. it Enter GRID pass phrase: Your proxy is valid until Mon Jan 30 23: 35: 51 2006 Creating temporary proxy. . . . Done Contacting voms. ct. infn. it: 15001 [/C=IT/O=GILDA/OU=Host/L=INFN Catania/CN=voms. ct. infn. it/Email=emidio. giorgio@ct. infn. it] "gilda" Creating proxy. . . . . Done Your proxy is valid until Mon Jan 30 23: 35: 51 2006 EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 18/32
Grid foundation: Information Systems Enabling Grids for E-scienc. E • Generic Information Provider (GIP) – Provides LDIF information about a grid service in accordance to the GLUE Schema GIP Provider Cache Plugin LDIF File Config File • BDII: Information system in g. Lite 3. 0 (by LCG) – LDAP database that is updated by a process – More than one DBs is used separate read and write – A port forwarder is used internally to select the correct DB EGEE-II INFSO-RI-031688 2171 LDAP 2172 LDAP 2173 LDAP Update DB & Modify DB Swap DBs 2170 Port Fwd The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 19/32
Grid foundation: Information Systems Enabling Grids for E-scienc. E • R-GMA: provides a uniform method to access and publish distributed information and monitoring data – Used for job and infrastructure monitoring in g. Lite 3. 0 – Working to add authorization • Service Discovery: – – Provides a standard set of methods for locating Grid services Currently supports R-GMA, BDII and XML files as backends Will add local cache of information Used by some DM and WMS components in g. Lite 3. 0 EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 20/32
Grid foundation: Computing Element Enabling Grids for E-scienc. E testing and tuning. Being done now Grid LCG-CE (GT 2 GRAM) – in production now but will be phased-out later GLITE-CE (GSI-enabled Condor-C) New! – already deployed but still needs thorough CREAM (WS-I based interface) New! – being deployed on the JRA 1 preview test-bed Site now. After a first testing phase will be certified and deployed together with the g. Lite. CE – Our contribution to the OGF-BES group for a standard WS-I based CE interface • BLAH is the interface to the local resource manager (via plug-ins) – CREAM and g. Lite-CE – Information pass-through: pass parameters to WMS, Clients Information System Computing Element bd. II R-GMA CEMon glexec + LCAS/ LCMAPS BLAH WN LRMS the LRMS to help job scheduling EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 21/32
Grid foundation: Storage Element Enabling Grids for E-scienc. E g. Lite 3. 0 data access protocols: – File Transfer: GSIFTP (Grid. FTP) – File I/O (Remote File access) § Posix-like file access § Grid File Access Layer (GFAL) § Support for ACL in the SRM layer gsidcap insecure RFIO secured RFIO (gsirfio) EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 22/32
SE types Enabling Grids for E-scienc. E • • • Classic SE: – Grid. FTP server – Insecure RFIO daemon (rfiod) – only LAN limited file access – Single disk or disk array – No quota management – Does not support the SRM interface Mass Storage Systems (Castor, d. Cache) – Files migrated between front-end disk and back-end tape storage hierarchies – Grid. FTP server – Insecure RFIO (Castor), secure gsidcap (d. Cache) – Provide a SRM interface with all the benefits Disk pool managers (d. Cache, DPM, Sto. RM) – manage distributed storage servers in a centralized way – Physical disks or arrays are combined into a common (virtual) file system – Disks can be dynamically added to the pool – Grid. FTP server – Secure remote access protocols (gsidcap for d. Cache, gsirfio for DPM) – SRM interface EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 23/32
Highlights: Disk Pool Manager Enabling Grids for E-scienc. E • Light-weight disk-based Storage Element – Easy to install, configure, manage and to join or remove resources – Integrated security (authentication/authorization) based on VOMS groups and roles § All control and I/O services have security built-in: GSI or Kerberos 5 § Problem of ACLs propagation during replication between SEs will be addressed in the first half of 2007 – SRMv 1 and SRMv 2. 1, SRMv 2. 2 Grid Client RFIO Client Gridftp Client Data Server RFIO Daemon Disk System Gridftp Server RFIO Client SRM Server SRM Client EGEE-II INFSO-RI-031688 SRM Daemon Name Server NS Daemon NS Database Disk Pool Manager Request Daemon DPM Database The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 24/32
Grid foundation: Accounting Enabling Grids for E-scienc. E • APEL: Uses R-GMA to propagate and display job accounting information for infrastructure monitoring – Reads LRMS log files provided by g. Lite-CE and BLAH – Preparing an update for g. Lite 3. 0 to use the files form BLAH • DGAS: Collects, stores and transfers accounting data. Compliant with privacy requirements New! – Reads LRMS log files provided by LCG-CE and BLAH. – Stores information in a site database (HLR) and optionally in a central HLR. Access granted to user, site and VO administrators – Not yet certified in g. Lite 3. 0. Deployment plan: § certify and activate local sensors and site HLR in parallel with APEL § replace APEL sensors with DGAS (DGAS 2 APEL) § certify and activate central HLR; perform scalability tests EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 25/32
High Level Services: Workload mgmt. Enabling Grids for E-scienc. E • WMS helps the user accessing computing resources – Resource brokering, management of job input/output, . . . • lcg-RB: GT 2 + Condor-G – To be replaced when the g. Lite WMS proves to be reliable • g. Lite WMS: Web service (WMProxy) + Condor-G New! – Management of complex workflows (DAGs) and compound jobs § bulk submission and shared input sandboxes § support for input files on different servers (scattered sandboxes) – Support for shallow resubmission of jobs – Job File Perusal: file peeking during job execution – Supports collection of information from CEMon, BDII, R-GMA and from DLI and Storage. Index data management interfaces – Support for parallel jobs (MPI) when the home dir is not shared – Deployed for the first time in g. Lite 3. 0 EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 26/32
High Level Services: Workflows Enabling Grids for E-scienc. E • Direct Acyclic Graph (DAG) is a node. A set of jobs where the input, output, or execution of one or more jobs depends on one or node. B node. C node. E more other jobs • A Collection is a group of jobs with no dependencies node. D – basically a collection of JDL’s • A Parametric job is a job having one or more attributes in the JDL that vary their values according to parameters • Using compound jobs it is possible to have one shot submission of a (possibly very large, up to thousands) group of jobs – Submission time reduction § Single call to WMProxy server § Single Authentication and Authorization process § Sharing of files between jobs – Availability of both a single Job ID to manage the group as a whole and an ID for each single job in the group EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 27/32
Highlights: FTS Enabling Grids for E-scienc. E • Reliable and manageable File Transfer System for VOs • Transfers are treated as jobs • – May be split onto multiple “channels” – Channels are point-to-point or “catch-all” (only one end fixed). More flexible channel definitions on the way. . . New features that will be available in production soon: – Cleaner error reporting and service monitoring interfaces • – Proxy renewal and delegation – SRMv 2. 2 support Longer term development: – Optimized SRM interaction § split preparation from transfer – – Better service manag. controls Notification of finished jobs Pre-staging tape support Catalog & VO plug-ins framework § Allow catalog registration as part of transfer workflow EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 28/32
File management in g. Lite Enabling Grids for E-scienc. E • Files are write-once, read-many – If users edit files then they manage the consequences! • Middleware supporting – – Replica files Logical filenames Catalogue: maps logical name to physical storage device/file Virtual filesystems, POSIX-like I/O: GFAL • Services provided: – Storage: SE – transfer : FTS – catalogue that maps logical filenames to replicas: LFC EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 29/32
Name conventions Enabling Grids for E-scienc. E • Users primarily access and manage files through “logical filenames” LFC has a directory tree structure /grid/<VO_name>/ <you create it> LFC Namespace Defined by the user • Mapping by the “LFC” catalogue server EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 30/32
Summary Enabling Grids for E-scienc. E • g. Lite 3. 0 is an important milestone in EGEE program – New components from g. Lite 1. X being deployed for the first time on the Production Infrastructure § Address requirements in terms of functionality and scalability § Components deployed for the first time need extensive testing! – New organization in EGEE II § New build and integration environment form ETICS § More controlled software process and certification § Development is client driven (TCG) • Development is continuing to provide increased robustness, usability and functionality • Collaboration with other projects for interoperability and definition/adoption of international standards EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 31/32
Enabling Grids for E-scienc. E QUESTIONS? www. glite. org EGEE-II INFSO-RI-031688 The 6 th Joint Training of OMII-Europe & CNGrid, Hong Kong, 10 -11 January, 2008 32/32
013c441c6fd112ee0045f892bc73e40c.ppt