bb7918ca1089b016f1a9d61417c4a443.ppt
- Количество слайдов: 48
Grid Computing & Web Services: A Natural Partnership Dave Angulo Ian Foster Department of Computer Science Mathematics and Computer Science Division The University of Chicago Argonne National Laboratory and Mathematics and Computer Science Division Department of Computer Science The University of Chicago Argonne National Laboratory Address of Poznan Supercomputing & Networking Center Poznan, Poland February 7, 2002
Partial Acknowledgements l Open Grid Services Architecture work is performed by – Ian Foster, Globus Co-PI @ Argonne/Uof. C – Carl Kesselman, Globus Co-PI @ USC/ISI – Steve Tuecke, Globus Toolkit Architect @ANL – Jeff Nick, Steve Graham, Jeff Frey @ IBM l l l Globus Toolkit R&D involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www. globus. org) Strong collaborations with many outstanding EU, UK, US Grid projects Support from DOE, NASA, NSF, Microsoft dangulo@cs. uchicago. edu University of Chicago
Partial Acknowledgements l Globus Toolkit TM – R&D involves > many fine scientists & engineers at ANL/Uof. C, USC/ISI, and elsewhere (see www. globus. org) – Led by > Ian Foster @ Argonne/Uof. C > Carl Kesselman @ USC/ISI l Open Grid Services Architecture work performed by – – l l Ian Foster, Globus Co-PI @ Argonne/Uof. C Carl Kesselman, Globus Co-PI @ USC/ISI Steve Tuecke, Globus Toolkit Architect @ANL Jeff Nick, Steve Graham, Jeff Frey @ IBM Strong collaborations with many outstanding EU, UK, US Grid projects Support from DOE, NASA, NSF, Microsoft, IBM dangulo@cs. uchicago. edu University of Chicago
Grid Computing dangulo@cs. uchicago. edu University of Chicago
The Grid Problem Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations dangulo@cs. uchicago. edu University of Chicago
Why Grids? l l l A biochemist exploits 10, 000 computers to screen 100, 000 compounds in an hour 1, 000 physicists worldwide pool resources for petaflop analyses of petabytes of data Civil engineers collaborate to design, execute, & analyze shake table experiments Climate scientists visualize, annotate, & analyze terabyte simulation datasets A home user invokes architectural design functions at an application service provider – An application service provider purchases cycles from compute cycle providers dangulo@cs. uchicago. edu University of Chicago
Elements of the Problem l Resource sharing – Computers, storage, sensors, networks, … – Sharing always conditional: issues of trust, policy, payment, … l Coordinated problem solving – Beyond client-server: distributed data analysis, computation, … l Dynamic, multi-institutional virtual orgs – Community overlays on classic org structures – Large or small, static or dynamic dangulo@cs. uchicago. edu University of Chicago
Grids: Why Now? l l l Moore’s law improvements in computing produce highly functional end systems The Internet and burgeoning wired and wireless provide universal connectivity Network exponentials produce dramatic changes in geometry and geography dangulo@cs. uchicago. edu University of Chicago
Grids: Why Now? l l l Moore’s law improvements in computing produce highly functional endsystems The Internet and burgeoning wired and wireless provide universal connectivity Network exponentials produce dramatic changes in geometry and geography – 9 -month doubling: double Moore’s law! – 1986 -2001: x 340, 000; 2001 -2010: x 4000? dangulo@cs. uchicago. edu University of Chicago
The Grid World: Current Status l Dozens of major Grid projects in scientific & technical computing/research & education – Deployment, application, technology l Considerable consensus on key concepts and technologies – Globus Toolkit™ has emerged as de facto standard for major protocols & services l Global Grid Forum has emerged as a significant force – And first “Grid” proposals at IETF dangulo@cs. uchicago. edu University of Chicago
Selected Major Grid Projects Name Access Grid New Blue. Grid g g g DISCOM DOE Science New Grid g URL & Sponsors Focus www. mcs. anl. gov/FL/ accessgrid; DOE, NSF Create & deploy group collaboration systems using commodity technologies IBM Grid testbed linking IBM laboratories www. cs. sandia. gov/ discom DOE Defense Programs Create operational Grid providing access to resources at three U. S. DOE weapons laboratories sciencegrid. org Create operational Grid providing access to resources & applications at U. S. DOE science laboratories & partner universities DOE Office of Science Earth Systemg earthsystemgrid. org Grid (ESG) DOE Office of Science Delivery and analysis of large climate model datasets for the climate research community European Union (EU) Data. Grid Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics g eu-datagrid. org European Union dangulo@cs. uchicago. edu University of Chicago
Selected Major Grid Projects Name Euro. Grid, Grid New Interoperability (GRIP) Fusion Collaboratory New URL/Sponso r g g g Globus Project New Grid. PP European Union fusiongrid. org DOE Off. Science globus. org g Grid. Lab eurogrid. org g g Focus Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus Create a national computational collaboratory for fusion research DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies; development and support of Globus Toolkit; application and deployment gridlab. org Grid technologies and applications European Union gridpp. ac. uk U. K. e. Science Grid Research grids-center. org New Integration Dev. & NSF Support Center dangulo@cs. uchicago. edu Create & apply an operational grid within the U. K. for particle physics research Integration, deployment, support of the NSF Middleware Infrastructure for research & education University of Chicago
Selected Major Grid Projects Name URL/Sponsor Focus Grid Application Dev. Software g hipersoft. rice. edu/ grads; NSF Research into program development technologies for Grid applications Grid Physics Network g griphyn. org Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS NSF Information Power ipg. nasa. gov g NASA Grid Create and apply a production Grid for aerosciences and other NASA missions International g ivdgl. org Virtual Data Grid NSF Laboratory New Create international Data Grid to enable large-scale experimentation on Grid technologies & applications Network for g neesgrid. org Earthquake Eng. NSF Simulation Grid New Create and apply a production Grid for earthquake engineering Particle Physics Data Grid g ppdg. net DOE Science dangulo@cs. uchicago. edu Create and apply production Grids for data analysis in high energy and nuclear physics experiments University of Chicago
Selected Major Grid Projects Name URL/Sponsor g Tera. Grid teragrid. org NSF New UK e. Science Grid g grid-support. ac. uk New U. K. e. Science Unicore BMBFT Focus U. S. science infrastructure linking four major resource sites at 40 Gb/s Support center for Grid projects within the U. K. Technologies for remote access to supercomputers Also many technology R&D projects: e. g. , Condor, Net. Solve, Ninf, NWS See also www. gridforum. org dangulo@cs. uchicago. edu University of Chicago
Grid Communities & Applications: Data Grids for High Energy Physics ~PBytes/sec Online System ~100 MBytes/sec ~20 TIPS There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~622 Mbits/sec or Air Freight (deprecated) France Regional Centre Spec. Int 95 equivalents Offline Processor Farm There is a “bunch crossing” every 25 nsecs. Tier 1 1 TIPS is approximately 25, 000 Tier 0 Germany Regional Centre Italy Regional Centre ~100 MBytes/sec CERN Computer Centre Fermi. Lab ~4 TIPS ~622 Mbits/sec Tier 2 ~622 Mbits/sec Institute ~0. 25 TIPS Physics data cache Caltech ~1 TIPS Institute ~1 MBytes/sec Tier 4 Tier 2 Centre Tier 2 Centre ~1 TIPS Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physicist workstations dangulo@cs. uchicago. edu www. griphyn. org www. ppdg. net www. eu-datagrid. org University of Chicago
Grid Communities and Applications: Mathematicians Solve NUG 30 l l l Community=an informal collaboration of mathematicians and computer scientists Condor-G delivers 3. 46 E 8 CPU seconds in 7 days (peak 1009 processors) in U. S. and Italy (8 sites) Solves NUG 30 quadratic assignment problem 14, 5, 28, 24, 1, 3, 16, 15, 10, 9, 21, 2, 4, 29, 25, 22, 13, 26, 17, 30, 6, 20, 19, 8, 18, 7, 27, 12, 11, 23 dangulo@cs. uchicago. edu www. mcs. anl. gov/metaneos: Argonne, Iowa, NWU, Wisconsin Chicago University of
Grid Communities and Applications: Network for Earthquake Eng. Simulation l l NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other On-demand access to experiments, data streams, computing, archives, collaboration dangulo@cs. uchicago. edu NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www. neesgrid. org University of Chicago
The 13. 6 TF Tera. Grid: Computing at 40 Gb/s Site Resources 26 24 8 4 HPSS External Networks Caltech External Networks Site Resources HPSS SDSC 4. 1 TF 225 TB Site Resources HPSS External Networks 5 Argonne NCSA/PACI 8 TF 240 TB dangulo@cs. uchicago. edu Tera. Grid/DTF: NCSA, SDSC, Caltech, Argonne External Networks Site Resources Uni. Tree www. teragrid. org University of Chicago
Intl. Virtual Data Grid Lab. Tier 0/1 facility Tier 2 facility Tier 3 facility 10+ Gbps link 2. 5 Gbps link 622 Mbps link Other link dangulo@cs. uchicago. edu www. ivdgl. org University of Chicago
Access Grid l l Collaborative work among large groups ~50 sites worldwide Use Grid services for discovery, security www. scglobal. org Presenter mic Presenter camera Ambient mic (tabletop) Audience camera dangulo@cs. uchicago. edu others Access Grid: Argonne, www. accessgrid. org University of Chicago
Grid Architecture & Globus Toolkit™ l The question: – What is needed for resource sharing & coordinated problem solving in dynamic virtual organizations (VOs)? l The answer: – Major issues identified: membership, resource discovery & access, …, … – Grid architecture captures core elements, emphasizing pre-eminent role of protocols – Globus Toolkit™ has emerged as de facto standard for major protocols & services dangulo@cs. uchicago. edu University of Chicago
The Critical Role of Protocols l Need for interoperability when different groups want to share resources – E. g. , IP lets me talk to your computer, but how do we establish & maintain sharing? – How do I discover, authenticate, authorize, describe what I want to do, etc. ? l Need for shared infrastructure services to avoid repeated development, installation, e. g. – One port/service for remote access to computing, not one per tool/application – X. 509 enables sharing of Certificate Authorities dangulo@cs. uchicago. edu University of Chicago
Grid Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services “Sharing single resources”: negotiating access, controlling use Collective Application Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link Internet Protocol Architecture Application dangulo@cs. uchicago. edu For more info: www. globus. org/research/papers/anatomy. pdf Chicago University of
Globus Project and Toolkit l Globus Project™ – R&D project at ANL, U. Chicago, USC/ISI – Emphasis on identifying and defining core protocols and services – O(40) researchers & developers l Globus Toolkit™ – A major product of the Globus Project – Open source software: reference implementation of core protocols & services – Growing open source developer community dangulo@cs. uchicago. edu University of Chicago
Globus Toolkit: Evaluation (1) l Good technical solutions for key problems, e. g. – Authentication and authorization – Resource discovery and monitoring – Reliable remote service invocation – High-performance remote data access l This + good engineering is enabling progress – Good quality reference implementation, multilanguage support, interfaces to many systems, large user base, industrial support – Growing community code base built on tools dangulo@cs. uchicago. edu University of Chicago
Globus Toolkit: Evaluation (2) l Protocol deficiencies, e. g. – Heterogeneous basis: HTTP, LDAP, FTP – No standard means of error propagation l Significant missing functionality, e. g. – Databases, sensors, instruments – Programming tools: workflow, … – Virtualization of end systems (hosting envs. ) l Little work on total system properties, e. g. – Dependability, end-to-end Qo. S, … – Reasoning about system properties dangulo@cs. uchicago. edu University of Chicago
“Web Services” l Increasingly popular standards-based framework for accessing network applications – W 3 C standardization; Microsoft, IBM, Sun, others l WSDL: Web Services Description Language – Interface Definition Language for Web services l SOAP: Simple Object Access Protocol – XML-based RPC protocol; common WSDL target l WS-Inspection (WSIL) – Conventions for locating service descriptions l UDDI: Universal Desc. , Discovery, & Integration – Directory for Web services dangulo@cs. uchicago. edu University of Chicago
Transient Service Instances l l “Web services” address discovery & invocation of persistent services In Grids, must also support transient service instances, created/destroyed dynamically – E. g. , to manage e. Business workflow, video conference, or distributed data analysis l Significant implications for how services are managed, named, discovered, and used – In fact, much of our work is concerned with the management of service instances dangulo@cs. uchicago. edu University of Chicago
Open Grid Services Architecture l l Service orientation to virtualize resources From Web services: – Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency l Building on Globus Toolkit: – The Grid service defines standard semantics for service interactions – Factory, registry, and mapper services – Reliable and secure transport l Multiple hosting targets: J 2 EE, . NET, “C”, etc. dangulo@cs. uchicago. edu University of Chicago
OGSA Service Model l l System comprises (a typically few) persistent services & (potentially many) transient services All services adhere to specified Grid service interfaces and behaviors – Reliable invocation, lifetime management, discovery, authorization, notification, upgradeability, concurrency, manageability l Interfaces for managing Grid service instances – Factory, registry, mapper Heavily leverage Globus Toolkit technology => Reliable secure mgmt of distributed state l dangulo@cs. uchicago. edu University of Chicago
The Grid Service l A (potentially transient) Web service with specified interfaces & behaviors, including – Creation (Factory) – Global naming (GSH) & references (GSR) – Lifetime management – Registration & Discovery – Authorization – Notification – Concurrency – Manageability dangulo@cs. uchicago. edu University of Chicago
Factory l A Grid service with Factory interface can be requested to create a new Grid service instance – Reliable creation (once-and-only-once) l l Create operation can be extended to accept Grid-service-specific creation parameters Returns a Grid Service Handle (GSH) – A globally unique URL – Uniquely identifies the instance for all time – Based on name of a home mapper service dangulo@cs. uchicago. edu University of Chicago
Mapper l l A GSH is a stable name for a Grid service, but does not allow client to actually communicate with the Grid service A Grid Service Reference (GSR) is a WSDL document that describes how to communicate with the Grid service – Contains protocol binding, network address, … – May expire (I. e. GSR information may change) l The Mapper interface allows a client to map from a GSH to a GSR – http get on GSH also returns a GSR dangulo@cs. uchicago. edu University of Chicago
Lifetime Management l GS instances created by factory or manually; destroyed explicitly or via soft state – Negotiation of initial lifetime with Factory l Soft. State. Destruction interface supports – Get. Termination. Time message for inquiry > Notification interface also allows for lifetime notification – Set. Termination. Time message for keepalive l Soft state lifetime management avoids – Explicit client teardown of complex state – Resource “leaks” in hosting environments l Explicit. Destruction interface also available dangulo@cs. uchicago. edu University of Chicago
Discovery l A Grid service instance may maintain a set of service information – XML fragments encapsulated in standard <name, type, TTL-info> containers l Discovery interface allows clients to query the Grid service instance for this information – Query operation, plus supporting operations > Extensible query language support l See also Notification interfaces – Allows notification of service existence and about service information dangulo@cs. uchicago. edu University of Chicago
Registry l The Registry interface may be used to discover a set of Grid service instances – Returns a WS-Inspection document containing the GSHs of a set of Grid services – Also returns policy associated with the set – Also available through Discovery interface l The Registry. Management interface allows for soft-state registration of a Grid service – A set of Grid services can periodically register their GSHs into a registry service, to allow for discovery of services in that set dangulo@cs. uchicago. edu University of Chicago
Authorization l Protocol binding handles authentication during invocation of Grid service operation – Gives service URI for authenticated subject l Grid service instance should apply authorization policy on all operations – May be site-, service-, instance-, etc. , specific l OGSA defines standard interfaces for remote management of access control policy – Operation. Authorization. Management – Subject. Equivalency dangulo@cs. uchicago. edu University of Chicago
Notification Interfaces l Notification. Source for client subscription – One or more notification generators > Generates notification message of a specific type > Typed interest statements: E. g. , Filters, topics, … > Supports messaging services, 3 rd party filter services, … – Soft state subscription to a generator l l Notification. Sink for asynchronous delivery of notification messages A wide variety of uses are possible – E. g. Dynamic discovery/registry services, monitoring, application error notification, … dangulo@cs. uchicago. edu University of Chicago
Use of Web Services (1) l l A Grid service interface is a WSDL port. Type A Grid service definition is a WSDL extension (service. Type) containing: – A set of one or more port. Types supported by the service – port. Type & service. Type compatibility statements, to support upgradability > For discovery of compatible services when interfaces are upgraded – Implementation version information dangulo@cs. uchicago. edu University of Chicago
Use of Web Services (2) l A GSR is a WSDL document with extensions: – Extension to service element to reference service. Type – Service element extensions to carry the GSH, and the expiration time of the GSR l A GSH is an URL, with the following properties: – Globally unique for all time – http get on GSH + “. wsdl” returns GSR – Can derive GSH to Mapper from it l Registry returns WS-Inspection documents dangulo@cs. uchicago. edu University of Chicago
Using OGSA to Construct Grid Environments (a) Simple Hosting Environment H 2 R Mapper . . . Factory Service Registry Service Factory . . . Service . . . Registry Service Factory Service (b) Virtual Hosting Environment Service F S S E 2 E Factory E 2 E Reg H 2 R Mapper . . . Service R M F (c) Compound Services S F S Service E 2 E S R M F S E 2 E H 2 R Mapper F 1 S S R M S . . . E 2 E S S R M F 2 S E 2 E S S S In each case, Registry handle is effectively the unique name for the virtual organization. dangulo@cs. uchicago. edu University of Chicago
OGSA and the Globus Toolkit l Technically, OGSA enables – Refactoring of protocols (GRAM, MDS-2, etc. )— while preserving all GT concepts/features! – Integration with hosting environments: simplifying components, distribution, etc. – Greatly expanded standard service set l Pragmatically, we are proceeding as follows – Develop open source OGSA implementation > Globus Toolkit 3. 0; supports Globus Toolkit 2. 0 APIs – Partnerships for service development – Also expect commercial value-adds dangulo@cs. uchicago. edu University of Chicago
Globus Toolkit Refactoring l Grid Security Infrastructure (GSI) – Used in Grid service network protocol bindings l Meta Directory Service 2 (MDS-2) – Native part of each Grid service: > Discovery, Registry. Management, Notification l Grid Resource Allocation & Mngt (GRAM) – Gatekeeper -> Factory for job mgr instances l Grid. FTP – Refactor control channel protocol l Other services refactored to used Grid services dangulo@cs. uchicago. edu University of Chicago
Timeline l l Summer 2002 – Alpha releases of highlevel Grid Services Late 2002, Early 2003 – Alpha release of new core Grid Services (MDS, GRAM, Grid. FTP) dangulo@cs. uchicago. edu University of Chicago
Migration Paths l Globus Toolkit. TM evolutionary in nature – Toolkit implementation may change – Underlying model of Grid Computing remains the same – Capabilities of future Toolkits will be superset of today’s Toolkit l l New implementations integrate better with existing commodity technologies In cases of radical departure from current implementations, migration paths will be provided – possibly maintain compatible APIs – possibly create gateways to today’s protocols dangulo@cs. uchicago. edu University of Chicago
Summary: Evolution of Grid Technologies l Initial exploration (1996 -1999; Globus 1. 0) – Extensive appln experiments; core protocols l Data Grids (1999 -? ? ; Globus 2. 0+) – Large-scale data management and analysis l Open Grid Services Architecture (2001 -? ? , Globus 3. 0) – Integration w/ Web services, hosting environments, resource virtualization – Databases, higher-level services l Radically scalable systems (2003 -? ? ) – Sensors, wireless, ubiquitous computing dangulo@cs. uchicago. edu University of Chicago
Summary l l l The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations Grid architecture: Protocol, service definition for interoperability & resource sharing Globus Toolkit a source of protocol and API definitions—and reference implementations – And many projects applying Grid concepts (& Globus technologies) to important problems l Open Grid Services Architecture represents (we hope!) next step in evolution dangulo@cs. uchicago. edu University of Chicago
For More Information l The Globus Project™ – www. globus. org l Grid architecture – www. globus. org/research /papers/anatomy. pdf l Open Grid Services Architecture – www. globus. org/research /papers/ogsa. pdf – www. globus. org/research /papers/gsspec. pdf dangulo@cs. uchicago. edu University of Chicago
bb7918ca1089b016f1a9d61417c4a443.ppt