An Overview of Grid Computing Jonathan Schisler Advanced

Скачать презентацию An Overview of Grid Computing Jonathan Schisler Advanced

5832d946839c345e0c1375cc6f205ce0.ppt

Количество слайдов: 52

An Overview of Grid Computing Jonathan Schisler Advanced DBMS 2/10/2005

Topics • • Grid Computing Example Grids Grid History Grid Services GT 3 Example Challenges in Grid Computing The Uof. A and Grid

What is “Grid Computing” • Grid computing is way of organizing computing resources • So that they can be flexibly and dynamically allocated and accessed – Processors, storage, network bandwidth, databases, applications, sensors and so on

What is Grid (cont) • The objective of grid computing is to share information and processing capacity so that it can be more efficiently exploited – Offer QOS guarantees (security, workflow and resource management, fail-over, problem determination, … )

Elements of Grid Computing • Resource sharing – Computers, storage, sensors, networks, … – Sharing always conditional: issues of trust, policy, negotiation, payment, … • Coordinated problem solving – Beyond client-server: distributed data analysis, computation, collaboration, … • Dynamic, multi-institutional virtual organizations – Community overlays on classic org structures – Large or small, static or dynamic

Types of Grids • Computational grids – reducing execution time • Data grids – large scale data management problems

Oversimplified Comparison of SMP, MPP SC 04: HLRS

Why Use Grid • • • Commodity Parts - Cheap Custom Supercomputer - Expensive Reduce Application run-time Increased Availability Dynamic Allocation of Resources For Large Datasets

www. top 500. org

Online Access to Scientific Instruments Advanced Photon Source Wide-Area Dissemination Real-Time Collection archival Archival storage Storage desktop clients with shared controls DOE X-ray grand challenge: ANL, USC/ISI, NIST, U. Chicago

Network for Earthquake Engineering Simulation • NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other • On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Collaborative Engineering: NEESgrid U. Nevada Reno, www. neesgrid. org

USA Tera. Grid Computing, B. Wilkinson

Broader Context • “Grid Computing” has much in common with major industrial thrusts – Business-to-business, Peer-to-peer, Application Service Providers, Storage Service Providers, Distributed Computing, Internet Computing, Web Services, … • Sharing issues not adequately addressed by existing technologies – Complicated requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q” – High performance: unique demands of advanced & highperformance systems

Grid Evolution • First Generation (mid 80’s to 1990’s) - “Grid” coined in 1989 - Objective: provide computational resources to a range of high performance apps - Ex) FAFNER (Factoring via Network. Enabled Recursion) - Basic Services such as distributed file systems, site-wide single sign on - Gigabit test beds extended Grid distance

Grid Evolution • Second Generation (late 1990’s to now) - Condor, I-WAY (origin of Globus) and Legion Heterogeneity Scalability Adaptability Use of middleware to integrate applications Few standards, no interoperability Deployment requires significant customization

Grid Evolution (cont) • Third Generation (recent past and the present) - Global Grid Forum standards (1999) - OGSA published (June, 2002) - OGSI, Version 1. 0, published (July, 2003) - Globus Toolkit 3 (GT 3) available (June, 2003)

Grid Evolution (cont) – Administrative Hierarchy – Communication and Information Services – Naming Services – Distributed File Systems – Security and Authorization – System Status and Fault Tolerance – Resource Management and Scheduling

Popular Systems • Condor – Specialized workload management – Job queuing mechanism – Scheduling policy, priority scheme – Resource monitoring and management – Transparent job migration – Checkpointing

Popular Systems (cont) • Globus (GT 3) – Uses a service-oriented approach – Grid. FTP – GRAM – GSI – Provides Services to execute code on authorized machines

The Global Grid Forum • GGF developed standard interfaces, behaviors, core semantics, etc. for grid applications based upon web services. • GGF introduced the term Grid Service as an extended web service that conforms to the GGF OGSI standard. Grid Computing, B. Wilkinson

Grid Services • Common interface specification supports the interoperability of discrete, independently developed services • Concept similar to Remote Procedure Call (RPC), Remote Method Invocation (RMI), only applied over HTTP • Based on extensions of Web Services

Web Services Architecture The Web Services Architecture is specified and standardized by the World Wide Web Consortium (W 3 C), the same organization responsible for XML, HTML, CSS, etc.

Web Services

GGF Standards Open Grid Services Architecture (OGSA) – Defines standard mechanisms for creating, naming, and discovering Grid service instances. – Addresses architectural issues relating to interoperable Grid services. Open Grid Services Infrastructure (OGSI) – Based upon Grid Service specification and specifies way clients interact with a grid service (service invocation, management data interface, security interface, . . . ). Grid Computing, B. Wilkinson

Grid and Web Services Convergence The definition of WSRF means that the Grid and Web services communities can move forward on a common base. SC 04: www. globus. org

Differences between Web Services and Grid Service Grid services can be: – Stateful or Stateless – Transient or Non-Transient. Web services are usually thought of as non-transient and stateless.

Web Services missing features • At the time the OGSI V 1. 0 spec was published there was a gap between the need to define stateful Web Services and what was provided by the latest version of Web Services in WSDL 1. 1 – Web Services were stateless and non-transient • The result was the definition in OGSI of Service Data – a common mechanism to expose a service instance’s state data for query, update, and change notification • Also, Grid Services uses a Factory to manage instances – to allow transient and private instances

Grid Services Factory

Grid Services • The declared state of a service is accessed only though service operations that are defined as a part of the service interface (For those who know Java. Beans, Service Data is similar to Java. Bean properties) • I will show an example using GT 3. Since GT 3 uses Java, the whole example is in Java.

Grid Services Example Using GT 3 Step 1: Define the Service interface using Java public interface Math { public void add(int a); public void subtract(int a); public int get. Value(); } In this example there is a value and it can be modified via add or subtract, and can be accessed via get. Value. GT 3 provides tools for converting the Java to WSDL

Step 2: Implement the Service public class Math. Impl extends Grid. Service. Impl implements Math. Port. Type { private int value = 0; public Math. Impl() { super(“Math Factory Service”); } public void add(int a) throws Remote. Exception { value = value + a; } public void subtract(int a) throws Remote. Exception { value = value - a; } public int get. Value() throws Remote. Exception { return value; } }

Step 3: Write the Deployment Descriptor using Web Service Deployment Descriptor (WSDD) format

(Continued)  <parameter name= (Continued)

Step 4: Compile and deploy the Service using ant [aapon@kite tutorial]$. /tutorial_build. sh gt 3 tutorial/core/factory/impl/Math. java You can see gar and jar files that ant creates from the source files. [aapon@kite] newgrp globus [aapon@kite] cd $GLOBUS_LOCATION [aapon@kite] ant deploy Dgar. name=/home/aapon/tutorial/build/lib/gt 3 tutorial. core. factory. Math. gar

Step 5: Write and compile the client public class Math. Client { public static void main(String[] args) { try { // Get command-line arguments URL GSH = new java. net. URL(args[0]); int a = Integer. parse. Int(args[1]); // Get a reference to the Math. Service instance Math. Service. Grid. Locator my. Service. Locator = new Math. Service. Grid. Locator(); Math. Port. Type myprog = my. Service. Locator. get. Math. Service(GSH); // Call remote method 'add' myprog. add(a); System. out. println("Added " + a); // Get current value through remote method 'get. Value' int value = myprog. get. Value(); System. out. println("Current value: " + value); }catch(Exception e) … }

Step 6: Start the Service and execute the client Start the Service: [aapon@kite] globus-start-container -p 8081 Create the service instance: This client does not create a new instance when it runs; thus, the instance needs to be created the first time. [aapon@kite] ogsi-create-service http: //localhost: 8081/ogsa/services/tutorial/core/factory/Math. Factory. Service myprog This ogsi-create-service has two arguments: the service handle GSH and the name of the instance we want to create. Execute the client: [aapon@kite tutorial] java gt 3 tutorial. core. factory. client. Math. Client http: //localhost: 8081/ogsa/services/tutorial/core/factory/Math. Factory. Service/myprog 4 You will see the following result: Added 4 Current value: 4

Problems with GT 3 and OGSI • I didn’t tell you the whole story – there a lot of environmental variables, a lot of setup is required! • You have to be very proficient at Java to use GT 3. • Not only that, it is quite slow. • Oops, OGSI is not completely interoperable with Web Services!

Changes to Grid Standards • Introduction of Web Services Resource Framework (WSRF), January, 2004 – Web services vendors recognized the importance of OGSI concept but would not adopt OGSI as it was defined (Summer 2003) – Globus Alliance teamed up with Web services architects and came up with WSRF – Add the ability to create, address, inspect, discover, and manage stateful resources

WSRF changes the terms slightly WS-Resource (instead of Grid services) The concepts are the same: • Grid service has an identity, service data, and a lifetime management mechanism • WS-Resource has a name, resource properties, and a lifetime management mechanism So, the GT 3 tutorial is still relevant!

WS-Resource Guaranteed to have these four characteristics (the ACID properties): Atomicity - Stateful resource updates within a transactional unit are made in an all-or-nothing fashion. Consistency - Stateful resources should always be in a consistent state even after failures. Isolation - Updates to stateful resources should be isolated within a given transactional work unit. Durability - Provides for the permanence of stateful resource updates made under the transactional unit of work.

Planned Components in GT 4. 0 SC 04: www. globus. org

Distributed computing is complex • There are many advantages to working within a standard framework – Single sign-on – Remote deployment of executables – Computation management, data movement – Benefits of working with an international community of developers and users – A framework enables the definition of higher-level services

Uof. A Grid Computing Possibilities • Acxiom work: Self-Regulation of the Acxiom Grid Environment • Computational chemistry: exploit 10, 000 computers to screen 100, 000 compounds in an hour • DNA computational scientists visualize, annotate, & analyze terabyte simulation datasets • Environmental scientists share volcanic activity sensing data that has been collected from a widely dispersed sensor grid

Uof. A “Grid” for Sharing Digital Map Data • Geo. Stor digital map data delivery system • http: //www. cast. uark. edu/cast/geostor/ • Contains all publicly available geographic data for the state of Arkansas • Oracle database is used for access to metadata and some maps

Uof. A “Grid” for Sharing Digital Map Data • Geo. Surf • A Java based product • User queries and downloads data from Geo. Stor • User specifies geographic clip boundaries, projection, data format • Could be a Grid service

Red Diamond • 128 -node (256 CPUs) Cluster • Funded by NSF Major Research Initiative (MRI) • 3. 2 GHz Xeon 64 processors, each with 4 GB memory, 72 GB hard drives • High-performance Infini. Band system area network • 10 Terabytes of external storage • 1 Teraflop/s (more than 1 trillion floating point operations every second) • Justification included research with Acxiom • http: //archie. csce. uark. edu/

Research Areas: • • • Initial Partitioning Dynamic Re-partitioning Scalability Load Balancing High Throughput and Overall Performance • Failover

Questions