6a2572f0eae9b91cafdc596412528dd6.ppt
- Количество слайдов: 23
Schematic Description of Grid Exceptions Arun Jagatheesan San Diego Supercomputer Center (SDSC) & High Energy Physics Group, University of Florida National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Outline • Why describe? • How to describe? • What to describe? • Who / MIB (Men Involved Behind the scene)? • RFS (Request for Suggestions) • NOTE: Still in learning phase… National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Data Grid Management System (DGMS) • Before we start, why we were interested : Data, Knowledge Management in Grids • Data Grid Management System as set of Services • Credit (DGMS): • • Reagan Moore (SDSC) – Knowledge Grids Jim Gray (Microsoft) – VLDB Suggestions Arcot Rajasekar (SDSC) – Data Grids You (TIA for your suggestions in workshop) National Partnership for Advanced Computational Infrastructure University of Florida A question session will follow. Take Notes San Diego Supercomputer Center
What: Datagrid and DGMS • Data Grid: Logical view of collection of heterogeneous data spread across virtual organization(s) providing a transparent access irrespective of data location, storage media, storage format and data identifier (name) • DGMS: System to manage relationship between data and events associated with the data grid workflow to help in automated data and knowledge discovery National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Descriptions Required • DGMS required Grid Data Flow – distributed operations on the grid specified using XML documents Also required … • Description of Exceptions in inter-grid and intra grid environments • Description of Handling Cases which might dynamically change National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Why descriptions? National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Whatz ahead for Grid • “Grid” as it was meant to be. Grid in a Cell? ? National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Exception Handling in Grid Service Requestor Grid / Executor Store file 1. xyz in data grid Service Providers National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Exception Handling in Grid – User Service Requestor Grid / Executor Just send me an e-mail of failure Service Providers National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Exception Handling in Grid - Provider Service Requestor Grid / Executor Archival System: Maintenance required for tape robot or try after 1 working day Service Providers National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Exception Handling in Grid – System Service Requestor Grid / Executor Mission Critical Grid: Need to find another Service Providers National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Exception Handling • Grid User Specified • Service Provider Specified • Grid/System Specified 2 Questions: • How do we dynamically specify the different exception handling associated with the same exception • Who does the real exception handling National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
How to description? National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Customized Exception Handling Try (store file 1. xyz in data grid) catch any exception and handle • Using Arun_Out_of_Office_Handler (User) • Using Vegas_HPSS_Handler (Service Provider) • Using Gri. Phy. N_Level 1_Handler (System) National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Exception Name spaces xmlns: condor = “http: //www. cs. wisc. edu/condor/exceptions”
What to describe? National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
The Grid Scenario Again… Service Requestor Grid / Executor Service Providers go. org Do this on the grid as mentioned in this VDL document gator. edu National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
The Grid Scenario Again… Service Requestor Now where the hell did the grid service crash? Grid / Executor If I now it, I could handle it appropriately Service Providers go. org gator. edu National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
So what we need… • Need for Name spaces describing the taxonomies of error handling with respect to the Service Provider, Type of the error (generic/specific), the policies in the grid • Need for description of what happened exactly. Just a message “Some thing bad happended during the service invocation” is not fine to recover from it. National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Sum up • VDL/DAG Submission (For each operation on the grid) • User App has its own customized error handler based on the name spaces/ categories of error types • Service provider has its own error handler based on the name spaces/ categories of its own domain which is shared with the grid • Grid (System) has its own error handler based on the grid policies and efficiency concerns All these error handlers could change dynamically. National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Advantages • • Structured Handling based on profiles Handled by the respective providers Profiles can be dynamically changed Suitable for inter-grid and intra-grid scenarios National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Disadvantages • Xtra processing • Definition and categorization of grid errors required • New Mechanisms to parse and handle these exception documents (probably in XML) is required National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Summary • Grid Exceptions involve: The Service Requestor, Service Provider and the Grid Policy which change dynamically • Generic Classification of Grid Errors which could be extended later is required • Error types and error handling description based on service provider required to handle if more efficiently National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center


