
e18b72b31081cf041047be92ecb111b5.ppt
- Количество слайдов: 21
DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan arun@sdsc. edu San Diego Supercomputer Center (SDSC) University of California, San Diego Gri. Phy. N All Hands Meeting May 17, 2004, University of Chicago National Partnership for Advanced Computational Infrastructure University of Florida San Diego Supercomputer Center
Acknowledgement • Participants • • Jonathan Weinberg Allen Ding Dipti Borkar Erik Vandekieft Reena Mathew Marcio Faerman (SCEC) Lucas Gilbert (BIRN) Also an out-sourced resource from the Gator’s Physics department – Thanks to Paul Avery for this important resource • Good-will Wishers • Reagan Moore and SDSC SRB Team • Kim Baldridge • You !!! Grid Physics Network (Gri. Phy. N) University of Florida 2 San Diego Supercomputer Center
Talk Outline • • Problem : Gridflow Description and Querying Gridflow Description Gridflow Language Requirements Options Path we took Our success Our future Summary Grid Physics Network (Gri. Phy. N) University of Florida 3 San Diego Supercomputer Center
SRB Data Grid Management Systems Southern California Earthquake Center NASA Data Grids This work is generic and not restricted to SRB alone NIH Biomedical Informatics Research Network National Science Digital Library Scripps Institute of Oceanography Grid Physics Network (Gri. Phy. N) University of Florida 4 San Diego Supercomputer Center
Gridflow in SCEC (data information pipeline) Metadata derivation Ingest Data Ingest Metadata Pipeline could be triggered by input at data source or by a data request from user Determine analysis pipeline Initiate automated analysis Use the optimal set of resources based on the task – on demand Organize result data into distributed data grid collections All gridflow activities stored for data flow Grid Physics Network (Gri. Phy. N) provenance University of Florida 5 San Diego Supercomputer Center
Data Discovery New data Digital entities updates relationships among data in collections Meta-data Services invoked to analyze new relationships Services DGMS applications get notified of state updates State Grid Physics Network (Gri. Phy. N) University of Florida 6 San Diego Supercomputer Center
What they want? We know the business (scientific) process Cyber. Infrastructure is all we care (why bother about colliding atoms) Grid Physics Network (Gri. Phy. N) University of Florida 8 San Diego Supercomputer Center
What they want? Use DGL to describe your process logic with abstract references to datagrid infrastructure dependencies Describe resource, site, VO or grid policy dependencies independently (UPL, CVF? ? ) Grid Physics Network (Gri. Phy. N) University of Florida 9 San Diego Supercomputer Center
Gridflows • Grid Workflow (Gridflow) is the automation of a execution pipeline in which data or tasks are processed through multiple autonomous grid resources according to a set of procedural rules • Gridflows are executed on resources that are dynamically obtained through confluence of one or more autonomous administrative domains (peers) Grid Physics Network (Gri. Phy. N) University of Florida 10 San Diego Supercomputer Center
Gridflow Language and CS Domains • Compiler Design • Variable scope definition, Recursive Grammar, Execution Stack Management, • Data Modeling • Schema definitions for gridflow patterns • Grid Computing • Data Grid data types, Virtual Organization, basic operations, … • Other concepts and Standards • Rules, W 3 C XQuery, GGF JSDL? Grid Physics Network (Gri. Phy. N) University of Florida 11 San Diego Supercomputer Center
Gridflow Language Requirements • High level Abstract descriptions • Abstract description of cyberinfrastructure dependencies • Simple yet flexible • Flexible to describe complex requirements (no brute force) • Gridflow dependency patterns • Based on execution structure and data semantics • (Parallel, Sequential, fork-new), (milestones, for-each, switch-case). . • Asynchronous execution • For long-run requests • Querying using existing standard • XQuery Grid Physics Network (Gri. Phy. N) University of Florida 12 San Diego Supercomputer Center
Gridflow Language Requirements • Process meta data and annotations • Runtime definition, update and querying of meta-data • Runtime Management of Gridflows • Stop gridflow at run time • Partitioning • Facility in language to divide a gridflow request to multiple requests • Import descriptions • Refer other gridflows in execution Grid Physics Network (Gri. Phy. N) University of Florida 13 San Diego Supercomputer Center
Data Grid Language (DGL) • DGL is just a language specification • Can be used in any commercial or academic data grid software • DGL describes gridflow description and dependencies Grid Physics Network (Gri. Phy. N) University of Florida 14 San Diego Supercomputer Center
Gridflow Process I End User using DGBuilder Grid Physics Network (Gri. Phy. N) University of Florida 15 Gridflow Description Data Grid Language San Diego Supercomputer Center
Gridflow Process II Abstract Gridflow using Data Grid Language Grid Physics Network (Gri. Phy. N) University of Florida 16 Planner Concrete Gridflow Using Data Grid Language San Diego Supercomputer Center
Gridflow Process III Gridflow Processor Concrete Gridflow Using Data Grid Language Grid Physics Network (Gri. Phy. N) University of Florida 17 Gridflow P 2 P Network San Diego Supercomputer Center
DGL - Hypothetical Picture • SRB Operation • Grid. FTP Operation • Condor execution DAG • Tera. Grid Scheduler DGL Compiler node? • (at run time – late binding) Capone? • … Grid Physics Network (Gri. Phy. N) University of Florida 18 San Diego Supercomputer Center
DGL Structure (data model) Flow Logic Structure Pre-Process Structure – parallel, sequential etc. , ECA Rule based definitions Runnable Recursive definition of runnables as either data operation or as a executable process (Job) Post-Process Grid Physics Network (Gri. Phy. N) University of Florida 19 Meta-data San Diego Supercomputer Center
Operations in DGL • Execute Process (DAG, java, WSDL, etc) • Very generic Datagrid operations • • Copy directories/files Change Permissions (Chmod) Create directory/file/archive Delete directory/file/archive Ingest/download URl or any data source Replicate, Rename, List Seek. NWrite, Seek. NRead Ingest, Query Any type of Metadata Grid Physics Network (Gri. Phy. N) University of Florida 20 San Diego Supercomputer Center
Components of DGL • DGL document is either a request or a response • Data Grid Request • Could be a Flow (aggregation of operations) • Or could be a Status Query • Data Grid Response • Could be a Flow Acknowledgement • Or could be a Status Response • Can be made Synchronous or Asynchronous • Flexibility for any type of Implementation Grid Physics Network (Gri. Phy. N) University of Florida 21 San Diego Supercomputer Center
Summary • A standard description language is Needed • Requirements of the language • Data Grid Language (DGL) • • Recursive definition of flows and steps Metadata or variable scopes Rules Can be partitioned (sub-divided) • Components of Data Grid Language • Next step: Talk to Scheduling or Heuristics people Grid Physics Network (Gri. Phy. N) University of Florida 22 San Diego Supercomputer Center
e18b72b31081cf041047be92ecb111b5.ppt