Скачать презентацию Gri Phy N SDSC Research and Infrastructure Reagan Скачать презентацию Gri Phy N SDSC Research and Infrastructure Reagan

d2e9211648f5efceeed8d976c18c8437.ppt

  • Количество слайдов: 25

Gri. Phy. N SDSC Research and Infrastructure Reagan Moore San Diego Supercomputer Center NATIONAL Gri. Phy. N SDSC Research and Infrastructure Reagan Moore San Diego Supercomputer Center NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Topics • Research activities • Advanced query interfaces - Amarnath Gupta • Knowledge bases Topics • Research activities • Advanced query interfaces - Amarnath Gupta • Knowledge bases - Bertram Ludaescher • Infrastructure development • • SRB replication - Michael Wan MCAT information catalog - Arcot Rajasekar Grid Portals - Mary Thomas WSDL web services - Arun Jagatheesan • Grids NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

LIGO Support Opportunities • Pattern recognition in template and chirptransform data using database technology LIGO Support Opportunities • Pattern recognition in template and chirptransform data using database technology • Derived data product optimization through optimization of input parameters - controlled parameter sweeps • Utilization of SRB/MCAT for storage of virtual data products NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

SDSS Support Opportunities • Federation of sky survey services • Development of a dynamic SDSS Support Opportunities • Federation of sky survey services • Development of a dynamic cross-match service between SDSS and other sky surveys • WSDL based web interface for sky survey services • UDDI based service directory • Build topic map providing relationships between “Strasbourg sky survey” attributes • Correlate attributes through physical laws as well as derived observations NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Integration of XSIL and XQuery • An XML query language designed for heterogeneous data Integration of XSIL and XQuery • An XML query language designed for heterogeneous data sources • Authors: Don Chamberlin (IBM), Jonathan Robie (Software. AG), and Deniela Florescu (INRIA) Quilt is built on previous XML query languages : -- XPath, XQL, XML-QL, XMAS, Lorel, YATL • • Become a standard query language for XML, called XQuery “List the titles of all books published by Addison Wesley after 1991, in alphabetic order. ” FOR $b IN document("www. bn. com/bib. xml")//book [publisher = "Addison Wesley" AND @year > "1991"] RETURN $b/@year, $b/title SORTBY (title) NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Extensible Scientific Interchange Language (XSIL) • A flexible, XML based, hierarchical, extensible, transport language Extensible Scientific Interchange Language (XSIL) • A flexible, XML based, hierarchical, extensible, transport language for scientific data objects 10 0. 0, 0. 1, 0. 2, 0. 3, 0. 4, 0. 5, 0. 6, 0. 7, 0. 8, 0. 9 Hello Auntie Joan 96 NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Quilt Extensions • Added the concept of data types • Float, integer, and boolean Quilt Extensions • Added the concept of data types • Float, integer, and boolean versus string • Added operator overloading • “Sum” on type string concatenates • “Sum” on type integer adds • Added array operations • Get, set, element summation, array summation. Subsequence, concatenate NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Data Grids Linking Collections Logical collection -Elements - attributes Available Transforms Derived data process Data Grids Linking Collections Logical collection -Elements - attributes Available Transforms Derived data process metadata Derived Data metadata Export elements & attributes Transforms On elements Derived data products Grid Container -Logical name -Container metadata -Element attributes -(Data model) -Elements Mapping of logical containers to physical files Import into existing or new logical collection NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER Grid metadata catalog Grid replica catalog Logical collection

SRB Status • SRB Features • Demonstration of the ability to coordinate bulk metadata SRB Status • SRB Features • Demonstration of the ability to coordinate bulk metadata and bulk data loads • Aggregate files into a “container”, simultaneously write metadata into a file for bulk load into the MCAT information repository • Achieved file import rate of 250 files/second • Development in progress • Improved error statement management • my. SRB. html web interface for collection support NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

MCAT Web Interface • Provide collection management • • Create a collection Define collection MCAT Web Interface • Provide collection management • • Create a collection Define collection attributes Ingest data / move / replicate Browse Query Annotate Comment • https: //srb. npaci. edu/my. SRB. html NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Grid Portal Development • Integrate collection management of derived data products with Grid execution Grid Portal Development • Integrate collection management of derived data products with Grid execution portal • Based on Grid Port and SRB • Funded by Gri. Phy. N, NPACI, NASA IPG NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Grid. Port + SRB Architecture • With SRB capabilities, file access is direct, uniform Grid. Port + SRB Architecture • With SRB capabilities, file access is direct, uniform • Uses same authentication as portal and other Grid services • Single SRB account access allows for more flexible data management NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Other Data Grids • • NSF - National Virtual Observatory DOE - Particle Physics Other Data Grids • • NSF - National Virtual Observatory DOE - Particle Physics Data Grid - Babar NSF - United Kingdom data grid NSF - Distributed Terascale Facility NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Astronomy Sky Survey Data Grid 1. Portals and Workbenches 2. Knowledge & Resource Management Astronomy Sky Survey Data Grid 1. Portals and Workbenches 2. Knowledge & Resource Management Concept space 4. Grid Security Caching Replication Backup Scheduling 3. Metadata View Bulk Data Catalog Analysis Standard APIs and Protocols Data View Information Metadata Data 5. Discovery delivery Discovery Delivery Standard Metadata format, Data model, Wire format 6. Catalog Mediator Data mediator Catalog/Image Specific Access 7. Compute Resources Derived Collections Catalogs Data Archives NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

PPDG - Babar Support • Installed SRB at Stanford • Added Babar specific metadata PPDG - Babar Support • Installed SRB at Stanford • Added Babar specific metadata attributes to MCAT catalog • Developed ability to support “soft links” between collections • Allows same file to appear in multiple collections • Release in SRB version 1. 1. 9 • UK data grid (SRB / Condor / Globus) • Rutherford - opportunity for international demonstration of Babar data replication NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Tera. Grid Wide Area Network Star. Light International Optical Peering Point (see www. startap. Tera. Grid Wide Area Network Star. Light International Optical Peering Point (see www. startap. net) Abilene Chicago ne kbo F Bac T Indianapolis Urbana D Los Angeles San Diego OC-48 (2. 5 Gb/s, Abilene) Multiple 10 Gb. E (Qwest) Multiple 10 Gb. E (I-WIRE Dark Fiber) I-WIRE ANL UIC Starlight / NW Univ Multiple Carrier Hubs Ill Inst of Tech Univ of Chicago NCSA/UIUC • Solid lines in place and/or available by October 2001 • Dashed I-WIRE lines planned for summer 2002 NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER Indianapolis (Abilene NOC)

PACI 13. 6 TF Linux Tera. Grid 32 256 p HP X-Class 24 128 PACI 13. 6 TF Linux Tera. Grid 32 256 p HP X-Class 24 128 p HP V 2500 24 8 92 p IA-32 HPSS Calren NTON v. BNS Abilene Calren ESnet 32 8 24 Extreme Black Diamond 4 Argonne Caltech 64 Nodes 1 TF 0. 25 TB Memory 25 TB disk 32 Nodes 0. 5 TF 0. 4 TB Memory 86 TB disk 32 32 5 32 128 p Origin 32 HR Display & VR Facilities 5 HPSS OC-12 Chicago & LA DTF Core Switch/Routers Cisco 65 xx Catalyst Switch (256 Gb/s Crossbar) OC-48 574 p IA-32 Chiba City OC-48 OC-12 ATM Juniper M 40 OC-12 2 OC-12 Gb. E Juniper M 160 OC-3 SDSC NCSA 256 Nodes 4. 1 TF, 2 TB Memory 225 TB disk 500 Nodes 8 TF, 4 TB Memory 240 TB disk ESnet HSCC MREN/Abilene Starlight Juniper M 40 OC-12 2 v. BNS Abilene MREN OC-12 OC-3 8 4 2 Sun Starcat 4 1176 p IBM SP Blue Horizon 4 Sun E 10 K Uni. Tree 8 HPSS = 32 x 1 Gb. E 1024 p IA-32 320 p IA-64 16 Myrinet Clos Spine = 64 x Myrinet = 32 x Myrinet 14 Myrinet Clos Spine = 32 x Fibre. Channel 1500 p Origin = 8 x Fibre. Channel 10 Gb. E 32 quad-processor Mc. Kinley Servers (128 p @ 4 GF, 8 GB memory/server) 32 quad-processor Mc. Kinley Servers (128 p @ 4 GF, 12 GB memory/server) Fibre Channel Switch 16 quad-processor Mc. Kinley Servers (64 p @ 4 GF, 8 GB memory/server) Cisco 6509 Catalyst Switch/Router IA-32 nodes NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Further Information http: //www. npaci. edu/DICE NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Further Information http: //www. npaci. edu/DICE NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

SDSC Storage Resource Broker & Meta-data Catalog Application Resource, User Defined C, C++, Linux SDSC Storage Resource Broker & Meta-data Catalog Application Resource, User Defined C, C++, Linux I/O Unix Shell Java, NT Browsers Prolog Web Predicate SRB MCAT Archives Dublin Core HPSS, ADSM, HRM Uni. Tree, DMF File Systems Databases Unix, NT, Mac OSX Remote Proxies DB 2, Oracle, Postgres Application Meta-data NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid Third-party copy SAN DIEGO SUPERCOMPUTER CENTER Data. Cutter

Replication Attributes • DATA_NAME • Global SRB data object name • DATA_REPL_ENUM • Replica Replication Attributes • DATA_NAME • Global SRB data object name • DATA_REPL_ENUM • Replica copy number • SIZE • Size of data in bytes • DATA_TYP_NAME • Data type (primarily specification of the data format) • DATA_CLASS_NAME • Logical classification of the data (description of the type). • DATA_CLASS_TYPE • Classification type • ACCESS_CONSTRAINT • Access restrictions on data DATA_COMMENTS NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Replication Attributes (2) • DATA_COMMENTS_TIMESTAMP • Time and date stamp for when comments were Replication Attributes (2) • DATA_COMMENTS_TIMESTAMP • Time and date stamp for when comments were made on the data object • REPL_TIMESTAMP • Time and date stamp when the owner modified the data object. • PATH_NAME • Physical path name of the data object. • DATA_CREATE_TIMESTAMP • Time and date stamp for when the data was created • DATA_IS_DELETED • A flag can be turned on that indicates a data object has been deleted, while retaining the data set on storage. • DATA_OWNER • Data object creator name. • DATA_OWNER_DOMAIN • Domain/ group of the data object creator. NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Quilt Extension (1) – Data Type • Original Quilt: No difference between dt 1. Quilt Extension (1) – Data Type • Original Quilt: No difference between dt 1. xml and dt 2. xml dt 1. xml dt 2. xml 21 … … 122 123 … … 203 … … 21 … … 122 123 … … 203 … … • After we add data type … NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER

Quilt Extension (2) – Operator Overloading Query 1 : sum of id and sponsor_id Quilt Extension (2) – Operator Overloading Query 1 : sum of id and sponsor_id ( type = string ) FOR $bill in document(“dt 1. xml")//bill RETURN $bill//id, $bill//sponsor_id, $bill//id/text() + $bill//sponsor_id/text() 21 122 21122 123 203 123203 NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE … … SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid

Quilt Extension (2) – Operator Overloading Query 2 : sum of id and sponsor_id Quilt Extension (2) – Operator Overloading Query 2 : sum of id and sponsor_id ( type = integer ) FOR $bill in document(“dt 2. xml")//bill RETURN $bill//id, $bill//sponsor_id, $bill//id/text() + $bill//sponsor_id/text() 21 122 143. 0 123 203 326. 0 NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE … … SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid

Quilt Extension (3) – Array Operation Value. Array Value. Integer. Array Value. Float. Array Quilt Extension (3) – Array Operation Value. Array Value. Integer. Array Value. Float. Array Value. String. Array Value. Bool. Array • Value : Interface for Kweelt base type • Value. Array : Extend Value. Implement Compare and arrayspecific operation • Accessor – getter, setter • Element summation • Array summation • Subsequence • Zip, Unscroll, concatenation, etc Demo. NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE : http: //pamina 2. sdsc. edu/cgi-bin/kweelt/demo. cgi Particle Physics Data Grid SAN DIEGO SUPERCOMPUTER CENTER