a062b58bda5616e99a1a42c01737d982.ppt
- Количество слайдов: 41
Science Environment for Ecological Knowledge Bertram Ludäscher UC Santa Barbara U New Mexico UC San Diego Supercomputer Center University of California, San Diego U Kansas Vermont, Napier, ASU, UNC http: //seek. ecoinformatics. org
Architecture Overview • Analysis & Modeling System – Design and execution of ecological models and analysis – End user focus – application-/upperware • Semantic Mediation System – Data Integration of hardto-relate sources and processes – Semantic Types and Ontologies – upper middleware • Eco. Grid – Access to ecology data and tools – middle-/underware SEEK Overview, 3/2004 (cf. GEON + Cyberinfrastructure) • Plus Working Groups: – Knowledge Representation (SEEK-KR) – Classification and Nomenclature (TAXON) – Biodiversity and Ecological Analysis and Modeling (BEAM) 2
SEEK Eco. Grid • Goal: standardize interfaces (using web and grid services) – We have standardized data via EML – Integrate diverse data networks from ecology, biodiversity, and environmental sciences • Grid-standardized interfaces – Uniform interface to: • Metacat, SRB, Di. GIR, Xanthoria, etc. • Anyone can implement these interfaces • Hides complexity of underlying systems • Metadata-mediated data access – Supports multiple metadata standards – EML, Darwin Core as foci • Computational services – Pre-defined analytical services – On-the-fly analytical services SEEK Overview, 3/2004 3
Grid versus Web Services • Grid Services are Web Services – Add authentication, lifecycle management, notification, etc. – Globus Toolkit 3: Implements Open Grid Services Architecture (OGSA) • Implications for use – Write a normal web service extending Grid. Service base class – When deployed within GT 3, you get these extra functions for ‘free’ – Supports distributed computation via proxy authentication • Problems – Complex system to understand – GT 3 can be difficult to deploy – Proposals to incorporate grid services within the Web services community (Web Services Resource Framework [WSRF]) SEEK Overview, 3/2004 4
Eco. Grid client interactions • Modes of interaction – Client-server – Fully distributed – Peer-to-peer • Eco. Grid Registry – Node discovery – Service discovery • Aggregation services – Centralized access – Reliability – Data preservation SEEK Overview, 3/2004 5
Building the Eco. Grid NTL AND HBR VCR LUQ Metacat node Veg. Bank node Xanthoria node SEEK Overview, 3/2004 SRB node Di. GIR node Legacy system LTER Network (24) Natural History Collections (>> 100) Organization of Biological Field Stations (180) UC Natural Reserve System (36) Partnership for Interdisciplinary Studies of Coastal Oceans (4) Multi-agency Rocky Intertidal Network (60) 6
Kepler: Scientific Workflows Query Eco. Grid to find data Archive output to Eco. Grid EML provides semi-automated data binding Scientific workflows represent knowledge about the process; Kepler captures this knowledge 7 SEEK Overview, 3/2004
GARP Invasive Species Model Di. GIR Species presence &absence points (invasion area) (a) Eco. Grid Query Di. GIR Species presence & absence points (native range) (a) Test sample (d) +A 1 +A 2 +A 3 Training sample (d) Sample GARP rule set (e) Map Data Calculation Native range prediction map (f) Model quality parameter (g) Integrated layers (native range) (c) SRB Environmental layers (native range) (b) Eco. Grid Query Validation User Layer Integration Model quality parameter (g) SRB Environmental layers (invasion area) (b) Layer Integration Integrated layers (invasion area) (c) Map Invasion area prediction map (f) Validation Scientific workflows represent knowledge about the process; AMS captures this knowledge Slide from D. Pennington SEEK Overview, 3/2004 8
Kepler Team, Projects, Sponsors • Ilkay Altintas SDM • Chad Berkley SEEK • Shawn Bowers SEEK • Jeffrey Grethe BIRN • Christopher H. Brooks Ptolemy II • Zhengang Cheng SDM • Efrat Jaeger GEON • Matt Jones SEEK • Edward A. Lee Ptolemy II • Kai Lin GEON • Bertram Ludäscher BIRN, GEON, SDM, SEEK • Steve Mock NMI • Steve Neuendorffer Ptolemy II • Jing Tao SEEK • Mladen Vouk SDM • Yang Zhao Ptolemy II • … SEEK Overview, 3/2004 9 Ptolemy II
Kepler Understands EML Data (Chad Berkley, SEEK) SEEK Overview, 3/2004 10
Kepler: Ecological Modeling (Chad Berkley, SEEK) SEEK Overview, 3/2004 11
Database Access (Efrat Jaeger, GEON) Note: EML descriptions of relational sources would allow automated data ingestion SEEK Overview, 3/2004 12
Mineral Classification with Kepler … (Efrat Jaeger, GEON) SEEK Overview, 3/2004 13
… inside the Classifier SEEK Overview, 3/2004 14
Standard Browser. UI: Client-Side SVG SEEK Overview, 3/2004 15
SWF Reengineering (Ilkay, SDM; Ashraf, Efrat, Kai, GEON) SEEK Overview, 3/2004 16
Data. Mapper Sub-Workflow SEEK Overview, 3/2004 17
Result launched via Browser. UI actor (coupling with ESRI’s Arc. IMS) SEEK Overview, 3/2004 18
Distributed Workflows in KEPLER • Web and Grid Service plug-ins – WSDL (now) and Grid services (stay tuned …) – Proxy. Init, Globus. Grid. Job, Grid. FTP, Data. Access. Wizard – SSH, SCP, SDSC SRB, OGS? -? ? ? … coming • WS Harvester – Import query-defined WS operations as Kepler actors • XSLT and XQuery Data Transformers – to link not “designed-to-fit” web services • WS-deployment interface (planned) SEEK Overview, 3/2004 19
Web Service Actor (Ilkay Altintas, SDM) Given a WSDL and the name of an operation of a web service, dynamically customizes itself to implement and execute that method. Configure - select service operation n SEEK Overview, 3/2004 20
Set Parameters and Commit Set parameters and commit SEEK Overview, 3/2004 21
Specialized WS Actor (after instantiation) SEEK Overview, 3/2004 22
Web Service Harvester (Ilkay Altintas, SDM) • Imports the web services in a repository into the actor library. • Has the capability to search for web services based on a keyword. SEEK Overview, 3/2004 23
Kepler: Grid Services Access (Steve Mock, NMI) SEEK Overview, 3/2004 24
An (oversimplified) Model of the Grid • Hosts: {h 1, h 2, h 3, …} • Data@Hosts: d 1@{hi}, d 2@{hj}, … • Functions@Hosts: f 1@{hi}, f 2@{hj}, … X f • Given: data/workflow: • … as a functional plan: • … as a logic plan: Y g Z […; Y : = f(X); Z : = g(Y); …] […; f(X, Y) g(Y, Z); …] • Find Host Assignment: di hi , fj hj for all di , fj … s. t. […; d 3@h 3 : = f@h 2(d 1@h 1), …] is a valid plan SEEK Overview, 3/2004 25
Shipping & Handling Algebra (SHA) f@a x@b f@a Logical view y@c (1) y@c (2) f@a plan Y@C = F@A of X@B = 1. y@c x@b [ X@B to A, Y@A : = F@A(X@A), Y@A to C ] x@b 2. [ F@A => B, Y@B : = F@B(X@B), Y@B to C ] 3. [ X@B to C, F@A => C, Y@C : = F@C(X@C) ] Physical view: SHA Plans SEEK Overview, 3/2004 26 f@a x@b y@c (3)
Grid-Enabling PTII: Handles 1. 2. 3. 4. 5. 6. 7. Logical token transfer (3) requires get_handle(1, 2); then exec_handle(4, 5, 6, 7) for completion. Kepler space 1 Grid space SEEK Overview, 3/2004 3 A B 4 2 Example: &X = “GA. 17” *X =<some_huge_file> 7 5 GA 6 A GA: get_handle GA A: return &X A B: send &X B GB: request &X GB GA: request &X GA GB: send *X GB B: send done(&X) Candidate Formalisms: • Grid. FTP • SSH, SCP • SDSC SRB • OGS? -? ? ? … WSRF? GB 27
Homogeneous Data Integration • Integration of homogeneous or mostly homogeneous data via EML metadata is relatively straightforward SEEK Overview, 3/2004 28
Heterogeneous Data integration • Requires advanced metadata and processing – – Attributes must be semantically typed Collection protocols must be known Units and measurement scale must be known Measurement relationships must be known SEEK Overview, 3/2004 • e. g. , that Areal. Density=Count/Area 29
Semantic Mediation • Label data with semantic types • Label inputs and outputs of analytical components with semantic types Data Ontology Workflow Components • Use reasoning engines to generate transformation steps – Beware analytical constraints • Use reasoning engine to discover relevant components SEEK Overview, 3/2004 30
Ecological ontologies • • What was measured (e. g. , biomass) Type of measurement (e. g. , Energy) Context of measurement (e. g. , Psychotria limonensis) How it was measured (e. g. , dry weight) • SEEK intends to enable community-created ecological ontologies using OWL – Represents a controlled vocabulary for ecological metadata SEEK Overview, 3/2004 31
Extensions: Semantic Types • Take concepts and relationships from an ontology to “semantically type” the data-in/out ports • Application: e. g. , design support: – smart/semi-automatic wiring, generation of “massaging actors” m 1 p 3 (normalize) Takes Abundance Count Measurements for Life Stages SEEK Overview, 3/2004 p 4 Returns Mortality Rate Derived Measurements for Life Stages 32
SEEK Overview, 3/2004 33
SEEK Overview, 3/2004 34
Semantic Types • The semantic type signature – Type expressions over the (OWL) ontology m 1 p 3 (normalize) p 4 Sem. Type m 1 : : Observation & item. Measured. Abundance. Count & has. Context. applies. To. Life. Stage. Property -> Derived. Observation & item. Measured. Mortality. Rate & has. Context. applies. To. Life. Stage. Property SEEK Overview, 3/2004 35
Extended Type System (here: OWL Semantic Types) Sem. Type m 1 : : Observation & item. Measured. Abundance. Count & has. Context. applies. To. Life. Stage. Property Derived. Observation & item. Measured. Mortality. Rate & has. Context. applies. To. Life. Stage. Property Substructure association: XML raw-data =(X)Query=> object model =link => OWL ontology SEEK Overview, 3/2004 36
Semantic Types for Scientific Workflows SEEK Overview, 3/2004 37
Deriving Data Transformations from Semantic Service Registration [Bowers-Ludaescher, DILS’ 04] SEEK Overview, 3/2004 38
Structural and Semantic Mappings [Bowers-Ludaescher, DILS’ 04] SEEK Overview, 3/2004 39
SEEK Impact • Fundamental improvements for researchers – Global access to ecologically relevant data – Rapidly locate and utilize distributed computation – Capture, reproduce, extend analysis process SEEK Overview, 3/2004 40
Acknowledgements This material is based upon work supported by: The National Science Foundation under Grant Numbers 9980154, 9904777, 0131178, 9905838, 0129792, and 0225676. PBI Collaborators: NCEAS, University of New Mexico (Long Term Ecological Research Network Office), San Diego Supercomputer Center, University of Kansas (Center for Biodiversity Research) Kepler contributors: SEEK, Ptolemy II, SDM/Sci. DAC, GEON SEEK Overview, 3/2004 41
a062b58bda5616e99a1a42c01737d982.ppt