54ea05773dcbe94b276a9766467a6ec9.ppt
- Количество слайдов: 21
CS 690 L Ontologies Interoperability (Integration, Mapping, Query) Yugi Lee STB #555 (816) 235 -5932 leeyu@umkc. edu www. sice. umkc. edu/~leeyu 1 CS 690 L - Lecture 4
Semantic Web Fabric • Bootstrapping, Creation and Maintenance of Semantic Knowledge – Collaborative and Sociological Processes, Statistical Techniques – Ontology Building, Maintenance and Versioning Tools • Re-use of Existing Semantic Knowledge (Ontologies) • Annotation/Association/Extraction of Knowledge with/from Underlying Data • Information Retrieval and Analysis (Distributed Querying, Search, Inference Middleware) • Semantic Discovery and Composition of Services • Distributed Computing/Communication Infrastructures – Component based technologies, Agent based systems, Web Services Repositories for managing data and semantic knowledge – Relational Databases, Content Management Systems, Knowledge Base Systems [V. Kashyap, 2002] 2 CS 690 L - Lecture 4
What DB researchers have done ? • Semantic Data Models • Multi-database Schema Heterogeneity • Multi-database/Federated Database Schema Integration • Schema Evolution • Object Oriented/XML/Deductive Databases/Rule Based Systems • Mediators and Wrappers • Multidatabase/Federated Database Query Processing • Data Mining • Probabilistic Databases • Workflow-based Coordination Systems • Security in Database Systems • Multimedia Databases – Text and Information Retrieval Systems – Image Databases • DB Research is well positioned to contribute to the Semantic Web, but: – there has been little interest in issues related to Semantics in the DB community – the Semantic Web can be the underlying theme that ties in all the disparate pieces of work [V. Kashyap, 2002] 3 CS 690 L - Lecture 4
What are the missing gaps ? • Ontology Integration/Interoperation – Problem is different from Schema Integration – Need to address “semantics” of relationships such as “synonyms”, “hyponyms”, etc. • Ontology Impedance/Mismatch – Relax the requirements of consistency and completeness – Should be able to characterize the “information error/loss” that occurs. . • Dynamic Ontologies – Need to relax the assumption of the “staticness” of database schemas Inferences based on Semantics of the Data – Has been relatively ignored by the DB community [V. Kashyap, 2002] 4 CS 690 L - Lecture 4
What are the missing gaps ? • Semantics of Multimedia Data – Need to focus more on non-traditional data such as text, images, etc. – Need to focus on “annotation mechanisms” as an addition to wrappers/mediators • Semantics of Processes/Plans/Workflows • Performance/Scalability – A traditional strong point of DB research • The next wave of research (esp. in the context of the Semantic Web) will focus on re-use of pre-existing data models/schemas/ontologies that describes the content of information sources… [V. Kashyap, 2002] 5 CS 690 L - Lecture 4
6 CS 690 L - Lecture 4
7 CS 690 L - Lecture 4
Inter-ontological relationships • Synonyms – leads to semantics preserving translations • Hyponyms/Hypernyms – lead to semantics altering translations – typically results in loss of recall and precision • List of Hyponyms – – – – technical-manual hyponym manual book hyponym book proceedings hyponym book thesis hyponym book misc-publication hyponym book technical-reports hyponym book press hyponym periodical-publication periodical hyponym periodical-publication [V. Kashyap, 2002] 8 CS 690 L - Lecture 4
9 [V. Kashyap, 2002] CS 690 L - Lecture 4
[V. Kashyap, 2002] 10 CS 690 L - Lecture 4
[V. Kashyap, 2002] 11 CS 690 L - Lecture 4
Role of Ontologies • Content explication Ontologies are used for the explicit description of the information source Approaches: – Single ontology – Multiple ontology – Hybrid ontology • Query model • Verification (query containment) [H. Wache, 2002] 12 CS 690 L - Lecture 4
Single Ontology Approach • • SIMS One global ontology Hierarchical terminological database Combination of several specialized ontolgies (for modularization) • Can be used when all information sources to be integrated provide nearly the same view on a domain • Minimal ontology commitment • Susceptible to changes in the information sources [H. Wache, 2002] 13 CS 690 L - Lecture 4
Multiple Ontologies • OBSERVER • Each information source is described by its own ontology (source ontology) • No shared vocabulary • No common and minimal ontology commitment is needed • Simplifies integration and supports changes in sources • Difficult to compare different source ontologies • Inter-ontology mapping is needed [H. Wache, 2002] 14 CS 690 L - Lecture 4
Multiple Ontologies • COIN • Semantics of each source is described by its own ontology • Built from a a global shared vocabulary • Shared vocabulary contains basic terms of a domain • New sources can easily be added • Supports acquisition and evolution of ontologies • Source ontologies are comparable because of shared vocabulary • Existing ontologies can not easily be reused, but have to be redeveloped from scratch [H. Wache, 2002] 15 CS 690 L - Lecture 4
Query Model • • Integrated global view Global query schema User formulates query in terms of the ontology System reformulates queries in terms of subqueries for each source • Structure of the query model should be more intuitive for the user [H. Wache, 2002] 16 CS 690 L - Lecture 4
Mappings Connecting to Information Sources • Relate the ontologies to the actual content of an information source • Approaches – Structure resemblance Produce a one-to-one copy of the structure of the database and encode it in a language that makes automated reasoning possible – Definition of terms Use ontology to define terms from the database or the database scheme – Structure enrichment (most common) A logical model is built that resembles the structure of the information source and contains additional definitions and concepts Can be done using DLs – Meta-annotation Add semantic information to an information source ontobroker, SHOE [H. Wache, 2002] 17 CS 690 L - Lecture 4
Inter-Ontological Mapping Defined Mappings (KRAFT) – special customized mediator agents – Great flexibility – Fails to ensure a preservation of semantics - no verification Lexical Relations (OBSERVER) – Extend a common DL model by quantified inter-ontology relationships – Synonym, hypernym, overlap, covering, disjoint – Do not have formal semantics [H. Wache, 2002] 18 CS 690 L - Lecture 4
Inter-Ontological Mapping Top-level grounding (DWQ) – Relate all ontolgies used to a single top-level ontology – Inheriting concepts from a common top-level ontology – Can resolve conflicts and ambiguities Semantic correspondences – Rely on a common vocabulary – Uses semantic labels in order to compute correspondences – Subsumption reasoning can be used to establish relations between different terminolgies [H. Wache, 2002] 19 CS 690 L - Lecture 4
Conclusions • Data Models/Schemas/Ontologies will form the critical infrastructure for the Semantic Web • Re-use of pre-existing data models/schemas/ontologies is crucial in describing the semantics of various information sources • There is a need to relax consistency and completeness requirements and estimate the “error” in the results returned. • Semantics of information should be used to minimize “error” in the information obtained • The new environment is likely to be more “dynamic” in nature – schemas, workflows, queries, etc. can no longer be assumed to be static… • DB research is well positioned to participate in the Semantic Web if it “adapts” to these new requirements…. 20 CS 690 L - Lecture 4
References • Vipul Kashyap, The Semantic Web: Has the DB Community Missed the Bus (again ? ) NSF Workshop on DB & IS Research for Semantic Web and Enterprises, April 3, 2002 • H. Wache, T. Vogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann and S. Hubner, Ontology-Based Integration of Information: A Survey of Existing Approaches 21 CS 690 L - Lecture 4
54ea05773dcbe94b276a9766467a6ec9.ppt