c74e488b80369cc59533fb5a6e9419e8.ppt
- Количество слайдов: 41
Composing Mappings between Schemas using a Reference Ontology Eduard Dragut, Ramon Lawrence Iowa Database and Emerging Applications (IDEA) Laboratory University of Iowa {eduard-dragut, ramon-lawrence}@uiowa. edu Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 1
Outline èMotivation èIntegration Approach èBackground èArchitecture Overview èOntological Matching èComposing Mappings èGlobal View Construction èExperimental Results èFuture Work and Conclusions Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 2
Motivation èMany organizations have pre-existing ontologies that are not suitable as global views but are suitable as reference ontologies to aid integration. u Example: National Cancer Institute (NCI) and National Insitutes of Health (NIH) have ca. BIG grid prototype which standardizes terminology (EVS, ca. DSR) and data elements in cancer domain. èSchema-to-ontology matching requires integrators understand only their schema instead of all schemas that they may want to integrate. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 3
Integration Approach Reference Ontology Schema matching NCBI Database Schema matching Schema-toontology mapping Expression Database Compose & Merge Global View User Queries Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 4
Background: Ontologies and Integration èOntologies as the integrated, global view u Carnot project (Collet 91) with Cyc ontology (Lenat 90) u ONTOBROKER (Decker 98), OBSERVER (Mena 00) èTools for semi-automatically merging ontologies u PROMPT èUse (Noy 00), Ontobuilder (Gal 04) ontologies as matching/integration aids u MOMIS (Beneventano 03) using Word. Net u Indirect (Xu 03), CUPID (Madhavan 01), COMA (Do 02) èMatching ontologies (Doan 02) è“Discovering” ontologies (Madhavan 03) u Corpus-based matching Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 5
Background: Model Management èModel management as proposed by (Bernstein 03) is intended to allow high-level schema operations. u Operators include: Invert, Compose, Match, Merge. u Warning: Semantics of all operators are not yet fully defined and some of them are not completely automatic. èDefinitions: u. A match is a semantic correspondence between schema elements. u A mapping between schema elements is an expression that relates the elements. u Note that most schema matching systems such as COMA produce matches not mappings. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 6
Architecture Overview èWe assume the existence of a pre-existing reference ontology that has been “accepted” in a domain. u The ontology is NOT a global view and may not cover the information in all schemas. It cannot be edited. èGlobal view construction is a 3 -step process: u 1) Independently match each schema to the ontology. u 2) Compose schema-to-ontology matches to produce schema-to-schema mappings. u 3) Merge the schema mappings to produce the global view. èThe challenge is to automate this as much as possible. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 7
Benefits of Approach èEven with manual integration there are several benefits to using a reference ontology: u 1) An integrator must only understand their schema and the ontology and not other schemas to be integrated. u 2) Most validation is performed once during schema-toontology matching and not for every schema integrated. u 3) Schema-to-ontology matchings can be re-used every time a new schema is integrated into the federation. èAutomation can: u 1) Help construct schema-to-ontology matchings. u 2) Perform composition of mappings. u 3) Build a global view from the composed mappings. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 8
Automation Challenges èThere are several challenges in automating this process: u 1) Schema matching systems such as COMA are designed for simpler relational schemas. Ontologies must be mapped into a suitable format for use with COMA. u 2) Schema-to-ontology matching is less accurate due to more complicated ontological structure and because the ontology may not model the entire domain or may model it differently. u 3) Composing matchings often results in many false matches which must be handled. u 4) A method for merging schemas using model management primitive operators is required. ï **Even with these operators, Merge is not fully automatic. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 9
Background: COMA èCOMA (Do 02) is a schema matching system that can flexibly combine different match algorithms and reuse match results. u Match algorithms use names, paths, and schema properties in various ways. èThe mapping format between two schemas R and S is a triple (r, s, v) where r in R, s in S, and v is the similarity value in [0. . 1] between elements r and s. èA schema in COMA is represented as a rooted directed acyclic graph. Schema elements are nodes which may be connected by links of different types. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 10
Ontological Matching èThe first step is to convert ontologies in OWL/DAML format into COMA’s graph representation format. u Wrote èDuring a program that used the JENA parser. the conversion: u 1) Explicitly converted a named relationship in the ontology into a node and several edges in graph. u 2) Explicitly encoded attributes inherited over IS-A links since COMA does not support IS-A. èAfter conversion, COMA would automatically produce a schema-to-ontology match as it would appear to be matching two relational schemas. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 11
Converting Ontology to a Graph Converting Named Relationships Making IS-A Explicit * Also create a single root POOntology as required by COMA. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 12
Ontological Matching: Max versus no. Max èOne challenge is what should this match look like? èTwo choices: u 1) Max - For each schema element, keep the best match with the ontology (if any). u 2) No. Max - For each schema element, keep all the matches that are above the cutoff threshold. èSince Max only generates one match, it is probably the best in semi-automated settings. No. Max will generate many matches which must be filtered out by the user or during composition. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 13
Composing Mappings èSchema-to-ontology mappings must be composed to produce direct schema-to-schema mappings. èSince mappings carry no semantics, two objects are assumed to be identical if they map to the same ontological concept. Composition is performed transitively and is implemented using a natural join. u That is, if element r is similar to o and o is similar to s, then we assume that r is similar to s. èFor example: u <postal. Code, Zip, 0. 8> and <Zip, post. Code, 0. 7> can be composed to yield <postal. Code, post. Code, 0. 75>. u The similarity values may be combined using various functions, although average is the most common. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 14
Composition Example S 1 O Contact Organization name Company. Name contact Email Person Name First. Name Position Last. Name Email S 1 Compose Contact Company. Name Email Name Position S 2 Contact First. Name Last. Name Email Position Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 15
Global View Construction èOne of the possible applications of constructing schema-to-schema mappings in this way is using them to build a global view. èWe have given a script in the paper that uses model management operators to compose any number of schema-to-ontology mappings into a single global view for all sources. èNote that this algorithm is not perfect nor fully automatic as the mappings are not perfect and the Merge operator may require human intervention. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 16
Global View Construction Example Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 17
Experimental Setup èMatched the 5 sample order schemas: CIDR, Excel, Noris, Paragon, and Apertum used to evaluate COMA. Numbered these schemas 1, 2, 3, 4, and 5. èCreated a reference ontology that models some of the domain (but not all of it) and is quite different than the schemas (uses IS-A for example). èUsed the matchings specified with COMA as groundtruth. èEvaluation metrics: u Precision - # of correct matches/# of suggested matches u Recall - # of correct matches returned/# total matches u Overall = Recall * (2 - 1 / Precision) Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 18
Reference Order Ontology Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 19
Experiment #1: Schema-to-Ontology Matching èGoal: Evaluate the accuracy of schema-to-ontology matching. èMethod: u Automatically convert ontology into COMA format and match each schema with ontology. èEvaluation: u Measured the percent overlap of the schema and ontology. For many schemas, only 60% of their concepts were in the ontology. u Evaluated the precision, recall, and overall measures relative to the number of matches that could be found. ï E. g. If overlap was 60% and recall was 50%, then only 30% of all schema elements were matched BUT of all the possible matches, 50% were found. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 20
Experiment #1: Results * no. Max is poor for schema 5 as Buyer incorrectly matched to ontology. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 21
Experiment #2: Schema-to-Schema Mappings èGoal: Determine the accuracy of producing schemato-schema mappings by composing schema-toontology matchings. èMethod: u Used automatically generated schema-to-ontology matchings and composed them. Evaluated composition result against COMA answers for direct matching. u Evaluated no. Max and Max techniques and manual mappings. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 22
Experiment #2: Results (Overall) * 1 <-> 2 is poor because of Street mapping. * 4 <-> 5 is poor because of Buyer mapping. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 23
Experiment #3: Improving Direct Matches èGoal: Determine if the accuracy of producing direct schema-to-schema mappings can be improved by re -using schema-to-ontology matches. èMethod: u Generate schema-to-schema mappings by composing schema-to-ontology matchings and then use this as past matching information for COMA. u Allow COMA to perform direct match given this information. u Evaluated no. Max and Max techniques and manual mappings. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 24
Experiment #3: Results (Overall) * 1 <-> 2 is poor because of Street mapping. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 25
Discussion and Conclusions èMajor findings: u 1) Schema-to-ontology mappings can be constructed with good accuracy (70 -80% precision, 60% recall). u 2) The composition of schema-to-ontology matchings produces similar results to direct matching with COMA. u 3) Max has higher precision than no. Max but with lower recall. Max is probably best when the user must filter incorrect matches and always saves work. u 4) It is valuable to re-use schema-to-ontology matchings (either automatic or manually constructed) to improve the accuracy of direct matchings. èMajor conclusion: There is a benefit to building semiautomatic schema-to-ontology matchings for use in integration and global view construction. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 26
Future Work and Challenges èThe major challenge is that the mappings carry no semantics which often results in incorrect matches suggested after composition. u We are currently working on extending the mappings to capture semantics to avoid many of these cases. èThe approach is not fully automatic (nor will it ever be). However, most manual work is in the schema-to -ontology matching stage. We need better algorithms and tools to support this matching. èWant to perform experimental evaluation on larger ontologies such as those from NCI. u Issue: Many ontologies are not in suitable form for intermediate mapping with schemas. (just taxonomies) Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 27
Composing Mappings between Schemas using a Reference Ontology Eduard Dragut, Ramon Lawrence Iowa Database and Emerging Applicatons (IDEA) Laboratory University of Iowa {eduard-dragut, ramon-lawrence}@uiowa. edu Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 28
Extra Slides. . . Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 29
Ontology Conversion Algorithm è 1) Each ontology concept (class) becomes a node in the graph. è 2) For each property (attribute) of a class, add a node to the graph and connect it to its class. è 3) Non-basetype properties (those with domain and range in ontology) are converted by: u 3 a) Creating a node in the graph for the relationship. u 3 b) Adding an edge from the class domain to this node. u 3 c) Adding an edge from the new node to the range class. u Note: Do not currently support properties that have a domain or range that is union/intersection of concepts. è 4) IS-A expanded by graph traversal. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 30
Mapping Composition Challenges Composing N: 1 match with 1: N match results in a cross-product Cannot handle these cases as mappings have no semantics. Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 31
Global View Construction Script Computes Global View of N Source Schemas (with ontology mappings) Operator Global. View(Array. Schemas, Array. Mappings, O, n) // Array. Schemas stores the n schemas // Array. Mappings stores the n schema-to-ontology mappings 1. If n <= 0 Then Return empty schema; 2. If n == 1 Then Return Array. Schemas[0]; 3. S 1 = Array. Schemas[0]; 4. S 2 = Array. Schemas[1]; 5. map 1 = Array. Mappings[0]; 6. map 2 = Array. Mappings[1]; 7. < S, map > = Global. View 2(S 1, S 2, map 1, map 2, O); 8. For (i=2; i <= n-1; i++) 9. S 1 = S; 10. map 1 = map; 11. S 2 = Array. Schemas[i]; 12. map 2 = Array. Mappings[i]; 13. < S, map > = Global. View 2(S 1, S 2, map 1, map 2, O); 14. end for; 15. Return < S, map >; Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 32
Global View Construction Script (2) Computes Global View of Two Source Schemas (with ontology mappings) Operator Global. View 2(S 1, S 2, O, S 1_O, S 2_O) 1. S 1_S 2 = S 1_O * Invert(S 2_O) 2. < M, S 1_M, S 2_M > = Merge(S 1, S 2, S 1_S 2); 3. M_O = Invert(S 1_M) * S 1_O + Invert(S 2_M) * S 2_O; 4. Return < M, M_O >; Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 33
Sample Order Schema Excel XML Schema <? xml version="1. 0"? > <Schema name="Purchase. Order. biz" xmlns="urn: schemas-microsoft-com: xml-data" xmlns: dt="urn: schemas-microsoft-com: datatypes"> <Element. Type name="Purchase. Order" content="elt. Only"> <element type="Header"/> <element type="Items"/> <element type="Footer"/> <element type="Invoice. To"/> <element type="Deliver. To"/> </Element. Type><Element. Type name="Items" content="elt. Only"> <Attribute. Type name="item. Count" dt: type="int"></Attribute. Type> <attribute type="item. Count"/> <element type="Item" max. Occurs="*" min. Occurs="1"/> </Element. Type> <Element. Type name="Item" content="empty"> <Attribute. Type name="your. Part. Number" dt: type="string"></Attribute. Type> <Attribute. Type name="unit. Price" dt: type="number"></Attribute. Type> <Attribute. Type name="unit. Of. Measure" dt: type="string"></Attribute. Type> <Attribute. Type name="sales. Value" dt: type="number"></Attribute. Type> <Attribute. Type name="quantity" dt: type="number"></Attribute. Type> <Attribute. Type name="part. Number" dt: type="string"></Attribute. Type> <Attribute. Type name="part. Description" dt: type="string"></Attribute. Type> <Attribute. Type name="item. Number" dt: type="int"></Attribute. Type> Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 34
Sample Order Schema Excel XML Schema (2) <attribute type="item. Number"/> <attribute type="your. Part. Number"/> <attribute type="part. Description"/> <attribute type="quantity"/> <attribute type="unit. Of. Measure"/> <attribute type="unit. Price"/> <attribute type="sales. Value"/> </Element. Type> <Element. Type name="Invoice. To" content="elt. Only"> <element type="Contact"/> <element type="Address"/> </Element. Type> <Element. Type name="Header" content="elt. Only"> <Attribute. Type name="your. Account. Code" dt: type="string"></Attribute. Type> <Attribute. Type name="order. Num" dt: type="string"></Attribute. Type> <Attribute. Type name="order. Date" dt: type="date"></Attribute. Type> <attribute type="order. Num"/> <attribute type="order. Date"/> <attribute type="our. Account. Code"/> <attribute type="your. Account. Code"/> <element type="Contact"/> </Element. Type> Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 35
Sample Order Schema Excel XML Schema (3) <Element. Type name="Footer" content="empty"> <Attribute. Type name="total. Value" dt: type="number"></Attribute. Type> <attribute type="total. Value"/> </Element. Type> <Element. Type name="Deliver. To" content="elt. Only"> <element type="Contact"/> <element type="Address"/> </Element. Type> <Element. Type name="Contact" content="empty"> <Attribute. Type name="telephone" dt: type="string"></Attribute. Type> <Attribute. Type name="e-mail" dt: type="string"></Attribute. Type> <Attribute. Type name="contact. Name" dt: type="string"></Attribute. Type> <Attribute. Type name="company. Name" dt: type="string"></Attribute. Type> <attribute type="contact. Name"/> <attribute type="company. Name"/> <attribute type="e-mail"/> <attribute type="telephone"/> </Element. Type> Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 36
Sample Order Schema Excel XML Schema (4) <Element. Type name="Address" content="empty"> <Attribute. Type name="street 4" dt: type="string"></Attribute. Type> <Attribute. Type name="street 3" dt: type="string"></Attribute. Type> <Attribute. Type name="street 2" dt: type="string"></Attribute. Type> <Attribute. Type name="street 1" dt: type="string"></Attribute. Type> <Attribute. Type name="state. Province" dt: type="string"></Attribute. Type> <Attribute. Type name="postal. Code" dt: type="string"></Attribute. Type> <Attribute. Type name="country" dt: type="string"></Attribute. Type> <Attribute. Type name="city" dt: type="string"></Attribute. Type> <attribute type="street 1"/> <attribute type="street 2"/> <attribute type="street 3"/> <attribute type="street 4"/> <attribute type="city"/> <attribute type="state. Province"/> <attribute type="postal. Code"/> <attribute type="country"/> </Element. Type> </Schema> Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 37
Experiment #2: Precision Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 38
Experiment #2: Recall Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 39
Experiment #3: Results (Precision) Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 40
Experiment #3: Results (Recall) Composing Mappings between Schemas using a Reference Ontology - ODBASE’ 04 - Eduard Dragut, Ramon Lawrence Page 41
c74e488b80369cc59533fb5a6e9419e8.ppt