Скачать презентацию Enriching OWL with Instance Recognition Semantics for Automated Скачать презентацию Enriching OWL with Instance Recognition Semantics for Automated

9fbd7b2b3bafba9c7a57295ff0fbc083.ppt

  • Количество слайдов: 20

Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation Stephen W. Liddle Information Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation Stephen W. Liddle Information Systems Department Yihong Ding & David W. Embley Computer Science Department Brigham Young University, Provo, Utah

Outline n Background n Instance recognition semantics n Automated semantic annotation n OWL-AA: OWL Outline n Background n Instance recognition semantics n Automated semantic annotation n OWL-AA: OWL for Automated Annotation n Related work n Conclusion 6 November 2007 ONISW 2007, Auckland, New Zealand 2 of 20

Motivation n Big Hairy Audacious Goal: deliver a machine- processable semantic web that will Motivation n Big Hairy Audacious Goal: deliver a machine- processable semantic web that will let us build agents to work intelligently on our behalf n Realistic Shorter-term Goal: provide automatic semantic annotation machinery that turns ordinary web pages into semantic web pages by adding appropriate metadata n Goal of this Paper: Show that we can extend OWL with epistemological declarations that let us use our data extraction engine to provide a solid automatic annotator that works with standard ontologies 6 November 2007 ONISW 2007, Auckland, New Zealand 3 of 20

Typical Approach n Typical approach to automatic annotation: n Use existing data extraction engine Typical Approach n Typical approach to automatic annotation: n Use existing data extraction engine to wrap and annotate pages n BUT these approaches do not match extracted data with an ontology n Main drawback: n Post-processing step to map extracted data to the ontology 6 November 2007 ONISW 2007, Auckland, New Zealand 4 of 20

Our Approach n Our data extraction engine is ontology based and does not need Our Approach n Our data extraction engine is ontology based and does not need a post-processing alignment step We have demonstrated that our approach works (Ding/Embley/Liddle 2006, LNCS 4185) n But our ontology language is not a W 3 C standard n n OWL is a standard (W 3 C recommendation) n But lacks sufficient declarative instancerecognition semantics for automatic annotation 6 November 2007 ONISW 2007, Auckland, New Zealand 5 of 20

A Solution n Extend OWL with declarative instance- recognition semantics n Call this OWL A Solution n Extend OWL with declarative instance- recognition semantics n Call this OWL for Automated Annotation (OWL -AA) n OWL-AA extension is epistemological in nature Ontological definitions should be independent of the form of knowledge representation n Epistemological definitions may be sensitive to the form of knowledge representation n 6 November 2007 ONISW 2007, Auckland, New Zealand 6 of 20

Our Contributions n OWL-AA extends OWL to provide for automated semantic annotation that Embeds Our Contributions n OWL-AA extends OWL to provide for automated semantic annotation that Embeds instance-recognition semantics declarations in ontologies and data extraction tools n Provides enhanced knowledge sharing and reuse via semantic web n Separates domain knowledge creation from semantic annotation process n n 6 November 2007 Thus maintaining separation between concerns of domain experts and software developers ONISW 2007, Auckland, New Zealand 7 of 20

Instance Recognition Semantics n Instance Semantics Recognizers (ISR’s) n Formal specifications that identify instances Instance Recognition Semantics n Instance Semantics Recognizers (ISR’s) n Formal specifications that identify instances of a concept in unstructured, semistructured, or structured text n Example: n Concept: Phone. Number n Text: “Call me at 555 -1212. ” n ISR should recognize that 555 -1212 in the text has the intensional meaning of Phone. Number. 6 November 2007 ONISW 2007, Auckland, New Zealand 8 of 20

ISR Declarations in Data Frames Bedroom. Nr internal representation: Integer external representation: [1 -9]|10 ISR Declarations in Data Frames Bedroom. Nr internal representation: Integer external representation: [1 -9]|10 left context phrase: b right context phrase: . *r(oo)? ms? exception phrase: s. *ba(th)? s? b. *r(oo)? ms? context keywords: b(r|d)s? |bdrms? |bed(rooms? )? . . . Less. Than(p 1: Price, p 2: Price) returns (Boolean) context keywords: less than|<|or less|fewer|. . . end Feature external representation: Apartment. Feature. lexicon. . . end 6 November 2007 ONISW 2007, Auckland, New Zealand 9 of 20

Resiliency n There are no document layout dependencies in our ISR declarations n Other Resiliency n There are no document layout dependencies in our ISR declarations n Other approaches typically use page layout information for extraction n Layout independence (resiliency) is a major benefit Page layouts change periodically on the web n Different publishers in the same domain generally use different page layouts n Rewriting wrappers is costly n 6 November 2007 ONISW 2007, Auckland, New Zealand 10 of 20

Annotation (Apartment Rental) Bedroom. Nr External Representation Bath. Nr Context Phrase External Representation Context Annotation (Apartment Rental) Bedroom. Nr External Representation Bath. Nr Context Phrase External Representation Context Phrase CAPITOL HILL Luxury 2 bdrm 2 bath, 2 grg, w/d, views, 1700 sq ft. $1250 mo. Call 533 -0293 External Representation Context Phrase Month. Rate 6 November 2007 Context Keyword External Representation Feature Contact. Phone ONISW 2007, Auckland, New Zealand 11 of 20

ISR-Based Annotation Demo http: //www. deg. byu. edu, demos section 6 November 2007 ONISW ISR-Based Annotation Demo http: //www. deg. byu. edu, demos section 6 November 2007 ONISW 2007, Auckland, New Zealand 12 of 20

RDFS Graph of OWL-AA P domain ISRvalue sub. Class. Of range &xsd; string domain RDFS Graph of OWL-AA P domain ISRvalue sub. Class. Of range &xsd; string domain ISR has. Exception sub. Class. Of External. Representation P range Exception Contextual. Representation sub. Class. Of Regular. Expression Context. Phrase Lexicon. List range sub. Class. Of P Left. Context. Phrase extraction. Pattern domain 6 November 2007 sub. Class. Of Right. Context. Phrase range domain owl: Class Context. Keyword range P extraction. Left. Context domain P extraction. Right. Context ONISW 2007, Auckland, New Zealand domain P extraction. Keyword 13 of 20

OWL-AA Details (1) n Add namespace xmlns: owlaa=“http: //www. deg. byu. edu/OWL-AA#” n Add OWL-AA Details (1) n Add namespace xmlns: owlaa=“http: //www. deg. byu. edu/OWL-AA#” n Add properties to OWL classes associating ISR declarations 6 November 2007 ONISW 2007, Auckland, New Zealand 14 of 20

OWL-AA Details (2) n Declare ISR value and contextual restriction <owlaa: Regular. Expression rdf: OWL-AA Details (2) n Declare ISR value and contextual restriction [1 -9]|10 6 November 2007 ONISW 2007, Auckland, New Zealand 15 of 20

OWL-AA Details (3) n Add context phrases, context keywords, and exceptions <owlaa: Left. Context. OWL-AA Details (3) n Add context phrases, context keywords, and exceptions b . . . (see our paper for the rest of the details) 6 November 2007 ONISW 2007, Auckland, New Zealand 16 of 20

OWL-AA Features n Syntactically, OWL-AA is attachment- independent w. r. t. OWL n No OWL-AA Features n Syntactically, OWL-AA is attachment- independent w. r. t. OWL n No need to change a single line of existing OWL ontologies n Semantically, OWL-AA is also attachment- independent w. r. t. OWL n ISR declarations do not change meaning of existing ontology concepts at conceptual level n Prototype implementation translates OWL-AA to our own internal format for processing 6 November 2007 ONISW 2007, Auckland, New Zealand 17 of 20

Related Work n We aren’t the first to work on ISR declarations n IBM’s Related Work n We aren’t the first to work on ISR declarations n IBM’s Unstructured Information Management Architecture n There are many proposals for extending OWL n Two extensions related to ours n Context OWL (C-OWL) n Extension for ontology mapping that localizes ontology content for global visibility n OWL-Eu n Extension that enriches OWL with customized datatypes, a significant need n Not attachment-independent, unfortunately n But our approach is compatible with OWL-Eu too 6 November 2007 ONISW 2007, Auckland, New Zealand 18 of 20

Conclusion n Automated semantic annotation is an important and fundamental problem for the semantic Conclusion n Automated semantic annotation is an important and fundamental problem for the semantic web n Key to automated annotation is ISR declaration n Our epistemological OWL-AA extension augments OWL to formalize ISR declarations 6 November 2007 ONISW 2007, Auckland, New Zealand 19 of 20

Conclusion (cont. ) n OWL-AA is fully compatible with ordinary OWL n Attachable/detachable n Conclusion (cont. ) n OWL-AA is fully compatible with ordinary OWL n Attachable/detachable n OWL-AA does not introduce complexity or decidability issues into OWL n Prototype implementation demonstrates that our approach works well n Embedding epistemological declarations in OWL is a fruitful area meriting further research 6 November 2007 ONISW 2007, Auckland, New Zealand 20 of 20