Скачать презентацию Aero DAML Applying Information Extraction to Generate DAML Скачать презентацию Aero DAML Applying Information Extraction to Generate DAML


  • Количество слайдов: 10

Aero. DAML Applying Information Extraction to Generate DAML Annotations Dr. Paul Kogut Lockheed Martin Aero. DAML Applying Information Extraction to Generate DAML Annotations Dr. Paul Kogut Lockheed Martin Management & Data Systems

What is Information Extraction? Text or web pages Information Extraction Linguistic Knowledge 2 Entities What is Information Extraction? Text or web pages Information Extraction Linguistic Knowledge 2 Entities Relationships Co-references Events

Extraction and Semantic Annotation l Consumer-side extraction - 3 rd party text -> database Extraction and Semantic Annotation l Consumer-side extraction - 3 rd party text -> database u Advantages: è Applicable to raw documents (most of the web) u Disadvantages: è Must deal with full complexity of natural language l Semantic annotation proposed to overcome difficulty of consumer-side extraction - but annotation is labor intensive l Producer-side extraction - authored text -> annotation u Advantages: è Partial-automation - reduces manual effort è Human assisted disambiguation è Domain customization for intranets and B 2 B e-commerce u Disadvantages: è Requires manual effort to correct and add rich set of relationships è Domain customization requires up-front effort from the author/webmaster l Both types of extraction will coexist. 3

Aero. DAML Architecture UBOT Annotation Editor basic annotation DAML Annotator Text or web pages Aero. DAML Architecture UBOT Annotation Editor basic annotation DAML Annotator Text or web pages 4 Extraction to DAML Translation basic annotation Text Extraction refined annotation DAML annotated text or web pages DAML Ontologies

Client-Server Aero. DAML l Users: u personnel who routinely produce documents (e. g. , Client-Server Aero. DAML l Users: u personnel who routinely produce documents (e. g. , intelligence analysts) u personnel who have a large collection of legacy documents 5

l Users: Web-based Aero. DAML u novice/infrequent DAML annotators u people who want to l Users: Web-based Aero. DAML u novice/infrequent DAML annotators u people who want to do quick/simple annotation of a web page 6

Aero. DAML Output: Entities 7

Aero. DAML Output: Relationships 8

Aero. DAML Output: Co-reference 9

Aero. DAML Plans l Integrate with annotation editor l Improve Web-based Aero. DAML u Aero. DAML Plans l Integrate with annotation editor l Improve Web-based Aero. DAML u Allow user to select other ontologies besides the current Aero. DAML default ontology for annotation generation: è Open. Cyc or Cyc Upper Ontology è CIA World Fact Book è IEEE Standard Upper Ontology è Dublin Core è UNSPSC. . . l Try Aero. DAML! u http: //ubot. lockheedmartin. com/ubot/hotdaml/aerodaml. html 10