Скачать презентацию Metodi per ontology management and construction Syllabus Скачать презентацию Metodi per ontology management and construction Syllabus

0f97d48ec4ed275c57f5fd492f967fff.ppt

  • Количество слайдов: 48

Metodi per ontology management and construction Metodi per ontology management and construction

Syllabus • Come costruire un’ontologia • Quali strumenti sono a disposizione • Come apprendere Syllabus • Come costruire un’ontologia • Quali strumenti sono a disposizione • Come apprendere automaticamente ontologie da risorse

Costruire un’ontologia • Da zero • Ri-ingegnerizzando ontologie esistenti • Integrando ontologie esistenti Costruire un’ontologia • Da zero • Ri-ingegnerizzando ontologie esistenti • Integrando ontologie esistenti

Costruire un’ontologia • Identificare gli scopi • Identificare i “termini” rilevanti (albergo, prenotazione) • Costruire un’ontologia • Identificare gli scopi • Identificare i “termini” rilevanti (albergo, prenotazione) • Distinguere concetti e relazioni fra i termini usati per denotare entrambi PERSONA – Es: prenota(persona, albergo) • Codificare l’ontologia prenota ALBERGO

Costruire un’ontologia • Utilizzare risorse semi-formali disponibili (glossari, tesauri, data-document warehouses) • Usare strumenti Costruire un’ontologia • Utilizzare risorse semi-formali disponibili (glossari, tesauri, data-document warehouses) • Usare strumenti di consensus building e collaborative working • Integrare competenze diverse (lessicografi, knowledge engineers, esperti di dominio, utenti dell’applicazione)

Ontology Editing and Management Systems Ontology Editing and Management Systems

Protégé • http: //protege. stanford. edu/ Protégé • http: //protege. stanford. edu/

Creare un “progetto” OWL in Protégé Creare un “progetto” OWL in Protégé

OWL in Protégé Individuals (e. g. , “Four. Seasons”) Properties Object. Properties (references) Datatype. OWL in Protégé Individuals (e. g. , “Four. Seasons”) Properties Object. Properties (references) Datatype. Properties (simple values) Classes (e. g. , “Hotel”)

Object Properties • Collega due istanze (individuals) • Tipi di relazioni (0. . n, Object Properties • Collega due istanze (individuals) • Tipi di relazioni (0. . n, n. . m) t Par as Bondi. Beach h Sydney has. Ac como dation Four. Seasons

Inverse Properties • Rappresentno relazioni bidirezionali • Aggiungere un valore ad una proprietà implica Inverse Properties • Rappresentno relazioni bidirezionali • Aggiungere un valore ad una proprietà implica aggiungerlo alla sua inversa t Par has Sydney rt. Of is. Pa Bondi. Beach

Proprietà transitive New. South. Wales has. P art Sydney has. P art has. Part Proprietà transitive New. South. Wales has. P art Sydney has. P art has. Part (derived) Bondi. Beach

Datatype. Properties • Collega individui a valori primitivi (integers, floats, strings, booleans ecc) Sydney Datatype. Properties • Collega individui a valori primitivi (integers, floats, strings, booleans ecc) Sydney has. Size = 4, 500, 000 is. Capital = true rdfs: comment = “Don’t miss the opera house”

Classi • Gruppi di individui con caratteristiche comuni • Tutti gli individui sono istanza Classi • Gruppi di individui con caratteristiche comuni • Tutti gli individui sono istanza almeno di una classe Beach City Sydney Cairns Bondi. Beach Currawong. Beach

Range e Domain • Specifica di una proprietà – Domain: “è il lato sinistro Range e Domain • Specifica di una proprietà – Domain: “è il lato sinistro di una relazione” (Destination) – Range: “il lato destro” (Accomodation) Accomodation Destination n ccomodatio has. A Best. Western has. Accomod Four. Seasons Sydney ation

Dominio • Gli individui possono assumere solo i valori di proprietà con dominii corrispondenti, Dominio • Gli individui possono assumere solo i valori di proprietà con dominii corrispondenti, es: “Only Destinations can have Accomodations” • I dominii possono contenere classi multiple – Objects and Animates have Parts • I dominii possono essere indefiniti: le proprietà possono essere usate ovunque

Relazioni fra Classi e Superclassi • Le classi sono strutturate in gerarchie • Le Relazioni fra Classi e Superclassi • Le classi sono strutturate in gerarchie • Le istanze dirette di classi sono anche istanze indirette di super-classi Cairns Sydney Canberra Coonabarabran

Relazioni fra classi • Le classi si sovrappongono arbitrariamente Retiree. Destination City Cairns Sydney Relazioni fra classi • Le classi si sovrappongono arbitrariamente Retiree. Destination City Cairns Sydney Bondi. Beach

Disgiunzione fra classi • Le classi possono sovrapporsi • Ma in alcuni casi si Disgiunzione fra classi • Le classi possono sovrapporsi • Ma in alcuni casi si desidera che non abbiano istanze in comune disjoint. With Urban. Area Sydney City Rural. Area Woomera Cape. York Destination

Onto. Edit • Onto. Edit http: //www. ontoprise. de/products/ontoedit – Karlsruhe University – Ambiente Onto. Edit • Onto. Edit http: //www. ontoprise. de/products/ontoedit – Karlsruhe University – Ambiente grafico per editing di ontologie – Architettura estensibile per aggiungere plug-in – Ontologia viene memorizzata in un database relazionale – Implementata in XML, Flogic, RDF(S) DAML+OIL

Assiomi e inferenze in Onto. Edit Assiomi e inferenze in Onto. Edit

Altri strumenti di ontology management • • • Chimaera OILEd Apollo MOMIS Sym. Onto. Altri strumenti di ontology management • • • Chimaera OILEd Apollo MOMIS Sym. Onto. X

Ontology learning and population Ontology learning and population

Ontology building • Costruire un’ontologia interamente a mano è faticoso e time-consuming • Poche Ontology building • Costruire un’ontologia interamente a mano è faticoso e time-consuming • Poche ontologie contengono più di qualche centinaio di concetti • Sforzi opratutto per la definizione di core ontologies • Strumenti di apprendimento automatico e NLP per popolare automaticamente core ontologies

Metodi automatici per ontology building • In genere basati su NLP e machine learning Metodi automatici per ontology building • In genere basati su NLP e machine learning • Alcuni metodi: – Cercare nei testi “patterns” sintattici che sussumano relazioni (ad esempio, l’apposizione (es. “Shakespeare, the poet” hypernim(N 2, N 1): -appositive(N 2, N 1)) – Metodi statistici per estrarre termini e “string inclusion” per derivare relazioni di iperonimia (es: color laser printer) – Metodi di machine learning “apprendono” regole di assegnazione di relazioni semantiche fra termini utilizzando training sets di testi manualmente etichettati

Il sistema Ontolearn • Integra tecniche di machine learning, natural language processing e analisi Il sistema Ontolearn • Integra tecniche di machine learning, natural language processing e analisi statistica • Utilizza vari tipi di risorse (lessici semantici, corpora di testi, glossari) • Sperimentato in vari progetti nazionali e internazionali in vari mabiti (e-learning, interoperabilità , turismo, economia, compuer networks, arte)

Architettura del Sistema Onto. Learn Architettura del Sistema Onto. Learn

1. Terminology Extraction • Extract terminological candidates – Use a natural language processor (English 1. Terminology Extraction • Extract terminological candidates – Use a natural language processor (English or Italian) – Extract multiword strings conforming to syntactic structures for terminology (compounds, adj_noun, PP) • Filter candidates, using two entropy-based measures and “contrastive” domains Next week Project partner – Domain Relevance – Domain Consensus DR DC D 1. . . Di. . . Dn d 1. . . di. . . dn • Obtain a terminology T (list of domain relevant single and multiword expressions)

2. Search Definitions • Use existing glossaries and Google’s define feature to search for 2. Search Definitions • Use existing glossaries and Google’s define feature to search for term definitions • Extract term definitions from documents (tutorials, seminal papers) • Parse the glossary definitions • Use grammar-based approach to detect hyperonymy relations in glossary definitions • Use a WSD algorithm to attach sub-tree roots to the concepts of the core ontology

An example of root attachment core algorithm#1 No definition Cryptographic_algorithm Data Encryption Standard “A An example of root attachment core algorithm#1 No definition Cryptographic_algorithm Data Encryption Standard “A cryptographic algorithm for the protection of unclassified computer data and …” extension Type_3_Algorithm “A cryptographic algorithm that has been registered by the National Institute of Standards and Technology and …”

Computer Networks domain Computer Networks domain

Interoperability domain Interoperability domain

Glossary Parsing Experiments • 6, 800 terms of computer network application • Minor experiments Glossary Parsing Experiments • 6, 800 terms of computer network application • Minor experiments on Economy , Art techniques, Tourism, and Interoperability (200 -1000 terms) • Computer networks: about 550 sub-trees, 98% precision in detection of hypernyms from definitions. • About 82% precision in sub-trees root attachment to Word. Net (on-going experiments to reinforce the concept choice with “classical” similarity measures) • Similar performances in the other domains

Ongoing experiment within INTEROP • An interoperability glossary of 377 terms built from online Ongoing experiment within INTEROP • An interoperability glossary of 377 terms built from online glossaries (to be further extended) • Ongoing evaluation being performed by 7 domain experts from different areas of interoperability (enterprise modeling, architectures & platforms, ontologies) Term Definition Hypernym Source Knowledge engineer Ontology construction A person who implements an expert system. Ontology construction is usually a manual, iterative process consisting […] Analytical examination of a process for the purpose of […] Person www. pera. net/Tools/Glossary/ Enterprise_Integration/Glossary. ht ml Iterative process mia. ece. uic. edu/~papers/Me dia. Bot/pdf 00002. pdf Analytical examination www. pera. net/Tools/Glossar y/Enterprise_Integration/Glo ssary. html Process Analysis

What if no definitions are found? 3. Compositional interpretation • Objective: Attempt a compositional What if no definitions are found? 3. Compositional interpretation • Objective: Attempt a compositional interpretation, i. e. complex term meaning is obtained composing the meaning of its parts • Example: computer terminal not in core ontology, but (e. g. in Word. Net) 2 senses for computer, 3 senses for terminal • Method: find appropriate core ontology concept for each term component using WSD algorithm

SSI: an algorith for WSD • In Onto. Learn, used for: – Compositional interpretation SSI: an algorith for WSD • In Onto. Learn, used for: – Compositional interpretation of multi word expressions (e. g. “computer terminal” = computer#1 terminal#3) – Attaching a sub-tree under the appropriate node of a general purpose ontology (e. g. for the tree rooted in artificial_language#1

Structural Semantic Interconnection (SSI) • A WSD algorithm based on Structural Pattern Recognition • Structural Semantic Interconnection (SSI) • A WSD algorithm based on Structural Pattern Recognition • Starts from terms (singleton and multiword expressions) in a context T and produces an inventory I of concept labels • Relevant applications: – Ontology Learning – WSD tasks – Query Expansion – Semantic Annotation

SSI is a knowlede-based algorithm • Step 1: A Lexical knowledge base (LKB) was SSI is a knowlede-based algorithm • Step 1: A Lexical knowledge base (LKB) was built integrating several available on-line resources: – Word. Net – Oxford Collegiate Dictionary of collocations – Sem. Cor and LDC (semantically annotated corpora) –… – Integration in part manual in part automatic

Step 2. A structural representation of a concept c is a cut over a Step 2. A structural representation of a concept c is a cut over a LKB, centered in c, including all nodes in LKB at a maximum distance of 3 from c Bus, transport Bus, connector

The task: given a context, build a semantic interpretation T = [t 1, t The task: given a context, build a semantic interpretation T = [t 1, t 2, …, tn ] context SSI I = [St 1, St 2, …, Stn] semantic interpretation

(a) Build Semantic Networks T = [ bus, network, redundancy, connection ] • For (a) Build Semantic Networks T = [ bus, network, redundancy, connection ] • For each alternative sense of a word in T, find the best matching graph with respect to already disambiguated senses in I • I is initialised with unambiguous words in T, if no monosempus words, with the first sense of the less ambiguous word w in T (then algorithm is forked into as many executions as the senses of w) A vehicle carrying many passengers; used for transport The topology of a network whose components are connected by a busbar

Semantic Interpretation of Terms: (b) Intersect Semantic Nets T = [ bus, network, redundancy, Semantic Interpretation of Terms: (b) Intersect Semantic Nets T = [ bus, network, redundancy, connection ] Intersect all alternative I = [ bus#2, network#5, semantic networks and choose redundancy#3, connection#3 ] the networks with the higher number of relevant intersections. A CF grammar is used to detect relevant intersection patterns, e. g. gloss rule “network#5” appears in the definition (gloss) of “bus#2” or hyperonymy/hyponymy rule e. g.

Example of an intersection pattern for taxi#1 T [taxi, license, traveller, driver] Example of an intersection pattern for taxi#1 T [taxi, license, traveller, driver]

The algorithm is used in two ways • Compositional interpretation: – Given a list The algorithm is used in two ways • Compositional interpretation: – Given a list of multiword expression components (e. g. component, interacting, computer. . ) Interacting-component interact#1 component#3 – Given the elements of a sub-tree, find the appropriate attachment between the root node and a node in Word. Net ( artificial_language#1

4. Build a Semantic Tree • A complex term now corresponds to a complex 4. Build a Semantic Tree • A complex term now corresponds to a complex concept • Arrange concepts according to detected hyperonymy relations • Hyperonymy relations have been detected either parsing natural language difinitions, or disambiguating the components of a complex term

An example of root attachment core algorithm#1 No definition Cryptographic_algorithm Data Encryption Standard “A An example of root attachment core algorithm#1 No definition Cryptographic_algorithm Data Encryption Standard “A cryptographic algorithm for the protection of unclassified computer data and …” extension Type_3_Algorithm “A cryptographic algorithm that has been registered by the National Institute of Standards and Technology and …”