Скачать презентацию Ontologies German Rigau i Claramunt http www lsi Скачать презентацию Ontologies German Rigau i Claramunt http www lsi

7e90fec981a651fb41fc605cc47b1861.ppt

  • Количество слайдов: 127

Ontologies German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament Ontologies German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Ontologies Outline • Word. Net (Miller et al. 90, Fellbaum 98) • Euro. Word. Ontologies Outline • Word. Net (Miller et al. 90, Fellbaum 98) • Euro. Word. Net (Vossen et al. 98) • Spanish Word. Net • Combining Methods (Atserias et al. 97) • Mapping hierarchies (Daudé et al. 01) • Mikrokosmos (Viegas et al. 96) • Cyc (Malesh et al. 96) • Word. Net 2 (Harabagiu 98) • Mind. Net (Richardson et al. 97) • Thought. Treasure (Mueller 00) • Meaning. . .

Word. Net & Euro. Word. Net German Rigau i Claramunt http: //www. lsi. upc. Word. Net & Euro. Word. Net German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Word. Net & Euro. Word. Net • Universidad de Princeton (Miller et al. 1990) Word. Net & Euro. Word. Net • Universidad de Princeton (Miller et al. 1990) • Conceptos lexicalizados (parabras, lexíes) • Relacionados entre sí por relaciones semánticas • sinonimia • antonimia • hiperonimia-hiponimia • meronimia • implicación • causa • . . .

Word. Net & Euro. Word. Net Relaciones Semánticas de WN 1. 5 • Sinonimia Word. Net & Euro. Word. Net Relaciones Semánticas de WN 1. 5 • Sinonimia • Conceptos Lexicalizados (SYNSETS) • Noción débil de sinonimia: Sinonimia en contexto • Synset: Conjunto de palabras o lexías que en un contexto dado expresan un concepto • Hiperonimia / Hiponimia • Relación de clase a subclase

Word. Net & Euro. Word. Net Relacions Semàntiques de WN 1. 5 • Meronimias Word. Net & Euro. Word. Net Relacions Semàntiques de WN 1. 5 • Meronimias • Parte componente {mano} {brazo} • Elemento de colectividad {persona} {gente} • Sustancia {periódico} {papel}

Word. Net & Euro. Word. Net Relaciones Semánticas de WN 1. 5 • Antonimia Word. Net & Euro. Word. Net Relaciones Semánticas de WN 1. 5 • Antonimia {grande} {pequeño} • Causa {matar} {morir} • Implicación {divorciarse} {casarse} • Derivación {presidencial} {presidente} • Similitud {bueno} {positivo}

Word. Net & Euro. Word. Net Ejemplo Word. Net <conveyance> <vehicle> <motor vehicle, automovile, Word. Net & Euro. Word. Net Ejemplo Word. Net

Word. Net & Euro. Word. Net • Proyecto LE-2 4003 • Telematics Application Programme Word. Net & Euro. Word. Net • Proyecto LE-2 4003 • Telematics Application Programme de la UE • Redes semánticas de diversas lenguas • Integradas e interconectadas • Inglés • Holandés • Italiano • Español Universidad de Sheffield Univ. de Amsterdam I. L. C. de Pisa UB, UPC, UNED. • Computers and the Humanities • (Vol. monográfico, 1998) • http: //www. hum. uva. nl/~ewn/

Word. Net & Euro. Word. Net Extensiones Euro. Word. Net • EWN 2 Alemán, Word. Net & Euro. Word. Net Extensiones Euro. Word. Net • EWN 2 Alemán, Francés, Checo, Sueco, Estonio • Proyecto ITEM Castellano, Catalán, Vasco • CREL (Centre de Referència d’Enginyeria Lingüística) Catalán (UB, UPC)

Word. Net & Euro. Word. Net Aplicaciones • Desarrollo de recursos Básicos • Tratamiento Word. Net & Euro. Word. Net Aplicaciones • Desarrollo de recursos Básicos • Tratamiento interlingüístico de la información - Sistemas multilingües de recuperación de información (p. e. , Internet) - Módulo léxico-semántico de los sistemas de ingeniería lingüística Extracción de información Traducción automática

Word. Net & Euro. Word. Net Requisitos de Diseño • Preservación de las relaciones Word. Net & Euro. Word. Net Requisitos de Diseño • Preservación de las relaciones semánticas específicas de cada lengua • Máxima compatibilidad entre los diferentes recursos • Relativa independencia de los Word. Nets • en el proceso de construcción • en el resultado final

Word. Net & Euro. Word. Net Componentes de Euro. Word. Net • Núcleo • Word. Net & Euro. Word. Net Componentes de Euro. Word. Net • Núcleo • El ILI • La Top Concept Ontology (TCO) • Ontología de dominios (DO) • Periferia • Word. Nets específicos

Word. Net & Euro. Word. Net Interlingual Index of Euro. Word. Net • Colección Word. Net & Euro. Word. Net Interlingual Index of Euro. Word. Net • Colección no estructurada de elementos • Ligados con • al menos, un synset de un EWN • un elemento de la TCO o DO • Asociados a synsets de WN 1. 5

Word. Net & Euro. Word. Net Top Concept Ontology of Euro. Word. Net • Word. Net & Euro. Word. Net Top Concept Ontology of Euro. Word. Net • Jerarquía de conceptos independientes de la lengua • distinciones semánticas: objeto, lugar, dinámico, … • abstracta (no léxica) • Superpuesta al ILI • Tres tipos de entidades: • Primer orden: entidades concretas • Segundo orden: situaciones estáticas o dinámicas • Tercer orden: proposiciones abstractas

Word. Net & Euro. Word. Net Top Concept Ontology of Euro. Word. Net Word. Net & Euro. Word. Net Top Concept Ontology of Euro. Word. Net

Word. Net & Euro. Word. Net Domain Ontology of Euro. Word. Net • Jerarquía Word. Net & Euro. Word. Net Domain Ontology of Euro. Word. Net • Jerarquía de etiquetas de dominio • Reducción de la polisemia • Dominios: • Tráfico rodado, tráfico aéreo • Información Internacional • Micología • Medicina

Word. Net & Euro. Word. Net Relaciones de Euro. Word. Net • Riqueza superior Word. Net & Euro. Word. Net Relaciones de Euro. Word. Net • Riqueza superior a WN • Entre: • synsets (módulos monolingües) • registros ILI (multilingües): {actuar-1} EQ-SYNONYM {‘behave in a certain manner’} • registros ILI y TCO o OD

Word. Net & Euro. Word. Net Relaciones Interlingüísticas de Euro. Word. Net Word. Net & Euro. Word. Net Relaciones Interlingüísticas de Euro. Word. Net

Word. Net & Euro. Word. Net Relaciones de Euro. Word. Net Word. Net & Euro. Word. Net Relaciones de Euro. Word. Net

Spanish Word. Net: Building Process German Rigau i Claramunt http: //www. lsi. upc. es/~rigau Spanish Word. Net: Building Process German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Spanish Word. Net General Methodology 1) Mapping to WN 1. 5 manual work n Spanish Word. Net General Methodology 1) Mapping to WN 1. 5 manual work n automatic derivation of equivalents, using bilingual dictionaries n 2) Manual correction 3) Re-structuring

Spanish Word. Net Main Steps: First Core (Manual Translation) – Nouns: A) WN 1. Spanish Word. Net Main Steps: First Core (Manual Translation) – Nouns: A) WN 1. 5’s Tops File plus first level of hyponyms (about 800 synsets). n B) The rest of EWN’s Common Base Concepts (which were not in our set). n C) Manual translation of synsets intermediate between (A) and (B) following WN 1. 5 hyerarchy ¾thus building a compact taxonomy equivalent to WN 1. 5 without gaps¾ n – Verbs: n Manual translation of EWN’s Base Concepts (about 150 synsets)

Spanish Word. Net Main Steps: Subset 1 (Semi-automatic) n Nouns: – Applying authomatic methods Spanish Word. Net Main Steps: Subset 1 (Semi-automatic) n Nouns: – Applying authomatic methods using bi-lingual dictionaries – Manual validation of several subsets to check if the link is correct – Deriving a Confidence Score (CS) for every authomatic method (heuristic) – Selecting pairs synset-word above 85% CS – Some manual correction of this Subset 1 (mainly, filling gaps) n Verbs: – 3600 English verbs connected to WN 1. 5 senses and ambiguously translated to Spanish are manually inspected and disambiguated

Spanish Word. Net Main Steps: Subset 1 (Results 1) Spanish Word. Net Main Steps: Subset 1 (Results 1)

Spanish Word. Net Main Steps: Subset 1 (Results 2) Spanish Word. Net Main Steps: Subset 1 (Results 2)

Spanish Word. Net Main Steps: Subset 2 Main goals n enhance the quality of Spanish Word. Net Main Steps: Subset 2 Main goals n enhance the quality of the Subset 1 by manual revision n extend it by manual building of synsets n 4 Sub-tasks

Spanish Word. Net Main Steps: Subset 2 1) Covering manually those gaps in the Spanish Word. Net Main Steps: Subset 2 1) Covering manually those gaps in the hyponymy chains covered by other languages 2) Manual cleaning of some automatically-generated variants. – (a) pairs of synsets which are adjacent in the hyponymy chain and share at least one variant. n deleting redundant variants n re-locating to either pre-existant or newly created synsets – (b) multi-word expressions present in synsets. n Deleting non-lexicalized

Spanish Word. Net Main Steps: Subset 2 3) Manual addition of new vocabulary which Spanish Word. Net Main Steps: Subset 2 3) Manual addition of new vocabulary which has been considered relevant. – It mainly comes from the Catalan Word. Net: since we are building both wordnets in parallell, we detected those synsets which were built for Catalan and not for Spanish 4) Manual addition of cross-part of speech relations between nominal and verbal synsets. – This work has been based mainly on noun-verb pairs obtained by means of morphological criteria. (Work carried out by UNED –Madrid-)

Spanish Word. Net Main Steps: Subset 2 (Results) Spanish Word. Net Main Steps: Subset 2 (Results)

Spanish Word. Net Main Steps: Subset 2 (Results) Spanish Word. Net Main Steps: Subset 2 (Results)

Spanish Word. Net Main Steps: Beyond Subset 2 n Massive Manual Checking (from Nov’ Spanish Word. Net Main Steps: Beyond Subset 2 n Massive Manual Checking (from Nov’ 98) – Using WEI – Variants automatically generated – Filling gaps in the hierachy – New vocabulary – New Adjectives

Spanish Word. Net Main Steps: Beyond Subset 2 Spanish Word. Net Main Steps: Beyond Subset 2

Spanish Word. Net Main Steps: Beyond Subset 2 Spanish Word. Net Main Steps: Beyond Subset 2

Spanish Word. Net Main Steps: Parole Coverage Spanish Word. Net Main Steps: Parole Coverage

Spanish Word. Net Current Figures – Spanish, Catalan, Basque, (English) – http: //nipadio. lsi. Spanish Word. Net Current Figures – Spanish, Catalan, Basque, (English) – http: //nipadio. lsi. upc. es/wei 2. html

Combining Multiple Methods for the Automatic Construction of Multilingual Word. Nets German Rigau i Combining Multiple Methods for the Automatic Construction of Multilingual Word. Nets German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Combining Multiple Methods. . . Outline n Ten class methods – Four monosemic criteria Combining Multiple Methods. . . Outline n Ten class methods – Four monosemic criteria – Four polysemic criteria – two hybrid criteria n Three conceptual distance methods – CD 1: using pairwise word coocurrences – CD 2: using headword and genus – CD 3: using bilingual Spanish entries with multiple translations

Combining Multiple Methods. . . Ten class methods – Four Classes SW EW EW Combining Multiple Methods. . . Ten class methods – Four Classes SW EW EW SW SW EW

Combining Multiple Methods. . . Ten class methods – Four monosemic criteria SW EW Combining Multiple Methods. . . Ten class methods – Four monosemic criteria SW EW Synset SW EW Synset SW SW

Combining Multiple Methods. . . Ten class methods – Four polysemic criteria SW EW Combining Multiple Methods. . . Ten class methods – Four polysemic criteria SW EW Synset+ SW EW Synset+ SW SW

Combining Multiple Methods. . . Ten class methods – Variant criterion <. . . Combining Multiple Methods. . . Ten class methods – Variant criterion <. . . , EW, . . . > SW – Field criterion <. . . , headword-EW, . . . , Ind-EW, . . . > SW

Combining Multiple Methods. . . Ten class methods n Results Combining Multiple Methods. . . Ten class methods n Results

Combining Multiple Methods. . . Conceptual Distance methods n Conceptual Distance (Agirre et al. Combining Multiple Methods. . . Conceptual Distance methods n Conceptual Distance (Agirre et al. 94) – length of the shortest path – specificity of the concepts using Word. Net n Bilingual dictionary n

Combining Multiple Methods. . . Conceptual Distance methods n Three conceptual distance methods – Combining Multiple Methods. . . Conceptual Distance methods n Three conceptual distance methods – CD 1: using pairwise word coocurrences – CD 2: using headword and genus – CD 3: using bilingual Spanish entries with multiple translations

Combining Multiple Methods. . . Conceptual Distance methods (Example CD 2) <entity> <object, . Combining Multiple Methods. . . Conceptual Distance methods (Example CD 2) abadía_1_2 Iglesia o monasterio regido por un abad o abadesa (abbey, a church or a monastery ruled by an abbot or an abbess )

Combining Multiple Methods. . . Conceptual Distance methods (Example CD 2) <entity> <object, . Combining Multiple Methods. . . Conceptual Distance methods (Example CD 2) 06 ARTIFACT abadía_1_2 Iglesia o monasterio regido por un abad o abadesa (abbey, a church or a monastery ruled by an abbot or an abbess )

Combining Multiple Methods. . . Three CD methods n Results Combining Multiple Methods. . . Three CD methods n Results

Combining Multiple Methods. . . Combining methods n Results Combining Multiple Methods. . . Combining methods n Results

Combining Multiple Methods. . . Resulting Spanish Word. Nets Combining Multiple Methods. . . Resulting Spanish Word. Nets

Mapping Conceptual Hierarchies Using Relaxation Labelling German Rigau i Claramunt TALP Research Center UPC Mapping Conceptual Hierarchies Using Relaxation Labelling German Rigau i Claramunt TALP Research Center UPC

Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Constraints – Experiments & Results I (multilingual) – Experiments & Results II (monolingual) – Further work

Mapping Conceptual Hierarchies using Relaxation Labelling Setting C 1 C 2 C 3 C Mapping Conceptual Hierarchies using Relaxation Labelling Setting C 1 C 2 C 3 C 4 C 5 C 6

Mapping Conceptual Hierarchies using Relaxation Labelling Setting C 1 C 2 C 3 C Mapping Conceptual Hierarchies using Relaxation Labelling Setting C 1 C 2 C 3 C 4 C 5 C 6

Mapping Conceptual Hierarchies using Relaxation Labelling Setting Connecting already existing Hierarchies – Relaxattion labelling Mapping Conceptual Hierarchies using Relaxation Labelling Setting Connecting already existing Hierarchies – Relaxattion labelling Algorithn – Constraints Between – Spanish taxonomy automatically derived from an MRD (Rigau et al. 98) – Word. Net n using a bilingual MRD

Mapping Conceptual Hierarchies using Relaxation Labelling Setting animal (Tops <animal, animate_being, . . . Mapping Conceptual Hierarchies using Relaxation Labelling Setting animal (Tops ) (person ) (person ) (animal ) (artifact ) ave faisán rapaz (food ) (person ) (animal ) (food ) (animal ) (artifact ) (food ) (person )

Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Constraints – Experiments & Results I (multilingual) – Experiments & Results II (monolingual) – Further work

Mapping Conceptual Hierarchies using Relaxation Labelling Algorithm – Iterative algorithm for function optimization based Mapping Conceptual Hierarchies using Relaxation Labelling Algorithm – Iterative algorithm for function optimization based on local information – it can deal with any kind of constraints variables (senses of the taxonomy) n labels (synsets) n – Finds a weight assignment for each possible label for each variable weights for the labels of the same variable add up to one n weigth assignation satisfies -to the maximum possible extent- the set of constraints n

Mapping Conceptual Hierarchies using Relaxation Labelling Algorithm 1) Start with a random weight assigment Mapping Conceptual Hierarchies using Relaxation Labelling Algorithm 1) Start with a random weight assigment 2) Compute the support value for each label of each variable (according to the constraints) 3) Increase the weights of the labels more compatible with context and decrease those of the less compatible labels. 4) If a stopping/convergence is satisfied, stop, otherwiese go to step 2.

Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Constraints – Experiments & Results I (multilingual) – Experiments & Results II (monolingual) – Further work

Mapping Conceptual Hierarchies using Relaxation Labelling Constraints – Rely on the taxonomy structure – Mapping Conceptual Hierarchies using Relaxation Labelling Constraints – Rely on the taxonomy structure – Coded with three characters X: Spanish Taxonomy, I (immediate), n Y: English Taxonomy, A (ancestor) n X: Relation, E (hypernym), O (hyponym), B (both) n – Examples: IIE AAB + +

Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – II Constraints IIE NAACL’ 2001 Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – II Constraints IIE NAACL’ 2001 IIO IIB

Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – AI Constraints + + + Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – AI Constraints + + + AIE NAACL’ 2001 + AIO AIB

Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – IA Constraints + + + Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – IA Constraints + + + IAE NAACL’ 2001 IAO + IAB

Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – AA Constraints + + AAE Mapping Conceptual Hierarchies using Relaxation Labelling Hierarchical Constraints – AA Constraints + + AAE NAACL’ 2001 + AAO + + + AAB

Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Constraints – Experiments & Results I (multilingual) – Experiments & Results II (monolingual) – Further work

Combining Multiple Methods. . . RANLP’ 97 Eight class methods – Four monosemic criteria Combining Multiple Methods. . . RANLP’ 97 Eight class methods – Four monosemic criteria Prec. Cov. SW EW Synset 92% 5% SW EW Synset 89% 1% EW Synset 89% 2% SW EW Synset 85% 4% SW EW Synset SW SW

Combining Multiple Methods. . . RANLP’ 97 Eight class methods – Four polysemic criteria Combining Multiple Methods. . . RANLP’ 97 Eight class methods – Four polysemic criteria Prec. Cov. SW EW Synset+ 80% 8% SW EW Synset+ 75% 2% EW Synset+ 58% 17% SW EW Synset+ 61% 60% SW EW Synset+ SW SW

Combining Multiple Methods. . . RANLP’ 97 Experiments & Results Poly total TOK, FOK Combining Multiple Methods. . . RANLP’ 97 Experiments & Results Poly total TOK, FOK TOK, FNOK animal 279 (90%) 30 (91%) 209 (90%) food 166 (94%) 3 (100%) 169 (94%) cognition 198 (67%) 27 (90%) 225 (69%) communication 533 (77%) 40 (97%) 573 (78%) all total animal (90%) TOK, FOK 424 (93%) TOK, FNOK 62 (95%) 486

Combining Multiple Methods. . . RANLP’ 97 Experiments & Results piel (substance <skin, fur, Combining Multiple Methods. . . RANLP’ 97 Experiments & Results piel (substance ) marta visón (substance ) (substance )

Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Mapping Conceptual Hierarchies using Relaxation Labelling Outline – Setting – Relaxation Labelling Algorithm – Constraints – Experiments & Results I (multilingual) – Experiments & Results II (monolingual) – Further work

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Generalized Constraints n All Relationships – also-see, similar-to, attribute, antonym, etc. R R

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Generalized Constraints n Non-structural constraints – W: number of word coincidences – G: word coincidences in glosses – F: number of frame coincidences (verbs)

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 POS mapping depencences Nouns Adjectives Verbs Adverbs

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Constraints for Verbs n Structural constraints – – – n hyper/hyponymy antonymy also-see Non-structural constraints – W, G and F

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Constraints Adjectives n Structural constraints – Adj-to-Adj n antonymy, similar-to and also-see – Adj-to-Verb n participle-of – Adj-to-Noun n n pertains and attribute Non-structural constraints – W and G

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Constraints Adverbs n Structural constraints – Adv-to-Adv n antonymy – Adv-to-Adj n n derived Non-structural constraints – W and G

A Complete. . . ACL’ 00, NAACL’ 01 Example extra-POS WN 1. 5 02025107 A Complete. . . ACL’ 00, NAACL’ 01 Example extra-POS WN 1. 5 02025107 a evangelical evangelistic pertainym 04237485 n Gospels evangel WN 1. 6 00843344 a evangelical evangelistic Similar to 00842521 a enthusiastic 02025107 a evangelical pertainym 04853575 n Gospels evangel

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Example extra-POS WN 1. 5 00057615 r impossibly absurdly WN 1. 6 00294844 r impossibly derived from 01393725 a impossible 01752468 a impossible antonym 00294658 a possibly

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Results n Basic constraint set: structural constraints – Nouns: AA hyper/hyponym – Verbs: AA hyper/hyponym, II also-see – Adjectives: II antonymy, similar-to, also-see – Adverbs: II antonymy

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Results n Basic constraint set: structural constraints N V A Coverage 99. 7% 96. 9% 94. 1% Ambigous 94. 9% - 99. 6% 93. 5% - 99. 2% 82. 8% - 98. 9% Overall 97. 6% - 99. 8% 94. 6% - 99. 2% 89. 5% - 99. 4% R 80. 8% 97. 5% - 100% 99. 0% - 100% Precision - recall

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Results n Basic constraint set + W, G and F for verbs N V A Coverage 99. 9% 99. 8% 98. 9% R 99. 5% Ambigous Overall 97. 5% - 97. 7 % 98. 8% - 98. 9% 99. 4% - 99. 7% 99. 3% - 99. 6% 96. 5% - 98. 8% 97. 9% - 99. 3% 97. 5% - 100% 99. 0% - 100% Precision - recall

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Results n Basic + extra-POS relationships Ambigous Overall N V A Coverage 95. 8% - 98. 9% 90. 9% - 99. 4% R 88. 0% 69. 2% - 94. 2% 97. 9% - 98. 1% Precision - recall

A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, A Complete WN 1. 5 to WN 1. 6 Mapping. . . ACL’ 00, NAACL’ 01 Results n Basic + extra-POS relationships + WGF N V A Coverage 99. 9% 99. 8% 99. 0% R 99. 6% Ambigous Overall 97. 5% - 97. 7 % 98. 8% - 98. 9% 99. 4% - 99. 7% 99. 3% - 99. 6% 96. 5% - 99. 1% 97. 9% - 99. 5% 98. 3% - 100% 99. 3% - 100% Precision - recall

Mapping Conceptual Hierarchies using Relaxation Labelling Conclusions – First complete mapping between Wordnet versions Mapping Conceptual Hierarchies using Relaxation Labelling Conclusions – First complete mapping between Wordnet versions – Combining structural and non-structural information – Robust approach based on local information, but with global effects – Incremental POS approach – http: //www. lsi. upc. es/~nlp – 90 downloads (since November 2000)

Mapping Conceptual Hierarchies using Relaxation Labelling Further Work – mapping other structures WN-EDR, WN-LDOCE, Mapping Conceptual Hierarchies using Relaxation Labelling Further Work – mapping other structures WN-EDR, WN-LDOCE, etc. n Other language taxonomies to Euro. Word. Net n – Spanish. EWN to WN 1. 6 – symmetrical philosophy rather than sourcetarget

Mikrokosmos German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament Mikrokosmos German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Mikrokosmos Outline • Introduction • Representational Issues • The Lexicon • The Ontology • Mikrokosmos Outline • Introduction • Representational Issues • The Lexicon • The Ontology • Acquisition Process • Lexicon Acquisition • Guidelines • Ontology/Lexicon Trade-off • Semantics in Action

Mikrokosmos Introduction • Knowledge Base Machine Translation (KBMT) • CRL, NMSU • 5, 000 Mikrokosmos Introduction • Knowledge Base Machine Translation (KBMT) • CRL, NMSU • 5, 000 concepts • Events • Objects • Properties • 7, 000 Spanish word senses • 40, 000 word senses • after expansion with productive Lexical Rules • comprar -> comprador, comprable, . . . • Text Meaning Representation

Mikrokosmos Representational Issues: The Lexicon • Typed Feature Structures (Pollard and Sag 87) • Mikrokosmos Representational Issues: The Lexicon • Typed Feature Structures (Pollard and Sag 87) • language-dependant • 10 zones • phonology • orthography • morphology • Syntactic (subcategorization) • Semantic (Lexical Semantic Representation) • syntax-semantic linking • stylistics • paradigmatic • syntacmatic

Mikrokosmos Representational Issues: The Lexicon Adquirir-V 1 syn: subj: cat: obj: cat: sem: acquire Mikrokosmos Representational Issues: The Lexicon Adquirir-V 1 syn: subj: cat: obj: cat: sem: acquire agent: theme: Adquirir-V 2 syn: subj: cat: obj: cat: sem: acquire agent: theme: NP NP HUMAN OBJECT NP NP HUMAN INFORMATION

Mikrokosmos Representational Issues: The Ontology • Taxonomic multi-hierarchical • 14 local or inherited links Mikrokosmos Representational Issues: The Ontology • Taxonomic multi-hierarchical • 14 local or inherited links in average • language-impartial • EVENTS, OBJECTS, PROPERTIES • Methodology & Guidelines

Mikrokosmos Representational Issues: The Ontology • ACQUIRE DEFINITION “The transfer of possession event where Mikrokosmos Representational Issues: The Ontology • ACQUIRE DEFINITION “The transfer of possession event where the agent transfers an object to its possession” IS - A TRANSFER-POSSESSION SOURCE HUMAN PLACE THEME OBJECT (NOT HUMAN) AGENT ANIMAL (DEFAULT HUMAN) DESTINATION ANIMAL PLACE (DEFAULT HUMAN) INHERITED BENEFICIARY HUMAN

Mikrokosmos Acquisition Process: The Lexicon • Multi-lingual • French, English, Japanese, Russian, Spanish, etc. Mikrokosmos Acquisition Process: The Lexicon • Multi-lingual • French, English, Japanese, Russian, Spanish, etc. • Multi-media • Multi-process • Analysis • Generation (mono and multilingual) • MT • Summarization • IE • Speech Processing • Tools • corpus-search, lookup dictionary, ontology browser

Mikrokosmos Acquisition Process: The Ontology • Guidelines 1) Do not add instances as concepts Mikrokosmos Acquisition Process: The Ontology • Guidelines 1) Do not add instances as concepts • Instances do not have their own instances • Concepts do not have fixed position in space/time 2) Do not decompose concepts further 3) Use close concepts 4) Do not add EVENTs with particular arguments 5) Do not add concepts with instance-specific aspects, temporal relations 6) Do not add language-specific concepts 7) Do not add ontologycal concepts for collections

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • Daily negociations • lexicon acquirers • ontology acquirers Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • Daily negociations • lexicon acquirers • ontology acquirers • Possibilities • one-to-one mapping • lexicon unspecification • lexicon ontology balance

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • one-to-one mapping PREPARE-FOOD INST: COOKING-EQUIPMENT COOK BAKE INST: Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • one-to-one mapping PREPARE-FOOD INST: COOKING-EQUIPMENT COOK BAKE INST: STOVE cook : cuire sur le feu • Problems INST: OVEN bake : cuire ou four • Lexical: every word in a language is a concept • conceptual: cuire in french is not ambiguous

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • Lexicon Unspecification PREPARE-FOOD INST: COOKING-EQUIPMENT cook : cuire Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • Lexicon Unspecification PREPARE-FOOD INST: COOKING-EQUIPMENT cook : cuire sur le feu • Problems bake : cuire ou four • BAKE is not in the ontology INST: OVEN

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • Lexicon-Ontology Balance PREPARE-FOOD INST: COOKING-EQUIPMENT BAKE FRY INST: Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off • Lexicon-Ontology Balance PREPARE-FOOD INST: COOKING-EQUIPMENT BAKE FRY INST: STOVE INST: FRYING-PAN INST: OVEN bake cook : cuire

Mikrokosmos Semantics in Action • El grupo Roche, a través de su compañía en Mikrokosmos Semantics in Action • El grupo Roche, a través de su compañía en España, adquirió Doctor Andreu. • El grupo Roche adquirió Doctor Andreu a través de su compañía en España. • La adquisición de Doctor Andreu por el grupo Roche fue hecha a través de su compañía en España. ACQUIRE-1 ORGANIZATION-2 ORGANIZATION-3 Agent: ORGANIZATION-1 Theme: ORGANIZATION-2 Instrument: ORGANIZATION-3 Object-Name: Grupo Roche Object-Name: Doctor Andreu Location: España

Mikrokosmos Semantics in Action • Onto-Search: Ontological search mechanism to check constraints • check-onto(ACQUIRE, Mikrokosmos Semantics in Action • Onto-Search: Ontological search mechanism to check constraints • check-onto(ACQUIRE, EVENT) = 1 • since ACQUIRE is a type of EVENT • check-onto(ORGANIZATION, HUMAN) = 0. 9 • since ORGANIZATION HAS-MEMBER HUMAN

Mikrokosmos Semantics in Action 1) a-través-de INSTRUMENT, LOCATION adquirir require PHYSICAL-OBJECT 2) en LOCATION, Mikrokosmos Semantics in Action 1) a-través-de INSTRUMENT, LOCATION adquirir require PHYSICAL-OBJECT 2) en LOCATION, TEMPORAL España is not a TEMPORAL-OBJECT 3) adquirir ACQUIRE, LEARN Doctor Andreu is not an INFORMATION 4) Doctor Andreu ORGANIZATION, HUMAN the Theme of ACQUIRE is not HUMAN 5) compañía CORPORATION, SOCIAL-EVENT ORGANIZATIONs typically fill the INSTRUMENT slot of ACQUIRE acts

Mikrokosmos Experiment: WSD Text words/sentence open-class words ambiguous words syntax correct % 1 347 Mikrokosmos Experiment: WSD Text words/sentence open-class words ambiguous words syntax correct % 1 347 16. 5 183 57 21 51 97 2 385 24. 0 167 42 19 41 99 3 370 26. 4 177 57 20 45 93 4 353 20. 8 177 35 12 34 99 Mean 364 21. 4 176 48 18 43 97

Mikrokosmos Experiment: WSD Text words/sentence open-class words ambiguous words syntax correct % Mean 364 Mikrokosmos Experiment: WSD Text words/sentence open-class words ambiguous words syntax correct % Mean 364 21. 4 176 48 18 43 97 Mean Unseen 390 26 104 26 9 23 97

Word. Net 2 German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Word. Net 2 German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Word. Net 2 Outline • Introduction • Text Inferences • Defining Features • Plausible Word. Net 2 Outline • Introduction • Text Inferences • Defining Features • Plausible inferences • Inference Rules • Semantic Paths • What Word. Net cannot do

Word. Net 2 Introduction • (Harabagiu 98) • Commonse reasoning requires extensive knowledge • Word. Net 2 Introduction • (Harabagiu 98) • Commonse reasoning requires extensive knowledge • ~ 100 millions of concepts and relations • Word. Net • represents almost all English words • 100. 000 synsets • linked by semantic relations • Word. Net 2 • each synset has a gloss that, when disambiguated may increase the number of relations • Word. Net glosses into semantic networks • NEW RELATIONS

Word. Net 2 Text Inferences German was hungry He opened the refrigerator • hungry Word. Net 2 Text Inferences German was hungry He opened the refrigerator • hungry (feeling a need or desire to eat) • eat (take in solid food) • refrigerator (an appliance in which foods can be stored at low temperature)

Word. Net 2 Defining Features • Transform each concept’s gloss into a graph where Word. Net 2 Defining Features • Transform each concept’s gloss into a graph where concepts are nodes and lexical relations are links • (all the knowledge shared by society) --AGENT--> (licensed medical practitioner) --ATRIBUTTE-->

Word. Net 2 Defining Features ship OBJECT guide PURPOSE pilot LOCATION person GLOSS water Word. Net 2 Defining Features ship OBJECT guide PURPOSE pilot LOCATION person GLOSS water ATTRIBUTE qualified difficult

Word. Net 2 Inference Rules Rule 1 Rule 2 VC 1 IS-A VC 2 Word. Net 2 Inference Rules Rule 1 Rule 2 VC 1 IS-A VC 2 IS-A VC 3 ------------VC 1 IS-A VC 3 Rule 3 VC 1 IS-A VC 2 ENTAIL VC 3 ------------VC 1 ENTAIL VC 3 Rule 2 VC 1 IS-A VC 2 R_IS-A VC 3 ------------VC 1 PLAUSIBLE (not VC 3) • 16 + 1 regles VC 1 IS-A VC 2 R_ENTAIL VC 3 ------------VC 1 EXPLAINS VC 3

Word. Net 2 Semantic Paths 0) Create and load the KB 1) Place markers Word. Net 2 Semantic Paths 0) Create and load the KB 1) Place markers on KB concepts 2) Propagate markers The algorithm avoids cycles 3) Detect collisions To each marker collision it corresponds a path 4) Extract Inferences

Word. Net 2 Semantic Paths Inference sequence • German was hungry • German felt Word. Net 2 Semantic Paths Inference sequence • German was hungry • German felt a desire to eat • German felt a desire to take in food COLLISION: German=he felt a desire to take food, stored in an appliance, which he opened • He opened an appliance where food is stored • He opened the refrigerator

Word. Net 2 What Word. Net cannot do Major Word. Net limitations: 1) The Word. Net 2 What Word. Net cannot do Major Word. Net limitations: 1) The lack of compound concepts 2) The small number of causation and entailment relations 3) the lack of preconditions for verbs 4) the absence of case relations

Thought. Treasure German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Thought. Treasure German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Thought. Treasure Overview • a comprehensive platform for • NLP English, French • commonsense Thought. Treasure Overview • a comprehensive platform for • NLP English, French • commonsense reasoning • A hotel room has a bed, night table, . . . • People has fingernails • soda is a drink • one hangs up at the end of a phone call • the sky is blue • dogs bark • someone who is 16 years old is a teenager

Thought. Treasure Overview • 25, 000 concepts organized into a hierarchy EVIAN -> FLAT-WATER Thought. Treasure Overview • 25, 000 concepts organized into a hierarchy EVIAN -> FLAT-WATER -> DRINKING-WATER • 55, 000 words (English, French) food <-> aliment <-> FOOD • 50, 000 asertions about concepts green-pea is green • 100 scripts

Thought. Treasure Overview • Text Agents for recognizing names, phones, etc • mechanisms for Thought. Treasure Overview • Text Agents for recognizing names, phones, etc • mechanisms for learning new words • X-phile is someone who likes X • a syntactic parser • a NL generator • a semantic parser • an anaphoric parser • planning agents for achieving goals • understanding agents

Thought. Treasure Example • Who created Bugs Bunny? • 1. 0 (create human-interrogative-pronoun Bugs-Bunny) Thought. Treasure Example • Who created Bugs Bunny? • 1. 0 (create human-interrogative-pronoun Bugs-Bunny) • 0. 9 (create rock-group-the-Who Bugs-Bunny) • 1. 0 (create Tex-Avery Bugs-Bunny) • 0. 1 (not (create rock-group-the-Who Bugs-Bunny))

Meaning German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament Meaning German Rigau i Claramunt http: //www. lsi. upc. es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya

Meaning Overview n Bases de Conocimiento – Enriquecimiento automático de EWN (modelos verbales, etc. Meaning Overview n Bases de Conocimiento – Enriquecimiento automático de EWN (modelos verbales, etc. ) – Aproximación mixta (KB + ML) – Q/A n Problema – ambigüedad estructural y léxica n Aproximación – localizar automáticamente ejemplos de sentidos (Leacock et al. 98, Mihalcea y Moldovan 99) – WSD a gran escala (Boosting, SVM, transductivos …) – Acquisición Conocimiento (Ribas 95, Mc. Carthy 01)

Meaning Exploiting EWN Semantic Relations <evento> <agrupación grupo colectivo> <evento social> <grupo_social> <competición, concurso> Meaning Exploiting EWN Semantic Relations

Meaning Exploiting EWN Semantic Relations partido 1 Todos los partidos piden reformas legales para Meaning Exploiting EWN Semantic Relations partido 1 Todos los partidos piden reformas legales para TV 3. La derecha planea agruparse en un partido. El diputado reiteró que ni él ni UDC, “como partido”, han recibido dinero de Pellerols. partido 2 Pero España puso al partido intensidad, ritmo y coraje. El seleccionador cree que el partido de hoy contra Italia dará la medida de España El Racing no gana en su campo desde hace seis partidos.

Meaning Exploiting EWN Semantic Relations partido 1 No negociaremos nunca com un partido político Meaning Exploiting EWN Semantic Relations partido 1 No negociaremos nunca com un partido político que sea partidario de la independencia de Taiwan. Una vez más es noticia la desviación de fondos destinadoss a la formación ocupacional hacia la financiación de un partido político. Estas lleyess fueron votadas gracias a un consenso general de los partidos políticos. partido 2 Rivera pide el suporte de la afición para encarrilar las semifinales. Sólo el equipo de Valero Ribera puede sentenciar una semifinal como lo hizo ayer en un Palau Blaugrana completamente entregado. El Racing ganó los cuartos de final en su campo.

Meaning Arquitecture English Web Corpus ACQ WSD English EWN WSD UPLOAD PORT ACQ Spanish Meaning Arquitecture English Web Corpus ACQ WSD English EWN WSD UPLOAD PORT ACQ Spanish EWN Spanish Web Corpus Catalan EWN Catalan Web Corpus WSD Italian EWN Italian Web Corpus ACQ PORT Multilingual Central Repository UPLOAD PORT Basque EWN WSD ACQ Basque Web Corpus