Скачать презентацию Anaphor Resolution in Norwegian Gordana Ilic Holen Institut Скачать презентацию Anaphor Resolution in Norwegian Gordana Ilic Holen Institut

fac74c31261ab90b9cefe0fc90c4d09b.ppt

  • Количество слайдов: 17

Anaphor Resolution in Norwegian Gordana Ilic Holen Institut for lingvistiske fag Det historisk-filosofiske fakultet Anaphor Resolution in Norwegian Gordana Ilic Holen Institut for lingvistiske fag Det historisk-filosofiske fakultet Universitetet i Oslo g. i. [email protected] uio. no January 2003 Fefor

Some technical data z Hovedfagsoppgave (incl. obligatory courses, a 4 semestrer project) z Aim: Some technical data z Hovedfagsoppgave (incl. obligatory courses, a 4 semestrer project) z Aim: Making a system for resolving pronominal anaphors in Norwegian. z Mentor: Janne Bondi Johannessen z Implementation in (CLOS) LISP z To be finished Christmas 2003 January 2003 Fefor 2

Where did it start? z. Martin Hassel, 2000 y. Made AR system for Swedish Where did it start? z. Martin Hassel, 2000 y. Made AR system for Swedish pronouns han/ honom/ hans and hon/ henne /hennes z. Differences y. Planning to cover more pronouns y. A different theoretical background January 2003 Fefor 3

The Top List z. Han/ ham/ hans and hun/ hennes y. Among the most The Top List z. Han/ ham/ hans and hun/ hennes y. Among the most used; not ambiguous z. Seg and selv y. Syntactic solutions z. Den y. Ambiguous with the determinative den (gule bilen). January 2003 Fefor 4

The Top Wish List z. De y. Ambiguous with a determinative de (gule bilene) The Top Wish List z. De y. Ambiguous with a determinative de (gule bilene) y. Problems delimiting the antecedent z Det y. Problems in deciding whether det is pronominal xdet (gule huset) xdet (regner) January 2003 Fefor 5

Approach To be based on z. Mitkov's anaphora resolution system/ MARS (Mitkov 1996, 1998) Approach To be based on z. Mitkov's anaphora resolution system/ MARS (Mitkov 1996, 1998) and partially on z. Resolution of Anaphora Procedures/ RAP (Leass & Lappin 1994). January 2003 Fefor 6

Why MARS and RAP z. Both made for English z. MARS: intuitive, fully automated Why MARS and RAP z. Both made for English z. MARS: intuitive, fully automated z. RAP: high precision z. Flexible January 2003 Fefor 7

MARS z No parsing z The AR module uses a list of preferences called MARS z No parsing z The AR module uses a list of preferences called antecedent indicators y. Boosting y. Impeding z Fully automatic, not very high precision - 61%) January 2003 Fefor (60 8

MARS: The algorithm z The text is POS tagged. z NPs are extracted by MARS: The algorithm z The text is POS tagged. z NPs are extracted by a NP-extractor z NPs which precede the anaphor (in a twosentence scope) are located z Gender and number constraints are applied z Antecedent indicators are applied to the antecedent candidates that agree in gender and number. The scores (2, 1, 0 or -1) are assigned. z The NP with the highest score is proposed as antecedent. January 2003 Fefor 9

MARS: Antecedent indicators (boosting) z. First noun phrases +1 z. Indicating verbs +1 z. MARS: Antecedent indicators (boosting) z. First noun phrases +1 z. Indicating verbs +1 z. Lexical reiteration +2 / +1 z. Section heading preference +1 z. Collocation match +2 z. Immediate reference +2 z. Sequential instructions +2 z. Term preference +2 January 2003 Fefor 10

MARS: Antecedent indicators (boosting) z. First noun phrases +1 z. Indicating verbs +1 z. MARS: Antecedent indicators (boosting) z. First noun phrases +1 z. Indicating verbs +1 z. Lexical reiteration +2 / +1 z. Section heading preference +1 z. Collocation match +2 z. Immediate reference +2 z. Sequential instructions +2 z. Term preference +2 January 2003 Fefor 11

MARS: Antecedent indicators (boosting) z. First noun phrases +1 z. Indicating verbs +1 z. MARS: Antecedent indicators (boosting) z. First noun phrases +1 z. Indicating verbs +1 z. Lexical reiteration +2 / +1 z. Section heading preference +1 z. Collocation match +2 z. Immediate reference +2 z. Sequential instructions +2 z. Term preference +2 January 2003 Fefor 12

MARS: Antecedent indicators (impeding) z. Indefiniteness -1 z. Prepositional NPs -1 January 2003 Fefor MARS: Antecedent indicators (impeding) z. Indefiniteness -1 z. Prepositional NPs -1 January 2003 Fefor 13

RAP z. A high precision system (86% correctly resolved anaphors) z. Originally based on RAP z. A high precision system (86% correctly resolved anaphors) z. Originally based on parsed text, but there exists a version without (Kennedy and Boguraev, 1996) z. The AR module: Salience weighting January 2003 Fefor 14

RAP: Salience weighting z. Salience factors: y. Sentence recency 100 y. Subject emphasis y. RAP: Salience weighting z. Salience factors: y. Sentence recency 100 y. Subject emphasis y. Head noun emphasis y. Existential emphasis y. Accusative emphasis y. Non-adverbial emphasis y. IO and oblique component emphasis January 2003 Fefor 80 80 70 50 50 40 15

Modifications As both systems exist in versions with or without parsing, leaving this question Modifications As both systems exist in versions with or without parsing, leaving this question open. Starting with using Oslo Corpus for training and adjusting z. Experiment with antecedent indicators and adjust them for Norwegian z. Try to combine them with RAP’s salience factors January 2003 Fefor 16

Open for suggestions g. i. holen@hfstud. uio. no January 2003 Fefor 17 Open for suggestions g. i. [email protected] uio. no January 2003 Fefor 17