Скачать презентацию Penn Putting Meaning Into Your Trees Martha Palmer Скачать презентацию Penn Putting Meaning Into Your Trees Martha Palmer

1a4002f5e82fe317defa899384b4e3e3.ppt

  • Количество слайдов: 72

Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003 Princeton 11/06/03 1

Outline Penn · Introduction · Background: Word. Net, Levin classes, Verb. Net · Proposition Outline Penn · Introduction · Background: Word. Net, Levin classes, Verb. Net · Proposition Bank – capturing shallow semantics · Mapping Prop. Bank to Verb. Net · Mapping Prop. Bank to Word. Net Princeton 11/06/03 2

Word sense in Machine Translation Penn · Different syntactic frames Ø John left the Word sense in Machine Translation Penn · Different syntactic frames Ø John left the room Juan saiu do quarto. (Portuguese) Ø John left the book on the table. Juan deizou o livro na mesa. · Same syntactic frame? Ø John left a fortune. Juan deixou uma fortuna. Princeton 11/06/03 3

Ask Jeeves – A Q/A, IR ex. Penn What do you call a successful Ask Jeeves – A Q/A, IR ex. Penn What do you call a successful movie? Blockbuster · Tips on Being a Successful Movie Vampire. . . I shall call the police. · Successful Casting Call & Shoot for ``Clash of Empires''. . . thank everyone for their participation in the making of yesterday's movie. · Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague. . . · VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer. Princeton 11/06/03 4

Ask Jeeves – filtering w/ POS tag Penn What do you call a successful Ask Jeeves – filtering w/ POS tag Penn What do you call a successful movie? · Tips on Being a Successful Movie Vampire. . . I shall call the police. · Successful Casting Call & Shoot for ``Clash of Empires''. . . thank everyone for their participation in the making of yesterday's movie. · Demme's casting is also highly entertaining, although I wouldn't go so far as to call it successful. This movie's resemblance to its predecessor is pretty vague. . . · VHS Movies: Successful Cold Call Selling: Over 100 New Ideas, Scripts, and Examples from the Nation's Foremost Sales Trainer. Princeton 11/06/03 5

Filtering out “call the police” Penn call(you, movie, what) call(you, police) Syntax Princeton 11/06/03 Filtering out “call the police” Penn call(you, movie, what) call(you, police) Syntax Princeton 11/06/03 6

English lexical resource is required Penn · That provides sets of possible syntactic frames English lexical resource is required Penn · That provides sets of possible syntactic frames for verbs. · And provides clear, replicable sense distinctions. Ask. Jeeves: Who do you call for a good electronic lexical database for English? Princeton 11/06/03 7

Word. Net – call, 28 senses Penn 1. name, call -- (assign a specified, Word. Net – call, 28 senses Penn 1. name, call -- (assign a specified, proper name to; "They named their son David"; …) -> LABEL 2. call, telephone, call up, phone, ring -- (get or try to get into communication (with someone) by telephone; "I tried to call you all night"; …) ->TELECOMMUNICATE 3. call -- (ascribe a quality to or give a name of a common noun that reflects a quality; "He called me a bastard"; …) -> LABEL 4. call, send for -- (order, request, or command to come; "She was called into the director's office"; "Call the police!") -> ORDER Princeton 11/06/03 8

Word. Net – Princeton (Miller 1985, Fellbaum 1998) Penn · On-line lexical reference (dictionary) Word. Net – Princeton (Miller 1985, Fellbaum 1998) Penn · On-line lexical reference (dictionary) Ø Nouns, verbs, adjectives, and adverbs grouped into synonym sets Ø Other relations include hypernyms (ISA), antonyms, meronyms · Limitations as a computational lexicon Ø Contains little syntactic information Ø No explicit predicate argument structures Ø No systematic extension of basic senses Ø Sense distinctions are very fine-grained, ITA 73% Ø No hierarchical entries Princeton 11/06/03 9

Levin classes (Levin, 1993) Penn · 3100 verbs, 47 top level classes, 193 second Levin classes (Levin, 1993) Penn · 3100 verbs, 47 top level classes, 193 second and third level · Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. John cut the bread. / *The bread cut. / Bread cuts easily. John hit the wall. / *The wall hit. / *Walls hit easily. Princeton 11/06/03 10

Levin classes (Levin, 1993) Penn · Verb class hierarchy: 3100 verbs, 47 top level Levin classes (Levin, 1993) Penn · Verb class hierarchy: 3100 verbs, 47 top level classes, 193 · Each class has a syntactic signature based on alternations. John broke the jar. / The jar broke. / Jars break easily. change-of-state John cut the bread. / *The bread cut. / Bread cuts easily. change-of-state, recognizable action, sharp instrument John hit the wall. / *The wall hit. / *Walls hit easily. contact, exertion of force Princeton 11/06/03 11

Penn Princeton 11/06/03 12 Penn Princeton 11/06/03 12

Confusions in Levin classes? Penn · Not semantically homogenous Ø{braid, clip, file, powder, pluck, Confusions in Levin classes? Penn · Not semantically homogenous Ø{braid, clip, file, powder, pluck, etc. . . } · Multiple class listings Øhomonymy or polysemy? · Conflicting alternations? ØCarry verbs disallow the Conative, (*she carried at the ball), but include {push, pull, shove, kick, draw, yank, tug} Øalso in Push/pull class, does take the Conative (she kicked at the ball) Princeton 11/06/03 13

Intersective Levin Classes Penn “apart” CH-STATE “across the room” CH-LOC Princeton 11/06/03 “at” ¬CH-LOC Intersective Levin Classes Penn “apart” CH-STATE “across the room” CH-LOC Princeton 11/06/03 “at” ¬CH-LOC Dang, Kipper & Palmer, ACL 98 14

Intersective Levin Classes Penn · More syntactically and semantically coherent Øsets of syntactic patterns Intersective Levin Classes Penn · More syntactically and semantically coherent Øsets of syntactic patterns Øexplicit semantic components Ørelations between senses VERBNET www. cis. upenn. edu/verbnet Dang, Kipper & Palmer, IJCAI 00, Coling 00 Princeton 11/06/03 15

Verb. Net – Karin Kipper Penn · Class entries: Ø Capture generalizations about verb Verb. Net – Karin Kipper Penn · Class entries: Ø Capture generalizations about verb behavior Ø Organized hierarchically Ø Members have common semantic elements, semantic roles and syntactic frames · Verb entries: Ø Refer to a set of classes (different senses) Ø each class member linked to WN synset(s) (not all WN senses are covered) Princeton 11/06/03 16

Semantic role labels: Penn Christiane broke the LCD projector. break (agent(Christiane), patient(LCD-projector)) cause(agent(Christiane), broken(LCD-projector)) Semantic role labels: Penn Christiane broke the LCD projector. break (agent(Christiane), patient(LCD-projector)) cause(agent(Christiane), broken(LCD-projector)) agent(A) -> intentional(A), sentient(A), causer(A), affector(A) patient(P) -> affected(P), change(P), … Princeton 11/06/03 17

Hand built resources vs. Real data Penn · Verb. Net is based on linguistic Hand built resources vs. Real data Penn · Verb. Net is based on linguistic theory – how useful is it? · How well does it correspond to syntactic variations found in naturally occurring text? Prop. Bank Princeton 11/06/03 18

Proposition Bank: Penn From Sentences to Propositions Powell met Zhu Rongji battle wrestle join Proposition Bank: Penn From Sentences to Propositions Powell met Zhu Rongji battle wrestle join debate Powell and Zhu Rongji met Powell met with Zhu Rongji Powell and Zhu Rongji had a meeting consult Proposition: meet(Powell, Zhu Rongji) meet(Somebody 1, Somebody 2) . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) Princeton 11/06/03 discuss([Powell, Zhu], return(X, plane)) 19

Capturing semantic roles* Penn SUBJ · George broke [ ARG 1 the laser pointer. Capturing semantic roles* Penn SUBJ · George broke [ ARG 1 the laser pointer. ] SUBJ · [ARG 1 The windows] were broken by the hurricane. SUBJ · [ARG 1 The vase] broke into pieces when it toppled over. *See also Framenet, http: //www. icsi. berkeley. edu/~framenet/ Princeton 11/06/03 20

English lexical resource is required Penn · That provides sets of possible syntactic frames English lexical resource is required Penn · That provides sets of possible syntactic frames for verbs with semantic role labels. · And provides clear, replicable sense distinctions. Princeton 11/06/03 21

A Tree. Banked Sentence Penn (S (NP-SBJ Analysts) (VP have (VP been VP (VP A Tree. Banked Sentence Penn (S (NP-SBJ Analysts) (VP have (VP been VP (VP expecting (NP a GM-Jaguar pact) have VP (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) NP-SBJ been VP (VP would Analyst expecting. NP (VP give s (NP the U. S. car maker) SBAR NP (NP an eventual (ADJP 30 %) stake) S a GM-Jaguar WHNP-1 (PP-LOC in (NP the British company)))))) VP pact that NP-SBJ VP *T*-1 would NP give PP-LOC NP Analysts have been expecting a GM-Jaguar NP pact that would give the U. S. car maker an the US car NP an eventual maker eventual 30% stake in the British company. the British 30% stake in company S Princeton 11/06/03 22

The same sentence, Prop. Banked Penn (S Arg 0 (NP-SBJ Analysts) (VP have (VP The same sentence, Prop. Banked Penn (S Arg 0 (NP-SBJ Analysts) (VP have (VP been Arg 1 (VP expecting Arg 1 (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg 0 (NP-SBJ *T*-1) a GM-Jaguar (VP would pact (VP give Arg 2 (NP the U. S. car maker) Arg 1 (NP an eventual (ADJP 30 %) stake) Arg 0 (PP-LOC in (NP the British that would give Arg 1 company)))))) have been expecting Arg 0 Analyst s *T*-1 Arg 2 the US car maker Princeton 11/06/03 an eventual 30% stake in the British company expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake) 23

Frames File Example: expect Penn Roles: Arg 0: expecter Arg 1: thing expected Example: Frames File Example: expect Penn Roles: Arg 0: expecter Arg 1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg 0: Portfolio managers REL: expect Arg 1: further declines in interest rates Princeton 11/06/03 24

Frames File example: give Penn Roles: Arg 0: giver Arg 1: thing given Arg Frames File example: give Penn Roles: Arg 0: giver Arg 1: thing given Arg 2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg 0: The executives REL: gave Arg 2: the chefs Arg 1: a standing ovation Princeton 11/06/03 25

Word Senses in Prop. Bank Penn · Orders to ignore word sense not feasible Word Senses in Prop. Bank Penn · Orders to ignore word sense not feasible for 700+ verbs Ø Mary left the room Ø Mary left her daughter-in-law her pearls in her will Frameset leave. 01 "move away from": Arg 0: entity leaving Arg 1: place left Frameset leave. 02 "give": Arg 0: giver Arg 1: thing given Arg 2: beneficiary How do these relate to traditional word senses in Verb. Net and Word. Net? Princeton 11/06/03 26

Annotation procedure Penn · PTB II - Extraction of all sentences with given verb Annotation procedure Penn · PTB II - Extraction of all sentences with given verb · Create Frame File for that verb Paul Kingsbury Ø (3100+ lemmas, 4400 framesets, 118 K predicates) Ø Over 300 created automatically via Verb. Net · First pass: Automatic tagging (Joseph Rosenzweig) Ø http: //www. cis. upenn. edu/~josephr/TIDES/index. html#lexicon · Second pass: Double blind hand correction Paul Kingsbury · Tagging tool highlights discrepancies Scott Cotton · Third pass: Solomonization (adjudication) Ø Betsy Klipple, Olga Babko-Malaya Princeton 11/06/03 27

Trends in Argument Numbering Penn · Arg 0 = agent · Arg 1 = Trends in Argument Numbering Penn · Arg 0 = agent · Arg 1 = direct object / theme / patient · Arg 2 = indirect object / benefactive / instrument / attribute / end state · Arg 3 = start point / benefactive / instrument / attribute · Arg 4 = end point · Per word vs frame level – more general? Princeton 11/06/03 28

Additional tags (arguments or adjuncts? ) Penn · Variety of Arg. M’s (Arg#>4): Ø Additional tags (arguments or adjuncts? ) Penn · Variety of Arg. M’s (Arg#>4): Ø TMP - when? Ø LOC - where at? Ø DIR - where to? Ø MNR - how? Ø PRP -why? Ø REC - himself, themselves, each other Ø PRD -this argument refers to or modifies another Ø ADV –others Princeton 11/06/03 29

Function tags for Chinese (arguments or adjuncts? ) Penn · Variety of Arg. M’s Function tags for Chinese (arguments or adjuncts? ) Penn · Variety of Arg. M’s (Arg#>4): Ø TMP - when? Ø LOC - where at? Ø DIR - where to? Ø MNR - how? Ø PRP -why? Ø TPC – topic Ø PRD -this argument refers to or modifies another Ø ADV –others Ø CND – conditional Ø DGR – degree Ø FRQ - frequency Princeton 11/06/03 30

Additional function tags for Chinese for phrasal verbs- Penn · Correspond to groups of Additional function tags for Chinese for phrasal verbs- Penn · Correspond to groups of “prepositions” Ø AS Ø AT Ø INTO Ø ONTO Ø TOWARDS Princeton 11/06/03 31

Inflection Penn · Verbs also marked for tense/aspect Ø Ø Ø Passive/Active Perfect/Progressive Third Inflection Penn · Verbs also marked for tense/aspect Ø Ø Ø Passive/Active Perfect/Progressive Third singular (is has does was) Present/Past/Future Infinitives/Participles/Gerunds/Finites · Modals and negations marked as Arg. Ms Princeton 11/06/03 32

Frames: Multiple Framesets Penn · Framesets are not necessarily consistent between different senses of Frames: Multiple Framesets Penn · Framesets are not necessarily consistent between different senses of the same verb · Framesets are consistent between different verbs that share similar argument structures, (like Frame. Net) · Out of the 787 most frequent verbs: Ø 1 Frame. Net – 521 Ø 2 Frame. Net – 169 Ø 3+ Frame. Net - 97 (includes light verbs) Princeton 11/06/03 33

Ergative/Unaccusative Verbs Penn Roles (no ARG 0 for unaccusative verbs) Arg 1 = Logical Ergative/Unaccusative Verbs Penn Roles (no ARG 0 for unaccusative verbs) Arg 1 = Logical subject, patient, thing rising Arg 2 = EXT, amount risen Arg 3* = start point Arg 4 = end point Sales rose 4% to $3. 28 billion from $3. 16 billion. The Nasdaq composite index added 1. 01 to 456. 6 on paltry volume. Princeton 11/06/03 34

Actual data for leave Penn · http: //www. cs. rochester. edu/~gildea/Prop. Bank/Sort/ Leave. 01 Actual data for leave Penn · http: //www. cs. rochester. edu/~gildea/Prop. Bank/Sort/ Leave. 01 “move away from” Arg 0 rel Arg 1 Arg 3 Leave. 02 “give” Arg 0 rel Arg 1 Arg 2 sub-ARG 0 obj-ARG 1 44 sub-ARG 0 20 sub-ARG 0 NP-ARG 1 -with obj-ARG 2 17 sub-ARG 0 sub-ARG 2 ADJP-ARG 3 -PRD 10 sub-ARG 1 ADJP-ARG 3 -PRD 6 sub-ARG 0 sub-ARG 1 VP-ARG 3 -PRD 5 NP-ARG 1 -with obj-ARG 2 4 obj-ARG 1 3 sub-ARG 0 sub-ARG 2 VP-ARG 3 -PRD 3 Princeton 11/06/03 35

Penn Prop. Bank/Frame. Net Buy Sell Arg 0: buyer Arg 0: seller Arg 1: Penn Prop. Bank/Frame. Net Buy Sell Arg 0: buyer Arg 0: seller Arg 1: goods Arg 2: seller Arg 2: buyer Arg 3: rate Arg 4: payment More generic, more neutral – maps readily to VN, TR Rambow, et al, PMLB 03 36 Princeton 11/06/03

Annotator accuracy – ITA 84% Princeton 11/06/03 Penn 37 Annotator accuracy – ITA 84% Princeton 11/06/03 Penn 37

English lexical resource is required Penn · That provides sets of possible syntactic frames English lexical resource is required Penn · That provides sets of possible syntactic frames for verbs with semantic role labels? · And provides clear, replicable sense distinctions. Princeton 11/06/03 38

English lexical resource is required Penn · That provides sets of possible syntactic frames English lexical resource is required Penn · That provides sets of possible syntactic frames for verbs with semantic role labels that can be automatically assigned accurately to new text? · And provides clear, replicable sense distinctions. Princeton 11/06/03 39

Automatic Labelling of Semantic Relations Penn • Stochastic Model • Features: Ø Predicate Ø Automatic Labelling of Semantic Relations Penn • Stochastic Model • Features: Ø Predicate Ø Phrase Type Ø Parse Tree Path Ø Position (Before/after predicate) Ø Voice (active/passive) Ø Head Word Gildea & Jurafsky, CL 02, Gildea & Palmer, ACL 02 Princeton 11/06/03 40

Semantic Role Labelling Accuracy. Known Boundaries Gold St. parses Framenet Prop. Bank ≥ 10 Semantic Role Labelling Accuracy. Known Boundaries Gold St. parses Framenet Prop. Bank ≥ 10 inst 77. 0 Automatic parses 82. 0 73. 6 Penn Prop. Bank ≥ 10 instances 83. 1 79. 6 • Accuracy of semantic role prediction for known boundaries--the system is given the constituents to classify. • Frame. Net examples (training/test) are handpicked to be unambiguous. • Lower performance with unknown boundaries. • Higher performance with traces. • Almost evens out. Princeton 11/06/03 41

Additional Automatic Role Labelers Penn · Performance improved from 77% to 88% Colorado Ø Additional Automatic Role Labelers Penn · Performance improved from 77% to 88% Colorado Ø (Gold Standard parses, < 10 instances) Ø Same features plus § Named Entity tags § Head word POS § For unseen verbs – backoff to automatic verb clusters Ø SVM’s § Role or not role § For each likely role, for each Arg#, Arg# or not § No overlapping role labels allowed Pradhan, et. al. , ICDM 03, Sardeneau, et. al, ACL 03, Chen & Rambow, EMNLP 03, Gildea & Hockemaier, EMNLP 03 Princeton 11/06/03 42

Word Senses in Prop. Bank Penn · Orders to ignore word sense not feasible Word Senses in Prop. Bank Penn · Orders to ignore word sense not feasible for 700+ verbs Ø Mary left the room Ø Mary left her daughter-in-law her pearls in her will Frameset leave. 01 "move away from": Arg 0: entity leaving Arg 1: place left Frameset leave. 02 "give": Arg 0: giver Arg 1: thing given Arg 2: beneficiary How do these relate to traditional word senses in Verb. Net and Word. Net? Princeton 11/06/03 43

Mapping from Prop. Bank to Verb. Net Penn Frameset id = leave. 02 Sense Mapping from Prop. Bank to Verb. Net Penn Frameset id = leave. 02 Sense = give Verb. Net class = future-having 13. 3 Arg 0 Giver Agent Arg 1 Thing Theme given Benefactiv Recipient e Arg 2 Princeton 11/06/03 44

Mapping from PB to Verb. Net Princeton 11/06/03 Penn 45 Mapping from PB to Verb. Net Princeton 11/06/03 Penn 45

Mapping from Prop. Bank to Verb. Net Penn · Overlap with Prop. Bank framesets Mapping from Prop. Bank to Verb. Net Penn · Overlap with Prop. Bank framesets Ø 50, 000 Prop. Bank instances Ø < 50% VN entries, > 85% VN classes · Results Ø MATCH - 78. 63%. (80. 90% relaxed) Ø (Verb. Net isn’t just linguistic theory!) · Benefits Ø Thematic role labels and semantic predicates Ø Can extend Prop. Bank coverage with Verb. Net classes Ø Word. Net sense tags Kingsbury & Kipper, NAACL 03, Text Meaning Workshop http: //www. cs. rochester. edu/~gildea/Verb. Net/ Princeton 11/06/03 46

Word. Net as a WSD sense inventory Penn · Senses unnecessarily fine-grained? · Word Word. Net as a WSD sense inventory Penn · Senses unnecessarily fine-grained? · Word Sense Disambiguation bakeoffs Ø Senseval 1 – Hector, ITA = 95. 5% Ø Senseval 2 – Word. Net 1. 7, ITA verbs = 71% Ø Groupings of Senseval 2 verbs, ITA =82% § Used syntactic and semantic criteria Princeton 11/06/03 47

Groupings Methodology (w/ Dang and Fellbaum) Penn · Double blind groupings, adjudication · Syntactic Groupings Methodology (w/ Dang and Fellbaum) Penn · Double blind groupings, adjudication · Syntactic Criteria (Verb. Net was useful) Ø Distinct subcategorization frames § call him a bastard § call him a taxi Ø Recognizable alternations – regular sense extensions: § play an instrument § play a song § play a melody on an instrument SIGLEX 01, SIGLEX 02, JNLE 04 Princeton 11/06/03 48

Groupings Methodology (cont. ) Penn · Semantic Criteria Ø Differences in semantic classes of Groupings Methodology (cont. ) Penn · Semantic Criteria Ø Differences in semantic classes of arguments § Abstract/concrete, human/animal, animate/inanimate, different instrument types, … Ø Differences in the number and type of arguments § Often reflected in subcategorization frames § John left the room. § I left my pearls to my daughter-in-law in my will. Ø Differences in entailments § Change of prior entity or creation of a new entity? Ø Differences in types of events § Abstract/concrete/mental/emotional/…. Ø Specialized subject domains Princeton 11/06/03 49

Results – averaged over 28 verbs Dang and Palmer, Siglex 02, Dang et al, Results – averaged over 28 verbs Dang and Palmer, Siglex 02, Dang et al, Coling 02 Penn Total WN polysemy 16. 28 Group polysemy 8. 07 ITA-fine 71% ITA-group 82% MX-fine 60. 2% MX-group 69% MX – Maximum Entropy WSD, p(sense|context) Features: topic, syntactic constituents, semantic classes 50 +2. 5%, +1. 5 to +5%, +6% Princeton 11/06/03

Grouping improved ITA and Maxent WSD Penn · Call: 31% of errors due to Grouping improved ITA and Maxent WSD Penn · Call: 31% of errors due to confusion between senses within same group 1: Ø name, call -- (assign a specified, proper name to; They named their son David) Ø call -- (ascribe a quality to or give a name of a common noun that reflects a quality; He called me a bastard) Ø call -- (consider or regard as being; I would not call her beautiful) Ø 75% with training and testing on grouped senses vs. Ø 43% with training and testing on fine-grained senses Princeton 11/06/03 51

Word. Net: - call, 28 senses, groups WN 5, WN 16, WN 12 Loud Word. Net: - call, 28 senses, groups WN 5, WN 16, WN 12 Loud cry WN 3 WN 19 WN 1 Label WN 22 WN 18 WN 27 Challenge WN 2 WN 13 Phone/radio WN 28 WN 17 , WN 11 Princeton 11/06/03 Penn WN 15 WN 26 Bird or animal cry WN 4 WN 7 WN 8 WN 9 Request WN 20 WN 6 WN 25 Call a loan/bond WN 23 Visit WN 10, WN 14, WN 21, WN 24, Bid 52

Word. Net: - call, 28 senses, groups WN 5, WN 16, WN 12 Loud Word. Net: - call, 28 senses, groups WN 5, WN 16, WN 12 Loud cry WN 3 WN 19 WN 1 Label WN 22 WN 18 WN 27 Challenge WN 2 WN 13 Phone/radio WN 28 WN 17 , WN 11 Princeton 11/06/03 Penn WN 15 WN 26 Bird or animal cry WN 4 WN 7 WN 8 WN 9 Request WN 20 WN 6 WN 25 Call a loan/bond WN 23 Visit WN 10, WN 14, WN 21, WN 24, Bid 53

Overlap between Groups and Framesets – 95% Penn Frameset 2 Frameset 1 WN 2 Overlap between Groups and Framesets – 95% Penn Frameset 2 Frameset 1 WN 2 WN 3 WN 4 WN 6 WN 7 WN 8 WN 5 WN 9 WN 10 WN 11 WN 12 WN 13 WN 19 WN 14 WN 20 develop Palmer, Dang & Fellbaum, NLE 2004 Princeton 11/06/03 54

Sense Hierarchy Penn · Prop. Bank Framesets – coarse grained distinctions Ø Sense Groups Sense Hierarchy Penn · Prop. Bank Framesets – coarse grained distinctions Ø Sense Groups (Senseval-2) intermediate level (includes Levin classes) – 95% overlap § Word. Net – fine grained distinctions Princeton 11/06/03 55

English lexical resource is available Penn ü That provides sets of possible syntactic frames English lexical resource is available Penn ü That provides sets of possible syntactic frames for verbs with semantic role labels that can be automatically assigned accurately to new text. ü And provides clear, replicable sense distinctions. Princeton 11/06/03 56

Summary of English Prop. Bank Penn Paul Kingsbury, Olga Babko-Malaya, Scott Cotton Genre Words Summary of English Prop. Bank Penn Paul Kingsbury, Olga Babko-Malaya, Scott Cotton Genre Words Frames Files Frameset Tags Released Wall Street Journal* (financial subcorpus) 300 K < 2000 400 July, 02 Wall Street Journal* (Penn Tree. Bank II) 1000 K < 4000 700 Dec, 03? (March, 03) English Translation of Chinese Tree. Bank * ITIC funding 100 K <1500 July, 04 Sinorama, English corpus NSF-ITR funding 150 K <2000 July, 05 English half of DLI Military Corpus ARL funding 50 K < 1000 July, 05 Princeton 11/06/03 57

A Chinese Treebank Sentence Penn 国会/Congress 最近/recently 通过/pass 了/ASP 银行法 /banking law “The Congress A Chinese Treebank Sentence Penn 国会/Congress 最近/recently 通过/pass 了/ASP 银行法 /banking law “The Congress passed the banking law recently. ” (IP (NP-SBJ (NN 国会/Congress)) (VP (ADV 最近/recently)) (VP (VV 通过/pass) (AS 了/ASP) (NP-OBJ (NN 银行法/banking law))))) Princeton 11/06/03 58

The Same Sentence, Prop. Banked Penn (IP (NP-SBJ arg 0 (NN 国会)) (VP arg. The Same Sentence, Prop. Banked Penn (IP (NP-SBJ arg 0 (NN 国会)) (VP arg. M (ADVP (ADV 最近)) (VP f 2 (VV 通过) (AS 了) arg 1 (NP-OBJ (NN 银行法))))) 通过(f 2) (pass) arg 0 arg. M arg 1 国会 最近 银行法 (law) (congress) Princeton 11/06/03 59

Chinese Prop. Bank Status - (w/ Bert Xue and Scott Cotton) Penn · Create Chinese Prop. Bank Status - (w/ Bert Xue and Scott Cotton) Penn · Create Frame File for that verb - Ø Similar alternations – causative/inchoative, unexpressed object Ø 5000 lemmas, 2000 DONE, (hired Jiang) · First pass: Automatic tagging 2000 DONE Ø Subcat frame matcher (Xue & Kulick, MT 03) · Second pass: Double blind hand correction Ø In progress (includes frameset tagging), 600 DONE Ø Ported RATS to CATS, in use since May · Third pass: Solomonization (adjudication) Princeton 11/06/03 60

Penn Summary of Chinese Prop. Bank Nianwen Xue, Meiyu Chang, Zhiyi, Ping Genre Words Penn Summary of Chinese Prop. Bank Nianwen Xue, Meiyu Chang, Zhiyi, Ping Genre Words Frames Files Frameset Tags Released Xinhua News DOD funding 250 K 4867 200 July, 04 Sinorama NSF-ITR funding 150 K < 4000 Princeton 11/06/03 July, 05 61

A Korean Treebank Sentence Penn 그는 르노가 3 월말까지 인수제의 시한을 갖고 있다고 덧붙였다. A Korean Treebank Sentence Penn 그는 르노가 3 월말까지 인수제의 시한을 갖고 있다고 덧붙였다. He added that Renault has a deadline until the end of March for a merger proposal. (S (NP-SBJ 그/NPN+은/PAU) (VP (S-COMP (NP-SBJ 르노/NPR+이/PCA) (VP (NP-ADV 3/NNU 월/NNX+말/NNX+까지/PAU) (VP (NP-OBJ 인수/NNC+제의/NNC 시한/NNC+을/PCA) 갖/VV+고/ECS)) 있/VX+다/EFN+고/PAD) 덧붙이/VV+었/EPF+다/EFN) . /SFN) Princeton 11/06/03 62

The same sentence, Prop. Banked 덧붙이었다 Arg 0 그는 Arg 2 갖고 있다 Arg The same sentence, Prop. Banked 덧붙이었다 Arg 0 그는 Arg 2 갖고 있다 Arg 0 르노가 Arg 1 Arg. M 3 월말까지 Penn (S Arg 0 (NP-SBJ 그/NPN+은/PAU) (VP Arg 2 (S-COMP ( Arg 0 NP-SBJ 르노/NPR+이/PCA) (VP ( Arg. M NP-ADV 3/NNU 월/NNX+말/NNX+까지/PAU) (VP ( Arg 1 NP-OBJ 인수/NNC+제의/NNC 시한/NNC+을/PCA) 갖/VV+고/ECS)) 있/VX+다/EFN+고/PAD) 덧붙이/VV+었/EPF+다/EFN). /SFN) 인수제의 시한을 덧붙이다 (그는, 르노가 3 월말까지 인수제의 시한을 갖고 있다) (add) (he) 갖다 (르노가, (has) Princeton 11/06/03 (Renaut has a deadline until the end of March for a merger proposal) 3 월말까지, 인수제의 시한을) (Renaut) (until the end of March) (a deadline for a merger proposal) 63

Korean Prop. Bank funded by ARO - CORK Penn · 50 K words Virginia Korean Prop. Bank funded by ARO - CORK Penn · 50 K words Virginia Corpus of Korean Treebank · New semantic augmentations Ø Predicate-argument relations for predicates Ø Label arguments: Arg 0, Arg 1, Arg 2 … · Korean lexical resource – Frames files Ø Use “semantic role glosses” unique to each predicate Ø Create manually for 900 predicates Ø Refer to the hand-corrected DSynt files from Virginia · Extend to newswire domain – 130 K words Princeton 11/06/03 64

Prop. Bank II Penn · Nominalizations NYU · Lexical Frames DONE · Event Variables, Prop. Bank II Penn · Nominalizations NYU · Lexical Frames DONE · Event Variables, (including temporals and locatives) · More fine-grained sense tagging Ø Tagging nominalizations w/ Word. Net sense Ø Selected verbs and nouns · Nominal Coreference Ø not names · Clausal Discourse connectives – selected subset Princeton 11/06/03 65

Prop. Bank II Event variables; Penn sense tags; nominal reference; discourse connectives {Also} [Arg Prop. Bank II Event variables; Penn sense tags; nominal reference; discourse connectives {Also} [Arg 0 substantially lower Dutch corporate tax rates] , helped [Arg 1[Arg 0 the company] keep [Arg 1 its tax outlay] [Arg 3 PRD flat] [Arg. M-ADV relative to earnings growth]]. ID# REL h 23 help tax rates help 2, 5 tax rate 1 the company keep its tax outlay flat k 16 keep 1 its tax outlay Princeton 11/06/03 Arg 0 the company 1 company Arg 1 Arg 3 PRD Arg. M-ADV flat relative to earnings… 66

Summary of Multilingual Tree. Banks, Prop. Banks Parallel Text Corpora Chinese Treeban k Arabic Summary of Multilingual Tree. Banks, Prop. Banks Parallel Text Corpora Chinese Treeban k Arabic Treeban k Korean Treeban Princeton 11/06/03 Treebank Penn Prop. Bank Prop II I Chinese 500 K Ch 100 K English 400 K English 100 K English 350 K En 100 K Arabic 500 K English 500 K Arabic 500 K English ? ? ? Korean 180 K English 50 K 67

Levin class: escape-51. 1 -1 Penn · Word. Net Senses: WN 1, 5, 8 Levin class: escape-51. 1 -1 Penn · Word. Net Senses: WN 1, 5, 8 · Thematic Roles: Location[+concrete] Theme[+concrete] · Frames with Semantics Basic Intransitive "The convict escaped" motion(during(E), Theme) direction(during(E), Prep, Theme, ~Location) Intransitive (+ path PP) "The convict escaped from the prison" Locative Preposition Drop "The convict escaped the prison" Princeton 11/06/03 68

Levin class: future_having-13. 3 Penn · Word. Net Senses: WN 2, 10, 13 · Levin class: future_having-13. 3 Penn · Word. Net Senses: WN 2, 10, 13 · Thematic Roles: Agent[+animate OR +organization] Recipient[+animate OR +organization] Theme[] · Frames with Semantics Dative "I promised somebody my time" Agent V Recipient Theme has_possession(start(E), Agent, Theme) future_possession(end(E), Recipient, Theme) cause(Agent, E) Transitive (+ Recipient PP) "We offered our paycheck to her" Agent V Theme Prep(to) Recipient ) Transitive (Theme Object) "I promised my house (to somebody)" Agent V Theme Princeton 11/06/03 69

Actual data for leave Penn · http: //www. cs. rochester. edu/~gildea/Prop. Bank/Sort/ Leave. 01 Actual data for leave Penn · http: //www. cs. rochester. edu/~gildea/Prop. Bank/Sort/ Leave. 01 “move away from” Arg 0 rel Arg 1 Arg 3 Leave. 02 “give” Arg 0 rel Arg 1 Arg 2 sub-ARG 0 obj-ARG 1 44 sub-ARG 0 20 sub-ARG 0 NP-ARG 1 -with obj-ARG 2 17 sub-ARG 0 sub-ARG 2 ADJP-ARG 3 -PRD 10 sub-ARG 1 ADJP-ARG 3 -PRD 6 sub-ARG 0 sub-ARG 1 VP-ARG 3 -PRD 5 NP-ARG 1 -with obj-ARG 2 4 obj-ARG 1 3 sub-ARG 0 sub-ARG 2 VP-ARG 3 -PRD 3 Princeton 11/06/03 70

SENSEVAL – Word Sense Disambiguation Evaluation Penn DARPA style bakeoff: training data, testing data, SENSEVAL – Word Sense Disambiguation Evaluation Penn DARPA style bakeoff: training data, testing data, scoring algorithm. Languages Systems Eng. Lexical Sample Verbs/Poly/Instanc es Princeton 11/06/03 SENSEVAL 1 1998 3 24 Yes 13/12/215 SENSEVAL 2 2001 12 90 Yes 29/16/110 NLE 99, CHUM 01, NLE 02, NLE 03 71

Maximum Entropy WSD Hoa Dang, best performer on Verbs Penn · Maximum entropy framework, Maximum Entropy WSD Hoa Dang, best performer on Verbs Penn · Maximum entropy framework, p(sense|context) · Contextual Linguistic Features ØTopical feature for W: § keywords (determined automatically) ØLocal syntactic features for W: § presence of subject, complements, passive? § words in subject, complement positions, particles, preps, etc. ØLocal semantic features for W: § Semantic class info from Word. Net (synsets, etc. ) § Named Entity tag (PERSON, LOCATION, . . ) for proper Ns § words within +/- 2 word window Princeton 11/06/03 72