Скачать презентацию EECS 595 LING 541 SI 661 Скачать презентацию EECS 595 LING 541 SI 661

07155be3e4616414576785d7a9fb57d4.ppt

  • Количество слайдов: 24

EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2004 Lecture EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2004 Lecture Notes #4

Parsing with Context-Free Grammars Parsing with Context-Free Grammars

Introduction • Parsing = associating a structure (parse tree) to an input string using Introduction • Parsing = associating a structure (parse tree) to an input string using a grammar • CFG are declarative, they don’t specify how the parse tree will be constructed • Parse trees are used in grammar checking, semantic analysis, machine translation, question answering, information extraction • Example: “How many people in the Human Resources Department receive salaries above $30, 000? ”

Parsing as search S NP VP Det that | this |a S Aux NP Parsing as search S NP VP Det that | this |a S Aux NP VP Noun book | flight | meal | money S VP Verb book | include | prefer NP Det Nominal Aux does Nominal Noun Proper-Noun Houston | TWA Nominal Noun Nominal Prep from | to | on NP Proper-Noun VP Verb NP Nominal PP

Parsing as search Book that flight. S Two types of constraints on the parses: Parsing as search Book that flight. S Two types of constraints on the parses: a) some that come from the input string, b) others that come from the grammar VP NP Nom Verb Det Noun Book that flight

Top-down parsing S S NP VP S NP S S VP Det Nom NP Top-down parsing S S NP VP S NP S S VP Det Nom NP Prop. N Aux NP S VP VP S S S Aux NP VP VP VP Det Nom Prop. N V NP V

Bottom-up parsing Book that flight Noun Det Noun Verb Det Noun Book that flight Bottom-up parsing Book that flight Noun Det Noun Verb Det Noun Book that flight NOM NOM Noun Det Noun Verb Det Noun Book that flight NP NP NOM VP NOM Noun Det Noun Verb Det Noun Book that flight NOM VP Verb VP NP Det Noun Book that flight NP NOM Verb Det Noun Book that flight

Comparing TD and BU parsers • TD never wastes time exploring trees that cannot Comparing TD and BU parsers • TD never wastes time exploring trees that cannot result in an S. • BU however never spends effort on trees that are not consistent with the input. • Needed: some middle ground.

Basic TD parser • Practically infeasible to generate all trees in parallel. • Use Basic TD parser • Practically infeasible to generate all trees in parallel. • Use depth-first strategy. • When arriving at a tree that is inconsistent with the input, return to the most recently generated but still unexplored tree.

A TD-DF-LR parser function TOP-DOWN-PARSE (input, grammar) returns a parse tree agenda (Initial S A TD-DF-LR parser function TOP-DOWN-PARSE (input, grammar) returns a parse tree agenda (Initial S tree, Beginning of input) current-search-state POP (agenda) loop if SUCCESSFUL-PARSE? (current-search-state) then return TREE (current-search-state) else if CAT (NODE-TO-EXPAND (current-search-state)) is a POS then if CAT (node-to-expand) POS (CURRENT-INPUT (current-search-state)) then PUSH (APPLY-LEXICAL-RULE (current-search-state), agenda) else return reject else PUSH (APPLY-RULES (current-search-state, grammar), agenda) if agenda is empty then return reject else current-search-state NEXT (agenda) end

An example Does this flight include a meal? An example Does this flight include a meal?

Problems with the basic parser • Left-recursion: rules of the type: NP PP solution: Problems with the basic parser • Left-recursion: rules of the type: NP PP solution: rewrite each rule of the form A Ab | a using a new symbol: A a. A’ A b. A’ | e • Ambiguity: attachment ambiguity, coordination ambiguity, noun-phrase bracketing ambiguity • Attachment ambiguity: I saw the Grand Canyon flying to New York • Coordination ambiguity: old men and women

Problems with the basic parser • Example: President Kennedy today pushed aside other White Problems with the basic parser • Example: President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio. • Solutions: return all parses or include disambiguation in the parser. • Inefficient reparsing of subtrees: a flight from Indianapolis to Houston on TWA

The Earley algorithm • Resolving: – Left-recursive rules – Ambiguity – Inefficient reparsing of The Earley algorithm • Resolving: – Left-recursive rules – Ambiguity – Inefficient reparsing of subtrees • A chart with N+1 entries • Dotted rules – S . VP, [0, 0] – NP Det. Nominal, [1, 2] – VP V NP. , [0, 3]

Parsing with FSAs • Shallow parsing • Useful for information extraction: noun phrases, verb Parsing with FSAs • Shallow parsing • Useful for information extraction: noun phrases, verb phrases, locations, etc. • The Fastus system (Appelt and Israel, 1997) • Sample rules for noun groups: NG Pronoun | Time-NP | Date-NP NG (DETP) (Adjs) Hd. Nns | DETP Ving Hd. Nns DETP-CP | DETP-CP • Complete determiner-phrases: “the only five”, “another three”, “this”, “many”, “hers”, “all”, “the most”

Sample FASTUS output Company Name: Verb Group: Noun Group: Preposition: Location: Preposition: Noun Group: Sample FASTUS output Company Name: Verb Group: Noun Group: Preposition: Location: Preposition: Noun Group: Conjunction: Noun Group: Verb Group: Preposition: Location: Bridgestone Sports Co. said Friday it had set up a joint venture in Taiwan with a local concern and a Japanese trading house to produce golf clubs to be shipped to Japan

Features and unification Features and unification

Introduction • Grammatical categories have properties • Constraint-based formalisms • Example: this flights: agreement Introduction • Grammatical categories have properties • Constraint-based formalisms • Example: this flights: agreement is difficult to handle at the level of grammatical categories • Example: many water: count/mass nouns • Sample rule that takes into account features: S NP VP (but only if the number of the NP is equal to the number of the VP)

Feature structures CAT NP NUMBER SINGULAR PERSON 3 CAT AGREEMENT NP NUMBER SG PERSON Feature structures CAT NP NUMBER SINGULAR PERSON 3 CAT AGREEMENT NP NUMBER SG PERSON 3 Feature paths: {x agreement number}

Unification [NUMBER SG] + [NUMBER SG] [NUMBER PL] - [NUMBER SG] [NUMBER []] = Unification [NUMBER SG] + [NUMBER SG] [NUMBER PL] - [NUMBER SG] [NUMBER []] = [NUMBER SG] [PERSON 3] = ?

Agreement • S NP VP {NP AGREEMENT} = {VP AGREEMENT} • Does this flight Agreement • S NP VP {NP AGREEMENT} = {VP AGREEMENT} • Does this flight serve breakfast? • Do these flights serve breakfast? • S Aux NP VP {Aux AGREEMENT} = {NP AGREEMENT}

Agreement • These flights • This flight • NP Det Nominal {Det AGREEMENT} = Agreement • These flights • This flight • NP Det Nominal {Det AGREEMENT} = {Nominal AGREEMENT} • Verb serve {Verb AGREEMENT NUMBER} = PL • Verb serves {Verb AGREEMENT NUMBER} = SG

Subcategorization • VP Verb {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = INTRANS Subcategorization • VP Verb {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = INTRANS • VP Verb NP {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = TRANS • VP Verb NP NP {VP HEAD} = {Verb HEAD} {VP HEAD SUBCAT} = DITRANS

Readings for next time • J&M Chapters 12, 13, 20 • Lecture notes #4 Readings for next time • J&M Chapters 12, 13, 20 • Lecture notes #4 • FUF/CFUF documentation