Скачать презентацию EECS 595 LING 541 SI 661 Скачать презентацию EECS 595 LING 541 SI 661

f23ddb2f3571a6c797c35af3221e5fbf.ppt

  • Количество слайдов: 41

EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2005 Lecture EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2005 Lecture Notes #3

Context-Free Grammars for English Context-Free Grammars for English

Context-Free Rules and Trees • Grammars • CFG = PSG = BNF • Derivations, Context-Free Rules and Trees • Grammars • CFG = PSG = BNF • Derivations, parse trees

Constituency • Examples: – – – Josephine My neighbor’s cat He Peter, Paul, and Constituency • Examples: – – – Josephine My neighbor’s cat He Peter, Paul, and Mary The first three people to participate in the competition with (? ) • Preposed and postposed constructions: – In the park, he plays with his dog. – He plays in the park with his dog. – He plays with his dog in the park.

Examples of noun phrases • Terminals, non-terminals • Parsing: the process of mapping from Examples of noun phrases • Terminals, non-terminals • Parsing: the process of mapping from a string of words to one or more parse trees

Sentence-level constructions • • • Declarative vs. imperative sentences Imperative sentences: S VP Yes-no Sentence-level constructions • • • Declarative vs. imperative sentences Imperative sentences: S VP Yes-no questions: S Aux NP VP Wh-type questions: S Wh-NP VP Fronting (less frequent): On Tuesday, I would like to fly to San Diego

Noun phrase • Before the noun – – Determiner: a, the, that, this, those, Noun phrase • Before the noun – – Determiner: a, the, that, this, those, any, some No determiner (e. g. , in plural, mass nouns “dinner”) Predeterminers: all Postdeterminers: cardinals, ordinals, quantifiers: one, two; first, second, next, last, past, other, another; many, (a) few, several, much, a little – Adjectives: a first-class fare, a nonstop flight, the longest layover – AP: the least expensive fare – NP (Det) (Card) (Ord) (Quant) (AP) Nominal

Noun phrases (Cont’d) • Postmodifiers: – any stopovers [for Delta seven fifty one] – Noun phrases (Cont’d) • Postmodifiers: – any stopovers [for Delta seven fifty one] – all flights [from Cleveland] [to Newark] • Nominal PP (PP) • Non-finite postmodifiers: gerundive, -ed, infinitive

Gerunds • any flights [arriving after ten p. m] • Nominal Gerund. VP • Gerunds • any flights [arriving after ten p. m] • Nominal Gerund. VP • Gerund. VP Gerund. V NP | Gerund. V PP | Gerund. V NP PP • Gerund. V being | preferring | arriving …

Infinitives and –ed forms • the last flight to arrive in Boston • I Infinitives and –ed forms • the last flight to arrive in Boston • I need to have dinner served • which is the aircraft used by this flight?

Postnominal relative clauses • Restrictive relative clauses: – A flight that serves breakfast – Postnominal relative clauses • Restrictive relative clauses: – A flight that serves breakfast – Flights that leave in the morning – The United flight that arrives in San Jose at ten p. m. • Rules: – Nominal Rel. Clause – Rel. Clause (who | that) VP • Multiple postnominal modifiers can be combined: – A boy from London studying French in Spain (what are the modifiers in the previous example)?

Combining post-modifiers • A flight from Phoenix to Detroit leaving Monday evening • Evening Combining post-modifiers • A flight from Phoenix to Detroit leaving Monday evening • Evening flights from Nashville to Houston that serve dinner

A slightly more complicated example • The earliest American Airlines flight that I can A slightly more complicated example • The earliest American Airlines flight that I can get • What rules are needed in the grammar for this type of constructions?

Coordination • Coordinate noun phrases: – NP and NP – S S and S Coordination • Coordinate noun phrases: – NP and NP – S S and S – Similar for VP, etc.

Agreement • Examples: – – – Do any flights stop in Chicago? Do I Agreement • Examples: – – – Do any flights stop in Chicago? Do I get dinner on this flight? Does Delta fly from Atlanta to Boston? What flights leave in the morning? * What flight leave in the morning? • Rules: – – – S Aux NP VP S 3 sg. Aux 3 sg. NP VP S Non 3 sg. Aux Non 3 sg. NP VP 3 sg. Aux does | has | can … non 3 sg. Aux do | have | can …

Agreement • We now need similar rules for pronouns, also for number agreement, etc. Agreement • We now need similar rules for pronouns, also for number agreement, etc. – 3 Sg. NP (Det) (Card) (Ord) (Quant) (AP) Sg. Nominal – Non 3 Sg. NP (Det) (Card) (Ord) (Quant) (AP) Pl. Nominal – Sg. Nominal Sg. Noun | Sg. Noun – etc.

Combinatorial explosion • What other phenomena will cause the grammar to expand? • Solution: Combinatorial explosion • What other phenomena will cause the grammar to expand? • Solution: parameterization with feature structures (see Chapter 11)

The Verb phrase • • VP Verb NP PP VP Verb PP The Verb phrase • • VP Verb NP PP VP Verb PP

Sentential complements • You said there were two flights that were the cheapest • Sentential complements • You said there were two flights that were the cheapest • You said you had a two hundred sixty six dollar fare • VP Verb S • I want to fly from Milwaukee to Orlando • I’m trying to find a flight that goes from Pittsburgh to Denver next Friday • VP Verb VP

Subcategorization • Frames: – – – – 0: eat, sleep NP: prefer, find, leave Subcategorization • Frames: – – – – 0: eat, sleep NP: prefer, find, leave NP NP: show, give PPfrom PPto: fly, travel NP PPwith: help, load VPto: prefer, want, need VPbarestem: can, would, might S: mean

Subcategorization ambiguity • Find me a flight – What phenomenon is related to this Subcategorization ambiguity • Find me a flight – What phenomenon is related to this sentence? • Others?

Auxiliaries • • • Modals: can, could, may, might Perfect: have Progressive: be Passive: Auxiliaries • • • Modals: can, could, may, might Perfect: have Progressive: be Passive: be What are their subcategories? Ordering: modal < perfect < progressive < passive

Parsing with Context-Free Grammars Parsing with Context-Free Grammars

Introduction • Parsing = associating a structure (parse tree) to an input string using Introduction • Parsing = associating a structure (parse tree) to an input string using a grammar • CFG are declarative, they don’t specify how the parse tree will be constructed • Parsing programming languages is easy. They are designed to be unambiguous and efficiently parsed. • However, natural languages are inherently ambiguous – I saw [the man] [with a telescope]. – I saw [the man with a telescope].

Applications • Parse trees are used in – – Grammar checking: MS Word Semantic Applications • Parse trees are used in – – Grammar checking: MS Word Semantic analysis: explaining ambiguity Machine translation: parse tree operations Question answering: e. g. “How many people in the Human Resources Department receive salaries above $30, 000? ” – Speech recognition: e. g. Put the file in the folder. Put the file and the folder. – information extraction, information retrieval, etc. .

Parsing as search S NP VP Det that | this |a S Aux NP Parsing as search S NP VP Det that | this |a S Aux NP VP Noun book | flight | meal | money S VP Verb book | include | prefer NP Det Nominal Aux does Nominal Noun Proper-Noun Houston | TWA Nominal Noun Nominal Prep from | to | on NP Proper-Noun VP Verb NP Nominal PP

Parsing as search Book that flight. S Two types of constraints on the parses: Parsing as search Book that flight. S Two types of constraints on the parses: a) some that come from the input string, b) others that come from the grammar VP NP Nom Verb Det Noun Book that flight

Top-down parsing S S NP VP S NP S S VP Det Nom NP Top-down parsing S S NP VP S NP S S VP Det Nom NP Prop. N Aux NP S VP VP S S S Aux NP VP VP VP Det Nom Prop. N V NP V

Bottom-up parsing Book that flight Noun Det Noun Verb Det Noun Book that flight Bottom-up parsing Book that flight Noun Det Noun Verb Det Noun Book that flight NOM NOM Noun Det Noun Verb Det Noun Book that flight NP NP NOM VP NOM Noun Det Noun Verb Det Noun Book that flight NOM VP Verb VP NP Det Noun Book that flight NP NOM Verb Det Noun Book that flight

Comparing TD and BU parsers • TD parser – never wastes time exploring trees Comparing TD and BU parsers • TD parser – never wastes time exploring trees that cannot result in an S – but ignores the input until it reaches the “leaves” of the tree. • BU parser – never spends effort on trees that are not consistent with the input. – but constructs useless subtrees that do not lead to an S. • Needed: some middle ground.

Basic TD parser • Practically infeasible to generate all trees in parallel. • Use Basic TD parser • Practically infeasible to generate all trees in parallel. • Use depth-first strategy. • When arriving at a tree that is inconsistent with the input, return to the most recently generated but still unexplored tree.

A TD-DF-LR parser function TOP-DOWN-PARSE (input, grammar) returns a parse tree agenda (Initial S A TD-DF-LR parser function TOP-DOWN-PARSE (input, grammar) returns a parse tree agenda (Initial S tree, Beginning of input) current-search-state POP (agenda) loop if SUCCESSFUL-PARSE? (current-search-state) then return TREE (current-search-state) else if CAT (NODE-TO-EXPAND (current-search-state)) is a POS then if CAT (node-to-expand) POS (CURRENT-INPUT (current-search-state)) then PUSH (APPLY-LEXICAL-RULE (current-search-state), agenda) else return reject else PUSH (APPLY-RULES (current-search-state, grammar), agenda) if agenda is empty then return reject else current-search-state NEXT (agenda) end

An example Does this flight include a meal? • We can add bottom-up filtering An example Does this flight include a meal? • We can add bottom-up filtering to eliminate the trees that are inconsistent with the input. This is called left corner (LC) parsing.

Problems with the basic parser • Left-recursion: rules of the type: NP PP solution: Problems with the basic parser • Left-recursion: rules of the type: NP PP solution: rewrite each rule of the form A Ab | a using a new symbol: A a. A’ A b. A’ | e • Ambiguity: attachment ambiguity, coordination ambiguity, noun-phrase bracketing ambiguity • Attachment ambiguity: I saw the Grand Canyon flying to New York • Coordination ambiguity: old men and women

Problems with the basic parser • Example: President Kennedy today pushed aside other White Problems with the basic parser • Example: President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio. • Solutions: return all parses or include disambiguation in the parser. • Inefficient reparsing of subtrees: a flight from Indianapolis to Houston on TWA

The Earley algorithm (aka Chart Parser) • Resolving: – Left-recursive rules – Ambiguity – The Earley algorithm (aka Chart Parser) • Resolving: – Left-recursive rules – Ambiguity – Inefficient reparsing of subtrees • A chart with N+1 entries • Dotted rules – S . VP, [0, 0] – NP Det. Nominal, [1, 2] – VP V NP. , [0, 3]

Three operations • Predictor (expands the rules) – Given S . VP, [0, 0], Three operations • Predictor (expands the rules) – Given S . VP, [0, 0], derive VP . Verb, [0, 0] and VP . Verb NP, [0, 0] • Scanner (scans the current word in the input if applicable) – Given VP . Verb NP, [0, 0], derive VP Verb. , [0, 1] and VP Verb. NP, [0, 1] if the current word is a Verb. • Completer (completes parsing an entire rule) – Given NP Det Nominal. , [1, 3], and VP Verb. NP, [0, 1], derive VP Verb NP. , [0, 3]

Overview of Chart Parser • Dynamic programming. All possible states for chart[n] are produced Overview of Chart Parser • Dynamic programming. All possible states for chart[n] are produced before reading the n+1 st word. • Never parses the same subtree again. • The idea of “incremental” parsing is close to how humans parse the sentences. Is chart table a representation of the human brain’s state?

Some Theoretical Limitations • Chart parser is O(n 3) • Fast CFG parsing requires Some Theoretical Limitations • Chart parser is O(n 3) • Fast CFG parsing requires fast Boolean matrix multiplication (Lee 2002), i. e. it is very unlikely that a much better algorithm exists for parsing. • There is strong evidence showing that natural languages may not be context-free at all (Shieber 1985).

Parsing with FSAs • Shallow parsing • Useful for information extraction: noun phrases, verb Parsing with FSAs • Shallow parsing • Useful for information extraction: noun phrases, verb phrases, locations, etc. • The Fastus system (Appelt and Israel, 1997) • Sample rules for noun groups: NG Pronoun | Time-NP | Date-NP NG (DETP) (Adjs) Hd. Nns | DETP Ving Hd. Nns DETP-CP | DETP-CP • Complete determiner-phrases: “the only five”, “another three”, “this”, “many”, “hers”, “all”, “the most”

Sample FASTUS output Company Name: Verb Group: Noun Group: Preposition: Location: Preposition: Noun Group: Sample FASTUS output Company Name: Verb Group: Noun Group: Preposition: Location: Preposition: Noun Group: Conjunction: Noun Group: Verb Group: Preposition: Location: Bridgestone Sports Co. said Friday it had set up a joint venture in Taiwan with a local concern and a Japanese trading house to produce golf clubs to be shipped to Japan