bd8d0858cf3e283ab766eb5af9798f1e.ppt
- Количество слайдов: 21
CSA 3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser October 2008 CSA 3180: Sentence Parsing 1
Problems with DFTD Parser • Left Recursion • Handling Ambiguity • Inefficiency October 2008 CSA 3180: Sentence Parsing 2
Left Recursion • A grammar is left recursive if it contains at least one non-terminal A for which A * A and * (n. b. * is the transitive closure of ) • Intuitive idea: derivation of that category includes itself along its leftmost branch. NP PP NP and NP NP Det. P Nominal Det. P NP ' s October 2008 CSA 3180: Sentence Parsing 3
Left Recursion Left recursion can lead to an infinite loop [nltk demo October 2008 CSA 3180: Sentence Parsing 4
Dealing with Left Recursion • Use different parsing strategy • Reformulate the grammar to eliminate LR A A | is rewritten as A A' A' | October 2008 CSA 3180: Sentence Parsing 5
Rewriting the Grammar NP → NP ‘and’ NP NP → D N | D N PP October 2008 CSA 3180: Sentence Parsing 6
Rewriting the Grammar NP → NP ‘and’ NP β NP → D N | D N PP α October 2008 CSA 3180: Sentence Parsing 7
Rewriting the Grammar NP → NP ‘and’ NP β NP → D N | D N PP New Grammar NP → α NP 1 → β NP 1 | ε α October 2008 CSA 3180: Sentence Parsing 8
Rewriting the Grammar NP → NP ‘and’ NP β NP → D N | D N PP α October 2008 New Grammar NP → α NP 1 → β NP 1 | ε α → D N | D N PP β → ‘and’ NP CSA 3180: Sentence Parsing 9
New Parse Tree NP α NP 1 D N the cat October 2008 CSA 3180: Sentence Parsing ε 10
Rewriting the Grammar • Different parse tree • Unnatural parse tree? October 2008 CSA 3180: Sentence Parsing 11
Problems with DFTD Parser • Left Recursion • Handling Ambiguity • Inefficiency October 2008 CSA 3180: Sentence Parsing 12
Handling Ambiguity • Coordination Ambiguity: different scope of conjunction: Hot curry and ice taste nice with rice Hot curry and rice taste nice with ice • Attachment Ambiguity: a constituent can be added to the parse tree in different places: I shot an elephant in my trousers • VP → VP PP NP → NP PP October 2008 CSA 3180: Sentence Parsing 13
Real sentences are full of ambiguities President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio October 2008 CSA 3180: Sentence Parsing 14
Prepositional Phrase Ambiguity No of PPs # parses he will deliver - to the American people - over nationwide TV - in New York - during September - for very good reasons October 2008 2 2 3 5 4 5 6 7 8 14 132 469 1430 4867 CSA 3180: Sentence Parsing 15
Growth of Number of Ambiguities The nth Catalan number counts the ways of dissecting a polygon with n+2 sides into triangles by drawing nonintersecting diagonals. No of PPs # parses 2 3 4 5 14 5 6 7 8 October 2008 2 132 469 1430 4867 CSA 3180: Sentence Parsing 16
Handling Ambiguities • Statistical disambiguation – which is the most probable interpretation? • Semantic knowledge – which is the most sensible interpretation? – Subatomic particles such as positively charged protons and electrons October 2008 CSA 3180: Sentence Parsing 17
Problems with DFTD Parser • Left Recursion • Handling Ambiguity • Inefficiency October 2008 CSA 3180: Sentence Parsing 18
Repeated Parsing of Subtrees • Local versus global ambiguity. – NP → Det Noun – NP → NP PP • Because of the top down depth first, left to right policy, the parser builds trees that fail because they do not cover all of the input. • Successive parses cover larger segments of the input, but these include structures that have already been built before. October 2008 CSA 3180: Sentence Parsing 19
Repeated Parsing of Subtrees NP Nom Det a a flight 3 to Houston 2 on TWA 1 A flight from Indianapolis flight 4 from Indianapolis Noun 3 A flight from Indianapolis to 2 Houston A flight from Indianapolis to 1 Houston on TWA October 2008 NP NP PP Nom Det a Noun P Noun flight from Indianapolis CSA 3180: Sentence Parsing 20
Repeated Parsing of Subtrees a flight 4 from Indianapolis 3 to Houston 2 on TWA 1 A flight from Indianapolis 3 A flight from Indianapolis to 2 Houston A flight from Indianapolis to 1 Houston on TWA October 2008 CSA 3180: Sentence Parsing 21


