- Количество слайдов: 68
Computational Paninian Grammar for Dependency Parsing Dipti Misra Sharma LTRC, IIIT, Hyderabad NLP Winter School 25 -12 -2008
Outline Backgrond Paninian Grammar : The Basic Framework Some Example Cases Conclusion
Background Indian languages Rich morphology Relatively flexible word order For example, 1. a) baccaa phala khaataa hai ‘child’ ‘fruit’ ‘eat+hab’ ‘pres’ b) phala baccaa khaataa hai c) phala khaataa hai baccaa d) baccaa khaataa hai phala
A V VP Nux Saccaa NP phala khaataa hai b Basic Structure in PS 1 a) baccaa phala khaataa hai ‘child’ ‘fruit’ ‘eat+hab’ ‘pres’ Subject – baccaa ‘child’ • Object - phala ‘fruit’ • VP
Tree - I PS for 1(b) 1 b) phala baccaa khaataa hai ‘fruit’ ‘child’ ‘eat’ ‘pres’ Topic – phala ‘fruit’ Subject - baccaa ‘child’ Object - t Movement involved •
Problems Complex tree In what ways subject (baccaa) is different from object (phala) ? Agreement does not hold Position does not hold
How to Draw PSs for 1 (c-d) ? 1 c) baccaa khaata hai phala 'child' 'eat+hab' 'pres' 'fruit' 1 d) phala khaata hai baccaa 'fruit' 'eat+hab' 'pres' 'child' Simple and perfectly natural sentences difficult to handle in Phrase Structure Dependency structures make it easy
k 2 k 1 • One baccaa phala dependency for all (1 a-d) khaataa_ hai Additional attribute of 'order' can be included to capture the variation in order • Case and postpositions be encoded in role • Dependency Structure baccaa phala khaataa hai ‘child’ ‘fruit’ ‘eat’ ‘is’ phala baccaa khaataa hai ‘fruit’ ‘child’ ‘eat’ ‘is’ baccaa khaata hai phala ‘child’ ‘eat’ ‘is’ ‘fruit’ phala khaata hai baccaa ‘fruit’ ‘eat’ ‘is’ ‘child’
Paninian Grammatical Formalism A dependency grammar based approach Motivation for following the Paninian approach Inspired by inflectionally rich language (Sanskrit) Better suited for handling ILs Provides the level of syntactico-semantic interface for parsing Various linguistic phenomena handled seamlessly ( Refer Akshar Bharati et al Natural Language Parsing a Paninian Perspective (1995) http: //ltrc. iiit. net/showfile. php? filename=downloads/nlpb ook/index. html)
Panian Grammar Contd. The grammar facilitates realisation of the intended meaning as an 'expression' of what the speaker wants to communicate (vivaksha)
The Basic Framework Treats a sentence as a series of modifiermodified relations A sentence has a primary modified (generally a verb) Provides a blueprint to identify these relations Syntactic cues help in identifying the relation types
Levels of Representation (1) Semantic information Assignment of karakas (Th-roles) and of abstract tense (2) Morphosyntactic representation Morphological spellout rules (3) Abstract morphological representation Allomorphy and phonology (4) Phonological output form (From Kiparsky, Lectures in CIEFL, Hyderabad, pg 2)
Some Concepts Speaker's intention (vivakshaa) Root + Suffix (prakriti + pratyaya) Expectancy (aakaankshaa) Eligibility (yogyataa) Proximity (sannidhi) Karaka vibhakti
Speaker’s Intention (vivakshaa) Each sentence reflects speaker’s intention Various sub-actions come into focus Participants are assigned various relations accordingly ‘key’ gets assigned karta, karana based on the kind of sub-action under focus Syntax reflects vivaksha
Prakriti and Pratyaya (root and suffix) The premise Every word is composed of two parts 1. Content part (root morpheme) 2. Functional part (affix) For languages such as English and Hindi the auxiliaries can be treated as the functional morphemes Morph analysers or Local word groupers can provide this information
aakaankshaa (Expectation/Demand) Every word has certain demands to be fulfilled. For Parsing, verb is the most critical element The demand frames (karaka frames) for the verbs list out their demands
For Example, frame of Hindi verb 'khaa' Verb khaa Sense to eat Sense ID ? ? ? raam seb khaataa hai Eg ‘Ram ate an apple’ -----------------------------------------arc-label necessity vibhakti lextype reln -----------------------------------------k 1 m 0 n c k 2 m 0 n c --------------------------------------------k 1 karta; k 2 karma; m mandatory; n noun; c child
Yogyataa (Eligibility) Selectional Restrictions For example, baccaa phala khaataa hai 'phala' (fruit) does not have the eligibility to become the 'karta' of the verb 'khaa' (eat) Constraints based on yogyata require semantic knowledge for each lexical item This knowledge can be obtained from a lexical resource such as a 'Word. Net'
Sannidhi (Proximity) The modifier and the modified tend to occur in close proximity in a sentence For example, 'r. Ama ne kelaa khaayaa, mohana ne duudha piyaa Ora Hari ne film dekhii' This Hindi example cotains three verbs kh. Ay. A (ate), piy. A (drank) and dekh. I (saw) Respective arguments of each of these verbs would tend to occur in close proximity to it
Karaka and Vibhakti Two levels of analysis Syntactico-sematic relations : Direct participants of the action denoted by a verb (Karaka) Other relations : purpose, genitive, reason etc Relation markers (Vibhaktis)
Semantics of the verb A verbal root denotes: The activity The result Locus of activity : karta Locus of result : karma
karta - karma The boy opened the lock k 1 – karta k 2 – karma karta, karma sometimes correspond to agent/theme Not always
Action – bundle of sub-actions The boy opened the lock with the key The key opened the lock The lock opened Notion of vivaksha Realization of speakers’ intention in a sentence
Sub-actions - Opening of lock
Sub-actions - Opening of lock Action 1 Action 2 The key opened the lock Action 3 The boy opened the lock with the key The lock opened Each sentence reflects speakers’ intention
Sub-actions - Opening of lock
Basic karaka relations Only six karta – subject/agent/doer karma – object/patient karana – instrument sampradaan – beneficiary apaadaan – source adhikarana – location in place/time/other
Basic karaka relations
Basic karaka relations
Basic karaka relations
Other relations Other dependency relations Purpose, reason, direction etc Causatives, associatives, comparatives etc Genitive, adjective
Vibhaktis : Markers for karaka Relations • Relation markers (Vibhaktis) raama ne 'Ram‘ 'erg' | karta(doer) caakuu 'knife‘ se seba 'with' 'apple' 'cut' | | karana(instrument) karma (theme) raama ne mohana ke_liye ‘Ram’ kaa. Taa ‘erg’ ‘Mohan’ ‘for’ “Ram cut the apple for Mohan” seba kaa. Taa ‘apple’ (purpose) ‘cut’ mai. M mohana ke_saatha baazaara gayaa ‘I’ ‘Mohan’ ‘with’ ‘market’ ‘went’ “ I went to the market with Mohan “ (associative)
However No one-to-one correspondence between relations and relation markers
Syntactic Cues Verbal inflections (Tense Aspect Modality (TAM)) Passive : verb agrees with the karma Some other cases raama ko jaanaa pa. Daa ‘I+to’ ‘go’ ‘had to’ “I had to go” raama ko calanaa caahiye ‘Ram’ ‘to’ ‘walk’ “I should leave” ‘should’
Example Raama jaataa hai Raama ko jaanaa pa. Daa ‘Ram’ ‘go+hab’ ‘pres’ “Ram goes” ‘Ram+to’ jaa karta raama ‘go’ “Ram had to go” jaa karta mujha ‘had to’
Some Examples Relative Clause MWEs Change of state verbs Conjuncts Ellipsis
Relative Clause A noun is modified by a clause with a relative pronoun as its coreferent Example meraa bhaaii jo dillii me. M rahataa ‘my’ ‘brother’ ‘who’ ‘Delhi’ ‘in’ hai ‘prog’ ‘pres’ ‘My brother who lives in Delhi is coming tomorrow’ How to represent this ? Two possible representations aa ‘live+hab’ ‘pres’ ‘tomorrow’ ‘come’ rahaa hai kala
Alternative 1 aa meraa bhaaii jo raha dillii kala
Alternative 2 Aa meraa bhaaii coref raha jo dillii kala
Other Relative-Corelative Constructions Adjective having a clausal modifier tuma aisaa sundara ghar banaao jaisaa ‘you’ ‘such’ ‘beautiful’ ‘house’ ‘build’ unakaa hai ‘such-that’ ‘theirs’ “You build a house as beautiful as theirs” banaao ‘build’ k 1 tuma k 2 ghara adj sundara jjmod aisaa coref jo-vo-jjmod hai jaisaa unakaa ‘is’
MWEs Conjunct Verbs ((raama ne)) ((bahuta dera)) ((ravi kii)) ((pratiikshaa kii)) 'r. Ama erg' 'very' 'late' 'ravi' ‘of' 'wait‘ ‘did’ Ram waited for Ravi for a long time ((kaaryashaalaa ke liye)) ((biisa logo. M kaa)) ((naamaa. Mkana kiyaa gayaa)) 'workshop‘ 'for' 'twenty' 'people' ‘of‘ 'name registration' 'do+passive‘ Twenty people were registered for the workshop
Conjunct Verbs Conjunct verb ‘prashna kiyaa’ below mohana ne ravi se prashna kiyaa 'Mohan' 'erg' 'Ravi' 'to' 'question' 'did' “Mohan asked Ravi a question” A conjunct verb can have partial modification mohana ne acchaa prashna kiyaa thaa 'Mohan' 'erg' 'good' 'question' 'do+perf' 'past‘ The elements in a complex predicate can also be discontinuous prashna to mohana ne kiyaa thaa 'question' 'part' 'Mohan' 'erg' 'do+perf' 'past'
Conjunct Verbs However, Mohan ne ravi se acchaa prashna kiyaa prashna_kiyaa ‘questioned’ k 1 mohan ne Mohan k 2 ravi se to Ravi ? acchaa good 'acchaa' is NOT a verb modifier, 'acchaa' modifies 'prashna' and not 'prashna kiy. A', Solution ?
Conjunct Verbs Solution Don't chunk a conjunct verb as a single verbal unit Thus, Mohan ne ravi se ((acchaa)) ((prashna kiyaa))_VG Revise to Mohan ne ravi se ((acchaa prashna))_NP ((kiyaa))_VG
Conjunct Verbs Show 'part-of' relation between the noun and the verb Add a tag 'pof' to achieve the above Therefore, _kiyaa k 1 mohan ne k 2 ravi se pof prashna nmod acchaa
k 1 pof prashna mohana kiyaa DS for Discontinuous Elements prashna to mohana ne kiyaa thaa • Use of pof (‘Part Of’ relation )
MWEs Idioms ((kisaana kii)) ((patnii ko)) ((vaha ci. Diyaa)) 'farmer' 'of' 'wife' 'to' 'that' 'bird' (( phuu. Tii aa. Mkha nahii. M suhaatii thii)) 'not appealed' The idiom (in bold) is functionally a verb.
Idioms Two possible solutions phuu. Tii aazkha suhaa
Change of State Verbs Change of state verbs such as ‘ra. Mganaa’ (colour) pose a problem such as, ((usane)) ((apanaa ghara)) 'he/she' 'own' 'house' ((piilaa)) 'yellow' ((ra. Mgaa)) 'coloured' ra. Mga ‘colour’ k 1 usane he/she k 2 ghara house ? piilaa yellow Is 'piilaa' a complement of 'ghara' ? OR Is it the k 2 of ra. Mgaa ? If ‘piilaa’ is the k 2 of ra. Mgaa then what is the relation of ‘ghara’ with ‘ra. Mgaa ? Can they both be k 2 ?
Proposed Solution In Panini's framework, verbs denoting 'change of state' can have two 'karma' The object which is being changed The state after change Thus, ra. Mga ’coloured’ k 1 usane he k 2 -1 ghara house k 2 -2 piilaa yellow
Conjuncts Need special treatment in a dependency representation (mai. M baazaara gayaa)1 Ora (ve para ruke)2 'I' 'market' 'went' 'stayed' 'and' loga ghara 'those' people‘ 'home' 'at‘ “I went to the market and those people stayed at home” What is the head of a co-ordinate structure ? How to represent the equal status of 1 and 2 above ?
Conjuncts Take Conjunct as the 'head' Label the relation as 'ccof' ccof Ora ‘and’ gay. A ‘went’ k 1 m. EM ‘I’ k 2 b. Az. Ara ‘market’ ccof ruke ‘stay’ k 1 k 7 p loga ‘people’ ghara ‘home’ A subordinating conjunct will have a single child node
Some Problem Cases Certain complex sentences pose problems For example : agara tuma aate to ‘if’ hama vahaa. M jaate ‘you’ ‘come’ ‘then’ ‘we’ ‘there’ “Had you come, we would have gone there” Counterfactual ‘agara’ and ‘to’ two connectives How to represent the dependencies ? ‘go’
Main Clause – Subordinate Clause jaate ‘go+? ’ ? ? agara to K 1 hama k 7 p vahaa. M ccof aate k 1 tuma This representation fails to capture the relation between ‘agara’-’to’
Representation-Currently Followed to ‘then’ ccof jaate ‘go+? ’ vmod agara ccof k 1 k 7 p hama ‘we’ aate 'come' k 1 tuma 'you' vahaa. M ‘there’
Alternative Proposal agara-to pof agara pof to ccof aate k 1 tuma jaate k 1 hama Treat ‘agara-to’ as a complex conjunct k 7 p vahaa. M
Ellipsis How to show dependencies when the head is missing ? bacce ba. De ho gaye h. EM kis. I k. I b. Ata nah. IM sunate “The children have grown up, they don't listen to anyone” No explicit conjunct !! Insert a NULL element to show the dependencies NULL_CCP ccof bade_ho_gaye ccof nah. IM_sunate Insert a NULL node only if it is essential to represent the dependencies.
Applying Paninian Model to English
Some English Examples English is : A configurational language Relatively fixed word order Relations are not realised in affixes Subject and object are positional Subject is sacrosanct
Passive Rama ate a banana A banana was eaten by Rama eat
Interrogatives Did Rama eat a banana ? A 'Yes-no' interrogative Structurally, Interrogative is realised through word order change Subject – Auxiliary inversion No interrogative morpheme
Interrogative Contd. Proposed solution: eat < fs stype=interrogative__yes- no> fragof Did k 1 Rama k 2 banana Position gives the cues for the constraints
Interrogatives Contd. What did Rama eat ? Eat k 2 What Question Auxiliary < fs stype=interrogative__wh> fragof k 1 did element 'what' and position provide the syntactic cues Rama
Control Verbs John persuaded Harry to leave John promised Harry to leave persuade k 1 John k 2 promise rt (? ) Harry k 1 leave The object of persuade corefers to the 'missing' 'karta' of 'leave' John leave k 4 k 2 Harry The subject of promise corefers to the 'missing' 'karta' of 'leave'
Verbs such as 'want' John wanted Harry to leave want k 1 John k 2 leave k 1 Harry 'want' is a transitive verb and can take 'a clause' as its 'karma'
Empty 'it' It is raining in Delhi rain
Conclusion Paninian Grammatical Formalism offers a depenency based approach for sentence parsing which suits better morphologically richer languages with relatively free word order such as Indian languages.