8e2663d3d954f25130d2ffca01a02918.ppt
- Количество слайдов: 23
Constraint based Dependency Telugu Parser Guided by Dr. Rajeev Sangal Dr. Dipti Misra Samar Hussain Team members Phani Chaitanya Ravi kiran
Overview • • Motivation A word about the language Overview of constraint based parser Analysis of special cases – Genitives – Copula – “ani” construction – Conjuncts • Future work
Motivation – We thought about a question answering system in Telugu mainly for medical and tourism domain which could help native Telugu speakers (as a preliminary diagnosis tool and a travel guide). And we were in need of a parser to make things easier.
A word about the language • Telugu is a South Asian language • Features – Morphologically rich – Free word order – Agglutinative • challenges – No Treebank – No parser – No wordnet
Overview of constraint based parser Telugu : r. Amudu i. Mtiki vacc. Aka pa. Mdu ni wi. Mtadu Gloss : Rama home after_coming apple English : Ram eats an apple after coming home eats
Overview of constraint based parser 1 1. 1 2 2. 1 3 3. 1 4 4. 1 4. 2 5 5. 1 5. 2 (( r. Amudu )) (( i. Mtiki )) (( vacc. Aka )) (( pa. Mdu ni )) (( wi. Mt. Adu. )) )) NP NN <af=r. Ama, n, , 0, , adj_v. Adu, > NP NN <af=illu, n, , s, , 0, , ki, > VG VRB <af=vaccu, v, , , any, 0, , ina_Aka, > NP NN PREP <af=pa. Mdu, n, , s, , 0, >|<af=pa. Mdu, n, , s, , 0, , obl, > <af=ni, n, , s, , 0, > VG VFM SYM <af=winu, v, , , 3_p, 0, , w. A, >
Overview of constraint based parser 1 1. 1 2 2. 1 3 3. 1 4 4. 1 4. 2 5 5. 1 5. 2 (( r. Amudu )) (( i. Mtiki )) (( vacc. Aka )) (( pa. Mdu ni )) (( wi. Mt. Adu. )) )) NP NN VG VRB NP NN PREP VG VFM SYM <af=r. Ama, n, , 0, , adj_v. Adu, > <af=illu, n, , s, , 0, , ki, > <af=vaccu, v, , , any, 0, , ina_Aka, > <af=pa. Mdu, n, , s, , 0, > <af=ni, n, , s, , 0, > <af=winu, v, , , 3_p, 0, , w. A, > Source Demand
Overview of constraint based parser Frame for winu (eat in basic form so no transformation required) ---------------------------------arc-label |necessity| vibhakti|lextype |posn|reln ---------------------------------k 1 m 0 n l c k 2 m ni n l c ----------------------------------Frame for vaccu (come) ---------------------------------arc-label |necessity| vibhakti|lextype |posn|reln ---------------------------------k 1 m 0 n l c K 2 m ki n l c ---------------------------------Transformation charts [ina_aka (after+ing)] --------------------------------------arc-label |necessity| vibhakti|lextype |posn|reln|op --------------------------------------K 1 m 0 n l c remove Vmod m v r p insert --------------------------------------- Winu[wa] (eat) k 1 r. Amudu(Ram) k 2 pa. Mdu (fruit) Vmod (after coming )Vaccu[ina_aka] k 1 k 2 r. Amudu (House)i. Mtiki
Overview of constraint based parser Frame for vacc. Aka (after transformation) arc-label necessity vibhakti lextype k 2 m ki n Vmod m v posn l r reln c p ------------------------------- Frame for winu k 1 m 0 n l c k 2 m ni n l c -------------------------------------------- X 1: k 1 r. Amudu i. Mtiki vacc. Aka X 3: k 2 pa. Mduni wi. Mtadu X 2: k 2 X 4: vmod
Overview of constraint based parser C 1 : For each of the mandatory karakas in a karaka chart for each demand group, there should be exactly one outgoing edge labeled by the karaka by the demand group. C 2 : for each of the optional or desirable karakas in a karaka chart for each demand group, there should be at most one outgoing edge labeled by the karaka by the demand group. C 3 : There should be exactly one incoming arc into each source group Equations formed by applying the above constraints are C 1 : X 1 = 1 X 2 = 1 X 3 = 1 X 4 = 1 C 2 : No optional field found C 3 : X 1 = 1 X 2 = 1 X 3 = 1 X 4 = 1 :
Overview of constraint based parser 1 1. 1 2 2. 1 3 3. 1 4 4. 1 4. 2 5 5. 1 5. 2 (( r. Amudu )) (( i. Mtiki )) (( vacc. Aka )) (( pa. Mdu ni )) (( wi. Mt. Adu. )) )) NP NN < af=r. Ama, n, , 0, , adj_v. Adu, /drel=k 1: 5/name=1> <af=r. Ama, n, , 0, , adj_v. Adu, > NP NN <af=illu, n, , s, , 0, , ki, /drel = k 2: 3/name=2> <af=illu, n, , s, , 0, , ki, > VG VRB <af=vaccu, v, , , any, 0, , ina_Aka, /drel = vmod: 5/name=3> <af=vaccu, v, , , any, 0, , ina_Aka, > NP NN PREP <af=pa. Mdu, n, , s, , 0, /drel = k 2: 5/name=4> <af=pa. Mdu, n, , s, , 0, >|<af=pa. Mdu, n, , s, , 0, , obl, > <af=ni, n, , s, , 0, > VG VFM SYM <af=winu, v, , , 3_p, 0, , w. A, /name = 5> <af=winu, v, , , 3_p, 0, , w. A, >
Analysis of special cases • • Genitives Copula “ani” construction Conjuncts
Genitives • Genitives is the case that marks a noun as being the possessor of another noun (ex – his, her, its …… etc) • Cases – Genitive marker exists – Telugu : r. Amudi yo. Vkka puswaka. M – Gloss : ram 's book • So when there is a marker then it is a straight forward that the noun preceding “yo. Vkka” holds an R 6 relation with the noun succeeding “yo. Vkka”. – Genitive marker is dropped – Telugu : r. Amudi puswaka. M – Gloss : ram book • here is the suffix “udi” in “r. Amudi” which gives the information about existence of genitive.
Genitive contd. . • Exceptions in case where genitive marker can be dropped • • Telugu : ra. Gu puswaka. M r. Amudiki ic. Cadu Gloss : Raghu book Ram gave English (sense 1): Raghu gave book to sita. English (sense 2): Raghu’s book is given to sita. So for non-masculine nouns (Raghu and Sita)in Telugu we don’t have any markers for genitives. • So we output all possible parses for this case. The parses include ic. CAdu ra. Gu k 4 k 1 k 2 r. Amudiki puswakam k 4 ic. CAdu r. Amudiki k 2 puswakam r 6 ra. Gu
Copula • Ex – is, are, were …. . Etc • Copula is generally dropped in Telugu For ex– Telugu : r. Amudu ma. Mci b. Aludu – gloss : RAM good boy – Eng : Ram is a good boy. • So we handle these cases by introducing a “NULL_VG” Frame for NULL_VG ----------------------------------------------arc-label necessity vibhakti lextype posn reln ----------------------------------------------k 1 m 0 n l c k 1 S m 0 n l c ----------------------------------------------
‘ani’ construction • ‘ani’ in telugu is some times similar to “that” in english. • There are three different ways of using “ani” as follows : Ø Used as complementizer : • Telugu : r. Amudu pa. Mdu wi. Mt. Adu ani mohan ce. Vpp. Adu. • Gloss : Ram fruit will_eat that mohan said. • English : Ram said that Mohan will eat a fruit. Ø Used as verb : • Telugu : mohan r. Amudu pa. Mdu wi. Mt. Adu ani vellipoy. Adu. • English : mohan left saying ram eats an apple. Ø Used to state a reason : • Telugu : mohan r. Amudu pa. Mdu winn. Adani vellipoy. Adu. • Gloss : Mohan Ram fruit had_eaten went. • English : Mohan went because ram had eaten the fruit.
“ani” construction Contd … So we created a demand frame for “ani” Frame for ani ----------------------------------------------arc-label necessity vibhakti lextype posn reln ----------------------------------------------Ccof m v_fin l c Ccof m v_fin r p ----------------------------------------------
Conjuncts • In Telugu conjuncts occur as suffixes (tam of the verb) , Dheerg. As and as lexical items such as “ink. A” , “anduke” , “mariyu” , “k. Ani” , “aiwe” and “anwe”. Ø Suffixes : Ø Here , just applying the corresponding transformation chart of the verb solves the case. Telugu : nenu i. Mtiki velwe nixrapow. Anu. Gloss : I home if go will_sleep. English: I will sleep if I go home.
Contd … • Lexical items : Here we will have frame for each lexical entry which will do the corresponding job. In case of “mariyu” : Frame 1 : ----------------------------------------------arc-label necessity vibhakti lextype posn reln ----------------------------------------------Ccof m v l c Ccof m v r c ----------------------------------------------Frame 2 : ----------------------------------------------arc-label necessity vibhakti lextype posn reln ----------------------------------------------Ccof m n l c Ccof m n r c ----------------------------------------------
Contd … • Dheerg. As : Ø Often by elongation of the vowel at the end of lexical items the conjuncts information is implicit there without the need of explicit lexical entries such as “mariyu”. • • • Telugu : r. Amud. U siw. A i. Mtiki vell. Aru. Gloss : Ram (implicit conj) sita home went. English : Ram and Sita went home. Ø In such cases a NULL_CCP is introduced which serves like explicit conjunct lexical entry and we have a frames for the NULL_CCP similar to the one in previous slide.
Future work !! • A thorough analysis of Relative clauses. • Analysis and handling of NULL VERBS in case of complex constructions. • And their implementation. • Verb and TAM Classification.
THANKS !!
Any Queries ? ?
8e2663d3d954f25130d2ffca01a02918.ppt