
71cfb1404a87d8e8b77ec9fe63f9fe9f.ppt
- Количество слайдов: 19
Morphological Analysis of Hungarian in Noo. J Peter Vajda Hungarian Academy of Sciences Research Institute for Linguistics
Summary 1. 2. 3. 4. 5. 6. Hungarian morphology Linguistic resources Some experiments with INTEX/Noo. J The solution Examples Derivation 2
Hungarian morphology l l Agglutinative (and sometimes inflectional) The suffixes l l l Can have many forms (vowel harmony) Can change the form of the stem (there are groups of variants) l bokor (sg. ) bokr – ok (pl. ); alma (sg. ) almá – k (pl. ) Sometimes begin with a linking vowel l plural: -k / -ak / -ek / -ok / -ök A noun (adj. , num. ) can have ~ 7 -800 forms A verb can have ~ 80 forms Orthography: there are difficulties, when digraphs are doubled l cs cscs ccs, gy gygy ggy
Nominal inflections l l 18 cases (nominative, accusative, dative + grammatical relations which are expressed by prepositions in French/English) Expression of the possessives by suffixes l Which mark the number, the person, the number of the possessed l l l Anaphorical possessive l l ház-a-m, ház-a-d, ház-a (my/your/his house) ház-a-i-m, ház-a-i-d, ház-a-i (my/your/his houses) A ház Péteré The house is Péter’s; A házak Péteréi The houses are Péter’s The maximal number of inflections can be five l l barát-ai-tok-é-i-t (I can see) those (things) of your friends’
Verbal inflections l l l Two tenses: present, past three modes: indicative, conditional, imperative definite and indefinite conjugations l l l one special form where the subject is in 1 st person and the object is in the 2 nd: l l Néz-ek egy asztalt Néz-em az asztalt I watch a table I watch the table néz-lek (I watch you) infinitive and „conjugated infinitive” (sometimes subjunctive in French) 5
The resources l Dictionary of Hungarian inflections (Elekfi, ’ 92) l l A traditional description, profound and exhaustive Two dimensional classification: l l l Vowel harmony (3 classes) and complex features of the stems (stem-types, linking vowel, etc. , 55 classes) Altogether: 1700 different sub-classes (paradigms) systematic differences and similarities are hidden not convenient to use in finite-state transducers We have converted it into a database, where we can retrieve all the forms from 6
The experiments with INTEX/Noo. J l ‘Brute-force’ method l l We created one graph per sub-class for testing INTEX 1700 sub-graphs 45000 paths in the graphs… Using only dictionaries (. nod) l Dictionary of stems (70000 words) l l Dictionary of suffixes (one million entries) l l l ház, N+C 2 A+stem=1+NW (*)ak, <$1=N+C 2 A+stem=1>{$0, $1 L, N$1 S+ana=PL} (*)am, <$1=N+C 2 A+stem=1>{$0, $1 L, N$1 S+ana=PSe 1} (*)at, <$1=N+C 2 A+stem=1>{$0, $1 L, N$1 S+ana=ACC} (*)at, <$1=N+C 2 A 1+stem=1>{$0, $1 L, N$1 S+ana=ACC} (*)amat, <$1=N+C 2 A+stem=1>{$0, $1 L, N$1 S+ana=PSe 1+ACC} dictionary of lexical forms (which have a zero morpheme as suffix) l ház, N+ana=NOM 7
The linguistic solution l transform the database into a grammar based on morphophonological features l l The grammatical features of stems and morphemes are in the dictionary The features of the stems and the suffixes can be unified • Grammar • We have to describe the order of the morphemes • Introduce features which select from the allomorphs 8
The order of morphemes for nominals 9
The order of morphemes for nominals barát-a-i-tok-é-i-t barát, N +PS +PL +ps_2 +ps_pl +ANAP+i 10 +ACC
11
Morpho-phonological features To introduce features we examine the allomorphs l l l HÁZ - A ház, , N+nonj HÁZ - AT ház, , N+nonj+acclink HAJÓ-JA hajó, , N+j HAJÓ - T hajó, , N+j+accnolink 12
The dictionary
The plural and the accusative kalap ot kalap - ok - at (hat, SG+ACC) (hats, PL+ACC) 15
Derivation l l Can change or leave the category (POS) Introduce new features l l kosár kosar-as kosar ak (pl. ) basket kosar - as - ok (pl. ) basketball player Simple cases are handled by graphs Others are listed as lemmas in the dictionary
Assimilation and digraphs l some suffixes (eg. val/vel) enforce total assimilation: l LÉC + VEL PÉCS + VEL PLÉD + VEL l l LÉCCEL PÉCCSEL PLÉDDEL 17
l Conclusion l l We have adapted the traditional description We have described the inflectional morphology of Hungarian in Noo. J grammars/dictionaries Handled some of the derivational morphology Objectives l Find a simpler method for derivation l Disambiguation Automatic methods to expand the dictionary l l Automatic delegation of features 18
Thank you