f9f39beea906eac295b99ebd0a7f22b1.ppt

- Количество слайдов: 27

KI 2 – 5 Grammar inference Lambert Schomaker Kunstmatige Intelligentie / Ru. G

2 Grammar inference (GI) § methods, aimed at uncovering the grammar which underlies an observed sequence of tokens § Two variants: – explicit, formal GI – implicit, statistical GI deterministic token generators stochastic token generators

3 Grammar inference AABBCCAA. . (? ). . ABA what’s next? 1 A 1 B 1 A AABBAA 2 A 2 B 2 A or AAB(+mirrorsymmetric) (2 A B)(+mirrored) repetition, mirrorring, insertion, substitution

4 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from an unknown language tokenized patterns in the real world

5 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from an unknown language tokenized patterns in the real world A B A

6 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from an unknown language tokenized patterns in the real world A B A Symm(B, A)

7 GI § induction of structural patterns from observed data § representation by a formal grammar versus: § emulating the underlying grammar withoutmaking the rules explicit (NN, HMM)

8 GI, the engine Data Grammatical rules Grammar Induction aaabbb ab abccba (seq (repeat 3 a)(repeat 3 b)) (seq a b) (symmetry (repeat 2 c) (seq a b))

9 The hypothesis behind GI Data G 0 Generator process aaabbb ab abccba Grammar Induction Find G’ G 0 G’

10 The hypothesis behind GI Data G 0 Generator process aaabbb ab abccba It is not claimed that G 0 actually ‘exists’ Grammar Induction Find G’ G 0 G’

11 Learning § Until now it was implicitly assumed that the data consists of positive examples § A very large amount of data is needed to induce an underlying grammar § It is difficult to find a good approximation to G 0 if there are no negative examples: e. g. “aaxybb does NOT belong to the grammar”

12 Learning… Convergence G 0 = G* is assumed for infinite N Data G 0 Generator process sample 1 sample 2 sample 3. . . sample. N Grammar Induction G’ G 1+2+3 G*

13 Learning… (Convergence G 0 = G* is assumed for infinite N) More realistic: a PAC, probably approximately correct G* Data G 0 Generator process sample 1 sample 2 sample 3. . . sample. N Grammar Induction G’ G 1+2+3 G*

14 PAC GI the language generated by G 0 L(G 0) P[ p(L(G 0) L(G*)) < ] > (1 - ) the language explained by G* L(G*)

15 PAC GI the language generated by G 0 L(G 0) the language explained by G* L(G*) P[ p(L(G 0) L(G*)) < ] > (1 - ) The probability that “the probability of finding elements {L 0 xor L*} is smaller than ”, will be larger than 1 -

16 Example § S+ = { aa, abba, abbba} a aa

17 Example § S+ = { aa, abba, abbba} a a a b a b a

18 Example § S+ = { aa, abba, abbba} a aa b b b a a

19 Example § S+ = { aa, abba, abbba} a aa b b a a

20 Many GI approaches are known (Dupont, 1997)

21 Second group: Grammar Emulation § Statistical methods, aiming at producing token sequences with the same statistical properties as the generator grammar G 0 § 1: recurrent neural networks § 2: Markov models § 3: hidden-Markov models

22 Grammar emulation, training ABGBABGACTVYAB

23 Recurrent neural networks for grammar emulation § Major types: – Jordan (output-layer recurrence) – Elman (hidden-layer recurrence)

24 Jordan MLPs § Assumption: current state is represented by output unit activation at the previous time step(s) and by the current input Input state Output t

25 Elman MLPs § Assumption: current state is represented by hidden unit activation at the previous time step(s) and by the current input Input state Output t

26 Markov variants § Shannon: fixed 5 -letter window for English to predict next letter § Variable-length Markov Models (VLMM) (Guyon & Pereira) idea: the width of the context window to predict the next token in a sequence is variable and depends on statistics

27 Results § Example output of letter VLMM, trained on news item texts (250 MB training set) “liferator member of flight since N. a report the managical including from C all N months after dispute. C and declaracter leaders first to do a lot of though a ground out and C C pairs due to each planner of the lux said the C nailed by the defender begin about in N. the spokesman standards of the arms responded victory the side honored by the accustomers was arrest two mentalisting the romatory accustomers of ethnic C C the procedure. “