Скачать презентацию KI 2 5 Grammar inference Lambert Schomaker Скачать презентацию KI 2 5 Grammar inference Lambert Schomaker

f9f39beea906eac295b99ebd0a7f22b1.ppt

  • Количество слайдов: 27

KI 2 – 5 Grammar inference Lambert Schomaker Kunstmatige Intelligentie / Ru. G KI 2 – 5 Grammar inference Lambert Schomaker Kunstmatige Intelligentie / Ru. G

2 Grammar inference (GI) § methods, aimed at uncovering the grammar which underlies an 2 Grammar inference (GI) § methods, aimed at uncovering the grammar which underlies an observed sequence of tokens § Two variants: – explicit, formal GI – implicit, statistical GI deterministic token generators stochastic token generators

3 Grammar inference AABBCCAA. . (? ). . ABA what’s next? 1 A 1 3 Grammar inference AABBCCAA. . (? ). . ABA what’s next? 1 A 1 B 1 A AABBAA 2 A 2 B 2 A or AAB(+mirrorsymmetric) (2 A B)(+mirrored) repetition, mirrorring, insertion, substitution

4 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from 4 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from an unknown language tokenized patterns in the real world

5 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from 5 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from an unknown language tokenized patterns in the real world A B A

6 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from 6 Strings of tokens § § DNA: ACTGAGGACCTGAC… output of speech recognizers words from an unknown language tokenized patterns in the real world A B A Symm(B, A)

7 GI § induction of structural patterns from observed data § representation by a 7 GI § induction of structural patterns from observed data § representation by a formal grammar versus: § emulating the underlying grammar withoutmaking the rules explicit (NN, HMM)

8 GI, the engine Data Grammatical rules Grammar Induction aaabbb ab abccba (seq (repeat 8 GI, the engine Data Grammatical rules Grammar Induction aaabbb ab abccba (seq (repeat 3 a)(repeat 3 b)) (seq a b) (symmetry (repeat 2 c) (seq a b))

9 The hypothesis behind GI Data G 0 Generator process aaabbb ab abccba Grammar 9 The hypothesis behind GI Data G 0 Generator process aaabbb ab abccba Grammar Induction Find G’ G 0 G’

10 The hypothesis behind GI Data G 0 Generator process aaabbb ab abccba It 10 The hypothesis behind GI Data G 0 Generator process aaabbb ab abccba It is not claimed that G 0 actually ‘exists’ Grammar Induction Find G’ G 0 G’

11 Learning § Until now it was implicitly assumed that the data consists of 11 Learning § Until now it was implicitly assumed that the data consists of positive examples § A very large amount of data is needed to induce an underlying grammar § It is difficult to find a good approximation to G 0 if there are no negative examples: e. g. “aaxybb does NOT belong to the grammar”

12 Learning… Convergence G 0 = G* is assumed for infinite N Data G 12 Learning… Convergence G 0 = G* is assumed for infinite N Data G 0 Generator process sample 1 sample 2 sample 3. . . sample. N Grammar Induction G’ G 1+2+3 G*

13 Learning… (Convergence G 0 = G* is assumed for infinite N) More realistic: 13 Learning… (Convergence G 0 = G* is assumed for infinite N) More realistic: a PAC, probably approximately correct G* Data G 0 Generator process sample 1 sample 2 sample 3. . . sample. N Grammar Induction G’ G 1+2+3 G*

14 PAC GI the language generated by G 0 L(G 0) P[ p(L(G 0) 14 PAC GI the language generated by G 0 L(G 0) P[ p(L(G 0) L(G*)) < ] > (1 - ) the language explained by G* L(G*)

15 PAC GI the language generated by G 0 L(G 0) the language explained 15 PAC GI the language generated by G 0 L(G 0) the language explained by G* L(G*) P[ p(L(G 0) L(G*)) < ] > (1 - ) The probability that “the probability of finding elements {L 0 xor L*} is smaller than ”, will be larger than 1 -

16 Example § S+ = { aa, abba, abbba} a aa 16 Example § S+ = { aa, abba, abbba} a aa

17 Example § S+ = { aa, abba, abbba} a a a b a 17 Example § S+ = { aa, abba, abbba} a a a b a b a

18 Example § S+ = { aa, abba, abbba} a aa b b b 18 Example § S+ = { aa, abba, abbba} a aa b b b a a

19 Example § S+ = { aa, abba, abbba} a aa b b a 19 Example § S+ = { aa, abba, abbba} a aa b b a a

20 Many GI approaches are known (Dupont, 1997) 20 Many GI approaches are known (Dupont, 1997)

21 Second group: Grammar Emulation § Statistical methods, aiming at producing token sequences with 21 Second group: Grammar Emulation § Statistical methods, aiming at producing token sequences with the same statistical properties as the generator grammar G 0 § 1: recurrent neural networks § 2: Markov models § 3: hidden-Markov models

22 Grammar emulation, training ABGBABGACTVYAB <x>. . . context window Grammar emulator predict x 22 Grammar emulation, training ABGBABGACTVYAB . . . context window Grammar emulator predict x

23 Recurrent neural networks for grammar emulation § Major types: – Jordan (output-layer recurrence) 23 Recurrent neural networks for grammar emulation § Major types: – Jordan (output-layer recurrence) – Elman (hidden-layer recurrence)

24 Jordan MLPs § Assumption: current state is represented by output unit activation at 24 Jordan MLPs § Assumption: current state is represented by output unit activation at the previous time step(s) and by the current input Input state Output t

25 Elman MLPs § Assumption: current state is represented by hidden unit activation at 25 Elman MLPs § Assumption: current state is represented by hidden unit activation at the previous time step(s) and by the current input Input state Output t

26 Markov variants § Shannon: fixed 5 -letter window for English to predict next 26 Markov variants § Shannon: fixed 5 -letter window for English to predict next letter § Variable-length Markov Models (VLMM) (Guyon & Pereira) idea: the width of the context window to predict the next token in a sequence is variable and depends on statistics

27 Results § Example output of letter VLMM, trained on news item texts (250 27 Results § Example output of letter VLMM, trained on news item texts (250 MB training set) “liferator member of flight since N. a report the managical including from C all N months after dispute. C and declaracter leaders first to do a lot of though a ground out and C C pairs due to each planner of the lux said the C nailed by the defender begin about in N. the spokesman standards of the arms responded victory the side honored by the accustomers was arrest two mentalisting the romatory accustomers of ethnic C C the procedure. “