Скачать презентацию The Chomsky Hierarchy sentences The sentence Скачать презентацию The Chomsky Hierarchy sentences The sentence

fe78c3c104ac1943ff081d117dba20ba.ppt

  • Количество слайдов: 43

 The Chomsky Hierarchy The Chomsky Hierarchy

sentences The sentence as a string of words E. g I saw the lady sentences The sentence as a string of words E. g I saw the lady with the binoculars string = a b c d e b f

The relations of parts of a string to each other may be different I The relations of parts of a string to each other may be different I saw the lady with the binoculars is stucturally ambiguous Who has the binoculars?

[ I ] saw the lady [ with the binoculars ] = [a] b [ I ] saw the lady [ with the binoculars ] = [a] b c d [e b f] I saw [ the lady with the binoculars] = a b [c d e b f]

How can we represent the difference? By assigning them different structures. We can represent How can we represent the difference? By assigning them different structures. We can represent structures with 'trees'. I read the book

a. I saw the lady with the binoculars S NP VP V NP NP a. I saw the lady with the binoculars S NP VP V NP NP PP I saw the lady with the binoculars I saw [the lady with the binoculars]

b. I saw the lady with the binoculars S NP VP VP PP I b. I saw the lady with the binoculars S NP VP VP PP I saw the lady with the binoculars I [ saw the lady ] with the binoculars

birds fly S NP VP N birds V fly Syntactic rules S → NP birds fly S NP VP N birds V fly Syntactic rules S → NP NP → N VP → V Graphs and trees VP

 S NP VP birds a fly b ab Graphs and trees = string S NP VP birds a fly b ab Graphs and trees = string

 S A B a b ab S → A B A → a S A B a b ab S → A B A → a B → b Graphs and trees

Rules Assumption: natural language grammars are a rule-based systems What kind of grammars describe Rules Assumption: natural language grammars are a rule-based systems What kind of grammars describe natural language phenomena? What are the formal properties of grammatical rules?

Chomsky (1957) Syntactic Struc-tures. The Hague: Mouton Chomsky, N. and G. A. Miller (1958) Chomsky (1957) Syntactic Struc-tures. The Hague: Mouton Chomsky, N. and G. A. Miller (1958) Finite-state languages Information and Control 1, 99 -112 Chomsky (1959) On certain formal properties of languages. Information and Control 2, 137 -167

Rules in Linguistics 1. PHONOLOGY /s/ → [θ] V ___V Rewrite /s/ as [θ] Rules in Linguistics 1. PHONOLOGY /s/ → [θ] V ___V Rewrite /s/ as [θ] when /s/ occurs in context V ____ V With: V = auxiliary node s, θ = terminal nodes

Rules in Linguistics 2. SYNTAX S → NP VP VP → V NP → Rules in Linguistics 2. SYNTAX S → NP VP VP → V NP → N Rewrite S as NP VP in any context With: S, NP, VP = auxiliary nodes V, N = terminal node

PHONOLOGY (sound system) Maltese – Word-final devoicing Orthography Pronunciation (spelling) (sound) Sabet sab [sa-bet] PHONOLOGY (sound system) Maltese – Word-final devoicing Orthography Pronunciation (spelling) (sound) Sabet sab [sa-bet] [sap] Ħobża ħobż [hob-za] [hops] Vjaġġi vjaġġ [vjağ-ği] [vjačč] voiced [+vd] voiceless [-vd] [b, z, ğ] [p, s, č] [+vd] → [-vd] /____ # (for # = end of word)

MORPHOLOGY (word formation) Maltese – Progressive assimilation in 3 fsg imprefective (present) Marker for MORPHOLOGY (word formation) Maltese – Progressive assimilation in 3 fsg imprefective (present) Marker for verb in 3 rd person feminine singular imperfective t- (3 fsgimpf = she) e. g. she breaks = t-kisser I break = n-kisser t-kisser t-ressaq 3 fsg-break 3 fsg-move she breaks she moves s-sakkar d-dur 3 fsg-lock 3 fsg-turn she locks she turns *t-sakkar * t-dur t → s, d, etc. /____ [s, d, etc. | [+cor] μ [3 fsg] (with μ = morpheme, C = consonant, cor = coronal

SYNTAX (phrase/sentence formation) SENTENCE: The boy kissed the girl SUBJECT PREDICATE NOUN PHRASE VERB SYNTAX (phrase/sentence formation) SENTENCE: The boy kissed the girl SUBJECT PREDICATE NOUN PHRASE VERB PHRASE ART + NOUN VERB + NOUN PHRASE S → NP VP VP → V NP NP → ART N

SEMANTICS (meaning) The lion attacks the hunter ATTACK (a, b) a λy [ATTACK (y, SEMANTICS (meaning) The lion attacks the hunter ATTACK (a, b) a λy [ATTACK (y, b)] λz λy [ATTACK (y, z)] b (with a = the lion, b = the hunter)

Chomsky Hierarchy 0. Type 0 (recursively enumerable) languages Only restriction on rules: left-hand side Chomsky Hierarchy 0. Type 0 (recursively enumerable) languages Only restriction on rules: left-hand side cannot be the empty string (* Ø ……. ) 1. Context-Sensitive languages - Context-Sensitive (CS) rules 2. Context-Free languages - Context-Free (CF) rules 3. Regular languages - Non-Context-Free (CF) rules 0 ⊇ 1 ⊇ 2 ⊇ 3 a ⊇ b meaning a properly includes b (a is a superset of b), i. e. b is a proper subset of a or b is in a

Generative power 0. Type 0 (recursively enumerable) languages - only restriction on rules: left-hand Generative power 0. Type 0 (recursively enumerable) languages - only restriction on rules: left-hand side cannot be the empty string (* Ø ……. ) - is the most powerful system 3. Type 3(regular language) - is the least powerful

Superset/subset relation S 1 a b S 2 a c b d f g Superset/subset relation S 1 a b S 2 a c b d f g S 1 is a subset of S 2 ; S 2 is a subset of S 1

Rule Type – 3 Name: Regular Example: Finite State Automata (Markov-process Grammar) Rule type: Rule Type – 3 Name: Regular Example: Finite State Automata (Markov-process Grammar) Rule type: a) right-linear A x. B or A x with: A, B = auxiliary nodes and x = terminal node b) or left-linear A Bx or A x Generates: ambn with m, n 1 Cannot guarantee that there as many a’s as b’s; no embedding

A regular grammar for natural language sentences S → the A A → cat A regular grammar for natural language sentences S → the A A → cat B A → mouse B A → duck B B → bites C B → sees C B → eats C C → the D D → boy D → girl D → monkey the cat bites the boy the mouse eats the monkey the duck sees the girl

Regular grammars Grammar 1: A → a B B → b A Grammar 3: Regular grammars Grammar 1: A → a B B → b A Grammar 3: A → a B B → b A Grammar 5: S → a A S → b B A → a S B → b b S S → Grammar 2: A → a A → B a B → A b Grammar 4: A → a A → B a B → b B → A b Grammar 6: A → A a A → B a B → b B → A b A → a

Grammars: non-regular Grammar 6: S → A B S → b B A → Grammars: non-regular Grammar 6: S → A B S → b B A → a S B → b b S S→ Grammar 7: A → a A → B a B → b A

Finite-State Automaton article noun NP 1 adjective NP 2 Finite-State Automaton article noun NP 1 adjective NP 2

 NP article NP 1 adjective NP 1 noun NP → article NP 1 NP article NP 1 adjective NP 1 noun NP → article NP 1 →adjective NP 1 → noun NP 2

A parse tree S NP N root node VP V nonterminal nodes NP DET A parse tree S NP N root node VP V nonterminal nodes NP DET terminal nodes N

Rule Type – 2 Name: Context Free Example: Phrase Structure Grammars/ Push-Down Automata Rule Rule Type – 2 Name: Context Free Example: Phrase Structure Grammars/ Push-Down Automata Rule type: A with: A = auxiliary node = any number of terminal or auxiliary nodes Recursiveness (centre embedding) allowed: A A

CF Grammar A Context Free grammar consists of: a) a finite terminal vocabulary VT CF Grammar A Context Free grammar consists of: a) a finite terminal vocabulary VT b) a finite auxiliary vocabulary VA c) an axiom S VA d) a finite number of context free rules of form A → γ, where A VA and γ {VA VT}* In natural language syntax S is interpreted as the start symbol for sentence, as in S → NP VP

CF Grammars The following languages cannot be generated by a regular grammar Language 1: CF Grammars The following languages cannot be generated by a regular grammar Language 1: Language 2: anbn mirror image ab abaaba aabb abba Context-Free rules: A → a b A → b A b

Natural language Is English regular or CF? If centre embedding is required, then it Natural language Is English regular or CF? If centre embedding is required, then it cannot be regular Centre Embedding: 1. [The cat] [likes tuna fish] a b 2. The cat the dog chased likes tuna fish a a b b 3. The cat the dog the rat bit chased likes tuna fish a a b b b 4. The cat the dog the rat the elephant admired bit chased likes tuna fish a a b b b ab aabb aaabbb aaaabbbb

Centre embedding S = ab NP the cat a VP likes tuna b Centre embedding S = ab NP the cat a VP likes tuna b

S NP NP the cat a = aabb VP likes S tuna b NP S NP NP the cat a = aabb VP likes S tuna b NP VP the chased dog b a

 NP S VP likes NP S tuna the b cat NP VP a NP S VP likes NP S tuna the b cat NP VP a chased NP S b the dog NP VP a the bit rat b a = aaabbb

Natural language Is English regular or CF? If centre embedding is required, then it Natural language Is English regular or CF? If centre embedding is required, then it cannot be regular

Centre Embedding 1. [The cat][likes tuna fish] a b = ab 2. [The cat] Centre Embedding 1. [The cat][likes tuna fish] a b = ab 2. [The cat] [the dog] [chased] [likes tuna fish] a a b b = aabb

[The cat] a [likes tuna fish] b 2. [The cat] [the dog] [chased] [likes. [The cat] a [likes tuna fish] b 2. [The cat] [the dog] [chased] [likes. . . ] a b b

3. [The cat] [the dog] [the rat] [bit] [chased] [likes. . . ] a 3. [The cat] [the dog] [the rat] [bit] [chased] [likes. . . ] a a b b b 4. [The cat] [the dog] [the rat] [the elephant] [admired] [bit] [chased] [likes . . ] = a a b b b aaabbb aaaabbbb

Natural language 2 More Centre Embedding: 1. If S 1, then S 2 a Natural language 2 More Centre Embedding: 1. If S 1, then S 2 a a 2. Either S 3, or S 4 b 3. The man who said S 5 is arriving today 4. The man who said S 6 is arriving the day after Sentence with embedding: If either the man who said S 5 is arriving today or the man who said S 5 is arriving tomorrow, then the man who said S 6 is arriving the day after ab b a = abba

Natural language 2 More Centre Embedding: 1. If S 1, then S 2 a Natural language 2 More Centre Embedding: 1. If S 1, then S 2 a a 2. Either S 3, or S 4 b b Sentence with embedding: If either the man is arriving today or the woman is arriving tomorrow, then the child is arriving the day after. a = [if b = [either the man is arriving today] b = [or the woman is arriving tomorrow]] a = [then the child is arriving the day after] = abba

CS languages The following languages cannot be generated by a CF grammar (by pumping CS languages The following languages cannot be generated by a CF grammar (by pumping lemma): anbmcndm Swiss German: A string of dative nouns (e. g. aa), followed by a string of accusative nouns (e. g. bbb), followed by a string of dative-taking verbs (cc), followed by a string of accusative-taking verbs (ddd) = aabbbccddd = anbmcndm

Swiss German: Jan sait das (Jan says that) … mer em Hans es Huus Swiss German: Jan sait das (Jan says that) … mer em Hans es Huus hälfed aastriiche we Hans/DAT the house/ACC helped paint we helped Hans paint the house abcd NPdat NPacc Vdat Vacc a a b b c c d d