Скачать презентацию Curs 10 Natural Language Generation a highly complex Скачать презентацию Curs 10 Natural Language Generation a highly complex

137191635580db00fbc07d051e7c8406.ppt

  • Количество слайдов: 54

Curs 10 Natural Language Generation a highly complex task both for people and for Curs 10 Natural Language Generation a highly complex task both for people and for machines Slide-uri împrumutate de la Michael Zock LIMSI-CNRS Orsay, France

Some preliminary issues : Warning This is not a state of the art talk. Some preliminary issues : Warning This is not a state of the art talk. If you are interested in those, this here could be a starting point : Bateman & Zock : (2003) Natural Language Generation. In R. Mitkov (Ed. ) Handbook of Computational Linguistics, Oxford University Press, pp. 284 -304 List of systems: http: //www. fb 10. unibremen. de/anglistik/langpro/NLG-tableroot. htm Anything related to NLG: http: //www. siggen. org/ 2

Some preliminary issues Background material ü Willem Levelt • ü Speaking : from Intention Some preliminary issues Background material ü Willem Levelt • ü Speaking : from Intention to Articulation, MIT Press, 1989 E. Reiter & R. Dale • Building Natural Language Generation Systems (2000), Cambridge University Press 3

Overview of this talk Part 1 : General problems • knowledge and constraints, architecture, Overview of this talk Part 1 : General problems • knowledge and constraints, architecture, process, etc. Part 2 : Deep generation § message planning § message ordering (text plan, outline) Part 3: Surface generation § § lexical choice (acces and synthesis) computation of syntactic structure 4

Different ways to look at text generation NLG FOR people NLG WITH people Fully Different ways to look at text generation NLG FOR people NLG WITH people Fully automated generation NLG LIKE people text Writer’s workbench NLG Semi-automated, machinemediated-generation Simulation of psychological processes Foreign language learning connectionism Online processing Incremental generation

What is NLG? - ask google Fort méconnue du grand public, la génération de What is NLG? - ask google Fort méconnue du grand public, la génération de textes demeure une discipline sportive essentiellement universitaire, pratiquée par d'obscurs chercheurs dans des laboratoires tristes et exigus. Cette discipline pousse ses malheureux adeptes à des pratiques honteuses : la génération par ordinateur interposé de textes longs et soporifiques à partir d'une composition sémantique produite mécaniquement. Hardly known by the great majority of people, text generation remains a sport basically practiced by people from academia. Those engaged in this activity usually work in sad and narrow places. The discipline induces strange kinds of behavior like the generation of long and boaring texts via computers on the basis of mechanically produced semantic representations. 6

What is NLG? In search for a definition Ø The focus and definition may What is NLG? In search for a definition Ø The focus and definition may depend on the domain (psychology, linguistic, computer science) ü Mapping problem: translate meanings into linguistic form ü Linguistically-mediated problem solving ü Language as a search problem 7

What is NL-Generation? (I) Generation as a mapping process C 1 W 1 Input: What is NL-Generation? (I) Generation as a mapping process C 1 W 1 Input: concepts C 3 C 2 W 3 Output: words NLG viewed as a process of mapping a conceptual structure (meaning) onto a linguistic form 8

Catch me if you can C 1 C 3 C 2 W 1 W Catch me if you can C 1 C 3 C 2 W 1 W 2 C 4 W 3 Conceptualization Expression We tend to think faster than we can find the corresponding words and convert them into sounds 9

There is no one-to-one mapping between linguistic structures and conceptual structures There is no one-to-one mapping between linguistic structures and conceptual structures

The same conceptual structure may map onto different linguistic structures (synonymes, paraphrase) Possession ü The same conceptual structure may map onto different linguistic structures (synonymes, paraphrase) Possession ü ü This car belongs to the president This is the car of the president This is the president's car This is his car. verb preposition genitive Poss. Adj. 11

The same linguistic structure may map onto different conceptual structures Linguistic ressource: genitif ü The same linguistic structure may map onto different conceptual structures Linguistic ressource: genitif ü ü ü Peter's car is broken Peter's brother is sick Peter's leg hurts possession family relationship inalienable possession, part of 12

NLG as language mediated problem solving 13 NLG as language mediated problem solving 13

A simple generation model 14 A simple generation model 14

Nature of choices ü pragmatic ü conceptual ü linguistic 15 Nature of choices ü pragmatic ü conceptual ü linguistic 15

Pragmatic choices Ø Languages are indirect means for achieving goals • Ø mediating devices Pragmatic choices Ø Languages are indirect means for achieving goals • Ø mediating devices Different linguistic means serve different discourse purposes • i. e. different forms are used in order to achieve different goals 16

Pragmatic choices: language as a resource ü active vs. passive voice [topic, perspective] ü Pragmatic choices: language as a resource ü active vs. passive voice [topic, perspective] ü main vs. subordinate clause [relative prominence] 17

Conceptual choices Different meanings yield generally different forms üNUMBER üTENSE he sings vs. they Conceptual choices Different meanings yield generally different forms üNUMBER üTENSE he sings vs. they sing he sings vs. he sang 18

Linguistic choices The same meaning can be expressed by different words or syntactic forms Linguistic choices The same meaning can be expressed by different words or syntactic forms (synonymes, paraphrases) GROWN UP MALE PERSON: man, guy, chap HELP: help, give a hand, assist 19

What is NL-Generation? Tentative definition (III) Generation as a search problem Size of mental What is NL-Generation? Tentative definition (III) Generation as a search problem Size of mental lexicon : appr. 30 000 words 20

An abstract view An abstract view

An example An example

Input: analysis 23 Input: analysis 23

Input: synthesis 24 Input: synthesis 24

Different search spaces 25 Different search spaces 25

Fundamental problems § Analysis : ambiguity § Generation : choice Fundamental problems § Analysis : ambiguity § Generation : choice

Why bother about generation ? (1) Different kinds of motivation ü Theoretical ü Practical Why bother about generation ? (1) Different kinds of motivation ü Theoretical ü Practical ü Industrial 27

Theoretical reasons - building and testing a theory Ø Testbed • coverage (over/undergeneration), correctness Theoretical reasons - building and testing a theory Ø Testbed • coverage (over/undergeneration), correctness Ø Testbed • for a linguistic theory : for a psychological model: simulation of cognitive processes (on-line processing, language learning) 28

Practical reasons (industrial-full automation) ü machine translation ü text generation (business letters) ü generation Practical reasons (industrial-full automation) ü machine translation ü text generation (business letters) ü generation of resumes (stock market report, weather forecast, etc. ) ü help systems (audit trail, access to DB) ü abstracting 29

Practical reasons (help systems, semi automation) ü Computer assisted language learning (tools) ü Writer's Practical reasons (help systems, semi automation) ü Computer assisted language learning (tools) ü Writer's workbench (pre/postediting: correction of grammar, style, spelling, text organization) 30

The decomposition of the task: NLG-architectures 31 The decomposition of the task: NLG-architectures 31

GOAL A two-stage model Division of labor 32 GOAL A two-stage model Division of labor 32

Four componants 33 Four componants 33

Procedural know-how ü Planning (determine the order of the different steps - textual organisation) Procedural know-how ü Planning (determine the order of the different steps - textual organisation) ü Searching (find the words; access) ü Reasoning-inferencing ( « see » possible links between ideas) 34

LTM STM Up to lifetime less than 30 seconds Sensory Memory 1 second Rose LTM STM Up to lifetime less than 30 seconds Sensory Memory 1 second Rose

Basic Memory Processes Basic Memory Processes

Number of choices (space + time constraints) ü We have to take a great Number of choices (space + time constraints) ü We have to take a great number of choices under severe space and time constraints ü space ü constraint (limitation of STM) time constraint : (speed) ü speech is fast: 3 -5 words / second ü average of decisions / word = 4 37

Diversity of choices ü Conceptual choices ü Linguistic choices ü Pragmatic choices 38 Diversity of choices ü Conceptual choices ü Linguistic choices ü Pragmatic choices 38

The necessary information for synthesis is scattered all over Subject LISTENER Pronoun GIVE Direct The necessary information for synthesis is scattered all over Subject LISTENER Pronoun GIVE Direct Object BOOK Indirect Object SPEAKER Pronoun 39

How to express the notion of the speaker ? je me me moi me How to express the notion of the speaker ? je me me moi me nous SPEAKER I we / us What do the different forms depend upon? 40

LISTENER Subj. GIVE DO BOOK IO SPEAKER Tu lui donnes le livre. You give LISTENER Subj. GIVE DO BOOK IO SPEAKER Tu lui donnes le livre. You give him/her the book. Tu nous donnes le livre. You give us the book. Person Number Tu ME donnes le livre. You give me the book. Tu me donnes le livre. You give me the book. 41

Tu me donnes le livre. You give me the book. Donne-le moi ! Give Tu me donnes le livre. You give me the book. Donne-le moi ! Give it to me ! Speech act Tu me donnes le livre. You give me the book. Tense Tu m’as donné le livre. You have given me the book. Ne me le donne pas ! Don’t give me this book ! Polarity Donne-moi ce livre ! Give me this book ! LISTENER Subj. GIVE DO BOOK IO SPEAKER 42

43 43

present Input PAUL HELP Agent Object MARY PRAGMATIC CHOICE Paul = topic Marie = present Input PAUL HELP Agent Object MARY PRAGMATIC CHOICE Paul = topic Marie = given Aider = new SYNT. FUNCT. & VOICE voice = active Paul = subject Mary = direct object MORPHOLOGY Verb : 3 d person, singular, present aide Subject : Noun Paul Direct object : pronoun la LEXICALIZATION HELP = aider PAUL = Paul MARY = Marie output PHONOGRAPH. SYNTH. Paul l’aide. Paul helps her PART OF SPEECH HELP = verb Paul = noun Mary = pronoun WORD ORDER SUBJECT noun DIR. OBJECT pronoun VERB verb 44

Consequences for languages, architecture & processing Ø languages are and need to be flexible Consequences for languages, architecture & processing Ø languages are and need to be flexible Ø information does not become available in a strict order: it may vary on every occasion EVENT-TIME-PLACE vs. PLACE-EVENT-TIME , etc. Ø Consequences (interaction and accomodation) Ø Ø Data : accomodation of the different data structures (interaction between words and syntax) in the different modules (conceptual lexical, syntactic), 45 Process : feedback to higher components

Example illustrating the consequences (i. e. functional dependencies ) of the choices 46 Example illustrating the consequences (i. e. functional dependencies ) of the choices 46

Conceptual input 47 Conceptual input 47

Let’s consider the consequences of the following 2 choices § Topicalisation ü the concept Let’s consider the consequences of the following 2 choices § Topicalisation ü the concept to start the sentence with § Lexical choice ü synonymes 48

Topicalize Agent Consequences: ü Agent --> Subject ü voice --> active ü Patient --> Topicalize Agent Consequences: ü Agent --> Subject ü voice --> active ü Patient --> Direct Object 49

Consequences of topicalisation 50 Consequences of topicalisation 50

Topicalize Patient Consequences: ü Agent --> PP ü Voice --> passive ü Patient -->grammatical Topicalize Patient Consequences: ü Agent --> PP ü Voice --> passive ü Patient -->grammatical Subject 51

Consequences of topicalisation 52 Consequences of topicalisation 52

Summary of the consequences of the topicalization choice at the top level Strategy 1 Summary of the consequences of the topicalization choice at the top level Strategy 1 Strategy 2 Topic agent patient Agent grammatical subject preposit. phrase Patient direct object grammat. subject voice active passive 53

Assumptions - Conclusion § § § No superexpert but a set of cooperative agents Assumptions - Conclusion § § § No superexpert but a set of cooperative agents competition - accomodation no algorithmic processing but opportunistic planning various orders of processing various components need the same information system is heterarchical rather than hierarchic 54