137191635580db00fbc07d051e7c8406.ppt
- Количество слайдов: 54
Curs 10 Natural Language Generation a highly complex task both for people and for machines Slide-uri împrumutate de la Michael Zock LIMSI-CNRS Orsay, France
Some preliminary issues : Warning This is not a state of the art talk. If you are interested in those, this here could be a starting point : Bateman & Zock : (2003) Natural Language Generation. In R. Mitkov (Ed. ) Handbook of Computational Linguistics, Oxford University Press, pp. 284 -304 List of systems: http: //www. fb 10. unibremen. de/anglistik/langpro/NLG-tableroot. htm Anything related to NLG: http: //www. siggen. org/ 2
Some preliminary issues Background material ü Willem Levelt • ü Speaking : from Intention to Articulation, MIT Press, 1989 E. Reiter & R. Dale • Building Natural Language Generation Systems (2000), Cambridge University Press 3
Overview of this talk Part 1 : General problems • knowledge and constraints, architecture, process, etc. Part 2 : Deep generation § message planning § message ordering (text plan, outline) Part 3: Surface generation § § lexical choice (acces and synthesis) computation of syntactic structure 4
Different ways to look at text generation NLG FOR people NLG WITH people Fully automated generation NLG LIKE people text Writer’s workbench NLG Semi-automated, machinemediated-generation Simulation of psychological processes Foreign language learning connectionism Online processing Incremental generation
What is NLG? - ask google Fort méconnue du grand public, la génération de textes demeure une discipline sportive essentiellement universitaire, pratiquée par d'obscurs chercheurs dans des laboratoires tristes et exigus. Cette discipline pousse ses malheureux adeptes à des pratiques honteuses : la génération par ordinateur interposé de textes longs et soporifiques à partir d'une composition sémantique produite mécaniquement. Hardly known by the great majority of people, text generation remains a sport basically practiced by people from academia. Those engaged in this activity usually work in sad and narrow places. The discipline induces strange kinds of behavior like the generation of long and boaring texts via computers on the basis of mechanically produced semantic representations. 6
What is NLG? In search for a definition Ø The focus and definition may depend on the domain (psychology, linguistic, computer science) ü Mapping problem: translate meanings into linguistic form ü Linguistically-mediated problem solving ü Language as a search problem 7
What is NL-Generation? (I) Generation as a mapping process C 1 W 1 Input: concepts C 3 C 2 W 3 Output: words NLG viewed as a process of mapping a conceptual structure (meaning) onto a linguistic form 8
Catch me if you can C 1 C 3 C 2 W 1 W 2 C 4 W 3 Conceptualization Expression We tend to think faster than we can find the corresponding words and convert them into sounds 9
There is no one-to-one mapping between linguistic structures and conceptual structures
The same conceptual structure may map onto different linguistic structures (synonymes, paraphrase) Possession ü ü This car belongs to the president This is the car of the president This is the president's car This is his car. verb preposition genitive Poss. Adj. 11
The same linguistic structure may map onto different conceptual structures Linguistic ressource: genitif ü ü ü Peter's car is broken Peter's brother is sick Peter's leg hurts possession family relationship inalienable possession, part of 12
NLG as language mediated problem solving 13
A simple generation model 14
Nature of choices ü pragmatic ü conceptual ü linguistic 15
Pragmatic choices Ø Languages are indirect means for achieving goals • Ø mediating devices Different linguistic means serve different discourse purposes • i. e. different forms are used in order to achieve different goals 16
Pragmatic choices: language as a resource ü active vs. passive voice [topic, perspective] ü main vs. subordinate clause [relative prominence] 17
Conceptual choices Different meanings yield generally different forms üNUMBER üTENSE he sings vs. they sing he sings vs. he sang 18
Linguistic choices The same meaning can be expressed by different words or syntactic forms (synonymes, paraphrases) GROWN UP MALE PERSON: man, guy, chap HELP: help, give a hand, assist 19
What is NL-Generation? Tentative definition (III) Generation as a search problem Size of mental lexicon : appr. 30 000 words 20
An abstract view
An example
Input: analysis 23
Input: synthesis 24
Different search spaces 25
Fundamental problems § Analysis : ambiguity § Generation : choice
Why bother about generation ? (1) Different kinds of motivation ü Theoretical ü Practical ü Industrial 27
Theoretical reasons - building and testing a theory Ø Testbed • coverage (over/undergeneration), correctness Ø Testbed • for a linguistic theory : for a psychological model: simulation of cognitive processes (on-line processing, language learning) 28
Practical reasons (industrial-full automation) ü machine translation ü text generation (business letters) ü generation of resumes (stock market report, weather forecast, etc. ) ü help systems (audit trail, access to DB) ü abstracting 29
Practical reasons (help systems, semi automation) ü Computer assisted language learning (tools) ü Writer's workbench (pre/postediting: correction of grammar, style, spelling, text organization) 30
The decomposition of the task: NLG-architectures 31
GOAL A two-stage model Division of labor 32
Four componants 33
Procedural know-how ü Planning (determine the order of the different steps - textual organisation) ü Searching (find the words; access) ü Reasoning-inferencing ( « see » possible links between ideas) 34
LTM STM Up to lifetime less than 30 seconds Sensory Memory 1 second Rose
Basic Memory Processes
Number of choices (space + time constraints) ü We have to take a great number of choices under severe space and time constraints ü space ü constraint (limitation of STM) time constraint : (speed) ü speech is fast: 3 -5 words / second ü average of decisions / word = 4 37
Diversity of choices ü Conceptual choices ü Linguistic choices ü Pragmatic choices 38
The necessary information for synthesis is scattered all over Subject LISTENER Pronoun GIVE Direct Object BOOK Indirect Object SPEAKER Pronoun 39
How to express the notion of the speaker ? je me me moi me nous SPEAKER I we / us What do the different forms depend upon? 40
LISTENER Subj. GIVE DO BOOK IO SPEAKER Tu lui donnes le livre. You give him/her the book. Tu nous donnes le livre. You give us the book. Person Number Tu ME donnes le livre. You give me the book. Tu me donnes le livre. You give me the book. 41
Tu me donnes le livre. You give me the book. Donne-le moi ! Give it to me ! Speech act Tu me donnes le livre. You give me the book. Tense Tu m’as donné le livre. You have given me the book. Ne me le donne pas ! Don’t give me this book ! Polarity Donne-moi ce livre ! Give me this book ! LISTENER Subj. GIVE DO BOOK IO SPEAKER 42
43
present Input PAUL HELP Agent Object MARY PRAGMATIC CHOICE Paul = topic Marie = given Aider = new SYNT. FUNCT. & VOICE voice = active Paul = subject Mary = direct object MORPHOLOGY Verb : 3 d person, singular, present aide Subject : Noun Paul Direct object : pronoun la LEXICALIZATION HELP = aider PAUL = Paul MARY = Marie output PHONOGRAPH. SYNTH. Paul l’aide. Paul helps her PART OF SPEECH HELP = verb Paul = noun Mary = pronoun WORD ORDER SUBJECT noun DIR. OBJECT pronoun VERB verb 44
Consequences for languages, architecture & processing Ø languages are and need to be flexible Ø information does not become available in a strict order: it may vary on every occasion EVENT-TIME-PLACE vs. PLACE-EVENT-TIME , etc. Ø Consequences (interaction and accomodation) Ø Ø Data : accomodation of the different data structures (interaction between words and syntax) in the different modules (conceptual lexical, syntactic), 45 Process : feedback to higher components
Example illustrating the consequences (i. e. functional dependencies ) of the choices 46
Conceptual input 47
Let’s consider the consequences of the following 2 choices § Topicalisation ü the concept to start the sentence with § Lexical choice ü synonymes 48
Topicalize Agent Consequences: ü Agent --> Subject ü voice --> active ü Patient --> Direct Object 49
Consequences of topicalisation 50
Topicalize Patient Consequences: ü Agent --> PP ü Voice --> passive ü Patient -->grammatical Subject 51
Consequences of topicalisation 52
Summary of the consequences of the topicalization choice at the top level Strategy 1 Strategy 2 Topic agent patient Agent grammatical subject preposit. phrase Patient direct object grammat. subject voice active passive 53
Assumptions - Conclusion § § § No superexpert but a set of cooperative agents competition - accomodation no algorithmic processing but opportunistic planning various orders of processing various components need the same information system is heterarchical rather than hierarchic 54


