Скачать презентацию Learning to Sportscast A Test of Grounded Language Скачать презентацию Learning to Sportscast A Test of Grounded Language

22ee64bb0aa6f7adbd2f6abf48f6985a.ppt

  • Количество слайдов: 50

Learning to Sportscast: A Test of Grounded Language Acquisition David Chen & Raymond Mooney Learning to Sportscast: A Test of Grounded Language Acquisition David Chen & Raymond Mooney Department of Computer Sciences University of Texas at Austin

Motivation • Constructing annotated corpora for language learning is difficult • Children acquire language Motivation • Constructing annotated corpora for language learning is difficult • Children acquire language through exposure to linguistic input in the context of a rich, relevant, perceptual environment

Goals • Learn to ground the semantics of language Block • Learn language through Goals • Learn to ground the semantics of language Block • Learn language through correlated linguistic and visual inputs

Challenge Challenge

Challenge Challenge

Challenge A linguistic input may correspond to many possible events Block ? ? ? Challenge A linguistic input may correspond to many possible events Block ? ? ?

Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation

Learning to Sportscast • Robocup Simulation League games • No speech recognition – Record Learning to Sportscast • Robocup Simulation League games • No speech recognition – Record commentaries in text form • No computer vision – Ruled-based system to automatically extract game events in symbolic form • Concentrate on linguistic issues

Robocup Simulation League Robocup Simulation League

Robocup Simulation League Pink 4’s pass was intercepted by Purple 6 Robocup Simulation League Pink 4’s pass was intercepted by Purple 6

Learning to Sportscast • Learn to sportscast by observing sample human sportscasts • Build Learning to Sportscast • Learn to sportscast by observing sample human sportscasts • Build a function that maps between natural language (NL) and meaning representation (MR) – NL: Textual commentaries about the game – MR: Predicate logic formulas that represent events in the game

Mapping between NL/MR NL: “Purple 3 passes the ball to Purple 5” Semantic Parsing Mapping between NL/MR NL: “Purple 3 passes the ball to Purple 5” Semantic Parsing (NL MR) Tactical Generation (MR NL) MR: Pass ( Purple 3, Purple 5 )

Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink 8 Purple team is very sloppy today Pink 8 passes the ball to Pink 11 looks around for a teammate Pink 11 makes a long pass to Pink 8 passes back to Pink 11 Meaning Representation bad. Pass ( Purple 1, Pink 8 ) turnover ( Purple 1, Pink 8 ) kick ( Pink 8) pass ( Pink 8, Pink 11 ) kick ( Pink 11 ) ballstopped kick ( Pink 11 ) pass ( Pink 11, Pink 8 ) kick ( Pink 8 ) pass ( Pink 8, Pink 11 )

Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink 8 Purple team is very sloppy today Pink 8 passes the ball to Pink 11 looks around for a teammate Pink 11 makes a long pass to Pink 8 passes back to Pink 11 Meaning Representation bad. Pass ( Purple 1, Pink 8 ) turnover ( Purple 1, Pink 8 ) kick ( Pink 8) pass ( Pink 8, Pink 11 ) kick ( Pink 11 ) ballstopped kick ( Pink 11 ) pass ( Pink 11, Pink 8 ) kick ( Pink 8 ) pass ( Pink 8, Pink 11 )

Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink 8 Purple team is very sloppy today Pink 8 passes the ball to Pink 11 looks around for a teammate Pink 11 makes a long pass to Pink 8 passes back to Pink 11 Meaning Representation bad. Pass ( Purple 1, Pink 8 ) turnover ( Purple 1, Pink 8 ) kick ( Pink 8) pass ( Pink 8, Pink 11 ) kick ( Pink 11 ) ballstopped kick ( Pink 11 ) pass ( Pink 11, Pink 8 ) kick ( Pink 8 ) pass ( Pink 8, Pink 11 )

Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink Robocup Sportscaster Trace Natural Language Commentary Purple goalie turns the ball over to Pink 8 Purple team is very sloppy today Pink 8 passes the ball to Pink 11 looks around for a teammate Pink 11 makes a long pass to Pink 8 passes back to Pink 11 Meaning Representation P 6 ( C 1, C 19 ) P 5 ( C 1, C 19 ) P 1( C 19 ) P 2 ( C 19, C 22 ) P 1 ( C 22 ) P 0 P 1 ( C 22 ) P 2 ( C 22, C 19 ) P 1 ( C 19 ) P 2 ( C 19, C 22 )

Robocup Data • Collected human textual commentary for the 4 Robocup championship games from Robocup Data • Collected human textual commentary for the 4 Robocup championship games from 2001 -2004. – Avg # events/game = 2, 613 – Avg # sentences/game = 509 • Each sentence matched to all events within previous 5 seconds. – Avg # MRs/sentence = 2. 5 (min 1, max 12) • Manually annotated with correct matchings of sentences to MRs (for evaluation purposes only).

Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation

Tactical Generation • • Learn how to generate NL from MR Example: Pass(Pink 2, Tactical Generation • • Learn how to generate NL from MR Example: Pass(Pink 2, Pink 3) “Pink 2 kicks the ball to Pink 3” • Two steps 1. Disambiguate the training data 2. Learn a language generator

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( Purple 5, Purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( Purple 5, Purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Initial Semantic Parser Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Semantic Parser Learner

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Initial Semantic Parser

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Semantic Parser Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Semantic Parser Learner

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Turnover ( purple 7 , pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Semantic Parser Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Semantic Parser Learner

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Turnover ( purple 7 , pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Semantic Parser Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Semantic Parser Learner

System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks System Overview Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Turnover ( purple 7 , pink 2 ) Pass ( pink 2 , pink 5 ) Pass ( pink 5 , pink 8) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Semantic Parser Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Semantic Parser Learner

Semantic Parser Learners • Learn a function from NL to MR NL: “Purple 3 Semantic Parser Learners • Learn a function from NL to MR NL: “Purple 3 passes the ball to Purple 5” Semantic Parsing (NL MR) Tactical Generation (MR NL) MR: Pass ( Purple 3, Purple 5 ) • We experiment with two semantic parser learners –WASP (Wong & Mooney, 2006; 2007) –KRISP (Kate & Mooney, 2006)

WASP: Word Alignment-based Semantic Parsing • Uses statistical machine translation techniques – Synchronous context-free WASP: Word Alignment-based Semantic Parsing • Uses statistical machine translation techniques – Synchronous context-free grammars (SCFG) (Wu, 1997; Melamed, 2004; Chiang, 2005) – Word alignments (Brown et al. , 1993; Och & Ney, 2003) • Capable of both semantic parsing and tactical generation

KRISP: Kernel-based Robust Interpretation by Semantic Parsing • Productions of MR language are treated KRISP: Kernel-based Robust Interpretation by Semantic Parsing • Productions of MR language are treated like semantic concepts • SVM classifier is trained for each production with string subsequence kernel • These classifiers are used to compositionally build MRs of the sentences • More resistant to noisy supervision but incapable of tactical generation

Matching • Ability to find correct NL/MR pair • 4 Robocup championship games from Matching • Ability to find correct NL/MR pair • 4 Robocup championship games from 2001 -2004. – Avg # events/game = 2, 613 – Avg # sentences/game = 509 • Leave-one-game-out cross-validation • Metric: – Precision: % of system’s annotations that are correct – Recall: % of gold-standard annotations correctly produced – F-measure: Harmonic mean of precision and recall

Systems Learner KRISPER (Kate & Mooney, 2007) KRISP WASPER-GEN WASP’s language generator Systems Learner KRISPER (Kate & Mooney, 2007) KRISP WASPER-GEN WASP’s language generator

KRISPER and WASPER Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 KRISPER and WASPER Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Turnover ( purple 7 , pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Semantic Parser Learner (KRISP/WASP)

Systems Learner KRISPER (Kate & Mooney, 2007) KRISP WASPER-GEN WASP’s language generator Systems Learner KRISPER (Kate & Mooney, 2007) KRISP WASPER-GEN WASP’s language generator

WASPER-GEN Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the WASPER-GEN Sportscaster Robocup Simulator Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Turnover ( purple 7 , pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 8 ) Unambiguous Training Data Purple 7 loses the ball to Pink 2 kicks the ball to Pink 5 makes a long pass to Pink 8 shoots the ball Pass ( purple 5, purple 7 ) Turnover ( purple 7 , pink 2 ) Kick ( pink 2 ) Pass ( pink 2 , pink 5 ) Kick ( pink 5 ) Tactical Generator Pass ( pink 5 , pink 8) Ballstopped Kick ( pink 8 ) Ambiguous Training Data Tactical Generator Learner (WASP)

Matching Results Matching Results

Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation

Strategic Generation • Generation requires not only knowing how to say something (tactical generation) Strategic Generation • Generation requires not only knowing how to say something (tactical generation) but also what to say (strategic generation). • For automated sportscasting, one must be able to effectively choose which events to describe.

Example of Strategic Generation pass ( purple 7 , purple 6 ) ballstopped kick Example of Strategic Generation pass ( purple 7 , purple 6 ) ballstopped kick ( purple 6 ) pass ( purple 6 , purple 2 ) ballstopped kick ( purple 2 ) pass ( purple 2 , purple 3 ) kick ( purple 3 ) bad. Pass ( purple 3 , pink 9 ) turnover ( purple 3 , pink 9 )

Example of Strategic Generation pass ( purple 7 , purple 6 ) ballstopped kick Example of Strategic Generation pass ( purple 7 , purple 6 ) ballstopped kick ( purple 6 ) pass ( purple 6 , purple 2 ) ballstopped kick ( purple 2 ) pass ( purple 2 , purple 3 ) kick ( purple 3 ) bad. Pass ( purple 3 , pink 9 ) turnover ( purple 3 , pink 9 )

Strategic Generation • For each event type (e. g. pass, kick) estimate the probability Strategic Generation • For each event type (e. g. pass, kick) estimate the probability that it is described by the sportscaster. • Requires correct NL/MR matching – Use estimated matching from tactical generation – Iterative Generation Strategy Learning

Iterative Generation Strategy Learning (IGSL) • Directly estimates the likelihood of an event being Iterative Generation Strategy Learning (IGSL) • Directly estimates the likelihood of an event being commented on • Self-training iterations to improve estimates • Uses events not associated with any NL as negative evidence

Strategic Generation Performance • Evaluate how well the system can predict which events a Strategic Generation Performance • Evaluate how well the system can predict which events a human comments on • Metric: – Precision: % of system’s annotations that are correct – Recall: % of gold-standard annotations correctly produced – F-measure: Harmonic mean of precision and recall

Strategic Generation Results Strategic Generation Results

Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation Overview • • Sportscasting task Tactical generation Strategic generation Human evaluation

Human Evaluation (Quasi Turing Test) • 4 fluent English speakers as judges • 8 Human Evaluation (Quasi Turing Test) • 4 fluent English speakers as judges • 8 commented game clips – 2 minute clips randomly selected from each of the 4 games – Each clip commented once by a human, and once by the machine • Presented in random counter-balanced order • Judges were not told which ones were human or machine generated

Demo Clip • Game clip commentated using WASPERGEN with IGSL, since this gave the Demo Clip • Game clip commentated using WASPERGEN with IGSL, since this gave the best results for generation. • Free. TTS was used to synthesize speech from textual output.

Human Evaluation Score English Fluency Semantic Correctness Sportscasting Ability 5 Flawless Always Excellent 4 Human Evaluation Score English Fluency Semantic Correctness Sportscasting Ability 5 Flawless Always Excellent 4 Good Usually Good 3 Non-native Sometimes Average 2 Disfluent Rarely Bad 1 Gibberish Never Terrible

Human Evaluation Score English Fluency Semantic Correctness Sportscasting Ability 5 Flawless Always Excellent 4 Human Evaluation Score English Fluency Semantic Correctness Sportscasting Ability 5 Flawless Always Excellent 4 Good Usually Good 3 Non-native Sometimes Average 2 Disfluent Rarely Bad 1 Gibberish Never Terrible Commentator English Fluency Semantic Correctness Sportscasting Ability Human 3. 94 4. 25 3. 63 Machine 3. 44 3. 56 2. 94 Difference 0. 5 0. 69

Future Work • Expand MRs to beyond simple logic formulas • Apply approach to Future Work • Expand MRs to beyond simple logic formulas • Apply approach to learning situated language in a computer video-game environment (Gorniak & Roy, 2005) • Apply approach to captioned images or video using computer vision to extract objects, relations, and events from real perceptual data (Fleischman & Roy, 2007)

Conclusion • Current language learning work uses expensive, unrealistic training data. • We have Conclusion • Current language learning work uses expensive, unrealistic training data. • We have developed a language learning system that can learn from language paired with an ambiguous perceptual environment. • We have evaluated it on the task of learning to sportscast simulated Robocup games. • The system learns to sportscast almost as well as humans.