Скачать презентацию Speech Natural Language Affect in Tutorial Dialogue Скачать презентацию Speech Natural Language Affect in Tutorial Dialogue

e6e77df55a3ce463a39e69b10694df3c.ppt

  • Количество слайдов: 40

Speech, Natural Language, & Affect in Tutorial Dialogue Systems Prof. Diane J. Litman Computer Speech, Natural Language, & Affect in Tutorial Dialogue Systems Prof. Diane J. Litman Computer Science Department, Intelligent Systems Program, & Learning Research and Development Center http: //www. cs. pitt. edu/~litman

A few words about me… n Currently Professor in CS and ISP Research Scientist A few words about me… n Currently Professor in CS and ISP Research Scientist at LRDC ITSPOKE research group (3 Ph. D students, 1 CS ugrad, 2 postdocs, 1 programmer) ¨ AI Research (speech and natural language, intelligent tutoring) ¨ ¨ ¨ ¨ ¨ n Discourse and dialogue Prosody, spoken dialogue systems Speech and language technology for education (take my spring seminar!) Reinforcement learning, user simulation Affective computing AI and education Cognitive science Previously ¨ ¨ ¨ Member Technical Staff, AT&T Labs Research, NJ Assistant Professor, CS at Columbia University, NY AI Research (speech and NLP, knowledge representation and reasoning, plan recognition) 2

Speech-based Computer Tutors n n What are they? Example Tutor: Well, if an object Speech-based Computer Tutors n n What are they? Example Tutor: Well, if an object has non zero constant velocity, is it moving or staying still? ¨ Student: Moving ¨ Tutor: Yep. If it’s moving, then its position is changing. So then what will happen to the packet’s horizontal displacement from the point of its release? ¨ Student: It will change ¨ n Intersection of two fields: Intelligent Tutoring Systems (ITS) ¨ Spoken Dialogue Systems (SDS) ¨ 3

Intelligent Tutoring Systems (ITS) n Education ¨ Classroom instruction [most frequent form] ¨ Human Intelligent Tutoring Systems (ITS) n Education ¨ Classroom instruction [most frequent form] ¨ Human (one-on-one) tutoring [most effective form] n Computer tutors – Intelligent Tutoring Systems ¨ Not as good as human tutors ¨ Ways to address the performance n Language technologies ¨ ¨ ¨ n gap Text-based dialogue Talking heads Speech-based dialogue: react to how in addition to what Affective computing 4

Adding speech to ITS n Spoken Dialogue Systems (SDS) ¨ n Systems that interact Adding speech to ITS n Spoken Dialogue Systems (SDS) ¨ n Systems that interact with users via speech Advantages Naturalness ¨ Efficiency ¨ Eye and hands free ¨ n Domains Information access [Raux et al. , 2005; Rudnicky et al. , 1999; Zue et al. , 2000] ¨ Tutoring [Graesser et al. , 2001; Litman and Silliman, 2004; Pon-Barry et al. , 2006] ¨ Assistants [Allen et al. , 2001; Rayner et al. , 2005; Acomb et al. , 2007] ¨ 5

Challenges in ITS n n n What does it mean to teach a subject? Challenges in ITS n n n What does it mean to teach a subject? What to teach? Designing instruction Delivering instruction Understanding the human learning process 6

Challenges in SDS n Automated speech recognition (ASR) ¨ n n Sphinx, Microsoft Speech, Challenges in SDS n Automated speech recognition (ASR) ¨ n n Sphinx, Microsoft Speech, Dragon Naturally Speaking Natural language understanding (NLU) Dialogue Management (DM) How to keep the conversation going? Best strategy? ¨ How to detect errors in communication? ¨ How to recover from errors? ¨ n Spoken language generation 7

Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning ¨ Interactions between phenomena ¨ Other projects ¨ 8

How to do research in speech-based computer tutors n ITSPOKE (Intelligent Tutoring SPOKEn Dialogue How to do research in speech-based computer tutors n ITSPOKE (Intelligent Tutoring SPOKEn Dialogue System) [Litman and Silliman, 2004] ¨ Speech-enabled version of the Why 2 -Atlas computer tutor [Van. Lehn, Jordan, Rose et al. , 2002] ¨ n Domain: Qualitative physics Sample ITSPOKE problem ¨ Suppose a man is in a free-falling elevator and is holding his keys motionless right in front of his face. He then lets go. What will be the position of the keys relative to the man's face as time passes? Explain. 9

 • Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. • Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. 2002] • Sphinx 2 speech recognition and Cepstral text-to-speech 10

 • Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. • Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. 2002] • Sphinx 2 speech recognition and Cepstral text-to-speech 11

 • Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. • Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. 2002] • Sphinx 2 speech recognition and Cepstral text-to-speech 12

Human-Computer Excerpt Tutor 26: Let me ask my question in a more general way. Human-Computer Excerpt Tutor 26: Let me ask my question in a more general way. What happens to the velocity of a body when there is no force acting on it? Student 27: dammit (ASR: it is) Tutor 28 : Could you please repeat that? Student 29 : same (ASR: i same) Tutor 30 : Right. And what happens to the HORIZONTAL velocity of a body when there is no HORIZONTAL force acting on it? Student 31: zero (ASR: the zero) Tutor 32 : Well. . . OK, let's consider how forces in the vertical direction affect the motion of a body in a horizontal direction in a different scenario <…omitted…> Student 33: oh gravity you already said this (ASR: beats gravity you're exert less) Tutor 34: Fine. Are there any other forces acting on the apple as it falls? Student 35: no why are you doing this again (ASR: no y and to it yes) Tutor 36: Correct. The only force acting on the apple as it falls to the earth is the earth's gravitational force. In which direction does gravitational force act? Student 37: downward you computer (ASR: downward you computer) 13

How ITSPOKE/WHY works n Simplified conversation structure Question-answer format ¨ Tutoring information authored in How ITSPOKE/WHY works n Simplified conversation structure Question-answer format ¨ Tutoring information authored in a hierarchical structure - KCDs ¨ [Van. Lehn, Jordan, Rosé et al, 2002] Problem Essay Dialogue with ITSPOKE Q 1 Q 2 Q 3 14

ESSAY SUBMISSION & ANALYSIS ITSPOKE behavior Q 1 Q 2 Q 3 Q 5 ESSAY SUBMISSION & ANALYSIS ITSPOKE behavior Q 1 Q 2 Q 3 Q 5 Q 4 Remediation subdialogue 15

Sample KCD (Knowledge Construction Dialogue) 16 Sample KCD (Knowledge Construction Dialogue) 16

Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning ¨ Interactions between phenomena ¨ Other projects ¨ 17

Comparing systems n n Metrics Subjective metrics ¨ Questionnaire at the end – agreement Comparing systems n n Metrics Subjective metrics ¨ Questionnaire at the end – agreement with statements like: n n n ¨ “It was easy to learn from the tutor” “I enjoyed working with the tutor” “It was easy to loose track of where I was in the conversation” Problems n n Unreliable Need for standardization (psychometrics) 18

Comparing systems (2) n Objective metrics ¨ ¨ ¨ Learning (gain) Time spent with Comparing systems (2) n Objective metrics ¨ ¨ ¨ Learning (gain) Time spent with the computer tutor Correctness of student answers Number of help/hint requrests Speech recognition performance Pre. Test Post. Test Learning 19

Comparing systems - projects n Speech vs. typed tutoring [Litman, Rose, Forbes-Riley et al. Comparing systems - projects n Speech vs. typed tutoring [Litman, Rose, Forbes-Riley et al. , 2006] ¨ Human-speech vs. Human-typed n n ¨ Computer-speech (ITSPOKE) vs. Computer-typed (WHY) n n n Better posttest scores Shorter instruction time No improvement in learning No degradation due to ASR Prerecorded vs. Synthesized [Forbes-Riley et al. , 2006] ¨ ¨ No degradation due to synthesized voice High ASR errors -> prefer synthesized more n Navigation Map ITSPOKE vs. regular ITSPOKE n Affective ITSPOKE vs. regular ITSPOKE [Rotaru and Litman, 2007; Rotaru, Ph. D thesis] [Forbes-Riley, Litman et. al 2008 a, 2008 b, in progress] 20

The Navigation Map n Outline of the conversation ¨ n Theoretical concept: discourse structure The Navigation Map n Outline of the conversation ¨ n Theoretical concept: discourse structure Hypotheses ¨ ¨ Easier for users to follow the conversation Preferred / learn more 21

Discourse segment hierarchy Discourse segments Intention/purpose structure Solution walkthrough TUTOR 1: To analyze this Discourse segment hierarchy Discourse segments Intention/purpose structure Solution walkthrough TUTOR 1: To analyze this problem we will first describe the motion of the person and his keys while he is holding them. Then we will look at the motion of the person and his keys after he lets go of them. Let's begin by looking at the motion of the man and his keys while he is holding them. How does his velocity compare to that of his keys? USER 1: it’s the same TUTOR 2: Yes. So while he is holding his keys they both have the same velocity. Now let's see what happens when he releases them. To do so we will first find the individual forces exerted on the man and his keys, then both net forces, both accelerations, and finally we will describe and compare the velocities. So what are the forces exerted on the man after he releases his keys? Please, specify their directions (e. g. , vertically up). USER 2: vertically down gravity TUTOR 3: Good. Because there is just one force, gravity's, it is trivial to determine the NET force (i. e. , the vector sum of all forces). So what is the direction of the NET force on the person? . . . Two time frames: before release, after release Before release Man’s velocity ? keys’ velocity After release Recipe: Forces Net force Acceleration Velocity Man: Forces/acceleration Forces on the man Net force on the man …………. 22

Experiment design n Within-subjects design ¨ n Rate tutor after each problem ¨ n Experiment design n Within-subjects design ¨ n Rate tutor after each problem ¨ n 1 problem with the NM; 1 without the NM (no. NM) 16 questions, 1 (Strongly Disagree) – 5 (Strongly Agree) scale Two conditions (to account for order and problem) ¨ ¨ F (First) : 1 st problem NM; 2 nd problem no. NM S (Second) : 1 st problem no. NM; 2 nd problem NM Experiment procedure Problem 1 F condition Read Problem 2 NM no. NM Pretest S condition Questionnaire no. NM Questionnaire Posttest NM Survey Interview NM Differences due to NM 23

Experiment design (2) n ITSPOKE dialogue history was disabled ¨ Compare Audio-Only versus Audio+Visual Experiment design (2) n ITSPOKE dialogue history was disabled ¨ Compare Audio-Only versus Audio+Visual (NM) NM no. NM 24

Results – subjective metrics n NM trend/significant effects on system perception during the dialogue: Results – subjective metrics n NM trend/significant effects on system perception during the dialogue: Rating scale 1 - Strongly Disagree ……. 5 - Strongly Agree 25

Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning ¨ Interactions between phenomena ¨ Other projects ¨ 26

Modeling learning n n Problem: What contributes to/causes learning? Correlations with learning Events that Modeling learning n n Problem: What contributes to/causes learning? Correlations with learning Events that significantly correlate with learning ¨ Does not imply causality but it is a requirement for it ¨ What events to measure? ¨ … correctness … time spent Pre. Test Post. Test Learning 27

What events? n Time on task (+), number of student words (+) n Student What events? n Time on task (+), number of student words (+) n Student emotions [Forbes-Riley, Rotaru and Litman, 2008] [Litman, Rose, Forbes-Riley et al. , 2006] [Forbes-Riley, Rotaru and Litman, 2008] ¨ ¨ n Neutral on certainty (-) Neutral on frustration (-) Type of turns – on human-human [Forbes-Riley et al. , 2005] ¨ ¨ Student: introduce new concept (+) Tutor: control dialogue (-) n Discourse structure inspired parameters n Computational implications? [Rotaru and Litman, 2006] 28

Intuition 1 – Conditioning Student learned? Correctness: …………… Incorrect …………… …………… …………… …………… n Intuition 1 – Conditioning Student learned? Correctness: …………… Incorrect …………… …………… …………… …………… n It is more important to be correct at specific “places in the dialogue”. n Phenomena related to performance: Correct Incorrect not uniformly important across the dialogue ¨ have more weight at specific places in the dialogue. ¨ Correct Incorrect Correct n Discourse structure can be used to define “places in the dialogue” Correct 29

Intuition 1 - Results n Correctness Transition – correctness parameters Q 1 Q 2. Intuition 1 - Results n Correctness Transition – correctness parameters Q 1 Q 2. 1 n Q 3 Q 2. 2 Pop. Up–Correct, Pop. Up–Incorrect Interpretation: Capture successful learning events or failed learning opportunities ¨ Generalizes across corpora ¨ ITSPOKE modification: engage in an additional remediation dialogue ¨ 30

Intuition 2 – Discrimination Student that learned less …………… …………… …………… …………… Different discourse Intuition 2 – Discrimination Student that learned less …………… …………… …………… …………… Different discourse structure …………… …………… …………… …………… Student that learned more 31

Intuition 2 - Results n Transition – Transition parameters Q 1 Q 2. 1 Intuition 2 - Results n Transition – Transition parameters Q 1 Q 2. 1 n Q 3 Q 2. 2 Push–Push ¨ Interpretation: system uncovers potential major knowledge gaps Q 2. 1. 1 Q 2. 1. 2 32

Other events n Psychology inspired ¨ Models of reading comprehension – Landscape Model [Ward Other events n Psychology inspired ¨ Models of reading comprehension – Landscape Model [Ward and Litman, 2005] ¨ Alignment model – lexical and prosodic convergence [Ward and Litman, 2007 a, 2007 b] n NLP inspired ¨ Cohesion – lexical co-occurrence [Ward and Litman, 2006] 33

From Correlations to Causality n n Correlation does not imply causality But can inform From Correlations to Causality n n Correlation does not imply causality But can inform modifications E. g. more instruction after Pop. Up-Incorrect events ¨ E. g. different instruction depending on student uncertainty ¨ Incorrect more tutoring Q 1 Q 2. 1 Q 3 Q 2. 2 34

Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning Outline n n ITSPOKE Main research tools & projects Comparing systems ¨ Modeling learning ¨ Interactions between phenomena ¨ Other projects ¨ 35

Interactions between phenomena n Things interact in a dialogue Student correctness tutor reply ¨ Interactions between phenomena n Things interact in a dialogue Student correctness tutor reply ¨ Student emotion tutor reply ¨ n Why look for interactions? Capture human tutor behavior ¨ Extract new patterns ¨ Allow us to formulate hypotheses ¨ n How to find interactions? Dependency tests: χ2 (Chi-Square) ¨ Example with 2 windows ¨ 36

Projects n Certainty human tutor reply [Forbes-Riley and Litman, 2005] ¨ Student uncertainty associated Projects n Certainty human tutor reply [Forbes-Riley and Litman, 2005] ¨ Student uncertainty associated with n n ¨ Student certainty associated with n n Increase in Bottom-up replies Decrease in Expansions Increase in Restatements Speech recognition errors [Rotaru and Litman, 2005, 2006 a, 2006 b] ¨ Speech recognition errors Next student state n ¨ Student State Speech recognition errors n ¨ Increase in frustration Incorrect, Uncertain, Frustrated more speech errors Discourse Structure Speech recognition errors 37

Other projects n Affective computing (Kate Forbes-Riley’s postdoc) ¨ Emotion prediction n n ¨ Other projects n Affective computing (Kate Forbes-Riley’s postdoc) ¨ Emotion prediction n n ¨ Emotion adaptation/handling n n n What are the important emotions in tutoring How to predict them Model human tutor behavior Formulate hypotheses from empirical analysis Reinforcement Learning and User Modeling System learns best way to react from rewards (Min Chi’s Ph. D) ¨ Needs a lot of data -> user simulations (Hua Ai’s Ph. D) ¨ 38

Resources n Recommended classes ¨ ¨ ¨ Introduction to Natural Language Processing Foundations of Resources n Recommended classes ¨ ¨ ¨ Introduction to Natural Language Processing Foundations of Artificial Intelligence Machine Learning Knowledge Representation Seminar classes n n Advance Topics in Artificial Intelligence (Speech and Language Technology for Educational Applications (this spring!), Affective Spoken Dialogue Systems, etc. ) Other resources ¨ ¨ ¨ ITSPOKE Group Meetings NLP @ Pitt Do. D @ CMU YRRSDS ISP Forum PSLC 39

Further information n Visit my homepage and talk with me ¨ http: //www. cs. Further information n Visit my homepage and talk with me ¨ http: //www. cs. pitt. edu n Take my seminar (CS 3710), projects course (CS 2002) n Talk with members of the ITSPOKE group ¨ http: //www. cs. pitt. edu/~litman/itspoke. html 40