1c235092bc7aad9033d9a4ac518fb999.ppt
- Количество слайдов: 30
Commercial Applications of Natural Language Technology April 14, 2005 Deborah A. Dahl Principal, Conversational Technologies Chair, World Wide Web Consortium Multimodal Interaction Working Group dahl@conversational-technologies. com 1
Business Motivations § § § save money (operators, phone costs) improve user satisfaction provide new revenue-generating services § do something that couldn’t otherwise be done § legal requirements Conversational Technologies 2
Technical Motivations for Natural Language Processing complexity of intention • travel from San Francisco to Philadelphia • aisle seat • vegetarian meal • no more than one stop • red-eye ok if arrives after 6: 00 a. m. and doesn’t stop in Chicago • wheelchair needed • should be on one of my preferred airlines unless fare is much higher system: When do you want Where do you system: to depart? want to go? user: I need to be user: Philadelphia complexity of language downtown for an 8: 00 meeting I’ve been having a lot of problems with my inkjet printer the last few indirection weeks Conversational Technologies 3
Natural Language Understanding in a Spoken Dialog System NLU rules Meaning extraction Language model Acoustic models Recognizer DM rules Back-end application NLG rules Templates Dialog manager Generate prompt Speech generation TTS Conversational Technologies Recordings 4
Natural Language Processing in Commercial Spoken Dialog Systems § Form-filling applications § Classification of free-form spoken inputs § Standards Conversational Technologies 5
Form-filling Spoken Dialog Systems § retail banking § voice portals § access to email, voice mail § travel reservations (Amtrak Julie) § package tracking Conversational Technologies 6
Multimodal Form-Filling using XHTML+Voice § IBM Chinese food demo Conversational Technologies 7
Government Applications -- NASA § Clarissa -- International Space Station's new § § speech-powered virtual assistant Space station checklists are very long and complex with many branches, which often require 'fill-in-the-blank' answers. General purpose 'procedure reader’ helps astronauts check out space suits and analyze drinking water quality Scheduled to begin working with astronauts in May as part of International Space Station Expedition 11. Conversational Technologies 8
Statistical Classification Sort user’s statement into bins of predefined topics (for example, place order, find out status of order, return item) § given examples of statements that go in different bins (training data) § sort new examples into the right bins § example of applying these kinds of techniques to text – spam filters Conversational Technologies 9
Example of Classification: Spam Filters 84% 98% § FOR YOUR ATTENTION; Dear Sir, I am pleased to write you in view of the circumstances in which I now found myself. This rescuable situation, though with its attendant mutual benefit needs urgent 84% action hence this letter, and I do hope you will not hesitate to come to my rescue…. 90% Conversational Technologies 93% 10
Some Current Statistical Classification Systems § Nuance “Say Anything” § Scansoft “Speak. Freely” § ATT “Voice. Tone” § BBN “Call Director” § Tu. Vox Conversational Technologies 11
Customer Service Calls “I’m closing up my summer home and want to turn off the phone. ” Problem: Customer Service Destination? Touchtone menu is complex with many layers Entry Point: Billing Payment Pay Make arrangements now Orders Repair Cancel Service New Service Prompts are confusing Customer wants just to say what they need. Conversational Technologies Balance Order Move Past due Seasonal notice Copy of Bill Order Unauthorized Status Change call 12
BBN Call Director™ “Please tell me briefly the reason for your call today. ” Speech Recognizer “I’m calling to check whethere is any better rate plans than the one I currently have. ” Technical Support Billing Text Topic IVR Classifier Topic Router Statistical Grammars & Topic Models Conversational Technologies Sales Automated Services
Standards § Extremely important for commercial applications Conversational Technologies 14
W 3 C Natural Language Standards § Aimed at form-filling dialogs § Voice. XML – defines dialogs § Speech Recognition Grammar Specification (SRGS): describes allowable sequences of words § Semantic Interpretation (SI): describes how sequences of words are to be interpreted § Extensible Multi. Modal Annotation (EMMA) represents final interpretation of user’s input Conversational Technologies 15
Form-filling Dialog System: Welcome to the weather information service. What state? User: help System: Please speak the state for which you want the weather User: Pennsylvania System: Please speak the city for which you want the weather. User: Philadelphia Conversational Technologies 16
SRGS Examples § Context-free grammar § XML and ABNF formats are provided
SRGS Specification § http: //www. w 3. org/TR/speech-grammar/ § Status: W 3 C Candidate Recommendation § Quick Guide to the SRGS Specification § http: //www. conversationaltechnologies. com/pages/5/index. htm Conversational Technologies 19
Semantic Interpretation § Tags are added to the grammar to describe the semantics of the user’s input § Format uses ECMAScript compact profile (ECMAScript 327) Conversational Technologies 20
Semantic Interpretation Specification § http: //www. w 3. org/TR/semanticinterpretation/ § Status: W 3 C Working Draft Conversational Technologies 22
EMMA § Developed by the W 3 C Multimodal Interaction Working Group § An XML-based approach to representing natural language meanings § Applicable to multimodal applications, but originally developed for speech Conversational Technologies 23
EMMA § Represents user input § Vehicle for transmitting user’s intention throughout application § Focus on language input (text, handwriting, speech) § Three components § data model § interpretation § annotation (main focus of standard) Conversational Technologies 24
Interpretation Example § I want to go from Denver to Pittsburgh
EMMA Example “I want to go from Boston to Denver on March 11, 2003”
EMMA Specification http: //www. w 3. org/TR/emma Status: W 3 C Working Draft Conversational Technologies 27
Natural Language Understanding W 3 C Standards Summary § Voice. XML: define spoken dialogs § SRGS: describes allowable sequences of words § Semantic Interpretation: describes what intentions are represented by sequences of words § EMMA: represents an interpretation of user’s input Conversational Technologies 28
Summary of Deployed Spoken Dialog Systems § Form filling applications are by far the most common § Statistical classification systems are becoming more common and are popular with users § Standards are accelerating commercial adoption of technology Conversational Technologies 29
Resources § Practical Spoken Dialog Systems, Springer, 2005. (D. Dahl, editor) § VB website http: //www. w 3. org/Voice/ § Voice. XML § SRGS § SISR § MMI website http: //www. w 3. org/2002/mmi/ § § § EMMA Be. Vocal website http: //cafe. bevocal. com/ Voice. XML deployments (some with phone numbers you can try http: //www. kenrehor. com/voicexml/#deployments) § Guide to speech standards -http: //www. speechtechmag. com/issues/9_8/cover/11619 -1. html Conversational Technologies 30


