f0ca9ef19cc954afb6a82a799839e23b.ppt
- Количество слайдов: 78
Voice. XML Overview James A. Larson Intel Corporation jim@larson-tech. com (c) 2007 Larson Technical Services 1
Outline • Motivation for Voice. XML • W 3 C Speech Interface Framework Languages • Dialog—Voice. XML 2. 0 • Speech Synthesis—SSML • Grammars—SRGS • Semantic Interpretation—SI • Voice. XML 2. 1 (c) 2007 Larson Technical Services 2
Voice. XML in the Marketplace • Voice. XML 2. 0 is now ratified as a Recommendation (e. g. , official standard) by the W 3 C • Hundreds of millions of Voice. XML calls are answered every day Voice. XML is the standard for building speech-enabled applications (c) 2007 Larson Technical Services 3
Motivation for Speech Applications • Users access Web sites from any telephone, anywhere, any time. • Speaking and listening are the natural usage modes for phones. (c) 2007 Larson Technical Services 4
Strength of Voice. XML Applications • Traditional system-directed dialogs for novice users • Mixed initiative dialogs for experienced users • Novice users smoothly become experienced users at their own pace (c) 2007 Larson Technical Services 5
Limitations of Voice. XML Applications • No special analysis of speech input – Not suitable for training speech skills— Reading, ESL, singing, etc. • VUI conversational bandwidth is slower than GUI conversational bandwidth – Using a VUI is like drinking from Lake Superior with a straw (c) 2007 Larson Technical Services 6
Exercise 1 • Name or describe a speech application you could use at work. • Name or describe a speech application you or family member can use at home. (c) 2007 Larson Technical Services 7
XML • XML = e. Xtensible Markup Language • Elements are surrounded by tags
Outline • Motivation for Voice. XML • W 3 C Speech Interface Framework Languages • Dialog—Voice. XML 2. 0 • Speech Synthesis—SSML • Grammars—SRGS • Semantic Interpretation—SI • Voice. XML 2. 1 (c) 2007 Larson Technical Services 9
Documents Multimedia Files HTML Scripts DB Database Server Voice. XML Scripts Grammars Audio Files Web Browser Capture Voice ASR Voice DTMF Browser Replay Audio TTS Speech Server/Gateway Web Server (c) 2007 Larson Technical Services 10
W 3 C Speech Interface Framework Voice. XML 2. 0 Speech Synthesis Call Control Other Semantic Interpretation (c) 2007 Larson Technical Services Grammar 11
Status of W 3 C Speech Interface Languages Recommendation Proposed Recommendation Candidate Recommendation Last Call Working Draft Semantic Interpret. Ration Voice Grammar Synthesis (SISR) XML 2. 0 (SRGS) (SSML) Working Draft Requirements (c) 2007 Larson Technical Services Call Control (CCXML) Voice XML 2. 1 V 3 12
Outline • Motivation for Voice. XML • W 3 C Speech Interface Framework Languages • Dialog—Voice. XML 2. 0 • Speech Synthesis—SSML • Grammars—SRGS • Semantic Interpretation—SI • Voice. XML 2. 1 (c) 2007 Larson Technical Services 13
Example of Voice. XML 2. 0 Fragment Dialog Language (Vocie. XML 2. 0) xml version="1. 0"? >
Example of Voice. XML 2. 0 Fragment Dialog Language (Vocie. XML 2. 0) xml version="1. 0"? >
Example of Voice. XML 2. 0 Fragment Dialog Language (Vocie. XML 2. 0) xml version="1. 0"? >
Example of Voice. XML 2. 0 Fragment Dialog Language (Vocie. XML 2. 0) xml version="1. 0"? >
Voice. XML 2. 0 features • Menus, forms, sub-dialogs • –














































