6c1723d902fa68daa4cdfa5d4e5bfe99.ppt
- Количество слайдов: 23
The Architecture Dream Team Schloss Dagshul, Germany October 2001
Would you build your dream house without a blueprint? Page 2
What you hope to get Page 3
… what you might get Page 4
Today’s Conventional Architecture Information Presentation Dialog Control User(s) Page 5 Application Interface Applications People
CHAMELEON Platform (Intelimedia Workbench) Paul Mc. Kevitt Speech NL parser recognizer Gesture recognizer Frame semantics Laser pointer Dialogue manager Topsy Black board Speech synthesizer Domain model Page 6 Microphone array
Microsoft Derek Jacoby A Typical Dr. Who App MIPAD Architecture Page 7
Harry Bunt Output Synthesis Context Management linguistic semantic physical perceptual cognitive social API Application Page 8 Pending Context Input Interpretation Dialogue Management
Art Exploration Oliviero Stock explicit input (e. g. , pointing) implicit input (e. g. , movement) input analyzer interaction history visitor models composer engine Physical space model Hypermedia information presentation links and image to UI Audio message to headphone Page 9
COLLAGEN Sidner et al. Page 10
IBM’s Responsive Information Architect (RIA) Michelle Zhou Visual Designer Models of: Design IRIS Domain Info User Conversation Server Environment Language Designer Presentation Broker Media Producer user Conversational Facilitator Multimodal Interpreter speech gesture Page 11
Interact Kristiina Jokinen ASR Language Understanding Topic Recognition Input Manager Information Storage TTS Dialogue Agents/Acts (e. g. , Q, A, State) Generator Agents Task Agents/Acts Dialogue Manager Database Page 12 Presentation Manager
EMBASSI Conceptual Architecture l Z-Axis: - Underlying HW-Infrastructure - Software-Infrastructure (Agent / Distr. Comp. Middleware) - Functional building blocks of conceptual architecture (Multimodal Assistant Componentware, MAC) - Application-level Assistants (not shown) l XY-Plane of MAC - Dialogic Assistance - Effectual Assistance - Situational Assistance - Explicit and implied generic (= application independent) ontologies, defining component interfaces Page 13
SMARTKOM Wolfgang Wahlster Page 15
Page 16
DARPA Galaxy Communicator The Galaxy Communicator Software Infrastructure (GCSI) is a distributed, message-based, hub-and-spoke infrastructure optimized for constructing spoken dialogue systems Language Generation Text-to-Speech Conversion Audio Server Dialogue Management Hub Application Backend Context Tracking Speech Recognition Frame Construction Open source and documentation available at fofoca. mitre. org Page 17 and sourceforge. net/projects/communicator
Definitions l Abstract Architecture - Components, connections (protocols), and constraints (IEEE definition) - Data/knowledge structures, data flow and protocols, control flow - Consider use cases, e. g. , l In-car navigation system l Desktop, kiosk, mobile device interaction l Media conversion Page 21
Requirements l Functional - Modality integration (input and output) - Situation (User, task, application) appropriate real-time sensing/response (e. g. , supporting barge-in, perceptual sensing/feedback) - Representation of level of granularity (modules and data structures) - Manage feedback - local and global, when/where? - Support incremental processing - Support incremental development (and scaleability) l System/Technical - Support for processing/fusing multimodal input (e. g. , parallel processing) - Modular, composable (possibly distributed processing) - Efficient implementation - Time scale, Temporal and spatial resolution - Accessible (even partial) data structures Page 22 - Open and extensible protocols
Architecture of the Smart. Kom Agent Media Input Processing Media/Mode Analysis Interaction Management Language (cf. Maybury/Wahlster 1998) Media Fusion Gesture Application Interface Graphics Discourse Modeling Biometrics Media/Mode Design Intention Recognition Language User(s) Graphics Gesture Animated Presentation Agent Media Output Rendering Presentation User Modeling Applications People Presentation Design Dialog Control Discourse Model Information Domain Model Application Interface Task Model Representation and Inference Media Models
Biometrics Media Input Processing Media/ Mode Analysis A Graphics Gesture Mode Coordination Language G V User(s) G T A Sound Media/ Mode Design Language G A V G Graphics Multimodal Fusion Multimodal Reference Resolution Presentation Design Select Content Gesture Sound Discourse Model Allocate Coordinate Animated Media Output Presentation Rendering Agent User Model Design Context Model Layout Interaction Management Discourse Management Reference Resolution Application Interface Context Management Initiate Lexicon Management Intention Recognition Action Planning Terminate Request Respond Integrate User Modeling Information, Applications, People Architecture User ID Domain Model Task Model Media Models Representation and Inference, States and Histories Application Models
The Architecture Dream Team Schloss Dagshul, Germany October 2001
Media Fusion Media/Mode Analysis S V Spoken Language Media Fusion Lip Reading V Media Fusion Gesture Page 27
COLLAGEN Sidner et al. Speech interpretation Speech USER Planning and discourse Agent Mel Via. Voice Window events Application Student Model Page 28
6c1723d902fa68daa4cdfa5d4e5bfe99.ppt