Animating Virtual Humans in Intelligent Multimedia Storytelling Minhua

Скачать презентацию Animating Virtual Humans in Intelligent Multimedia Storytelling Minhua

2b056ade62c111760f659f535bbdc803.ppt

Количество слайдов: 16

Animating Virtual Humans in Intelligent Multimedia Storytelling Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University of Ulster, Magee Derry, Northern Ireland

Outline w State-of-the-art virtual human animation standards n n VRML/X 3 D & MPEG-4 for object modelling H-Anim & MPEG-4 SNHC for humanoid modelling VHML & STEP for human animation modelling Natural language to 3 D animation w Language visualisation (animation) in intelligent multimodal storytelling system, CONFUCIUS n n Humanoid animation in CONFUCIUS Multiple animation channels Space sites of virtual humans Virtual object manipulation w Conclusion & future work PGNet 2005 Liverpool, 28 June 2005

Four levels of virtual human representation Current virtual human representation languages can be classified to four groups according to the levels of abstraction, starting from 3 D geometry modelling to language animation. high level animation CONFUCIUS Anim. NL VHML (BAML) XML-based STEP script-based Level 4 Natural language to animation Level 3 Human animation modelling H-Anim low level animation MPEG-4 SNHC Level 2 3 D human modelling VRML (X 3 D) MPEG-4 Level 1 3 D object modelling PGNet 2005 Liverpool, 28 June 2005

Level 1: 3 D object modelling high level animation CONFUCIUS Anim. NL VHML (BAML) XML-based STEP script-based Level 4 Natural language to animation Level 3 Human animation modelling H-Anim low level animation MPEG-4 SNHC Level 2 3 D human modelling VRML (X 3 D) MPEG-4 Level 1 3 D object modelling § VRML (Virtual Reality Modelling Language) is a hierarchical scene description language that defines the geometry and behaviour of a 3 D scene. X 3 D is the successor to VRML. § MPEG-4 uses BIFS (Binary Format for Scenes) for real-time streaming. BIFS borrows many concepts from VRML. BIFS and VRML can be seen as different representations of the same data. PGNet 2005 Liverpool, 28 June 2005

Level 2: 3 D human modelling high level animation CONFUCIUS Anim. NL VHML (BAML) XML-based STEP script-based Level 4 Natural language to animation Level 3 Human animation modelling H-Anim low level animation MPEG-4 SNHC Level 2 3 D human modelling VRML (X 3 D) MPEG-4 Level 1 3 D object modelling § H-Anim is a stardard VRML 97 representation for humanoids. It defines standard human Joints articulation, segments dimensions, and sites for “end effector” and attachment points for clothing. § MPEG-4 SNHC (Synthetic/Natural Hybrid Coding) incorporates H-Anim and provides an efficient way to animate virtual human and tools for the efficient compression of the animation parameters associated with the H-Anim human model. PGNet 2005 Liverpool, 28 June 2005

H-Anim joint-segment hierarchy § An H-Anim file contains a jointsegment hierarchy. § Each joint node may contain other joint nodes and a segment node that describes the body part associated with the joint. § Each segment is a normal VRML transform node describing the body part's geometry and texture. § H-Anim humanoids can be animated using keyframing, inverse kinematics, & other animation techniques. PGNet 2005 Liverpool, 28 June 2005

H-Anim models on the Web Virtual human models Nancy 1 Baxter, Nana 2 Authors Cindy Christian Ballreich Babski Y. T. , Hiro 3 Dilbert 3 Matt Beitler Max 3 Jake 3 Dork 4 Matt Michael Beitler Miller URLs: 1 http: //www. ballreich. net/vrml/h-anim/nancy_h-anim. wrl 2 http: //ligwww. epfl. ch/~babski/Standard. Body 3 http: //www. cis. upenn. edu/~beitler/H-Anim/Models/H-Anim 1. 1/ 4 http: //students. cs. tamu. edu/mmiller/hanim/proto/dork-proto. wrl PGNet 2005 Liverpool, 28 June 2005

Level 3: Human animation modelling high level animation CONFUCIUS Anim. NL VHML (BAML) XML-based STEP script-based Level 4 Natural language to animation Level 3 Human animation modelling H-Anim low level animation MPEG-4 SNHC Level 2 3 D human modelling VRML (X 3 D) MPEG-4 Level 1 3 D object modelling § VHML (Virtual Human Mark-up Language) is an XML-based language which provides an intuitive way to define virtual human animation. It is composed of several sub-languages: DMML, FAML, BAML, SML, and EML. § STEP is a scripting language for human actions. It has a Prolog-like syntax, which makes it compatible with most standard logic programming languages. PGNet 2005 Liverpool, 28 June 2005

VHML & STEP examples <left-calf-flex amount=”medium”> <right-calf-flex amount=”medium”> <left-arm-front amount=“medium Standing on" src="https://present5.com/presentation/2b056ade62c111760f659f535bbdc803/image-9.jpg" alt="VHML & STEP examples Standing on" /> VHML & STEP examples Standing on my knees I beg you pardon A. A VHML example script(walk_forward_step(Agent), Action. List): Action. List=[parallel( [script_action( walk_pose(Agent), move(Agent, front, fast) ])]. B. A STEP example PGNet 2005 Liverpool, 28 June 2005

Level 4: Natural language to animation high level animation CONFUCIUS Anim. NL VHML (BAML) XML-based STEP script-based Level 4 Natural language to animation Level 3 Human animation modelling H-Anim low level animation MPEG-4 SNHC Level 2 3 D human modelling VRML (X 3 D) MPEG-4 Level 1 3 D object modelling § High level animation applications converting natural language to virtual human animation. Little research on virtual human animation focuses on this level. § The Anim. NL project aims to enable people to use natural language instructions to tell virtual humans what to do § CONFUCIUS also deals with language animation § Research on this level will lead to powerful web-based applications PGNet 2005 Liverpool, 28 June 2005

Architecture of CONFUCIUS Natural language sentences Knowledge base Surface transformer Language knowledge Natural Language Processing (Word. Net, LCS database, FDG parser) mapping 3 D authoring tools existing 3 D models & virtual human models Visual/audio knowledge (3 D models & animations, audio encapsulated in graphic models) Media allocator semantic representation Animation engine (with nonspeech audio) Text-to. Speech Presentation agent (Merlin the Narrator) Synchronizing 3 D virtual world with speech in VRML Narration integration Multimodal presentation PGNet 2005 Liverpool, 28 June 2005

Humanoid animation in CONFUCIUS Semantic Representation match basic motions in library? N User interaction animation controller environment placement Camera controller Y If the event predicate matches basic human motions in animation library Either loading a precreated keyframe animation or providing animation specification for animation generation Motion instantiation Apply spatial info & place OBJ/HUMAN into a specified environment Automatic camera placement & apply cinematic rules VRML file of the virtual story world PGNet 2005 Liverpool, 28 June 2005

Multiple animation channels § 3 rd level human animation modeling languages (VHML, STEP) provide a facility to specify both sequential and parallel temporal relations § Simultaneous animations cause the Dining Philosopher's problem for higher level animation using predefined animation data (multiple animations may request to access same body parts at the same time) § Multiple animation channels allow characters to run multiple animations at the same time, e. g. walking with the lower body while waving with the upper body § Multiple animation channels often disable one channel when a specific animation is playing on another channel to avoid conflicts with another animation Involved joints /Animations sacroiliac l_hip r_hip … r_shoulder walk 2 2 2 … 1 jump 2 2 2 … 1 wave 0 0 0 … 2 run 2 2 2 … 1 scratch head 0 0 0 … 2 sit 2 2 2 … 1 … … … PGNet 2005 Liverpool, 28 June 2005 … … …

Space sites of virtual humans w Types of virtual objects Small props, manipulated by hands or feet, e. g. cup, hat, ball n Big props, source or targets of actions, e. g. table, chair, tree n Stage props have internal structure, e. g. house, restaurant, chapel n w Site tags of virtual humans Manipulating small props, 6 sites on hands (three sites for each hand), one site on head (skull_tip), one site for each foot tip n For big props placement, 5 sites indicating five directions around the human body: x_front, x_back, x_left, x_right, x_bottom. Big props like a table or chairs usually placed at these positions. n For stage props setting, 5 more space tags indicating further places: far_front, far_back, far_left, far_right, far_top. Stage props (e. g. a house) often locate at these far sites. n PGNet 2005 Liverpool, 28 June 2005 grip, pincer grip pushing pointing

Virtual object manipulation Two approaches to organize knowledge required for successful grasping 1. Store applicable objects in the animation file of an action and using lexical knowledge of nouns to infer hypernymy relations between objects 2. Including the manipulation hand postures and movements within the object description, besides its intrinsic object properties. These objects have the ability to describe in details their functionality and their possible interactions with virtual humans. 4 stored hand postures for interacting with 3 D objects index pointing (press a button) pincer grip (use thumb and index finger to pick up small objects) PGNet 2005 Liverpool, 28 June 2005 grip (hold cup handle, knob, a bottle) palm push (push a piece of furniture)

Conclusion w Classified virtual human representation languages into four levels of abstraction w CONFUCIUS is an overall framework of intelligent multimedia storytelling, using 3 D modelling/animation techniques with natural language understanding technologies to achieve higher level virtual human animation w A number of projects are currently based on virtual human animation, working on various application domains. Few of them takes modern NLP approach that a high level human animation system should be based on. w The value of CONFUCIUS lies in generation of 3 D animation from natural language by automating the processes of language parsing, semantic representation and animation production. w Potential application areas: computer games, animation production and direction, multimedia presentation, shared virtual worlds w Future work: coordination & synchronization of multiple virtual humans PGNet 2005 Liverpool, 28 June 2005