Spatial and Planning Models of ASL Classifier Predicates

Скачать презентацию Spatial and Planning Models of ASL Classifier Predicates

5ca20dcbce90645d31f415d4f0d27c39.ppt

Количество слайдов: 46

Spatial and Planning Models of ASL Classifier Predicates for Machine Translation Matt Huenerfauth 10 th International Conference on Theoretical and Methodological Issues in Machine Translation October 4, 2004 Baltimore, MD, USA Computer and Information Science University of Pennsylvania Research Advisors: Mitch Marcus & Martha Palmer

Motivations and Applications • Only half of Deaf high school graduates (age 18+) can read English at a fourth-grade (age 10) level, despite ASL fluency. • Many Deaf accessibility tools forget that English is a second language for these students (and has a different structure). • Applications for a Machine Translation System: – – TV captioning, teletype telephones. Computer user-interfaces in ASL. Educational tools, access to information/media. Transcription, storage, and transmission of ASL.

Input / Output What’s our input? English Text. What’s our output? ASL has no written form. Imagine a 3 D virtual reality human being… One that can perform sign language… But this character needs a set of instructions telling it how to move! Our job: English These Instructions. VCom 3 d

Off-the-Shelf Virtual Humans Photos: Seamless Solutions, Inc. Simon the Signer (Bangham et al. “Signing for the Deaf Using Virtual Humans, ” IEE 2000. ) Vcom 3 D Corporation

ASL Linguistics • Some ASL sentences: structure similar to that of spoken/written languages. • Other ASL sentences: use space around signer to topologically describe the 3 D layout of a scene under discussion. – The hands indicate the movement and location of entities in the scene. – Called “Classifier Predicates. ”

Example Classifier Predicate The car parked between the cat and the house. (Loc#3) (Loc#2) Gaze Right Left Viewer sign: HOUSE Gaze Right Left Viewer sign: CAR Loc#1 To Loc#1 (Loc#1) Viewer sign: CAT Loc#3 To Loc#3 Eyes follow right hand. Path of car, stop at Loc#2. To Loc#2 Note: Facial expression, head tilt, and shoulder tilt not included in this example.

Previous ASL MT Systems • Little ASL corpora – no statistical systems. • Previous direct and transfer systems are only partial solutions. – Some produce only Signed English, not ASL. – None handle the spatial aspects of ASL. • All ignore classifier predicates.

We can’t ignore CPs • CPs are needed to convey many concepts. • Signers use CPs frequently. * • CPs needed for some important applications – ASL user-interfaces – literacy educational software * Morford and Mc. Farland. 2003. “Sign Frequency Characteristics of ASL. ” Sign Language Studies. 3: 2.

Focus and Assumptions • Focus of this approach: producing classifier predicates of movement and location. • Part of a larger project* to develop a multi-path English-ASL MT architecture – Direct/transfer paths: most sentences. – This path: produce Classifier Predicates. * Huenerfauth, M. 2004. “A Multi-Path Architecture for English-to-ASL MT. ” HLT-NAACL Student Workshop.

ASL Classifier Predicate Models

Overall Architecture 3 D Animation of the Event 3 D Animation Planning Operator Pred-Arg Structure English Sentence CP Discourse CP Semantics CP Syntax CP Phonology

CP Translation Models Discussed • • • Scene Visualization Discourse Semantics Syntax Phonology (we’ll talk about this one first)

Overall Architecture 3 D Animation of the Event Phonological Model 3 D Animation Planning Operator CP Discourse CP Semantics Body Parts Moving Through Space: CP Syntax Pred-Arg “Articulators” Structure English Sentence CP Phonology

ASL Phonetics/Phonology • “Phonetic” Representation of Output – Hundreds of animation joint angles. • Traditional ASL Phonological Models – Hand: shape, orientation, location, movement – Some specification of non-manual features. – Tailored to non-CP output: Difficult to specify complex motion paths. CPs don’t use as many handshapes and orientation patterns.

Example Classifier Predicate The car parked between the cat and the house. Gaze Right Left At Viewer sign: HOUSE Gaze Right Left At Viewer sign: CAR Location #1 To Loc #1 At Viewer sign: CAT Location #3 To Loc #3 Eyes follow right hand. Path of car, stop at Loc #2. To Location #2 Note: Facial expression, head tilt, and shoulder tilt not included in this example.

Phonological Model • What is the output? – Abstract model of (somewhat) independent body parts. • “Articulators” – – – Dominant Hand (Right) Non-Dominant Hand (Left) Eye Gaze Head Tilt Shoulder Tilt Facial Expression What information do we specify for each of these?

Values for Articulators • Dominant Hand, Non-Dominant Hand – 3 D point in space in front of the signer – Palm orientation – Hand shape (finite set of standard shapes) • Eye Gaze, Head Tilt – 3 D point in space at which they are aimed.

Overall Architecture 3 D Animation of the Event CP Discourse Scene Visualization Approach 3 D Animation Planning Operator CP Semantics Converting an English Pred-Arg sentence into a 3 D CP Syntax Structure animation of an event. English Sentence CP Phonology

Previously-Built Technology • Anim. NL System – Virtual reality model of 3 D scene. – Input: English sentences that tell the characters/objects in the scene what to do. – Output: An animation in which the characters/objects obey the English commands. Bindiganavale, Schuler, Allbeck, Badler, Joshi, & Palmer. 2000. "Dynamically Altering Agent Behaviors Using Nat. Lang. Instructions. " Int'l Conf. on Autonomous Agents. Related Work: Coyne and Sproat. 2001. “Words. Eye: An Automatic Text-to-Scene Conversion System. ” SIGGRAPH-2001. Los Angeles, CA.

How It Works 3 D Animation of the Event 3 D Animation Planning Operator Pred-Arg Structure English Sentence We won’t discuss all the details, but one part of the process is important to understand. (We’ll come back to it later. )

Example Step 1: Analyzing English Input • • • The car parked between the cat and the house. Syntactic analysis. Identify word senses: e. g. park-23 Identify discourse entities: car, cat, house. Predicate Argument Structure – Predicate: park-23 – Agent: the car – Location: between the cat and the house

Example Step 2: Anim. NL builds 3 D scene

Overall Architecture 3 D Animation of the Event Discourse Model 3 D Animation Planning Operator Pred-Arg Structure English Sentence CP Discourse CP Semantics CP Syntax CP Phonology

Discourse Model Motivations • Preconditions for Performing a CP – (Entity is the current topic) OR (Starting point of this CP is the same as the ending point of a previous CP) • Effect of a CP Performance – (Entity is topicalized) AND (assigned a 3 D location) • Discourse Model must record: – topicalized status of each entity – whether a point has been assigned to an entity – whether entity has moved in the virtual reality since the last time the signer showed its location with a CP

Discourse Model • Topic(x) – X is the current topic. • Identify(x) – X has been associated with a location in space. • Position(x) – X has not moved since the last time that it was placed using a CP.

Example Step 3: Setting up Discourse Model CAR: __ Topic? __ Location Identified? __ Still in Same Position? HOUSE: __ Topic? __ Location Identified? __ Still in Same Position? CAT: __ Topic? __ Location Identified? __ Still in Same Position? • Model includes a subset of the entities in the 3 D scene: those mentioned in the text. • All values initially set to false for each entity.

Overall Architecture 3 D Animation of the Event Semantic Model 3 D Animation Planning Operator CP Discourse CP Semantics Invisible 3 D Placeholders: “Ghosts” Pred-Arg Structure English Sentence CP Syntax CP Phonology

Semantic Model • 3 D representation of the arrangement of invisible placeholder objects in space • These “ghosts” will be positioned based on the 3 D virtual reality scene coordinates • Choose the details, viewpoint, and timescale of the virtual reality scene for use by CPs

Example Step 4: Producing Ghost Scene HOUSE CAR CAT

Overall Architecture 3 D Animation of the Event Syntactic Model 3 D Animation Planning Operator CP Discourse CP Semantics Planning-Based Generation of CPs Pred-Arg Structure English Sentence CP Syntax CP Phonology

CP Templates • Recent linguistic analyses of CPs suggests that they can be generated by: – Storing a lexicon of CP templates. – Selecting a template that expresses the proper semantics and/or shows proper 3 D movement. – Instantiate the template by filling in the relevant 3 D locations in space. Liddel, S. 2003. Grammar, Gesture, and Meaning in ASL. Cambridge University Press. Huenerfauth, M. 2004. “Spatial Representation of Classifier Predicates for MT into ASL. ” Workshop on Representation and Processing of Signed Languages, LREC-2004.

Animation Planning Process • This mechanism is actually analogous to how the Anim. NL system generates 3 D virtual reality scenes from English text. – Stores templates of prototypical animation movements (as hierarchical planning operators) – Select a template based on English semantics – Use planning process to work out preconditions and effects to produce a 3 D animation of event

Example Database of Templates WALKING-UPRIGHT-FIGURE MOVING-MOTORIZED-VEHICLE Parameters: g 0 (ghost car parking), g 1. . g. N (other ghosts) LOCATE-BULKY-OBJECT Restrictions: g 0 is a vehicle Parameters: g 0 (ghost car parking), g 1. . g. N (other ghosts) Preconditions: topic(g 0) or (ident(g 0) and positioned(g 0)) Restrictions: g 0 is a vehicle TWO-APPROACHING-UPRIGHT-FIGURES Parameters: for g 0 (ghost car parking), g 1. . g. N (other ghosts) g=g 1. . g. N: (ident(g) and positioned(g)) Preconditions: topic(g 0) or (ident(g 0) and positioned(g 0)) Restrictions: g 0 is a vehicle LOCATE-SEATED-HUMAN for g 0 (ghost car parking), g 1. . g. N (other ghosts) g=g 1. . g. N: (ident(g) and positioned(g)) Parameters: Preconditions: Right Hand or (ident(g 0) and positioned(g 0)) topic(g 0) Articulator: Restrictions: g 0 is a PARKING-VEHICLEvehicle for g 0 (ghost car parking), g 1. . g. N (other ghosts) g=g 1. . g. N: (ident(g) Location: Parameters: Follow_location_of( g 0 ) and positioned(g)) Articulator: Right Hand or (ident(g 0) and positioned(g 0)) Preconditions: topic(g 0) Orientation: Restrictions: Direction_of_motion_path( g 0 ) g 0 is a vehicle Location: Parameters: Follow_location_of( g 0 ) and positioned(g)) for g 0 (ghost car parking), g 1. . g. N (other ghosts) g=g 1. . g. N: (ident(g) Articulator: “Sideways 3” or (ident(g 0) and positioned(g 0)) Handshape: Preconditions: Right Hand topic(g 0) Orientation: Direction_of_motion_path( g 0 ) Restrictions: g 0 is a vehicle Location: Follow_location_of( g 0 ) and positioned(g)) for g=g 1. . g. N: (ident(g) Handshape: Articulator: “Sideways 3” or (ident(g 0) and position (g 0)) Preconditions: Right Hand topic(g 0) Orientation: positioned(g 0), topic(g 0), Direction_of_motion_path( g 0 ) Effects: Location: Follow_location_of( g 0 ) and position (g)) for g=g 1. . g. N: (ident(g) Handshape: express (park-23 ag: g 0 loc: g 1. . g. N ) Articulator: “Sideways 3” Right Hand Effects: Orientation: positioned(g 0), topic(g 0), Direction_of_motion_path( g 0 ) Location: PLATFORM(g 0. loc. final), g 0 ) Follow_location_of( EYETRACK(g 0) Concurrently: Handshape: express (park-23 ag: g 0 loc: g 1. . g. N ) Articulator: “Sideways 3” Right Hand Effects: Orientation: positioned(g 0), topic(g 0), Direction_of_motion_path( g 0 ) Location: Follow_location_of( g 0 ) Concurrently: Handshape: PLATFORM(g 0. loc. final), EYETRACK(g 0) “Sideways 3” express (park-23 ag: g 0 loc: g 1. . g. N )

Example Step 5: Initial Planner Goal • Planning starts with a “goal. ” • Express the semantics of the sentence: – Predicate: PARK-23 – Agent: “the car” discourse entity • We know from lexical information that this “car” is a vehicle (some special CPs may apply) – Location: 3 D position calculated “between” locations for “the cat” and “the house. ”

Example Step 6: Select Initial CP Template PARKING-VEHICLE Parameters: Restrictions: Preconditions: g_0, g_1, g_2 (ghost car & nearby objects) g_0 is a vehicle topic( g_0 ) or ( ident( g_0 ) and position( g_0 )) (ident( g_1 ) and position( g_1 )) (ident( g_2 ) and position( g_2 )) Articulator: Location: Orientation: Handshape: Right Hand Follow_location_of( g_0 ) Direction_of_motion_path( g_0 ) “Sideways 3” Effects: position( g_0 ), topic( g_0 ), express(park-23 agt: g_0 loc: g_1, g_2 ) PLATFORM( g_0. loc. final), EYETRACK( g_0 ) Concurrently:

Example Step 7: Instantiate the Template PARKING-VEHICLE Parameters: Restrictions: Preconditions: CAR, HOUSE, CAT CAR is a vehicle topic(CAR) or (ident(CAR) and position(CAR)) (ident(CAT) and position(CAT)) (ident(HOUSE) and position(HOUSE)) Articulator: Location: Orientation: Handshape: Right Hand Follow_location_of( CAR ) Direction_of_motion_path( CAR ) “Sideways 3” Effects: position(CAR), topic(CAR), express(park-23 agt: CAR loc: HOUSE, CAT ) PLATFORM(CAR. loc. final), EYETRACK(CAR) Concurrently:

Example Step 7: Instantiate the Template PARKING-VEHICLE Parameters: Restrictions: Preconditions: Gaze Right Left Effects: CAR, HOUSE, CAT CAR is a vehicle topic(CAR) or (ident(CAR) and position(CAR)) (ident(CAT) and position(CAT)) (ident(HOUSE) and position(HOUSE)) Eyes follow right hand. Path of car, stop at Loc#2. To Loc#2 position(CAR), topic(CAR), express (park-23 agt: CAR loc: HOUSE, CAT )

Example Step 8: Begin Planning Process PARKING-VEHICLE Parameters: Restrictions: Preconditions: Gaze Right Left Effects: CAR, HOUSE, CAT CAR is a vehicle topic(CAR) or (ident(CAR) and position(CAR)) (ident(CAT) and position(CAT)) (ident(HOUSE) and position(HOUSE)) Eyes follow right hand. Path of car, stop at Loc#2. To Loc#2 position(CAR), topic(CAR), express (park-23 agt: CAR loc: HOUSE, CAT )

Example Other Templates in the Database • We’ve seen these: – PARKING-VEHICLE – PLATFORM – EYEGAZE • There’s also these: – LOCATE-STATIONARY-ANIMAL – LOCATE-BULKY-OBJECT – MAKE-NOUN-SIGN

Example Step 9: Planning Continues… PARKING-VEHICLE Parameters: CAR, HOUSE, CAT Restrictions: CAR is a vehicle Preconditions: topic(CAR) or (ident(CAR) and position(CAR)) (ident(CAT) and position(CAT)) (ident(HOUSE) and position(HOUSE)) Gaze Eyes follow right hand. Right Path of car, stop at Loc#2. Left Effects: LOCATE-STATIONARY-ANIMAL To Loc#2 Parameters: CAT position(CAR), topic(CAR), Restrictions: CAT is an animal express (park-23 agt: CAR loc: HOUSE, CAT ) Preconditions: topic(CAT) Gaze Eyes at Cat Location. Right Move to Cat Location. Left Effects: topic(CAT), position(CAT), ident(CAT)

Example Step 9: Planning Continues… topic(HOUSE) identify(HOUSE) MAKENOUN: “HOUSE” LOCATEBULKYOBJECT topic(CAT) identify(CAT) MAKENOUN: “CAT” LOCATESTATNRYANIMAL position(CAT) position(HOUSE) topic(CAR) identify(CAR) MAKENOUN: “CAR” PARKINGVEHICLE EYEGAZE PLATFORM (concurrently)

Example Step 10: Build Phonological Spec MAKEat viewer NOUN: HOUSE “HOUSE” LOCATEat Loc#1 BULKYOBJECT MAKEat viewer NOUN: CAT “CAT” LOCATEat Loc#3 STATNRYANIMAL MAKEat viewer NOUN: CAR “CAR” follow car PARKINGVEHICLE EYEGAZE Gaze Right Left PLATFORM

Wrap-Up and Discussion

Wrap-Up • This is the first MT approach proposed for producing ASL Classifier Predicates. • Currently in early implementation phase. • Generation models for ASL CPs – discourse (topicalized/identified/positioned) – semantics (invisible ghosts) – syntax (planning operators) – phonology (simultaneous articulators)

Discussion • ASL as an MT research vehicle – Need for a spatial representation to translate some English-to-ASL sentence pairs. – Virtual reality: intermediate MT representation. – A translation pathway tailored to a specific phenomenon as part of a multi-path system. – Symmetry in use of planning in the analysis and generation sides of the MT architecture.

Questions?