Perceptual Categories Old and gradient young and sparse

Скачать презентацию Perceptual Categories Old and gradient young and sparse

c227e54c6d1e198c05e5d20d1d315dd6.ppt

Количество слайдов: 156

Perceptual Categories: Old and gradient, young and sparse. Bob Mc. Murray University of Iowa Dept. of Psychology

Collaborators Richard Aslin Michael Tanenhaus David Gow Joe Toscano Cheyenne Munson Meghan Clayards Dana Subik Julie Markant Jennifer Williams The students of the MACLab

Categorization occurs when: 1) discriminably different stimuli… 2) …are treated equivalently for some purposes… 3) …and stimuli in other categories are treated differently.

Categorization Perceptual Categorization • Continuous input maps to discrete categories. • Semantic knowledge plays minor role. • Bottom-up learning processes important.

Categorization Perceptual Categorization • Continuous inputs map to discrete categories. • Semantic knowledge plays less of a role. Categories include: • Faces • Shapes • Words • Colors Exemplars include: • A specific view of a specific faces • A variant of a shape. • A particular word in a particular utterance • Variation in hue, saturation, lightness

Categorization occurs when: 1) Discriminably different stimuli… 2) 3) …are treated equivalently for some purposes… …and stimuli in other categories are treated differently. Premise For Perceptual Categories this definition largely falls short. and this may be a good thing. Approach Walk through work on speech and category development. Assess this definition along the way.

Overview 1) Speech perception: Discriminably different and categorical perception. 2) Word recognition: exemplars of the same word are not treated equivalently. (+Benefits) 3) Speech Development: phonemes are not treated equivalently. 4) Speech Development (model): challenging other categories treated differently. (+Benefits) 5) Development of Visual Categories: challenging other categories treated differently.

Categorical Perception P 100 Discrimination B % /p/ 100 Discrimination ID (%/pa/) 0 B 0 VOT P • Sharp identification of tokens on a continuum. • Discrimination poor within a phonetic category. Subphonemic variation in VOT is discarded in favor of a discrete symbol (phoneme).

Categorical Perception: Demonstrated across wide swaths of perceptual categorization. Line Orientation (Quinn, 2005) Basic Level Objects (Newell & Bulthoff, 2002) Facial Identity (Beale & Keil, 1995) Musical Chords (Howard, Rosen & Broad, 1992) Signs (Emmorey, Mc. Collough & Brentari, 2003) Color (Bornstein & Korda, 1984) Vocal Emotion (Luakka, 2005) Facial Emotion (Pollak & Kistlerl, 2002) What’s going on?

Categorical Perception Across a category boundary, CP: • enhances contrast. Within a category, CP yields • a loss of sensitivity • a down-weighting of the importance of withincategory variation. • discarding continuous detail.

Categorical Perception Across a category boundary, CP: • enhances contrast. Within a category, CP yields Categorization occurs when: loss of sensitivity 1) • adiscriminably different stimuli… downweighting of the importance purposes… 2) • a…are treated equivalently for someof withincategory variation. 3) …and stimuli in other categories are treated • discarding continuous detail. differently Stimuli are not discriminably different. CP: Categorization affects perception. Definition: Categorization independent of perception. Need a more integrated view…

Categorical Perception Across a category boundary, CP: • enhances contrast. Within a category, CP yields • a loss of sensitivity • a downweighting of the importance of withincategory variation. • discarding continuous detail. Is continuous detail really discarded?

Is continuous detail really discarded? Evidence against the strong form of Categorical Perception from psychophysical-type tasks: Goodness Ratings Sidebar Miller (1994, 1997…) has This Massaro & Cohen (1983) never been Discrimination Tasks Pisoni and Tash (1974) Pisoni & Lazarus (1974) Carney, Widin & Viemeister (1977) examined with nonspeech stimuli… Training Samuel (1977) Pisoni, Aslin, Perey & Hennessy (1982)

Is continuous detail really discarded? No. ? Why not? Is it useful?

Online Word Recognition • Information arrives sequentially • At early points in time, signal is temporarily ambiguous. X basic bakery X ba… kery barrier X X bait barricade X baby • Later arriving information disambiguates the word.

Input: time beach butter bump putter dog b. . . u… tt… e… r

These processes have been well defined for a phonemic representation of the input. But considerably less ambiguity if we consider withincategory (subphonemic) information. Example: subphonemic effects of motor processes.

Coarticulation Any action reflects future actions as it unfolds. Example: Coarticulation Articulation (lips, tongue…) reflects current, future and past events. Subtle subphonemic variation in speech reflects temporal organization. n e t n e c k Sensitivity to these perceptual details might yield earlier disambiguation.

Experiment 1 ? What does sensitivity to within-category detail do? Does within-category acoustic detail systematically affect higher level language? Is there a gradient effect of subphonemic detail on lexical activation?

Experiment 1 Gradient relationship: systematic effects of subphonemic information on lexical activation. If this gradiency is used it must be preserved over time. Need a design sensitive to both systematic acoustic detail and detailed temporal dynamics of lexical activation. Mc. Murray, Tanenhaus & Aslin (2002)

Acoustic Detail Use a speech continuum—more steps yields a better picture acoustic mapping. Klatt. Works: generate synthetic continua from natural speech. 9 -step VOT continua (0 -40 ms) 6 pairs of words. beach/peach bump/pump bale/pale bomb/palm bear/pear butter/putter lock shoe lip sheep 6 fillers. lamp shark leg shell ladder ship leaf shirt

Temporal Dynamics How do we tap on-line recognition? With an on-line task: Eye-movements Subjects hear spoken language and manipulate objects in a visual world. Visual world includes set of objects with interesting linguistic properties. a beach, a peach and some unrelated items. Eye-movements to each object are monitored throughout the task. Tanenhaus, Spivey-Knowlton, Eberhart & Sedivy, 1995

Why use eye-movements and visual world paradigm? • Relatively natural task. • Eye-movements generated very fast (within 200 ms of first bit of information). • Eye movements time-locked to speech. • Subjects aren’t aware of eye-movements. • Fixation probability maps onto lexical activation. .

Task A moment to view the items

Task Bear Repeat 1080 times

Identification Results 1 0. 9 proportion /p/ 0. 8 High agreement across subjects and items for category boundary. 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0 0 5 10 B By subject: By item: 15 20 25 VOT (ms) 30 35 40 P 17. 25 +/- 1. 33 ms 17. 24 +/- 1. 24 ms

Task 200 ms Trials 1 2 3 4 Target = Bear Competitor = Pear Unrelated = Lamp, Ship % fixations 5 Time

Task VOT=0 Response= VOT=40 Response= Fixation proportion 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 00 400 800 1200 1600 Time (ms) More looks to competitor than unrelated items. 2000

Task Given that • the subject heard bear • clicked on “bear”… How often was the subject looking at the “pear”? target competitor time Gradient Effect Fixation proportion Categorical Results target competitor time

Results Response= Competitor Fixations 0. 16 VOT 0. 14 0 ms 5 ms 10 ms 15 ms 0. 12 0. 1 20 ms 25 ms 30 ms 35 ms 40 ms 0. 08 0. 06 0. 04 0. 02 0 0 400 800 1200 1600 2000 Time since word onset (ms) Long-lasting gradient effect: seen throughout the timecourse of processing.

Response= Competitor Fixations 0. 08 0. 07 Looks to 0. 06 0. 05 0. 04 Looks to Category Boundary 0. 03 0. 02 0 5 10 15 20 25 30 35 40 VOT (ms) Area under the curve: Clear effects of VOT Linear Trend B: p=. 017* B: p=. 023* P: p<. 001*** P: p=. 002***

Response= Competitor Fixations 0. 08 0. 07 Looks to 0. 06 0. 05 0. 04 Looks to Category Boundary 0. 03 0. 02 0 5 10 15 20 25 30 35 40 VOT (ms) Unambiguous Stimuli Only Clear effects of VOT B: p=. 014* P: p=. 001*** Linear Trend B: p=. 009** P: p=. 007**

Summary Subphonemic acoustic differences in VOT have gradient effect on lexical activation. • Gradient effect of VOT on looks to the competitor. • Effect holds even for unambiguous stimuli. • Seems to be long-lasting. Consistent with growing body of work using priming (Andruski, Blumstein & Burton, 1994; Utman, Blumstein & Burton, 2000; Gow, 2001, 2002). Variants from the same category are not treated equivalently: Gradations in interpretation are related to gradations in stimulus.

Extensions Word recognition is systematically sensitive to subphonemic acoustic detail. ü ü Voicing Laterality, Manner, Place Natural Speech Vowel Quality

Extensions Word recognition is systematically sensitive to subphonemic acoustic detail. ü ü Voicing Laterality, Manner, Place Natural Speech Vowel Quality Metalinguistic Tasks B P L Sh

Extensions Word recognition is systematically sensitive to subphonemic acoustic detail. Voicing Laterality, Manner, Place Natural Speech Vowel Quality Metalinguistic Tasks Competitor Fixations ü ü 0. 1 Response=P Looks to B 0. 08 0. 06 0. 04 Response=B Looks to B 0. 02 00 5 10 Category Boundary 15 20 25 VOT (ms) 30 35 40

Extensions Word recognition is systematically sensitive to subphonemic acoustic detail. Voicing Laterality, Manner, Place Natural Speech Vowel Quality Metalinguistic Tasks Competitor Fixations ü ü 0. 1 Response=P Looks to B 0. 08 0. 06 0. 04 Response=B Looks to B 0. 02 0 0 5 10 Category Boundary 15 20 25 VOT (ms) 30 35 40

Categorical Perception Within-category detail surviving to lexical level. Abnormally sharp categories may be due to meta-linguistic tasks. There is a middle ground: warping of perceptual space (e. g. Goldstone, 2002) Retain: non-independence of perception and categorization.

Perceptual Categorization occurs when: 1) discriminably different stimuli CP: perception not independent of categorization. 2) are treated equivalently for some purposes… Exp 1: Lexical variants not treated equivalently (gradiency) 3) and stimuli in other categories are treated differently.

Progressive Expectation Formation Any action reflects future actions as it unfolds. Can within-category detail be used to predict future acoustic/phonetic events? Yes: Phonological regularities create systematic within-category variation. • Predicts future events.

Experiment 3: Anticipation Word-final coronal consonants (n, t, d) assimilate the place of the following segment. Maroong Goose Maroon Duck Place assimilation -> ambiguous segments —anticipate upcoming material. Input: time maroon goose goat duck m… a… r… oo… ng… g… oo… s…

Subject hears “select the maroong duck” goose” duck” * We should see faster eyemovements to “goose” after assimilated consonants.

Results Onset of “goose” + oculomotor delay Fixation Proportion 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 Assimilated 0. 3 Non Assimilated 0. 2 0. 1 0 0 200 Time (ms) 400 Looks to “goose“ as a function of time Anticipatory effect on looks to non-coronal. 600

Fixation Proportion Onset of “goose” + oculomotor delay 0. 3 Assimilated Non Assimilated 0. 25 0. 2 0. 15 0. 1 0. 05 0 0 200 Time (ms) 400 600 Looks to “duck” as a function of time Inhibitory effect on looks to coronal (duck, p=. 024)

Experiment 3: Extensions Possible lexical locus Green/m Boat Eight/Ape Babies Assimilation creates competition

Sensitivity to subphonemic detail: • Increase priors on likely upcoming events. • Decrease priors on unlikely upcoming events. • Active Temporal Integration Process. Possible lexical mechanism… NOT treating stimuli equivalently allows withincategory detail to be used for temporal integration.

Adult Summary Lexical activation is exquisitely sensitive to withincategory detail: Gradiency. This sensitivity is useful to integrate material over time. • Progressive Facilitation • Regressive Ambiguity resolution (ask me about this)

Development Historically, work in speech perception has been linked to development. Sensitivity to subphonemic detail must revise our view of development. Use: Infants face additional temporal integration problems No lexicon available to clean up noisy input: rely on acoustic regularities. Extracting a phonology from the series of utterances.

Sensitivity to subphonemic detail: For 30 years, virtually all attempts to address this question have yielded categorical discrimination (e. g. Eimas, Siqueland, Jusczyk & Vigorito, 1971). Exception: Miller & Eimas (1996). • Only at extreme VOTs. • Only when habituated to nonprototypical token.

Use? Nonetheless, infants possess abilities that would require within-category sensitivity. • Infants can use allophonic differences at word boundaries for segmentation (Jusczyk, Hohne & Bauman, 1999; Hohne, & Jusczyk, 1994) • Infants can learn phonetic categories from distributional statistics (Maye, Werker & Gerken, 2002; Maye & Weiss, 2004).

Statistical Category Learning Speech production causes clustering along contrastive phonetic dimensions. E. g. Voicing / Voice Onset Time B: VOT ~ 0 P: VOT ~ 40 -50 Within a category, VOT forms Gaussian distribution. Result: Bimodal distribution 0 ms VOT 40 ms

To statistically learn speech categories, infants must: • Record frequencies of tokens at each value along a stimulus dimension. frequency • Extract categories from the distribution. +voice 0 ms -voice VOT 50 ms • This requires ability to track specific VOTs.

Experiment 4 Why no demonstrations of sensitivity? • Habituation Discrimination not ID. Possible selective adaptation. Possible attenuation of sensitivity. • Synthetic speech Not ideal for infants. • Single exemplar/continuum Not necessarily a category representation Experiment 3: Reassess issue with improved methods.

HTPP Head-Turn Preference Procedure (Jusczyk & Aslin, 1995) Infants exposed to a chunk of language: • Words in running speech. • Stream of continuous speech (ala statistical learning paradigm). • Word list. Memory for exposed items (or abstractions) assessed: • Compare listening time between consistent and inconsistent items.

Test trials start with all lights off.

Center Light blinks.

Brings infant’s attention to center.

One of the side-lights blinks.

Beach… When infant looks at side-light… …he hears a word

…as long as he keeps looking.

Methods 7. 5 month old infants exposed to either 4 b-, or 4 p-words. Bomb 80 repetitions total. Form a category of the exposed class of words. Palm Bear Pear Bail Pail Beach Peach Measure listening time on… Original words Bear Competitors Pear VOT closer to boundary Bear* Pear Bear Pear*

Stimuli constructed by cross-splicing naturally produced tokens of each end point. B: P: M= 3. 6 ms VOT M= 40. 7 ms VOT B*: P*: M=11. 9 ms VOT M=30. 2 ms VOT B* and P* were judged /b/ or /p/ at least 90% consistently by adult listeners. B*: 97% P*: 96%

Novelty or Familiarity? Novelty/Familiarity preference varies across infants and experiments. We’re only interested in the middle stimuli (b*, p*). Infants were classified as novelty or familiarity preferring by performance on the endpoints. Novelty Familiarity B 36 16 P 21 12 Within each group will we see evidence for gradiency?

After being exposed to bear… beach… bail… bomb… Infants who show a novelty effect… …will look longer for pear than bear. Listening Time What about in between? Categorical Gradient Bear* Pear

Experiment 3: Results Novelty infants (B: 36 P: 21) Listening Time (ms) 10000 9000 8000 7000 Exposed to: 6000 B P 5000 4000 Target* Target vs. Target*: Competitor vs. Target*: Competitor p<. 001 p=. 017

Familiarity infants (B: 16 P: 12) 10000 Listening Time (ms) Exposed to: 9000 B P 8000 7000 6000 5000 4000 Target* Target vs. Target*: Competitor vs. Target*: Competitor P=. 003 p=. 012

Infants exposed to /p/. 009** Novelty N=21 . 024* 9000 8000 7000 6000 . 028* 9000 5000 4000 P Familiarity N=12 Listening Time (ms) 10000 8000 P* . 018* B 7000 6000 5000 4000 P P* B

Infants exposed to /b/ >. 1 >. 2 <. 001** Novelty N=36 9000 8000 7000 6000 . 06 10000 5000 4000 B Familiarity N=16 Listening Time (ms) 10000 . 15 9000 B* 8000 P 7000 6000 5000 4000 B B* P

Experiment 3 Conclusions Contrary to all previous work: 7. 5 month old infants show gradient sensitivity to subphonemic detail. • Clear effect for /p/ • Effect attenuated for /b/.

Listening Time Reduced effect for /b/… But: Null Effect? Bear* Pear Listening Time Bear Expected Result? Bear* Pear

Listening Time Actual result. Bear* ü Pear • Bear* Pear • Category boundary lies between Bear & Bear* - Between (3 ms and 11 ms) [? ? ] • Within-category sensitivity in a different range?

Experiment 4 Same design as experiment 3. VOTs shifted away from hypothesized boundary Train Bomb Beach Bear Bale -9. 7 ms. Bomb* Beach* Bear* Bale* 3. 6 ms. Palm Peach Pear Pail 40. 7 ms. Test:

Familiarity infants (34 Infants) =. 01** Listening Time (ms) 9000 =. 05* 8000 7000 6000 5000 4000 B- B P

Novelty infants (25 Infants) =. 002** Listening Time (ms) 9000 =. 02* 8000 7000 6000 5000 4000 B- B P

Experiment 4 Conclusions • Within-category sensitivity in /b/ as well as /p/. Infants do NOT treat stimuli from the same category equivalently: Gradient.

Perceptual Categorization occurs when: 1) discriminably different stimuli 2) are treated equivalently for some purposes… CP: perception not independent of categorization. Exp 1: Lexical variants not treated equivalently (gradiency) Exp 2: non equivalence enables temporal integration. Exp 3/4: Infants do not treat category members equivalently 3) and stimuli in other categories are treated differently.

Experiment 4 Conclusions • Within-category sensitivity in /b/ as well as /p/. Infants do NOT treat stimuli from the same category equivalently: Gradient. Remaining questions: 1) Why the strange category boundary? 2) Where does this gradiency come from?

Experiment 4 Conclusions Remaining questions: Listening Time 2) Where does this gradiency come from? B- B B* P* P VOT

Remaining questions: 2) Where does this gradiency come from? Results resemble half a Gaussian… B- B B* P* P VOT

Remaining questions: 2) Where does this gradiency come from? Results resemble half a Gaussian… And the distribution of VOTs is Gaussian Lisker & Abramson (1964) Statistical Learning Mechanisms?

Remaining questions: 1) Why the strange category boundary? Category Mapping Strength /b/ results consistent with (at least) two mappings. /b/ /p/ 1) Shifted boundary VOT • Inconsistent with prior literature.

HTPP is a one-alternative task. Asks: B or not-B not: B or P Category Mapping Strength Adult boundary unmapped /b/ space /p/ 2) Sparse Categories VOT Hypothesis: Sparse categories: by-product of efficient learning.

Remaining questions: 1) Why the strange category boundary? 2) Where does this gradiency come from? ? Are both a by-product of statistical learning? Can a computational approach contribute?

Computational Model Mixture of Gaussian model of speech categories 1) Models distribution of tokens as a mixture of Gaussian distributions over phonetic dimension (e. g. VOT). 2) Each Gaussian represents a category. Posterior probability of VOT ~ activation. 3) Each Gaussian has three parameters: VOT

Statistical Category Learning 1) Start with a set of randomly selected Gaussians. 2) After each input, adjust each parameter to find best description of the input. 3) Start with more Gaussians than necessary--model doesn’t innately know how many categories. -> 0 for unneeded categories. VOT

Overgeneralization • large • costly: lose phonetic distinctions…

Undergeneralization • small • not as costly: maintain distinctiveness.

To increase likelihood of successful learning: • err on the side of caution. • start with small 1 0. 9 39, 900 Models Run P(Success) 0. 8 0. 7 0. 6 2 Category Model 3 Category Model 0. 5 0. 4 0. 3 0. 2 0. 1 0 0 10 20 30 Starting 40 50 60

Small Sparseness coefficient: % of space not strongly mapped to any category. Unmapped space Avg Sparseness Coefficient VOT 0. 4 Starting . 5 -1 0. 35 0. 3 0. 25 0. 2 0. 15 0. 1 0. 05 0 0 2000 4000 6000 8000 Training Epochs 10000 12000

Start with large σ Avg Sparsity Coefficient VOT 0. 4 Starting . 5 -1 0. 35 0. 3 0. 25 20 -40 0. 2 0. 15 0. 1 0. 05 0 0 2000 4000 6000 8000 Training Epochs 10000 12000

Intermediate starting σ Avg Sparsity Coefficient VOT 0. 4 Starting . 5 -1 3 -11 12 -17 20 -40 0. 35 0. 3 0. 25 0. 2 0. 15 0. 1 0. 05 0 0 2000 4000 6000 8000 Training Epochs 10000 12000

Model Conclusions Continuous sensitivity required for statistical learning. Statistical learning enhances gradient category structure. To avoid overgeneralization… …better to start with small estimates for Small or even medium starting => sparse category structure during infancy—much of phonetic space is unmapped. Tokens that are treated differently may not be in different categories.

Perceptual Categorization occurs when: 1) discriminably different stimuli CP: perception not independent of categorization. Exp 1: Lexical variants not treated equivalently (gradiency) 2) are treated equivalently for some purposes… Exp 2: non equivalence enables temporal integration. Exp 3/4: Infants do not treat category members equivalently Model: Gradiency arises from statistical learning. 3) and stimuli in other categories are treated differently. Model: Tokens treated differently are not in different categories (sparseness). Model: Sparseness by product of optimal learning.

AEM Paradigm Examination of sparseness/completeness of categories needs a two alternative task. Treating stimuli equivalently Treating stimuli differently Identification, not discrimination. Existing infant methods: Habituation Head-Turn Preference Preferential Looking Mostly test discrimination To AEM

AEM Paradigm Exception: Conditioned Head Turn (Kuhl, 1979) • Infant hears constant stream of distractor stimuli. • Conditioned to turn head in response to a target stimulus using visual reinforcer. • After training generalization can be assessed. • Approximates Go/No-Go task. i a a…

AEM Paradigm When detection occurs this could be because • Stimulus is perceptually equivalent to target. • Stimulus is perceptually different but member of same category as target. When no detection, this could be because • Stimuli are perceptually different. • Stimuli are in different categories. A solution: the multiple exemplar approach

AEM Paradigm Multiple exemplar methods (Kuhl, 1979; 1983) • Training: single distinction i/a. • Irrelevant variation gradually added (speaker & pitch). • Good generalization. This exposure may mask natural biases: • Infants trained on irrelevant dimension(s). • Infants exposed to expected variation along irrelevant dimension. Infants trained on a single exemplar did not generalize.

AEM Paradigm HTPP, Habituation and Conditioned Head-Turn methods all rely on a single response: criterion effects. Is a member of ’s category? Yes: No: • Both dogs • Different breeds • Both mammals • Different physical • Both 4 -legged animals properties How does experimenter establish the decision criterion?

AEM Paradigm Multiple responses: Is a member of or Pug vs. poodle: Decision criteria will be based on species-specific properties (hair-type, body-shape). Two-alternative tasks specify criteria without explicitly teaching: • What the irrelevant cues are • Their statistical properties (expected variance). ?

AEM Paradigm Conditioned-Head-Turn provides right sort of response, but cannot be adapted to two-alternatives (Aslin & Pisoni, 1980). • Large metabolic cost in making head-movement. • Requires 180º shift in attention. Could we use a different behavioral response in a similar conditioning paradigm?

AEM Paradigm Eye movements may provide ideal response. • Smaller angular displacements detectable with computer- based eye-tracking. • Metabolically cheap—quick and easy to generate. How can we train infants to make eye movements target locations?

AEM Paradigm Infants readily make anticipatory eye movements to regularly occurring visual events: Visual Expectation Paradigm (Haith, Wentworth & Canfield, 1990; Canfield, Smith, Breznyak & Snow, 1997) Movement under an occluder (Johnson, Amso & Slemmer, 2003)

AEM Paradigm Anticipatory Eye-Movements (AEM): Train infants to use anticipatory eye movements as a behavioral label for category identity. • Two alternative response (left-right) • Arbitrary, identification response. • Response to a single stimulus. • Many repeated measures.

AEM Paradigm Each category is associated with the left or right side of the screen. Categorization stimuli followed by visual reinforcer.

AEM Paradigm Delay between stimulus and reward gradually increases throughout experiment. STIMULUS trial 1 REINFORCER STIMULUS trial 30 REINFORCER time Delay provides opportunity for infants to make anticipatory eye-movements to expected location.

AEM Paradigm

AEM Paradigm After training on original stimuli, infants are tested on a mixture of: • new, generalization stimuli (unreinforced) Examine category structure/similarity relative to trained stimuli. • original, trained stimuli (reinforced) Maintain interest in experiment. Provide objective criterion for inclusion

AEM Paradigm Gaze position assessed with automated, remote eye-tracker. MHT Receiver Remote Eye tracker MHT Transmitter TV Baby Infrared Video Camera Eye -tracker Control Unit MHT Control Unit To Eye tracking Computer Gaze position recorded on standard video for analysis.

Experiment 5 Multidimensional visual categories Can infants learn to make anticipatory eye movements in response to visual category identity? ? What is the relationship between basic visual features in forming perceptual categories? • Shape • Color • Orientation

Experiment 5 Train: Shape (yellow square and yellow cross) Test: Variation in color and orientation. Yellow 0º (training values) Orange 10º Red 20º If infants ignore irrelevant variation in color or orientation, performance should be good for generalization stimuli. If infants’ shape categories are sensitive to this variation, performance will degrade.

Experiment 5: Results 9/10 scored better than chance on original stimuli. M = 68. 7% Correct 80 Percent Correct 70 No effect of color (p>. 2) 60 50 40 30 20 Yellow, 0° Color (n. s. ) Angle (p<. 05) 10 0 Training Stimuli Yellow 0° Orange 10° Red 20° Significant performance deficit due to orientation (p=. 002)

Some stimuli are uncategorized (despite very reasonable responses): sparseness. Sparse region of input spaces

Occlusion-Based AEM is based on an arbitrary mapping. • Unnatural mechanism drives anticipation. • Requires slowly changing duration of delay-period. Infants do make eye-movements to anticipate objects’ trajectories under an occluder. (Johnson, Amso & Slemmer, 2003) Can infants associate anticipated trajectories (under the occluder) with target identity?

Red Square

Yellow Cross

Yellow Square To faces To end

Experiment 6 Can AEM assess auditory categorization? Can infants “normalize” for variations in pitch and duration? or… Are infants’ sensitive to acoustic-detail during a lexical identification task?

Training: “Teak” -> rightward trajectory. “Lamb” -> leftward trajectory. “teak!” Test: Lamb & Teak with changes in: Duration: 33% and 66% longer. Pitch: 20% and 40% higher If infants ignore irrelevant variation in pitch or duration, performance should be good for generalization stimuli. “lamb!” If infants’ lexical representations are sensitive to this variation, performance will degrade.

Training stimulus (lamb)

Experiment 6: Results Pitch p>. 1 Duration p=. 002 Proportion Correct Trials 20 Training trials. 11 of 29 infants performed better than chance. 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 Duration Pitch 0. 1 0 Training Stimuli D 1 / P 1 D 2 / P 2 Stimulus Experiment 6 Results 2

Variation in pitch is tolerated for word-categories. Variation in duration is not. - Takes a gradient form. Again, some stimuli are uncategorized (despite very reasonable responses): sparseness.

Perceptual Categorization occurs when: 1) discriminably different stimuli 2) are treated equivalently for some purposes… 3) and stimuli in other categories are treated differently. CP: perception not independent of categorization. Exp 1: Lexical variants not treated equivalently (gradiency) Exp 2: non equivalence enables temporal integration. Exp 3/4: Infants do not treat category members equivalently Model: Gradiency arises from statistical learning. Exp 6: Gradiency in infant response to duration. Model: Tokens treated differently are not in different categories (sparseness). Model: Sparseness by product of optimal learning. Exp 5, 6: Shape, Word categories show similar sparse structure.

Exp 7: Face Categorization Can AEM help understand face categorization? Are facial variants treated equivalently? Train: two arbitrary faces Test: same faces at 0°, 45°, 90°, 180° Facial inversion effect.

Experiment 7: Results Percent Correct 1 0. 8 0. 6 0. 4 0. 2 0 Vertical 45º 90º 180º 22/33 successfully categorized vertical faces. • 45º, 180º: chance (p>. 2). • 90º: p=. 111 • 90º vs. Vertical: p<. 001 • 90º vs. 45º & 180º : p<. 001.

Experiment 7 AEM useful with faces. Facial Inversion effect replicated. Generalization not simple similarity– 90º vs. 45º –Infants’ own category knowledge is reflected. Resembles VOT (b/p) results: within a dimension, some portions are categorized, others are not. Again, some stimuli are uncategorized (despite very reasonable responses): sparseness.

Perceptual Categorization occurs when: 1) discriminably different stimuli 2) are treated equivalently for some purposes… 3) and stimuli in other categories are treated differently. CP: perception not independent of categorization. Exp 1: Lexical variants not treated equivalently (gradiency) Exp 2: non equivalence enables temporal integration. Exp 3/4: Infants do not treat category members equivalently Model: Gradiency arises from statistical learning. Exp 6: Gradiency in infant response to duration. Model: Tokens treated differently are not in different categories (sparseness). Model: Sparseness by product of optimal learning. Exp 5, 6, 7: Shape, Word, Face categories show similar sparse structure.

Again, some stimuli are uncategorized (despite very reasonable responses): sparseness. Variation Categories Tolerated Exp 5 Exp 6 Exp 7 Variation Not tolerated Shapes Faces Words Orientation Duration Color 90° Pitch Evidence for complex, but sparse categories: some dimensions (or regions of a dimension) are included in the category, others are not.

Infant Summary • Infants show graded sensitivity to continuous speech cues. • /b/-results: regions of unmapped phonetic space. • Statistical approach provides support for sparseness. - Given current learning theories, sparseness results from optimal starting parameters. • Empirical test will require a two-alternative task: AEM • Test of AEM paradigm also shows evidence for sparseness in shapes, words, and faces.

Audience Specific Conclusions For speech people Gradiency: continuous information in the signal is not discarded and is useful during recognition. Gradiency: Infant speech categories are also gradient, a result of statistical learning. For infant people Methodology: AEM is a useful technique for measuring categorization in infants (bonus: works with undergrads too). Sparseness: Through the lens of a 2 AFC task, (or interactions of categories) categories look more complex.

Perceptual Categorization 1) discriminably different stimuli… CP: discrimination not distinct from categorization. Continuous feedback relationship between perception and categorization 2) …are treated equivalently for some purposes… Gradiency: Infants and adults do not treat stimuli equivalently. This property arises from learning processes as well as the demands of the task. 3) and stimuli in other categories are treated differently Sparseness: Infants’ categories do not fully encompass the input. Many tokens are not categorized at all…

Conclusions Categorization is an approximation of an underlyingly continuous system. Clumps of similarity in stimulus-space. Reflect underlying learning processes and demands of online processing. During development, categorization is not common (across the complete perceptual space)—small, specific clusters may grow to larger representations. This is useful: avoid overgeneralization.

Take Home Message Early, sparse, regions of graded similarity space … grow, gain structure … but retain their fundamental gradiency.

Perceptual Categories: Old and gradient, young and sparse. Bob Mc. Murray University of Iowa Dept. of Psychology

IR Head-Tracker Emitters Head-Tracker Cam Monitor Head 2 Eye cameras Computers connected via Ethernet Eyetracker Computer Subject Computer

Misperception: Additional Results

10 Pairs of b/p items. • 0 – 35 ms VOT continua. 20 Filler items (lemonade, restaurant, saxophone…) Option to click “X” (Mispronounced). 26 Subjects 1240 Trials over two days.

Response Rate Identification Results 1. 00 0. 90 0. 80 0. 70 0. 60 0. 50 0. 40 0. 30 Significant target responses even at extreme. Voiced Voiceless NW 0. 20 0. 10 0. 00 0 5 10 15 20 25 Barricade 30 35 Parricade Response Rate 1. 00 0. 90 0. 80 0. 70 0. 60 Voiced 0. 50 0. 40 0. 30 0. 20 Voiceless NW 0. 10 0. 00 0 5 Barakeet 10 15 20 25 30 35 Parakeet Graded effects of VOT on correct response rate.

Phonetic “Garden-Path” “Garden-path” effect: Difference between looks to each target (b vs. p) at same VOT = 0 (/b/) VOT = 35 (/p/) Fixations to Target 1 0. 8 Barricade Parakeet 0. 6 0. 4 0. 2 0 0 500 1000 Time (ms) 1500

Garden-Path Effect ( Barricade - Parakeet ) 0. 15 Target 0. 1 0. 05 GP Effect: Gradient effect of VOT. 0 -0. 05 -0. 1 0 5 10 15 20 25 30 35 VOT (ms) Garden-Path Effect ( Barricade - Parakeet ) 0. 06 0. 04 0. 02 Competitor 0 -0. 02 -0. 04 -0. 06 -0. 08 -0. 1 0 5 10 15 20 VOT (ms) Target: p<. 0001 Competitor: p<. 0001

Assimilation: Additional Results

runm picks runm takes *** When /p/ is heard, the bilabial feature can be assumed to come from assimilation (not an underlying /m/). When /t/ is heard, the bilabial feature is likely to be from an underlying /m/.

Exp 3 & 4: Conclusions Within-category detail used in recovering from assimilation: temporal integration. • Anticipate upcoming material • Bias activations based on context - Like Exp 2: within-category detail retained to resolve ambiguity. . Phonological variation is a source of information.

Subject hears “select the mudg drinker” gear” drinker Critical Pair

Onset of “gear” Avg. offset of “gear” (402 ms) Fixation Proportion 0. 45 0. 4 0. 35 0. 3 0. 25 0. 2 0. 15 Initial Coronal: Mud Gear 0. 1 Initial Non-Coronal: Mug Gear 0. 05 0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Time (ms) Mudg Gear is initially ambiguous with a late bias towards “Mud”.

Onset of “drinker” Avg. offset of “drinker (408 ms) Fixation Proportion 0. 6 0. 5 0. 4 0. 3 0. 2 Initial Coronal: Mud Drinker 0. 1 Initial Non-Coronal: Mug Drinker 0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Time (ms) Mudg Drinker is also ambiguous with a late bias towards “Mug” (the /g/ has to come from somewhere).

Onset of “gear” Fixation Proportion 0. 8 0. 7 0. 6 0. 5 0. 4 Assimilated 0. 3 Non Assimilated 0. 2 0. 1 0 0 200 400 600 Time (ms) Looks to non-coronal (gear) following assimilated or non-assimilated consonant. In the same stimuli/experiment there is also a progressive effect!

Non-parametric approach? • Competitive Hebbian Learning (Rumelhart & Zipser, 1986). Categories • Not constrained by a particular equation—can fill space better. • Similar properties in terms of starting and sparseness. VOT