Скачать презентацию Attention in Psychology I Historical Background Attention was Скачать презентацию Attention in Psychology I Historical Background Attention was

66279112a36fe999d070349b03b81717.ppt

  • Количество слайдов: 81

Attention in Psychology: I Historical Background Attention was one of the first concepts to Attention in Psychology: I Historical Background Attention was one of the first concepts to appear in Psychology texts (ca 1730) – e. g. , Ebbinghaus, Titchener, … l Early discussions (Hatfield, 1998) focused on properties such as l § § § § l Narrowing (Aristotle, 4 th century BC) Active Directing (Lucretius, 1 st century AD) Involuntary shifts (Hippo, 400 AD) Clarity (Buridan, 14 th century) Fixation over time (Descartes, 17 th century) Effector sensitivity (Descartes) All the above phenomena (William James, early 1900 s) More recent studies have been concerned with Ø The view of attention as selection Ø The analysis of attention as a process of resource allocation Ø The study of the relation between voluntary and involuntary control of attention

 Attention as Selection We will concentrate on the Selection or Filtering aspects of Attention as Selection We will concentrate on the Selection or Filtering aspects of attention. We will ask: 1. Why do we need to select anyway? § Because our processing capacity is limited? The Big Question: In what way is it limited? (Miller, 1957) Ø We will return to this core question after some preliminaries on the early study of attention as selection and the filter theory. 2. On what basis do we select? Some alternatives: We select according to what is important to us (e. g. , affordances) We select what can be described physically (i. e. , “channels”) We select based on what can be encoded without accessing LTM We “pick out” things to which we subsequently attach concepts: i. e. , we pick out objects (or regions? ) What happens to what we have not selected? A largely unsolved mystery (though in some cases there are plausible answers). § § 3.

 Big Question #1: Why do we need to select information? Along which dimensions Big Question #1: Why do we need to select information? Along which dimensions is human information processing capacity limited? l Channel capacity: Shannon-Hartley Theorem Capacity measured in some sort of “chunks” (Miller) l Capacity measured in terms of the number of arguments that can be simultaneously bound to cognitive routines (Newell) l To what things in the world can the arguments of visual predicates be bound?

Early studies: Colin Cherry’s “Cocktail Party Problem” l What determines how well you can Early studies: Colin Cherry’s “Cocktail Party Problem” l What determines how well you can select one conversation among several? Why are we so good at it? l The more controlled version of this study used dichotic presentations – one “channel” per ear. l Cherry found that when attention is fully occupied in selecting information from one ear (through use of the “shadowing” task), almost nothing is noticed in the “rejected” ear (only if it was not speech). l More careful observations shows this was not quite true § Change in spectral properties (pitch) is noticed § You are likely to notice your name spoken § Even meaning is extracted, as shown by involuntary ear switching and disambiguating effect of rejected channel content

Visual analogues illustrating the two-channel selection problem In these examples you are to read Visual analogues illustrating the two-channel selection problem In these examples you are to read only the text in shadows and ignore the rest. Read as quickly as you can and when you are finished, close your eyes or look away from the text.

Visual analogue #1 illustrating the two-channel selection problem In performing an experiment like this Visual analogue #1 illustrating the two-channel selection problem In performing an experiment like this one on man attention car it house is boy critically hat important she that candy the old material horse that tree is pen being phone read cow by book the hot subject tape for pin the stand relevant view task sky be read cohesive man and car gramatically house complete boy but hat without shoe either candy being horse so tree easy pen that phone full cow attention book is hot not tape required pin in stand order view to sky read red it nor too difficult.

Visual analogue #2 illustrating the two-channel selection problem It is important that the subject Visual analogue #2 illustrating the two-channel selection problem It is important that the subject man be car pushed slightly boy beyond hat his normal limits horse of tree competence pen for be only in phone this cow way book can hot one tape be pin certain stand that snaps he with is his paying teeth attention in to the empty relevant air task and hat minimal shoe attention candy to horse the tree second or peripheral task.

Broadbent’s Filter Theory Effectors Motor planner Filter Very Short Term Store Senses Rehearsal loop Broadbent’s Filter Theory Effectors Motor planner Filter Very Short Term Store Senses Rehearsal loop Limited Capacity Channel Store of conditional probabilities of past events (in LTM) Broadbent, D. E. (1958). Perception and Communication. London: Pergamon Press.

Problems with the Filter Theory The filter “leaks. ” Work by Treisman, Lackner, and Problems with the Filter Theory The filter “leaks. ” Work by Treisman, Lackner, and many others shows that the filter could not be eliminating parts of the input using a physically-defined channel, because the properties on the basis of which the input is filtered require a high level of processing (e. g. , determination of meaning). Consequently such information must have to have gotten through the filter! l Many solutions to this conundrum have been proposed, ranging from replacing the filter with an attenuator, to various complex (and highly incomplete) proposals such as those of Deutsch & Deutsch, (1963) and Norman (1968), Morton (1969) and Neisser(1967), none of which are satisfactory, but each of which embodies some ideas that may be part of the story. l What all these alternatives do is assume that the filter is responsive to top-down expectancy and prediction effects. But the evidence is against this sort of knowledge-based selection as a general property of perception (Pylyshyn, 1999), although it is possible within such modular domains as language processing. l

Stroop Effect Baseline: Name the colors of the ink Stroop Effect Baseline: Name the colors of the ink

Stroop Effect in Portuguese Name the colors of the ink VERMELHO VERDE AZUL MARROM Stroop Effect in Portuguese Name the colors of the ink VERMELHO VERDE AZUL MARROM ROSA ALARANJADO VERDE ROSA VERMELHO AMARELO VERDE AMARELO VERMELHO MARROM VERMELHO AZUL MARROM VERDE VERMELHO ALARANJADO VERMELHO AZUL AMARELO ROSA ALARANJADO VERDE AZUL MARROM ROSA VERMELHO AMARELO VERDE AMARELO VERMELHO BROWN VERMELHO AZUL MARROM VERDE AMARELO VERMELHO ROSA ALARANJADO VERDE VERMELHO AZUL MARROM VERDE VERMELHO ALARANJADO VERMELHO AZUL

Stroop Effect in English Name the colors of the ink RED GREEN BLUE PINK Stroop Effect in English Name the colors of the ink RED GREEN BLUE PINK BROWN ORANGE GREEN PINK RED YELLOW GREEN YELLOW RED BROWN RED BLUE BROWN GREEN RED ORANGE RED BLUE YELLOW PINK ORANGE GREEN BLUE BROWN PINK RED YELLOW GREEN YELLOW RED BROWN RED BLUE GREEN BROWN YELLOW GREEN YELLOW RED PINK ORANGE GREEN RED BLUE BROWN GREEN RED ORANGE RED BLUE YELLOW GREEN YELLOW RED BROWN PINK RED YELLOW GREEN PINK RED YELLOW

Degree of Interference of the attended message, as well as its interpretation, shows that Degree of Interference of the attended message, as well as its interpretation, shows that the rejected message was understood Moral: Although the rejected channel appears to be rejected, it is being processed enough to understand the words! l The semantic interpretation of attended message depends on the meaning content of the rejected message. Subjects were asked to paraphrase the attended message in: l – Channel 1 (attended): “I think I will go down to the bank but I will be back for dinner” – Channel 2 (rejected): “The election results will depend on the value of the dollar against the Euro and on the state of the domestic economy” – OR Channel 2 (rejected): “The rain has resulted in erosion by the overflowing river” Lackner, J. R. , & Garrett, M. F. (1972). Resolving ambiguity: Effects of biasing context in the unattended ear. Cognition, 1, 359 -372.

Amount of information in terms of the Information-theoretic measure (entropy) l Amount of information Amount of information in terms of the Information-theoretic measure (entropy) l Amount of information in a signal depends on how much one’s estimate of the probability of events is changed by the signal. H = - pi Log 2 (pi) … information in bits “One of by land, two if by sea” contains one bit of information if the two possibilities were equally likely, less if they were not (e. g. , if one was twice as likely as the other the information in the message would be ⅓ Log ⅓ + ⅔ Log ⅔ = 0. 92 bits) l The amount of information transmitted depends on the potential amount of information in the message and the amount of correlation between message sent and message received. So information transmitted is a type of correlation measure. l The information measure assumes an “ideal receiver”. It is the maximum information that could be transmitted, given the statistical properties of messages, assuming that the sender and receiver know the code. This maximum depends on physical properties of the channel – its Channel Capacity. l

Information transmitted in a typical absolute judgment experiment Ø Information transmitted in an experiment Information transmitted in a typical absolute judgment experiment Ø Information transmitted in an experiment in which subjects were presented with tones drawn from a known practiced set (of a given size, which determines the value of input information) and had to name the tones from a learned name set. Ø The information transmitted was always around 2. 5 bits or an average of 6. 25 equiprobable alternatives!

The channel capacity hypothesis implies that the amount of information retained in STM is The channel capacity hypothesis implies that the amount of information retained in STM is constant and independent of the type of items l But it turns out that much more information is retained when the items are drawn from a larger set (e. g. , more information can be retained when the input is numerals rather then than binary digits, more for letters, more for words, etc).

Why can we retain vastly different amounts of information just by using a different Why can we retain vastly different amounts of information just by using a different encoding vocabulary? • Answer: The architecture of the cognitive system has the property that it can deal with a fixed maximum number of items, regardless of what the items are. • This property can be exploited to get around the bottleneck of the short-term memory. We do this by recoding the input into a smaller number of discrete units, called chunks. • There is also evidence that it takes additional time to encode and decode chunks, so the recoding technique is a case of time -capacity tradeoff or what is known in CS as a compute-vsstore tradeoff. • Newell has a model of the time taken in the Sternberg memory scan experiment that attributes the observed RT to encoding or chunking.

Example of the use of chunking • To recall a string of binary bits Example of the use of chunking • To recall a string of binary bits – e. g. , 00101110110101001 • People can recall a string of about 8 binary integers. If they learn a binary encoding rule (00 0, 01 1, 10 2, 11 3) they can recall about 8 such chunks or 18 binary bits. If they learn a 3: 1 chunking rule (called the Octal number system) they can recall a 24 bit string, etc

Does the evidence support this idea? Memory span can be greatly increased through chunking! Does the evidence support this idea? Memory span can be greatly increased through chunking! Yet chunking has also been used to explain things it cannot explain. It is only explanatory if you have an account of how chunking occurs and what rules in LTM are being used (and what counts as a chunk).

What does visual attention select? (What are the bases for selection? ) l If What does visual attention select? (What are the bases for selection? ) l If attention is selection, what does visual attention select? ü An obvious answer is places. We can select places by moving our eyes so our gaze lands on different places. ü When places are selected, are they selected automatically? ü Must we always move our eyes to change what we attend to? § Studies of Covert Attention-Movement: Posner (1980). ü How does attention switch from one place to another? ü Is it always the case that we attend to places? Can we attend to any other property? Can we select on the basis of color, depth, spatial frequency, affordances, or the property a painting has of having been painted by Da Vinci (A property to which Bernard Berenson was able to attend extremely well). cf Gibson

 l How else can visual attention select? Can we control the size and l How else can visual attention select? Can we control the size and shape of the region that is selected, or is selection always punctate and data-driven? ü Zoom Lens model of spatial attention (Eriksen & St James, 1986). ü What controls where attention moves: § Is this automatic or voluntary? § How do we know where to direct our attention? How do we specify a location prior to attending to it? § We need a way to specify where or what prior to attending to it! Keep this conundrum in mind – we will return to it later! l How narrowly can we focus our attention? Can we make it pick out one out of several objects? Ø Are there special conditions under which we are able to pick out individual things? We will return to “attentional resolution” or the minimum spacing for selecting individual things.

Covert movements of attention Example of an experiment using a cue-validity paradigm for showing Covert movements of attention Example of an experiment using a cue-validity paradigm for showing that the locus of attention moves without eye movements and for estimating its speed. Posner, M. I. (1980). Orienting of Attention. Quarterly Journal of Experimental Psychology, 32, 3 -25.

Extension of Posner’s demonstration of attention switch Does the improved detection in intermediate locations Extension of Posner’s demonstration of attention switch Does the improved detection in intermediate locations entail that the “spotlight of attention” moves continuously through empty space?

Exogenous vs endogenous control of attention In the Posner paradigm illustrated in the last Exogenous vs endogenous control of attention In the Posner paradigm illustrated in the last slide, attention was automatically grabbed by the onset of a spot (exogenous attention allocation). Other experiments showed that this could be done under voluntary (endogenous) control – e. g. , by providing an arrow at fixation indicating what direction to move attention. l Posner, Tsal and others showed that when attention goes from A to B, intermediate locations are maximally sensitive to detecting a signal at intermediate times. l Both exogenous and endogenous control produces movement of attention, but they differ in some of their effects. Ø Endogenously moved attention does not lead to Inhibition of Return (we will turn to this next) Ø Endogenous controlled movement does not appear to affect detection sensitivity, but it does affect discrimination Ø Endogenous controlled effects are stronger and appear earlier l Although the evidence suggests a continuously moving “spotlight” of attention, there are other models that claim that this is a side-effect of an attentional activation that fades at the starting place and grows at the target place, creating an overlap in intermediate locations (Sperling). l

We can select a shape even when it is intertwined among other similar shapes We can select a shape even when it is intertwined among other similar shapes Are the green items the same? On a surprise test at the end, subjects were not able to recognize recall shapes that had been present but had not appeared in green.

The time-course of attention: Inhibition of return l If we vary the time between The time-course of attention: Inhibition of return l If we vary the time between the cue and target in a modified Posner paradigm, we find that when the Cue-Target-Onset-Asynchrony (CTOA) gets to around 300 -900 ms, reaction time to the target begins to increase. This is called Inhibition-of-return (Klein, 2000). l To get this effect we actually have to attract attention to the target location and then attract it back to the origin. IOR is one of many examples of an inhibition effect being produced by attention.

Other examples of attentionally induced inhibition l Negative Priming (Treisman & De. Shepper, 1996). Other examples of attentionally induced inhibition l Negative Priming (Treisman & De. Shepper, 1996). § Is there a figure on the right that is the same as the figure on the left? § When the figure on the left is one that had appeared as an ignored figure on the right, RT is long and accuracy poor. § This “negative priming” effect persisted over 200 intervening trials and lasted for a month!

Another negative attention effect: Inattentional Blindness Another negative attention effect: Inattentional Blindness

Inattentional Blindness Ø The background task is to report which of two arms of Inattentional Blindness Ø The background task is to report which of two arms of the + is longer. One critical trial per subject, after about 3, 4 background trials. Another “critical” trial presented as a divided attention control. Ø 25% of subjects failed to see the square when it was presented in the parafovea (2° from fixation). Ø But 65% failed to see it when it was at fixation! Ø When the background task cross was made 10% as large, Inattentional Blindness increased from 25% to 66%. Ø It is not known whether this IB is due to concentration of attention at the primary task, or whethere is inhibition of outside regions.

In what other ways might our information capacity be limited? l We have limitations In what other ways might our information capacity be limited? l We have limitations on the input side that depend on the acuity of the sensors and the range of physical properties to which they respond. l But there is a limitation beyond that of acuity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. The capacity to individuate is different from the capacity to discriminate. § This notion of individuating and of individuals may be related to Miller’s “chunks”, but it has a special role in vision which we will explore in the next lecture § First some reason for thinking that individuating is a distinct process

Individuating is different from discriminating Individuating is different from discriminating

Individuating as a distinct process Individuating has its own psychometric function: The minimum distance Individuating as a distinct process Individuating has its own psychometric function: The minimum distance for individuating is much larger than for discriminating. l It may be that in vision our attention is limited in the number of things we can individuate and simultaneously access (more on this later). But how do you determine what counts as a “thing”? See next lecture. l Individuating is a prerequisite for recognition of patterns and other properties defined among a number of individual parts l Ø An example of how we can easily detect patterns if they are defined over a small enough number of parts is in subitizing Ø Another area where the concept of an individual has become important is in cognitive development, where it is clear that babies are sensitive to the numerosity of individual things in a way that is distinct from their perceptual abilities but is limited in its capacity

Pick out 3 dots and keep track of them Ø You can follow instructions Pick out 3 dots and keep track of them Ø You can follow instructions to “move one up” or Move 2 right” etc so long as at no time do you have to hold on to more than 4 dots Ø You can pick out 4 dots and then search through those 4 locations if all dots change to search items (Burkell & Pylyshyn, 1997) Ø You can count up to 4 dots without error (Trick & Pylyshyn, 1994) Ø You can keep track of 4 dots through saccades (Irwin, 1996) Ø You can detect such basic patterns as inside(dot, contour), Collinear(x 1, x 2, x 3, x 4), or Online(dot, contour) so long as there a small number of the relevant arguments to hold on to at one time.

Next: Objects and Attention Next: Objects and Attention

Are there collinear items (n>3)? Are there collinear items (n>3)?

Several objects must be picked out at once in making relational judgments l The Several objects must be picked out at once in making relational judgments l The same is true for other relational judgments like inside or on-the-same-contour… etc. We must pick out the relevant individual objects first.

When items cannot be individuated, predicates over them cannot be evaluated Do these figures When items cannot be individuated, predicates over them cannot be evaluated Do these figures contain one or two distinct curves? Individuating these curves requires a “curve tracing” operation, so Number_of_curves (C 1, C 2, …) takes time proportional to the length of the shortest curve.

The figure on the left is one continuous curve, the on the right is The figure on the left is one continuous curve, the on the right is two distinct curves – as shown in color.

Another example: Subitizing vs Counting. How many squares are there? Subitizing is fast, accurate Another example: Subitizing vs Counting. How many squares are there? Subitizing is fast, accurate and only slightly dependent on how many items there are. Only the squares on the right can be subitized. Concentric squares cannot be subitized because individuating them requires curve tracing, just as it did in the spiral example.

Signature subitizing phenomena only appear when objects are automatically individuated and indexed Trick, L. Signature subitizing phenomena only appear when objects are automatically individuated and indexed Trick, L. M. , & Pylyshyn, Z. W. (1994). Why are small and large numbers enumerated differently? A limited capacity preattentive stage in vision. Psychological Review, 101(1), 80 -102.

Example of subitizing popout and non-popout features (Count Pink vs. Count Online) Example of subitizing popout and non-popout features (Count Pink vs. Count Online)

What is attention is for? Treisman’s Attention as Glue Hypothesis Ø The purpose of What is attention is for? Treisman’s Attention as Glue Hypothesis Ø The purpose of visual attention is to Bind properties together in order to recognize objects

How are conjunctions of features detected? Read the vertical line of digits in the How are conjunctions of features detected? Read the vertical line of digits in the following disp Under these conditions Conjunction Errors are very frequent

Rapid visual search (Treisman) Find the following simple figure in the next slide: Rapid visual search (Treisman) Find the following simple figure in the next slide:

Rapid visual search (conjunction) Find the following simple figure in the next slide: Rapid visual search (conjunction) Find the following simple figure in the next slide:

Find the unique item in this slide Find the unique item in this slide

Serial vs parallel search? Finding an object that differs from all others in a Serial vs parallel search? Finding an object that differs from all others in a scene by a single feature – called a single-feature search – is fast, error-free and almost independent of how many nontargets there are; l Finding an object that differs from all others by a conjunction of two or more features (and that shares at least one feature with each object in the scene) – called a conjunction search – is usually slow, error-prone, and is worse the more nontargets there are in the scene*. l These results suggest that in order to find a conjunction, which requires solving the binding problem, attention has to be scanned serially to all objects. l * This way of putting is simplifies things. Under certain conditions the serial-parallel distinction breaks down

The attention-as-glue hypothesis has a converse: In addition to requiring attention to recognize objects The attention-as-glue hypothesis has a converse: In addition to requiring attention to recognize objects Attention is primarily directed at Objects l Instead of being like a spotlight beam that can be scanned around a scene and can be zoomed to cover a larger or smaller area, perhaps attention can only be directed towards occupied places – i. e. , to visual objects. (This is compatible with both kinds of attention allocation occurring).

Evidence for attentional selection based on Objects Single Object Advantage: pairs of judgments are Evidence for attentional selection based on Objects Single Object Advantage: pairs of judgments are faster when both apply to the same perceived object l Entire objects acquire enhanced sensitivity from focal attention to a part of the object l Single-Object advantage occurs even with generalized “objects” defined in feature space l Simultanagnosia and hemispatial neglect show object-based effect l l Studies with Moving Objects Ø IOR Ø Object Files Ø MOT

Some single-object superiority studies Duncan (1984) showed that two judgments made about the same Some single-object superiority studies Duncan (1984) showed that two judgments made about the same objects are faster even when the distances and areas are controlled. He concluded “Findings support a view in which parallel, preattentive

Single-object superiority even when the shapes are controlled Single-object superiority even when the shapes are controlled

More controls for the Baylis study… (Baylis, 1994) Controls for separability, convexity, area… More controls for the Baylis study… (Baylis, 1994) Controls for separability, convexity, area…

“Objects” endure over time l Several studies have shown that what counts as an “Objects” endure over time l Several studies have shown that what counts as an object (as the same object) endures over time and over changes in location; Ø Certain forms of disappearances in time and changes in location preserve objecthood. l This gives what we have been calling a “visual object” a real physical-object character and partly justifies our calling it an “object”.

Inhibition of return appears to be objectbased (as well as location-based) l Recall that Inhibition of return appears to be objectbased (as well as location-based) l Recall that Inhibition-of-return is the phenomenon whereby an object that has been attended (and then attention is moved away from it) is less likely to attract attention again in a period of 300 ms to 900 ms after it is first attended. The attended item is said to be inhibited. § This is thought to help in visual search since it prevents previously visited objects from being revisited l The original study used static objects. Then (Tipper, Driver & Weaver, 1991) showed that IOR moves with the inhibited object.

Object Based Inhibition of Return Object Based Inhibition of Return

Objects appear to carry their history with them Object-specific priming of objects and contents Objects appear to carry their history with them Object-specific priming of objects and contents Object File Theory Kahneman, Treisman & Gibbs(1992) Letters are faster to read if they appear in the same box where they appeared initially. Priming travels with the object. According to theory, when an object first appears, a file is created for it and the properties of the object are encoded and subsequently accessed through this object-file.

Visual neglect syndrome is object-based When a right neglect patient is shown a dumbbell Visual neglect syndrome is object-based When a right neglect patient is shown a dumbbell that rotates, the patient continues to neglect the object that had been on the right, even though It is now on the left (Behrmann & Tipper, 1999).

Simultanagnosic (Balint Syndrome) patients only attend to one object at a time Simultanagnosic patients Simultanagnosic (Balint Syndrome) patients only attend to one object at a time Simultanagnosic patients cannot judge the relative length of two lines, but they can tell that a figure made by connecting the ends of the lines is not a rectangle but a trapezoid (Holmes & Horax, 1919).

Balint patients can only attend to one object at a time even if they Balint patients can only attend to one object at a time even if they are overlapping Luria, 1959

End ? (for now) l Multiple Object Tracking is a methodology for studying Object-Based End ? (for now) l Multiple Object Tracking is a methodology for studying Object-Based attention.

Multiple Object Tracking l One of the clearest cases illustrating object-based attention is Multiple Multiple Object Tracking l One of the clearest cases illustrating object-based attention is Multiple Object Tracking l Keeping track of individual scene objects requires a mechanism for individuating, selecting, accessing and maintaining the identity of individuals over time § These are the functions we have proposed are carried out by the mechanism of visual indexes (FINSTs) § We have been using a variety of methods for studying visual indexing, including subitizing, subset selection for search, and Multiple Object Tracking (MOT).

Multiple Object Tracking l In a typical experiment, 8 simple identical objects are presented Multiple Object Tracking l In a typical experiment, 8 simple identical objects are presented on a screen and 4 of them are briefly distinguished in some visual manner – usually by flashing them on and off. l After these 4 “targets” have been briefly identified, all objects resume their identical appearance and move randomly. The subjects’ task is to keep track of which ones had earlier been designated as targets. l After a period of 5 -10 seconds the motion stops and subjects must indicate, using a mouse, which objects were the targets. l People are very good at this task (80%-98% correct). The question is: How do they do it?

Keep track of the objects that flash Keep track of the objects that flash

How do we do it? What properties of individual objects do we use? How do we do it? What properties of individual objects do we use?

Keep track of the objects that flash Keep track of the objects that flash

How do we do it? What properties of individual objects do we use? How do we do it? What properties of individual objects do we use?

Explaining Multiple Object Tracking § Basic finding: People (even 5 year old children, though Explaining Multiple Object Tracking § Basic finding: People (even 5 year old children, though not most senior professors!) can track 4 to 5 individual objects that have no unique visual properties § How is it done? Ø We have shown that it is unlikely that the tracking is done by keeping a record of target locations, and updating them while serially visiting the objects. Ø I have proposed that individuating and keeping track of certain kinds of individuals is a primitive visual operation and uses the mechanism of visual indexes or FINSTs. Ø Tracking is preconceptual* and preattentive§ (* § explanation is left for another occasion)

A possible location-based tracking algorithm 1. While the targets are visually distinct, scan attention A possible location-based tracking algorithm 1. While the targets are visually distinct, scan attention to each target in turn and encode its location on a list. 2. When targets begin to move, check the n’th position in the list and go to the location encoded there: Call it Loc(n). 3. Find the closest element to Loc(n). 4. Update the actual location of the element found in #3 in position n in the list: this becomes the new value of Loc(n). 5. Move attention to the location encoded in the next list position, Loc(n+1). 6. Repeat from #3 until elements stop moving. 7. Report elements whose locations are on the list. Use of the above algorithm assumes (1) focal attention is required to encode locations (i. e. , encoding is not parallel), (2) focal attention is unitary and has to be scanned continuously from location to location. It assumes no encoding (or dwell) time at each element.

Predicted performance for the serial tracking algorithm as a function of the speed of Predicted performance for the serial tracking algorithm as a function of the speed of movement of attention

If we are not using and updating objects’ locations, then how are we tracking If we are not using and updating objects’ locations, then how are we tracking them? l Since objects are identical, location is the only unique object property, yet we do not appear to be using locations to track. l Other ideas of how we track (e. g. , that we view objects as vertices of a deforming polygon), even if in some sense a true description, does not explain how we do it (e. g. polygon strategy). l We could be splitting attention, with each attentional beam moving independently (but if so they act differently from focal attention – e. g. , subjects do notice properties of targets). l The explanation we prefer, which is independently motivated, is that there a small number of primitive indexes or pointers, each of which can pick out a particular individual object qua object and keeps providing access to the object as it changes its properties and its location.

Additional examples of MOT l MOT with occlusion l MOT with virtual occluders l Additional examples of MOT l MOT with occlusion l MOT with virtual occluders l "Rubber band" displays

Summary of some properties of indexing revealed by our recent experiments 1. Targets can Summary of some properties of indexing revealed by our recent experiments 1. Targets can be tracked even when they disappear behind an occluder and, under certain conditions, even when all objects disappear from view (Scholl & Pylyshyn, 1999; Keane & Pylyshyn, VSS 2003). Demo: MOT occlusion with 2. Properties of targets are not encoded during MOT nor are they used in tracking. Changes in target properties are not even noticed (Scholl, Pylyshyn & Franconeri, 1999). 3. Not all well-defined clusters of features can be tracked: Only ones that correspond to objects (Scholl, Pylyshyn & Feldman, 2001). Demo: "Rubber band" displays

Summary of some properties of indexing revealed by our recent experiments 4. Indexes are Summary of some properties of indexing revealed by our recent experiments 4. Indexes are assigned primarily exogenously (involuntarily). They can also be assigned endogenously (voluntarily) but only by moving focal attention to each target serially (Annon & Pylyshyn, VSS 2003). 5. Index maintenance in tracking appears to be nonpredictive and non-attentive (Keane & Pylyshyn, VSS 2003; Leonard & Pylyshyn, VSS 2003). 6. Target-target confusions are much more numerous than target-nontarget confusions. The reason appears to be that nontargets are inhibited, which may prevent them from being swapped with nontargets (Pylyshyn & Leonard, VSS 2003).

So what are FINSTs? l l l They are a primitive reference mechanism that So what are FINSTs? l l l They are a primitive reference mechanism that refer to individual objects in the world (FINGs? ) Objects are picked out and referred to without using any encoding of their properties, including their location. Picking out objects is prior to encoding their locations! Indexing is nonconceptual because it does not represent the individuals as members of some conceptual category. FINSTs serve as visual demonstratives, much like the terms this or that do in language, by picking out and referring to individuals without using their properties. The central function of FINST indexes is to bind arguments of visual predicates or of motor commands to things in the world to which they must refer. Only predicates with bound arguments can be evaluated.

Schema for how FINSTs function in visual-motor control Schema for how FINSTs function in visual-motor control

The binding hypothesis of the visual-cognitive bottleneck Going back to Newell’s binding hypothesis we The binding hypothesis of the visual-cognitive bottleneck Going back to Newell’s binding hypothesis we are hypothesizing that the bottleneck between vision and cognition is in the number of objects that can be simultaneously bound to the arguments of cognitive routines l Another way to put this is that visual cognition can simultaneously attend to only about 4 objects. l § There is direct evidence for the limit of about 4 visual objects in visual working memory: Luck, S. and E. Vogel (1997). "The capacity of visual working memory for features and conjunctions. " Nature 390: 279 -281.

Information processing capacity appears to be limited to 7 ± 2 “chunks” rather than Information processing capacity appears to be limited to 7 ± 2 “chunks” rather than to a number of bits or baud l Experiments on “short term memory” STM (or “working memory”) – Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81 -97. l Experiments on the capacity of Visual Working Memory (Luck & Vogel, 1997)

Studies of the capacity of Visual Working Memory (Luck & Vogel, 1997) People appear Studies of the capacity of Visual Working Memory (Luck & Vogel, 1997) People appear to be able to retain about 4 properties of an object (4 colors, 4 shapes, 4 orientations, etc) over a short time l People can also retain the identity of 4 objects for a short time. l Luck and Vogel found that as long as there are not more than 4 properties per object, people can retain large numbers of properties (a phenomenon that is reminiscent of Miller’s “chunking hypothesis” except the chunks are objects). l