Trends in Cognitive Sciences
OpinionSelecting and perceiving multiple visual objects
Introduction
Many everyday activities, such as driving on a busy street, require the encoding of multiple distinctive visual objects from crowded scenes. Extending previous behavioral theories and incorporating recent brain imaging and behavioral data, we describe a neural object-file theory to explain how multiple visual objects are attended and encoded. Given processing limitations, our visual system can first select a fixed number of about four objects from a crowded scene, based on their spatial information (object individuation) and then encode their details (object identification). We present evidence showing the involvement of the inferior intra-parietal sulcus (IPS) in object individuation and the superior IPS and higher visual areas in object identification. These two stages of operation could underlie the variety of ways that visual processing is capacity limited, such as in visual short-term memory (also known as visual working memory), enumeration and multiple object tracking.
Section snippets
The neural object-file theory
Our theory consists of two main components, object individuation and object identification (Figure 1).
Evidence supporting the neural object-file theory
The distinction between object individuation and object identification has been noted more than 20 years ago. Sagi and Julesz [24] observed fast target detection but relatively slower target identification performance in a visual search task. Kahneman, et al.[1] proposed that spatial and temporal information allows an ‘object file’ to be created (corresponding to object individuation), before they can be filled with object features to allow objects to be identified (corresponding to object
Visual short-term memory capacity limitation
Behavioral literature disagrees on whether visual short-term memory (VSTM) (also known as visual working memory [VWM]) capacity is limited to a fixed number of objects 26, 27, or whether it is resolution-limited and can be flexibly allocated to a variable number of objects depending on object complexity and the encoding demand 12, 28, 29, 30, 31, 32. In a recent fMRI study, we examined posterior brain mechanisms supporting VSTM 13, 33. We found that, whereas representations in the inferior IPS
Feature encoding during multiple object tracking
Pylyshyn and Storm [10] reported that observers could simultaneously track about four moving targets among otherwise identical moving distractor objects. Observers’ memory for the color or shape features of the successfully tracked objects, however, is surprisingly poor [43]. Even when all objects differed in identity, and identity was present throughout tracking, recall of target location is superior to the recall of target location with identity [44]. How could an observer successfully attend
A crucial test: neural response to object repetition
Consider a task in which observers are asked to encode the shapes of a set of objects that can all be identical or different from each other and then decide whether a particular object shape is present or absent. According to the neural object-file theory, the inferior IPS should treat these objects as multiple, separate entities in object individuation, regardless of whether they are all identical or all different. By contrast, the superior IPS and higher visual areas should treat multiple
Discussion
Why does object individuation have a roughly four-item limit? This limitation is likely to have evolved in the visual system as a functional characteristic rather than a deficiency. Pylyshyn 7, 49 argued that, to comprehend simple relational geometrical properties, we need to simultaneously reference multiple places or objects in a scene. Having access to four objects or feature occupied spatial locations can be sufficient to comprehend most 3D object locations with respect to each other. After
Conclusions
Previous behavioral research and theoretical developments have emphasized the important and useful distinction between object individuation and identification for understanding the selection and encoding of multiple visual objects. Here, we propose how these two processes can be supported by distinct neural mechanisms: object individuation is mediated by the inferior IPS and object identification by higher visual areas and the superior IPS. Our neural object-file theory describes how the visual
Acknowledgements
We thank Nelson Cowan, Edward K. Vogel, Todd D. Horowitz and an anonymous review for comments on early drafts of this paper. This work was supported by NSF grants 0518138 and 0719975 to Y.X. and NIH grant EY014193 to M.M.C.
References (70)
The reviewing of object files: object-specific integration of information
Cognit. Psychol.
(1992)Representing connected and disconnected shapes in human inferior intra-parietal sulcus
Neuroimage
(2008)Some primitive mechanisms of spatial attention
Cognition
(1994)- et al.
Tracking multiple targets with multifocal attention
Trends Cogn. Sci.
(2005) - et al.
A saliency-based search mechanism for overt and covert shifts of visual attention
Vision Res.
(2000) Indexing and the object concept: developing ‘what’ and ‘where’ systems
Trends Cogn. Sci.
(1998)Objects and attention: the state of the art
Cognition
(2001)The role of location indexes in spatial perception: a sketch of the FINST spatial-index model
Cognition
(1989)Parietal lobe contributions to episodic memory retrieval
Trends Cogn. Sci.
(2005)- et al.
Multisensory spatial interactions: a window on functional integration in the human brain
Trends Neurosci.
(2005)