Opinion
Selecting and perceiving multiple visual objects

https://doi.org/10.1016/j.tics.2009.01.008Get rights and content

To explain how multiple visual objects are attended and perceived, we propose that our visual system first selects a fixed number of about four objects from a crowded scene based on their spatial information (object individuation) and then encode their details (object identification). We describe the involvement of the inferior intra-parietal sulcus (IPS) in object individuation and the superior IPS and higher visual areas in object identification. Our neural object-file theory synthesizes and extends existing ideas in visual cognition and is supported by behavioral and neuroimaging results. It provides a better understanding of the role of the different parietal areas in encoding visual objects and can explain various forms of capacity-limited processing in visual cognition such as working memory.

Introduction

Many everyday activities, such as driving on a busy street, require the encoding of multiple distinctive visual objects from crowded scenes. Extending previous behavioral theories and incorporating recent brain imaging and behavioral data, we describe a neural object-file theory to explain how multiple visual objects are attended and encoded. Given processing limitations, our visual system can first select a fixed number of about four objects from a crowded scene, based on their spatial information (object individuation) and then encode their details (object identification). We present evidence showing the involvement of the inferior intra-parietal sulcus (IPS) in object individuation and the superior IPS and higher visual areas in object identification. These two stages of operation could underlie the variety of ways that visual processing is capacity limited, such as in visual short-term memory (also known as visual working memory), enumeration and multiple object tracking.

Section snippets

The neural object-file theory

Our theory consists of two main components, object individuation and object identification (Figure 1).

Evidence supporting the neural object-file theory

The distinction between object individuation and object identification has been noted more than 20 years ago. Sagi and Julesz [24] observed fast target detection but relatively slower target identification performance in a visual search task. Kahneman, et al.[1] proposed that spatial and temporal information allows an ‘object file’ to be created (corresponding to object individuation), before they can be filled with object features to allow objects to be identified (corresponding to object

Visual short-term memory capacity limitation

Behavioral literature disagrees on whether visual short-term memory (VSTM) (also known as visual working memory [VWM]) capacity is limited to a fixed number of objects 26, 27, or whether it is resolution-limited and can be flexibly allocated to a variable number of objects depending on object complexity and the encoding demand 12, 28, 29, 30, 31, 32. In a recent fMRI study, we examined posterior brain mechanisms supporting VSTM 13, 33. We found that, whereas representations in the inferior IPS

Feature encoding during multiple object tracking

Pylyshyn and Storm [10] reported that observers could simultaneously track about four moving targets among otherwise identical moving distractor objects. Observers’ memory for the color or shape features of the successfully tracked objects, however, is surprisingly poor [43]. Even when all objects differed in identity, and identity was present throughout tracking, recall of target location is superior to the recall of target location with identity [44]. How could an observer successfully attend

A crucial test: neural response to object repetition

Consider a task in which observers are asked to encode the shapes of a set of objects that can all be identical or different from each other and then decide whether a particular object shape is present or absent. According to the neural object-file theory, the inferior IPS should treat these objects as multiple, separate entities in object individuation, regardless of whether they are all identical or all different. By contrast, the superior IPS and higher visual areas should treat multiple

Discussion

Why does object individuation have a roughly four-item limit? This limitation is likely to have evolved in the visual system as a functional characteristic rather than a deficiency. Pylyshyn 7, 49 argued that, to comprehend simple relational geometrical properties, we need to simultaneously reference multiple places or objects in a scene. Having access to four objects or feature occupied spatial locations can be sufficient to comprehend most 3D object locations with respect to each other. After

Conclusions

Previous behavioral research and theoretical developments have emphasized the important and useful distinction between object individuation and identification for understanding the selection and encoding of multiple visual objects. Here, we propose how these two processes can be supported by distinct neural mechanisms: object individuation is mediated by the inferior IPS and object identification by higher visual areas and the superior IPS. Our neural object-file theory describes how the visual

Acknowledgements

We thank Nelson Cowan, Edward K. Vogel, Todd D. Horowitz and an anonymous review for comments on early drafts of this paper. This work was supported by NSF grants 0518138 and 0719975 to Y.X. and NIH grant EY014193 to M.M.C.

References (70)

  • J. Gottlieb

    From thought to action: the parietal cortex as a bridge between perception, action, and cognition

    Neuron

    (2007)
  • J. Serences

    Parietal mechanisms of switching and maintaining attention to locations, objects, and features

  • F. Xu et al.

    Infants’ metaphysics: the case of numerical identity

    Cognit. Psychol.

    (1996)
  • Z. Káldy et al.

    A memory span of one? Object identification in 6.5-month-old infants

    Cognition

    (2005)
  • C.S. Green et al.

    Enumeration versus object tracking: Insights from video game players

    Cognition

    (2006)
  • X. Jiang

    Categorization training results in shape- and category-selective human neural plasticity

    Neuron

    (2007)
  • Y. Xu et al.

    Dissociable neural mechanism supporting visual short-term memory for objects

    Nature

    (2006)
  • Y. Xu et al.

    Visual grouping in human parietal cortex

    Proc. Natl. Acad. Sci. U. S. A.

    (2007)
  • Y. Xu

    The role of the superior intra-parietal sulcus in supporting visual short-term memory for multi-feature objects

    J. Neurosci.

    (2007)
  • Y. Xu

    Distinctive neural mechanisms supporting visual object individuation and identification

    J. Cogn. Neurosci.

    (2009)
  • R.A. Rensink

    The dynamic representation of scenes

    Vis. Cogn.

    (2000)
  • N. Cowan

    The magical number 4 in short-term memory: a reconsideration of mental storage capacity

    Behav. Brain Sci.

    (2001)
  • Z.W. Pylyshyn et al.

    Tracking multiple independent targets: evidence for a parallel tracking mechanism

    Spat. Vis.

    (1988)
  • G.A. Alvarez et al.

    The capacity of visual short-term memory is set both by visual information load and by number of objects

    Psychol. Sci.

    (2004)
  • J.J. Todd et al.

    Capacity limit of visual short-term memory in human posterior parietal cortex

    Nature

    (2004)
  • M. Corbetta et al.

    Control of goal-directed and stimulus-driven attention in the brain

    Nat. Rev. Neurosci.

    (2002)
  • L.G. Ungerleider et al.

    Two cortical visual systems

  • A.B. Sereno et al.

    Shape selectivity in primate lateral intraparietal cortex

    Nature

    (1998)
  • C.S. Konen et al.

    Two hierarchically organized neural systems for object information in human visual cortex

    Nat. Neurosci.

    (2008)
  • W.H. Merigan et al.

    How parallel are the primate visual pathways?

    Annu. Rev. Neurosci.

    (1993)
  • E. Awh

    Visual working memory represents a fixed number of items, regardless of complexity

    Psychol. Sci.

    (2007)
  • R. Rauschenberger et al.

    Masking unveils pre-amodal completion representation in visual search

    Nature

    (2001)
  • R.S. Zemel

    Experience-dependent perceptual grouping and object-based attention

    J. Exp. Psychol. Hum. Percept. Perform.

    (2002)
  • S.P. Vecera et al.

    Figure-ground organization and object recognition processes: an interactive account

    J. Exp. Psychol. Hum. Percept. Perform.

    (1998)
  • D. Sagi et al.

    “What” and “where” in vision

    Science

    (1985)
  • Cited by (0)

    View full text