Trends in Cognitive Sciences
ReviewHow position dependent is visual object recognition?
Introduction
One of the biggest challenges faced by the visual object recognition system is to enable rapid and accurate recognition despite vast differences in the retinal projection of an object produced by changes in, for example, viewing angle, size, position in the visual field or illumination 1, 2, 3, 4. Such ‘invariance’ is often considered one of the key characteristics of object recognition 4, 5, 6, 7. Changes in position (translations) are among the simplest of these transformations, because only the retinal position of the projection of an object is affected, and not the projection itself [2]. Although it is often assumed that objects can be recognized independently of retinal position [8], the behavioral evidence is limited. In this review, we critically evaluate the behavioral studies of position dependence in visual object recognition from a computational and physiological perspective. We find that the behavioral data on position independence are inconclusive. Furthermore, these studies do not test several key predictions from neurophysiology, including the effect of translations between eccentricities and hemifields, making it difficult to understand the relationship between behavior and the proposed neural substrate. We argue that whereas the balance of the available evidence argues against complete position independence 9, 10, 11, 12, 13, 14, the role of position in visual object recognition remains essentially unknown.
Section snippets
What is visual object recognition?
We use a definition of visual object recognition similar to that of previous authors 1, 3. For our purposes ‘visual object’ refers to a conjunction of a complex set of visual features. Successful recognition of such an object requires that the response to a current percept be in some way consistent enough with the internal representation of a previous percept to at least partially invoke it 1, 7, 15. This formulation of visual object recognition defines it, fundamentally, as a process of
The importance of position in object recognition
Given the comparison model of object recognition described above, there are two types of preexisting object representations that might underlie object recognition in the context of position changes. Both make specific behavioral predictions about the degree to which experience with an object at one position will affect recognition during later presentations of that object at different positions (transfer).
The first possibility is that the preexisting representations are specific to the object
Position dependence in the ventral visual pathway
The cortical system supporting object recognition is often described as a ventral visual pathway extending from primary visual cortex (V1) through a series of hierarchical processing stages (V2–V4) to the anterior parts of the inferior temporal (IT) cortex [29], a region crucial for visual object recognition 30, 31. Here we focus primarily on the response properties of neurons in monkey IT, which respond selectively to visual objects.
Behavioral studies of position dependence
Physiological considerations (RFs, retinal sampling) suggest there should be some effect of translations on object recognition, especially those between hemifields and eccentricities. However, without the behavioral output of the system it is impossible to know whether these characteristics have a role in determining the degree of position dependence. Most of the formal models of object recognition (Box 1) attempt to implement some aspect of the physiology into their architecture. Thus, the
Concluding remarks
A complete understanding of object recognition requires the integration of physiological, computational and behavioral evidence. Although the current behavioral data argue against complete position independence, future research will need to address what factors (such as task or long-term experience) affect the degree of position dependence and which properties of IT neurons are reflected in behavior (Box 2).
Acknowledgements
We would like to thank Hans Op de Beeck, Marlene Behrmann, Lalitha Chandrasekher, Jim DiCarlo, Daniel Dilks, Stephanie Manchin, Alex Martin, Mortimor Mishkin, Julianne Rollenhagen and Rebecca Schwarzlose for their insightful comments on an earlier version of this manuscript. We would also like to thank Hans Op de Beeck for contributing to Figure 2.
References (89)
- et al.
Untangling invariant object recognition
Trends Cogn. Sci.
(2007) - et al.
Invariance of long-term visual priming to scale, reflection, translation, and hemisphere
Vision Res.
(2001) - et al.
Computation of pattern invariance in brain-like structures
Neural Netw.
(1999) A quantitative theory of immediate visual recognition
Prog. Brain Res.
(2007)Object recognition and segmentation by a fragment-based hierarchy
Trends Cogn. Sci.
(2007)- et al.
Towards structural systematicity in distributed, statically bound visual representations
Cogn. Sci.
(2003) - et al.
Recognition invariance obtained by extended and invariant features
Neural Netw.
(2004) - et al.
Visual areas in the temporal cortex of the macaque
Brain Res.
(1979) Shape representation in the inferior temporal cortex of monkeys
Curr. Biol.
(1995)Role of inferior temporal cortex in interhemispheric transfer
Brain Res.
(1979)
The neural basis of stimulus equivalence across retinal translation
High level object recognition without an anterior inferior temporal lobe
Neuropsychologia
The visual field representation in striate cortex of the macaque monkey: asymmetries, anisotropies, and individual variability
Vision Res.
Eccentricity bias as an organizing principle for human high-order object areas
Neuron
The interaction of shape- and location-based priming in object categorisation: Evidence for a hybrid “what plus where” representation stage
Vision Res.
Display symmetry affects positional specificity in same-different judgment of pairs of novel visual patterns
Vision Res.
The representation of location in visual images
Cognit. Psychol.
The reverse hierarchy theory of visual perceptual learning
Trends Cogn. Sci.
Invariant face and object recognition in the visual system
Prog. Neurobiol.
Neural mechanisms of object recognition
Curr. Opin. Neurobiol.
Constraining the neural representation of the visual world
Trends Cogn. Sci.
How parallel is visual processing in the ventral pathway?
Trends Cogn. Sci.
The role of context in object recognition
Trends Cogn. Sci.
Scene perception: detection and judging object undergoing relational violations
Cognit. Psychol.
Representations and Recognition in Vision
Models of object recognition
Nat. Neurosci.
High-Level Vision: Object Recognition and Visual Cognition
Hierarchical models of object recognition in cortex
Nat. Neurosci.
Representation and recognition of the spatial organization of three-dimensional shapes
Proc. R. Soc. Lond. B. Biol. Sci.
Vision
‘Breaking’ position-invariant object recognition
Nat. Neurosci.
Coordinate transformations in object recognition
Psychol. Bull.
Imperfect invariance to object translation in the discrimination of complex shapes
Perception
The role of visual field position in pattern-discrimination learning
Proc. Biol. Sci.
Limited translation invariance of human visual pattern recognition
Percept. Psychophys.
Internal representations and operations in the visual comparison of transformed patterns: effects of pattern point-inversion, position symmetry, and separation
Biol. Cybern.
Some results on translation invariance in the human visual system
Spat. Vis.
Size and position invariance in visual-system
Perception
Evidence for complete translational and reflectional invariance in visual object priming
Perception
Varieties of object constancy
Q. J. Exp. Psychol.
Anterior inferotemporal neurons of monkeys engaged in object recognition can be highly sensitive to object retinal position
J. Neurophysiol.
Dynamic binding in a neural network for shape recognition
Psychol. Rev.
A multiscale dynamic routing circuit for forming size-invariant and position-invariant object representations
J. Comput. Neurosci.
A feedforward architecture accounts for rapid categorization
Proc. Natl. Acad. Sci. U. S. A.
Cited by (96)
Does the brain's ventral visual pathway compute object shape?
2022, Trends in Cognitive SciencesAn expanded model for perceptual visual single object recognition system using expectation priming following neuroscientific evidence
2021, Cognitive Systems ResearchCitation Excerpt :It is selective to complex forms, and it has bigger receptive fields on average than previous areas of ventral pathway (Kravitz et al., 2008, 2010; Rolls et al., 2008; Rolls, Aggelopoulos, & Fashan, 2003; Tanaka et al., 1991; Yamane et al., 2006). Further, there is evidence of a retinotopic organization (Rolls et al., 2008; Kravitz et al., 2008, 2010, 2013; Miyashita, 1993; Yasuda, Banno, & Komatsu, 2010) and position-dependent behavior (Hung, Kreiman, Poggio, & DiCarlo, 2005; Kravitz et al., 2008; Rolls et al., 2008; Tanaka et al., 1991; Yamane, Carlson, Bowman, Wang, & Connor, 2008) in object recognition processes. When information about an object is handled, the network’s activity forms activation clusters, each of which describes complex features of the present object and form activation columns in clusters for similar features, allowing overlapping responses in the representation of similar objects (Kravitz et al., 2013; Lehky & Tanaka, 2016; Tanaka et al., 1991).
Object Vision in a Structured World
2019, Trends in Cognitive Sciences