Transfer of object category knowledge across visual and haptic modalities: Experimental and computational studies
Highlights
► We address people’s abilities to transfer category knowledge across sensory domains. ► We introduce the See and Grasp data set, the first visual-haptic data set. ► Experiment shows that object category knowledge transfers across sensory domains. ► Bayesian inference algorithm is proposed for learning componential 3-D shapes. ► Forward models predict sensory-specific features from multisensory representations .
Introduction
When recording neural activity in the human medial temporal lobe, Quiroga, Kraskov, Koch, and Fried (2009) found individual neurons that explicitly encode multisensory percepts. For example, one neuron responded selectively when a person viewed images of the television host Oprah Winfrey, viewed her written name, or heard her spoken name. (To a lesser degree, the neuron also responded to the actress Whoopi Goldberg.) Another neuron responded selectively when a person saw images of the former Iraqi leader Saddam Hussein, saw his name, or heard his name. Clearly, our brains encode abstract representations of objects that are multisensory in the sense that these representations are activated by perceptual inputs, but these inputs span multiple sensory formats or modalities.
Why would our brains acquire abstract representations that are activated by inputs from a variety of sensory modalities? One possible answer to this question is that these representations facilitate the transfer of knowledge across modalities. Consider, for instance, a person that learns to categorize a set of objects based solely on tactile or haptic inputs. Would the person be able to categorize these same objects when the objects are viewed but not grasped? Would the person be able to view novel objects from the same categories and be able to categorize these?
Here, we report experimental and computational studies of the acquisition of multisensory representations of object category, and the role these representations play in the transfer of knowledge across visual and haptic modalities. Our work includes three contributions. First, our experiment used an unusual set of visual-haptic stimuli known as “Fribbles”. Fribbles are complex, 3-D objects with multiple parts and spatial relations among the parts (see Fig. 1). Moreover, they have a categorical structure—that is, each Fribble is an exemplar from a category formed by perturbing a category prototype. Fribbles have previously been used in the study of visual object recognition (Hayward and Williams, 2000, Tarr, 2003, Williams, 1997). An innovation of our work is that we have fabricated a large set of Fribbles using a 3-D printing process and, thus, our Fribbles are physical objects which can be both seen and grasped. Based on this set of stimuli, we have created a data set, referred to as the See and Grasp data set, containing both visual and haptic features of the Fribbles. We are making this data set freely available on the world wide web with the hope that it will encourage quantitative research on computational models of visual-haptic perception.
Second, we conducted an experiment evaluating whether people can transfer knowledge of object category across visual and haptic modalities. Previous researchers have considered the transfer of knowledge of object identity across visual and haptic modalities (e.g., Lacey et al., 2007, Lawson, 2009, Norman et al., 2004). They have also compared similarity and categorization judgements based solely on visual input with those based solely on haptic input (Gaißert and Wallraven, 2012, Gaißert et al., 2011, Gaißert et al., 2008, Gaißert et al., 2010). To our knowledge, our experiment is the first focused on the transfer of object category knowledge across visual and haptic modalities.
Lastly, we developed a computational model, referred to as the MVH (Multisensory-Visual-Haptic) model, accounting for how multisensory representations of prototypical 3-D shape might be acquired, and of the role these representations might play in the transfer of category knowledge across visual and haptic modalities. Like some previous models in the literature (Biederman, 1987; Marr & Nishihara, 1978), the model makes use of part-based representations of prototypes. However, it goes beyond previous work by introducing a learning mechanism for the acquisition of these representations. Using its acquired multisensory representations along with sensory-specific forward models for predicting visual or haptic features from multisensory representations, the model transfers object category knowledge between visual and haptic modalities, thereby providing a qualitative account of our experimental data.
Section snippets
Previous research on visual-haptic object perception
Previous research has shown that knowledge of object identity transfers (at least in part) across visual and haptic domains (e.g., Lacey et al., 2007, Lawson, 2009, Norman et al., 2004). For example, Lacey, Peters, et al. (2007) trained subjects to identify objects either visually or haptically. Following training, subjects were tested on the same task using the untrained sensory modality. Subjects showed excellent transfer to the novel modality when objects were presented at the same
Fribbles and the See and Grasp data set
A key component of our research is the unusual visual-haptic stimuli that we used in both our experimental and computational studies. These stimuli are a subset of a larger set of stimuli known as “Fribbles”.1 Fribbles have previously been used in the vision sciences to study visual
Experiment
Questions about categorization and generalization are fundamental to cognitive science, yet many open questions about them remain, particularly in the context of multisensory perception. Important questions include: To what extent does knowledge of object categories gained through one modality transfer to another modality? Is the amount of transfer the same for familiar and novel objects? For example, if a person learns to visually categorize a set of objects, can the person categorize these
Preliminary remarks regarding the MVH model
Our data show that participants transferred object category knowledge between visual and haptic modalities. How did they do this? To address this question, we propose a novel computational model, referred to as the MVH (Multisensory-Visual-Haptic) model, with several important properties. This model uses multisensory representations of prototypical 3-D shape. Like some previous models in the literature (Biederman, 1987; Marr & Nishihara, 1978), the model makes use of part-based representations
MVH (Multisensory-Visual-Haptic) model
This section provides the mathematical details of the MVH model. We describe the model from the perspective of a participant from Group V–H in our experiment. During training, the model is provided with images of Fribbles along with the Fribbles’ corresponding category labels. The model learns a multisensory representation of each category’s prototypical 3-D shape on the basis of this information. The model is provided with Fribbles’ haptic features during testing, and it estimates the category
Simulation results
In the simulations reported here, we used a slightly modified version of the See and Grasp data set for the four categories used in the experiment. We used three images of each Fribble rendered from three orthogonal viewpoints—a top view, a front view, and a right view. In addition, we simplified the images by using low-resolution images (80 pixels × 80 pixels) and by converting pixel values to binary numbers using a thresholding scheme. Therefore, the visual representation of a Fribble was a
Discussion
In summary, this article has addressed people’s abilities to transfer object category knowledge across visual and haptic domains. Our work has made three contributions. First, by fabricating Fribbles (3-D, multi-part objects with a categorical structure), we developed (and are making freely available on the web) visual-haptic stimuli that are highly complex and realistic. Second, we conducted an experiment evaluating whether people transfer object category knowledge across visual and haptic
Acknowledgements
We thank M. Tarr for making the 3-D object files for Fribbles available on his web pages. This work was supported by research grants from the National Science Foundation (DRL-0817250) and the Air Force Office of Scientific Research (FA9550-12-1-0303).
References (45)
- et al.
Similarity and categorization: From vision to touch
Acta Psychologica
(2011) - et al.
Reuniting perception and conception
Cognition
(1998) - et al.
Haptic study of three-dimensional objects activates extrastriate visual areas
Neuropsychologia
(2002) - et al.
Forward models: Supervised learning with a distal teacher
Cognitive Science
(1992) - et al.
Hand movements: A window into haptic object recognition
Cognitive Psychology
(1987) - et al.
The metamodal organization of the brain
Progress in Brain Research
(2001) - et al.
Multiple paired forward and inverse models for motor control
Neural Networks
(1998) - et al.
Convergence of visual and tactile shape processing in the human lateral occipital complex
Cerebral Cortex
(2002) - et al.
Functional imaging of human cross-modal identification and object recognition
Experimental Brain Research
(2005) - et al.
Cross-modal repetition priming in young and old adults
European Journal of Cognitive Psychology
(2009)
Grounded cognition
Annual Review of Psychology
Recognition-by-components: A theory of human image understanding
Psychological Review
Pattern recognition and machine learning
The psychophysics toolbox
Spatial Vision
Do vision and haptics share common representations? Implicit and explicit memory within and between modalities
Journal of Experimental Psychology: Learning, Memory and Cognition
Bayesian estimation of the shape skeleton
Proceedings of the National Academy of Sciences
Long-term deprivation affects visual perception and cortex
Nature
Categorizing natural objects: A comparison of the visual and the haptic modalities
Experimental Brain Research
Analyzing perceptual representations of complex, parametrically-defined shapes using MDS
Visual and haptic perceptual spaces show high similarity in humans
Journal of Vision
Sample criteria for testing outlying observations
Annals of Mathematical Statistics
Effects of vision and haptics on categorizing common objects
Cognitive Processes
Cited by (39)
Prior visual experience increases children's use of effective haptic exploration strategies in audio-tactile sound–shape correspondences
2024, Journal of Experimental Child PsychologyWidgets: A new set of parametrically defined 3D objects for use in haptic and visual categorization tasks
2020, Revue Europeenne de Psychologie AppliqueeCitation Excerpt :The size of the visual analogs was roughly comparable to the actual 3D size of the widgets. The viewpoint was chosen so that all the parts of each widget and the spatial relations between these parts were clearly visible (for a similar procedure, see Yildirim & Jacobs, 2013). The stimulus set is available at the following persistent URL: https://osf.io/q4p3g/.
Infants use phonetic detail in speech perception and word learning when detail is easy to perceive
2020, Journal of Experimental Child PsychologyAn integrative computational architecture for object-driven cortex
2019, Current Opinion in NeurobiologyCitation Excerpt :In addition, aspects of multisensory perception and crossmodal transfer can be modeled by composing causal generative models for multiple sensory modalities that share the same underlying latent variables — those represented in the physics engine. Most of these extensions of our framework have been implemented computationally in some form, and received some behavioral support [62–64], but it is an open question whether or how these computations might be instantiated in object-driven cortex. Another important goal is to explore further how the computational architecture presented here connects to existing theoretical accounts of the parietal–frontal regions and their interactions [3,8–10].
Visuo-haptic object perception
2019, Multisensory Perception: From Laboratory to Clinic