Abstract
In this paper we offer a theory of cross-modal objects. To begin, we discuss two kinds of linkages between vision and audition. The first is a duality. The the visual system detects and identifies surfaces; the auditory system detects and identifies sources. Surfaces are illuminated by sources of light; sound is reflected off surfaces. However, the visual system discounts sources and the auditory system discounts surfaces. These and similar considerations lead to the Theory of Indispensable Attributes that states the conditions for the formation of gestalts in the two modalities. The second linkage involves the formation of audiovisual objects, integrated cross-modal experiences. We describe research that reveals the role of cross-modal causality in the formation of such objects. These experiments use the canonical example of a causal link between vision and audition: a visible impact that causes a percussive sound.
[A fire is] a terrestrial event with flames and fuel. It is a source of four kinds of stimulation, since it gives off sound, odor, heat and light .... One can hear it, smell it, feel it, and see it, or get any combination of these detections, and thereby perceive a fire .... For this event, the four kinds of stimulus information and the four perceptual systems are equivalent.
If the perception of fire were a compound of separate sensations of sound, smell, warmth and color, they would have had to be associated in past experience in order to explain how any one of them could evoke memories of all the others. ...
[T]he problem of perception is not how sensations get associated; it is how the sound, the odor, the warmth, or the light that specifies fire gets discriminated from all the other sounds, odors, warmths, and lights that do not specify fire. Gibson (1966) (pp. 54–55)
In this paper, we offer a theory of cross-modal objects. We agree with Gibson’s assertion that such a theory is unlikely to be an associative theory. Instead, our theory is built on the notion of privileged inter-modal binding. As an example of such privileged binding, we will examine the relation between visible impacts and percussive sounds, which allows for a particularly powerful form of binding that produces audio-visual objects. To motivate these conclusions we devote the first two sections of this article to a review of Kubovy and Van Valkenburg’s (Cognition 80(1–2):97–126, 2001) theory of auditory and visual objects. In the final section, we present our new approach and present empirical data to support our view.
Similar content being viewed by others
Notes
The concept of privileged binding is similar to Stoffregen and Bardy’s (2001) notion of global arrays. We do not agree that their approach (and a fortiriori ours) undermines the idea of separate senses; a justification of this disagreement would go beyond the scope of this article.
References
Amano, K., D.H. Foster, and S.M.C. Nascimento. 2006. Color constancy in natural scenes with and without an explicit illuminant cue. Visual Neuroscience 23: 351–356.
Armontrout, J.A., M. Schutz, andM. Kubovy. 2009. Visual determinants of a cross-modal illusion. Attention, Perception, & Psychophysics 71: 1618–1627.
Aschersleben, G., and P. Bertelson. 2003. Temporal ventriloquism: Crossmodal interaction on the time dimension—2. Evidence from sensorimotor synchronization. International Journal of Psychophysiology 50(1–2): 157–163.
Bertelson, P., and G. Aschersleben. 2003. Temporal ventriloquism: Crossmodal interaction on the time dimension—1. Evidence from auditory-visual temporal order judgment. International Journal of Psychophysiology 50(1–2): 147–155.
Blauert, J. 1997. Spatial hearing: The psychophysics of human sound localization. Cambridge: MIT (revised edition).
Bon, L., and C. Lucchetti. 2006. Auditory environmental cells and visual fixation effect in area 8B of macaque monkey. Experimental Brain Research 168(3): 441–449.
Bradley, D.R., and H.M. Petry. 1977. Organizational determinants of subjective contour: The subjective Necker cube. American Journal of Psychology 90: 253–262.
Campbell, J. 2002. Reference and consciousness. Oxford: Oxford University Press. Published to Oxford Scholarship Online: November 2003. doi:10.1093/0199243816.001.0001. Accessed: 25 December 2007.
Clifton, R.K., R.L. Freyman, R.Y. Litovsky, and D. McCall. 1994. Listeners’ expectations about echoes can raise or lower echo threshold. The Journal of the Acoustical Society of America 95(3): 1525–1533.
Clifton, R.K., R.L. Freyman, and J. Meo. 2002. What the precedence effect tells us about room acoustics. Perception & Psychophysics 64(2): 180–188.
Gibson, J.J. 1966. The senses considered as perceptual systems. Boston: Houghton Mifflin.
Goldring, J., M. Dorris, B. Corneil, P. Balantyne, and D. Munoz. 1996. Combined eye-head gaze shifts to visual and auditory targets in humans. Experimental Brain Research 111: 68–78.
Hötting, K., F. Rösler, and B. Röder. 2003. Crossmodal and intermodal attention modulate event-related brain potentials to tactile and auditory stimuli. Experimental Brain Research 148: 26–37.
Jack, C.E., and W.R. Thurlow. 1973. Effects of degree of visual association and angle of displacement on the “ventriloquism” effect. Perceptual & Motor Skills 37: 967–979.
Kubovy, M., and D. Van Valkenburg. 2001. Auditory and visual objects. Cognition 80(1–2): 97–126.
Matthen, M. 2004. Features, places, and things: Reflections on Austen Clark’s theory of sentience. Philosophical Psychology 17(4): 497–518.
Matthen, M. 2005. Seeing, doing, and knowing—a philosophical theory of sense perception. Oxford, UK: Oxford University Press. Published to Oxford Scholarship Online: April 2005. doi:10.1093/0199268509.001.0001. Accessed: 25 December 2007.
McGurk, H., and J. MacDonald. 1976. Hearing lips and seeing voices. Nature 264(5588): 746–748.
Milner, A.D., and M.A. Goodale. 1995. The visual brain in action. Oxford: Oxford University Press.
Mollon, J. 1995. Seeing colour. In Colour: Art & science, eds. T. Lamb, and J. Bouriau, 127–150. Cambridge: Cambridge University Press.
Oxford English Dictionary. 2004. Object. Retrieved on 25 December 2006 from the Oxford english dictionary. Online: http://dictionary.oed.com/cgi/entry/00329075.
Robart, R.L., and L.D. Rosenblum. 2005. Hearing space: Identifying rooms by reflected sound. In Studies in perception and action XIII, eds. H. Heft, and K.L. Marsh, 153–161. Hillsdale: Lawrence Erlbaum.
Schutz, M., and M. Kubovy. 2009a. Causality and cross-modal integration. Journal of Experimental Psychology: Human Perception and Performance 35(6):1791–1810.
Schutz, M., and M. Kubovy. 2009b. Deconstructing a musical illusion: Point-light representations capture salient properties of impact motions. Canadian Acoustics 37: 23–28.
Schutz, M., and S. Lipscomb. 2007. Hearing gestures, seeing music: Vision influences perceived tone duration. Perception 36: 888–897.
Stoffregen, T.A., and B.G. Bardy. 2001. On specification and the senses. Behavioral and Brain Sciences 24: 195–213.
Watkins, A.J. 1991. Central, auditory mechanisms of perceptual compensation for spectral-envelope distortion. Journal of the Acoustical Society of America 90: 2942–2955.
Watkins, A.J. 1998. The precedence effect and perceptual compensation for spectral envelope distortion. In Psychophysical and physiological advances in hearing, eds. A. Palmer, A. Rees, A.Q. Summerfield, and R. Meddis, 336–343. London: Whurr.
Watkins, A.J. 1999. The influence of early reflections on the identification and lateralization of vowels. Journal of the Acoustical Society of America 106: 2933–2944.
Watkins, A.J., and S.J. Makin. 1996. Effects of spectral contrast on perceptual compensation for spectral-envelope distortion. Journal of the Acoustical Society of America 99: 3749–3757.
Witten, I.B., J.F. Bergan, and E.I. Knudsen. 2006. Dynamic shifts in the owl’s auditory space map predict moving sound location. Nature Neuroscience 9(11): 1439–1445.
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors work is supported by grants from NEI and NIDCD.
Rights and permissions
About this article
Cite this article
Kubovy, M., Schutz, M. Audio-Visual Objects. Rev.Phil.Psych. 1, 41–61 (2010). https://doi.org/10.1007/s13164-009-0004-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13164-009-0004-5