Skip to main content
Log in

Audio-Visual Objects

  • Published:
Review of Philosophy and Psychology Aims and scope Submit manuscript

Abstract

In this paper we offer a theory of cross-modal objects. To begin, we discuss two kinds of linkages between vision and audition. The first is a duality. The the visual system detects and identifies surfaces; the auditory system detects and identifies sources. Surfaces are illuminated by sources of light; sound is reflected off surfaces. However, the visual system discounts sources and the auditory system discounts surfaces. These and similar considerations lead to the Theory of Indispensable Attributes that states the conditions for the formation of gestalts in the two modalities. The second linkage involves the formation of audiovisual objects, integrated cross-modal experiences. We describe research that reveals the role of cross-modal causality in the formation of such objects. These experiments use the canonical example of a causal link between vision and audition: a visible impact that causes a percussive sound.

[A fire is] a terrestrial event with flames and fuel. It is a source of four kinds of stimulation, since it gives off sound, odor, heat and light .... One can hear it, smell it, feel it, and see it, or get any combination of these detections, and thereby perceive a fire .... For this event, the four kinds of stimulus information and the four perceptual systems are equivalent.

If the perception of fire were a compound of separate sensations of sound, smell, warmth and color, they would have had to be associated in past experience in order to explain how any one of them could evoke memories of all the others. ...

[T]he problem of perception is not how sensations get associated; it is how the sound, the odor, the warmth, or the light that specifies fire gets discriminated from all the other sounds, odors, warmths, and lights that do not specify fire. Gibson (1966) (pp. 54–55)

In this paper, we offer a theory of cross-modal objects. We agree with Gibson’s assertion that such a theory is unlikely to be an associative theory. Instead, our theory is built on the notion of privileged inter-modal binding. As an example of such privileged binding, we will examine the relation between visible impacts and percussive sounds, which allows for a particularly powerful form of binding that produces audio-visual objects. To motivate these conclusions we devote the first two sections of this article to a review of Kubovy and Van Valkenburg’s (Cognition 80(1–2):97–126, 2001) theory of auditory and visual objects. In the final section, we present our new approach and present empirical data to support our view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. The concept of privileged binding is similar to Stoffregen and Bardy’s (2001) notion of global arrays. We do not agree that their approach (and a fortiriori ours) undermines the idea of separate senses; a justification of this disagreement would go beyond the scope of this article.

References

  • Amano, K., D.H. Foster, and S.M.C. Nascimento. 2006. Color constancy in natural scenes with and without an explicit illuminant cue. Visual Neuroscience 23: 351–356.

    Article  Google Scholar 

  • Armontrout, J.A., M. Schutz, andM. Kubovy. 2009. Visual determinants of a cross-modal illusion. Attention, Perception, & Psychophysics 71: 1618–1627.

    Article  Google Scholar 

  • Aschersleben, G., and P. Bertelson. 2003. Temporal ventriloquism: Crossmodal interaction on the time dimension—2. Evidence from sensorimotor synchronization. International Journal of Psychophysiology 50(1–2): 157–163.

    Article  Google Scholar 

  • Bertelson, P., and G. Aschersleben. 2003. Temporal ventriloquism: Crossmodal interaction on the time dimension—1. Evidence from auditory-visual temporal order judgment. International Journal of Psychophysiology 50(1–2): 147–155.

    Article  Google Scholar 

  • Blauert, J. 1997. Spatial hearing: The psychophysics of human sound localization. Cambridge: MIT (revised edition).

    Google Scholar 

  • Bon, L., and C. Lucchetti. 2006. Auditory environmental cells and visual fixation effect in area 8B of macaque monkey. Experimental Brain Research 168(3): 441–449.

    Article  Google Scholar 

  • Bradley, D.R., and H.M. Petry. 1977. Organizational determinants of subjective contour: The subjective Necker cube. American Journal of Psychology 90: 253–262.

    Article  Google Scholar 

  • Campbell, J. 2002. Reference and consciousness. Oxford: Oxford University Press. Published to Oxford Scholarship Online: November 2003. doi:10.1093/0199243816.001.0001. Accessed: 25 December 2007.

  • Clifton, R.K., R.L. Freyman, R.Y. Litovsky, and D. McCall. 1994. Listeners’ expectations about echoes can raise or lower echo threshold. The Journal of the Acoustical Society of America 95(3): 1525–1533.

    Article  Google Scholar 

  • Clifton, R.K., R.L. Freyman, and J. Meo. 2002. What the precedence effect tells us about room acoustics. Perception & Psychophysics 64(2): 180–188.

    Google Scholar 

  • Gibson, J.J. 1966. The senses considered as perceptual systems. Boston: Houghton Mifflin.

    Google Scholar 

  • Goldring, J., M. Dorris, B. Corneil, P. Balantyne, and D. Munoz. 1996. Combined eye-head gaze shifts to visual and auditory targets in humans. Experimental Brain Research 111: 68–78.

    Article  Google Scholar 

  • Hötting, K., F. Rösler, and B. Röder. 2003. Crossmodal and intermodal attention modulate event-related brain potentials to tactile and auditory stimuli. Experimental Brain Research 148: 26–37.

    Article  Google Scholar 

  • Jack, C.E., and W.R. Thurlow. 1973. Effects of degree of visual association and angle of displacement on the “ventriloquism” effect. Perceptual & Motor Skills 37: 967–979.

    Google Scholar 

  • Kubovy, M., and D. Van Valkenburg. 2001. Auditory and visual objects. Cognition 80(1–2): 97–126.

    Article  Google Scholar 

  • Matthen, M. 2004. Features, places, and things: Reflections on Austen Clark’s theory of sentience. Philosophical Psychology 17(4): 497–518.

    Article  Google Scholar 

  • Matthen, M. 2005. Seeing, doing, and knowing—a philosophical theory of sense perception. Oxford, UK: Oxford University Press. Published to Oxford Scholarship Online: April 2005. doi:10.1093/0199268509.001.0001. Accessed: 25 December 2007.

    Google Scholar 

  • McGurk, H., and J. MacDonald. 1976. Hearing lips and seeing voices. Nature 264(5588): 746–748.

    Article  Google Scholar 

  • Milner, A.D., and M.A. Goodale. 1995. The visual brain in action. Oxford: Oxford University Press.

    Google Scholar 

  • Mollon, J. 1995. Seeing colour. In Colour: Art & science, eds. T. Lamb, and J. Bouriau, 127–150. Cambridge: Cambridge University Press.

  • Oxford English Dictionary. 2004. Object. Retrieved on 25 December 2006 from the Oxford english dictionary. Online: http://dictionary.oed.com/cgi/entry/00329075.

  • Robart, R.L., and L.D. Rosenblum. 2005. Hearing space: Identifying rooms by reflected sound. In Studies in perception and action XIII, eds. H. Heft, and K.L. Marsh, 153–161. Hillsdale: Lawrence Erlbaum.

  • Schutz, M., and M. Kubovy. 2009a. Causality and cross-modal integration. Journal of Experimental Psychology: Human Perception and Performance 35(6):1791–1810.

    Article  Google Scholar 

  • Schutz, M., and M. Kubovy. 2009b. Deconstructing a musical illusion: Point-light representations capture salient properties of impact motions. Canadian Acoustics 37: 23–28.

    Google Scholar 

  • Schutz, M., and S. Lipscomb. 2007. Hearing gestures, seeing music: Vision influences perceived tone duration. Perception 36: 888–897.

    Article  Google Scholar 

  • Stoffregen, T.A., and B.G. Bardy. 2001. On specification and the senses. Behavioral and Brain Sciences 24: 195–213.

    Google Scholar 

  • Watkins, A.J. 1991. Central, auditory mechanisms of perceptual compensation for spectral-envelope distortion. Journal of the Acoustical Society of America 90: 2942–2955.

    Article  Google Scholar 

  • Watkins, A.J. 1998. The precedence effect and perceptual compensation for spectral envelope distortion. In Psychophysical and physiological advances in hearing, eds. A. Palmer, A. Rees, A.Q. Summerfield, and R. Meddis, 336–343. London: Whurr.

    Google Scholar 

  • Watkins, A.J. 1999. The influence of early reflections on the identification and lateralization of vowels. Journal of the Acoustical Society of America 106: 2933–2944.

    Article  Google Scholar 

  • Watkins, A.J., and S.J. Makin. 1996. Effects of spectral contrast on perceptual compensation for spectral-envelope distortion. Journal of the Acoustical Society of America 99: 3749–3757.

    Article  Google Scholar 

  • Witten, I.B., J.F. Bergan, and E.I. Knudsen. 2006. Dynamic shifts in the owl’s auditory space map predict moving sound location. Nature Neuroscience 9(11): 1439–1445.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Kubovy.

Additional information

The authors work is supported by grants from NEI and NIDCD.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kubovy, M., Schutz, M. Audio-Visual Objects. Rev.Phil.Psych. 1, 41–61 (2010). https://doi.org/10.1007/s13164-009-0004-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13164-009-0004-5

Keywords

Navigation