Review of Zenon Pylyshyn's Seeing and Visualizing: It's Not What You Think Catharine Abell, Department of Philosophy, Macquarie University, Sydney, NSW 2109 cabell@scmp.mq.edu.au PSYCHE 11 (1), February 2005 REVIEW OF: Pylyshyn, Z. 2003. Seeing and Visualizing: It's Not What You Think. Cambridge, MA: MIT Press. 563pp. US$42.98 hbk. ISBN: 0-262-16217-2. This book has three principle aims: to show that neither vision nor mental imagery involves the creation or inspection of picture-like mental representations; to defend the claim that our visual processes are, in significant part, cognitively impenetrable; and to develop a theory of "visual indexes". In what follows, I assess Pylyshyn's success in realising each of these aims in turn. I focus primarily on his arguments against "picture theories" of vision and mental imagery, to which approximately half the book is devoted. I argue that Pylyshyn adopts an unnecessarily restricted interpretation of what it would be for mental representations to be picture-like, and that this leads him prematurely to reject the possibility of explaining the introspective evidence concerning the nature of mental imagery. 1. The Denial of Pictures in Our Heads The first and final three chapters of this book are devoted to discussion of views according to which either vision or mental imagery is pictorial in nature. Pylyshyn is adamantly opposed to such views, and argues that those who hold them do so only either because, in the case of vision, they succumb to the "intentional fallacy" of attributing to representations properties of what they represent or because, in the case of imagery, they mistakenly believe that introspection provides us with access to the underlying structure of our mental representations. However, he construes that claim that we have picture-like mental representations very narrowly, as entailing the claim that intrinsic features of our cognitive architecture are somehow pictorial. Consequently, as I will argue, he fails to PSYCHE: http://psyche.cs.monash.edu.au/ PSYCHE 2005: VOLUME 11 ISSUE 1 2 consider several possible contruals of mental representations as picture-like that would enable him to explain the introspective evidence about mental imagery. Chapter 1 comprises a detailed argument against the theory according to which vision creates a picture in the head. He explains the influence of the picture theory as due to the discrepancy between the information provided by our phenomenal visual experience and that contained on our retinas. Visual experience extends beyond the spatial and temporal boundaries imposed by the sensors in the fovea. Pylyshyn believes that this leads some to assume that there is an inner image that holds visual information for longer periods, and records spatially extended information by integrating the information available from the retinas at various moments. However, he argues that the informational richness of perceptual experience is better explained by the hypothesis that we have some method of identifying elements of a visible scene as they appear on successive retinal images. He claims that we assign labels to parts of a visible scene and keep track of the labels that have previously been assigned to those parts that remain in view, thus obviating the need for visual storage of off-foveal visual patterns. Pylyshyn invokes a range of claims from experimental psychology against the picture view. Most of these rely on evidence about the interpretation of pictures to support inferences about perception in general. Such claims depend on the implicit assumption that picture interpretation depends on perceptual mechanisms which process pictures and their objects in the same way. While this is a common assumption among psychologists, it is a matter of continuing debate among philosophers (see, for example, Walton 1990, who argues that picture interpretation depends on imagination in a way that vision does not). Moreover, even if it is warranted, this assumption causes problems for Pylyshyn's argument. He argues that vision cannot involve the creation of an inner picture because the information vision provides about a scene is not nearly so rich or uniformly detailed as that associated with pictures. However, if the interpretation of pictures depends on general visual mechanisms, the information provided by pictures should be no more or less detailed than that provided by vision itself. Furthermore, it is not at all clear that the information provided by pictures is especially rich or detailed. For example, stick figure drawings provide only very schematic information about their objects. This point is made forcefully by Michael Tye (1991) in his discussion of mental imagery, but is neglected by Pylyshyn. The final three chapters address the nature of mental imagery. Pylyshyn's main concern in these chapters is to demonstrate the implausibility of the picture theory of mental imagery. In Chapter 6, he argues that the plausibility of the picture theory results from equivocating between images' content-the properties they represent their objects as having-and their form-the underlying system of representation they employ. In order for the picture theory to be true, he argues, the form of mental images, and not merely their contents, must be picture-like. Because those properties of images that are due to their form are unalterable, he argues, they will be cognitively impenetrable, whereas those that are due to their content will be penetrable. He then argues that the various imaging tasks whose outcomes are commonly invoked in support of the picture theory are cognitively penetrable and thus do not demonstrate that mental images are pictorial in form. PSYCHE: http://psyche.cs.monash.edu.au/ PSYCHE 2005: VOLUME 11 ISSUE 1 3 In Chapter 7, Pylyshyn seeks to support his argument that mental images are distinguished by their contents rather than their form by showing that they lack any of the essential characteristics of pictures. In particular, he argues that mental images do not share the spatial properties of pictures, and that they are not processed by the visual system in a way that supports the claim that they can themselves be seen. His argument for the latter claim relies heavily on empirical evidence for the dissociation of vision and imagery. However, this evidence does not directly support the claim that our visual mechanisms are not applied to the interpretation of mental imagery. It could equally well be explained as due to the different causes of each, one external and one internal. Chapter 8 addresses the role of mental images in thought. Pylyshyn argues that the contribution that imagery makes to our problem solving abilities is not analogous to that which diagrams make. For example, he argues that, unlike diagrams, mental images do not enable us to notice new spatial relations. However, the evidence he cites for this (for example, that we do not notice the ambiguity of imaged Necker Cubes) may reflect limitations to our capacity to create images rather than limitations to what we can learn from them. He argues that images contribute to thought because their contents are different from those of other mental representations. Consequently, he claims, they get us to notice aspects of things that can be helpful in solving certain problems. However, while he acknowledges that the properties that images typically encode are also encoded by pictures, he denies that they share any of the other characteristics of pictures and thus denies that their role in thought is evidence that they are pictorial. However, several philosophers (eg Lopes 1996, Peacocke 1987) argue that what makes a representation pictorial is precisely the nature of the information it provides. On such accounts, that images typically encode the same properties as pictures is constitutive of their being pictorial. Pylyshyn's argumentative strategy shows that he takes picture theories to challenge the language of thought hypothesis, according to which all mental representations are essentially linguistic. However, it is not clear that picture theorists must deny this hypothesis. Many picture theorists deny that they need do so (for example Kosslyn 1983, Tye 1988). Kosslyn and Tye both propose that mental images are pictorial in virtue of having functional spatial properties. They have such properties in virtue of comprising digital data structures that correspond to two-dimensional matrices, which in turn function like two-dimensional displays that are interpreted in the same way as pictures. Pylyshyn argues that the functional space proposal cannot be invoked in support of the picture theory, since it does not identify any intrinsic properties of images and thus does not say anything about the underlying representational form of mental imagery. However, so long as the picture theory is construed as an attempt to explain the introspective evidence about mental imagery and not as a claim about our cognitive architecture-as indeed Kosslyn and Tye construe it-this argument has no force. Pylyshyn acknowledges that we still lack a good account of precisely what distinguishes mental images from other mental representations. This is surely the most interesting question about mental imagery. It is therefore difficult to understand why Pylyshyn chooses to focus instead on the question of what, if anything, imagery shows about our cognitive architecture. Pylyshyn addresses the former question only insofar as he concludes Chapter 7 by enumerating several constraints that should be met by any PSYCHE: http://psyche.cs.monash.edu.au/ PSYCHE 2005: VOLUME 11 ISSUE 1 4 theory of mental imagery. These constraints require images to be explained as possessing certain features, including the following: containing information about appearances and the relative locations of objects; referring to individual things; and lacking explicit quantifiers, disjunctions and negations. He does not make any attempt to develop an account that meets such constraints himself. Moreover, he ignores the possibility that a picture theory of mental imagery which is neutral regarding the underlying structure of mental representations could meet these constraints. There is reason to believe that such an account could succeed. Pictures typically meet the constraints that Pylyshyn identifies. Furthermore, contemporary philosophical accounts do not construe pictorial representation as essentially spatial, or in any other way that would have implications for the underlying representational structure of mental images, were they picture-like. Instead, they typically construe pictures' representational properties as depending on their capacity to elicit appropriate visual experiences from their viewers (eg Hopkins 1998, Wollheim 1987), or to engage their viewers' visual mechanisms in appropriate ways (eg Schier 1986). These accounts explain why pictures exhibit their characteristic features by appeal to the experiences they elicit or mechanisms they engage. This suggests that the apparently picture-like features of mental images could likewise be explained by appeal to the role our visual systems play in imaging, irrespective of the underlying representational structure of either vision or imagery. 2. The Cognitive Impenetrability of Early Vision Pylyshyn devotes the second chapter to discussion of whether and in what way vision is cognitively penetrable. He argues that the cognitive penetrability of vision is limited to the initial allocation of visual attention and to late visual processing and that there is an intermediary stage called "early vision" that is informationally encapsulated. The cognitive penetration of vision can thus take two forms: our beliefs may determine what features of a scene we attend to (Pylyshyn goes on to discuss focal attention and its underlying mechanisms in Chapters 4 and 5); and they may influence the way in which we classify the objects that are detected in early visionas chairs, or tables, etc. Pylyshyn argues persuasively that the evidence that is commonly invoked in favour of the thesis that vision is cognitively penetrable can be explained as resulting from one of three things: our allocation of focal attention; the way in which we interpret what we see; or evolved, informationally encapsulated constraints on interpretation in early vision. In Chapter 3, Pylyshyn elaborates on the nature of the evolved constraints he believes to govern interpretation in early vision. These constraints enable the visual system to recover a unique 3D structure from proximal 2D data that are intrinsically ambiguous because they are logically compatible with an indefinite number of different 3D structures. Unlike inferences from general knowledge, he argues, these constraints do not influence processes outside the visual system and do not respond to general knowledge concerning their appropriateness or applicability. However, they reflect principles that apply frequently, but not invariably in our world. Therefore, he claims, they usually, but do not always result in veridical visual representations. He gives examples of such constraints that are drawn both from computer vision and from studies of human vision. PSYCHE: http://psyche.cs.monash.edu.au/ PSYCHE 2005: VOLUME 11 ISSUE 1 5 3. The Allocation of Visual Indexes to Primitive Visual Objects Pylyshyn's main claim in Chapter 4 is that our attentional focus is allocated to objects rather than locations. He claims that the visual system detects and tracks "primitive visual objects", which are tracked as individuals rather than according to their spatial location or other properties. This claim is supported by two arguments. Firstly, although Pylyshyn identifies focal attention, along with the way we interpret what we see, as one of the two sources of cognitive influence on vision, he argues that attentional focus can nonetheless only be given to "transducable" features of a scene – that is, features whose detection does not involve accessing memory or drawing inferences. If this were not the case, he argues, the notion of attentional focus would have little explanatory value, since any difference whatever between subjects' perceptual experiences could be explained as resulting from differences in what is attended to. Secondly, he argues that the transducable features to which attention is allocated must be objects in a scene, rather than properties of the scene or of objects. He presents a range of empirical evidence in support of this claim, including evidence for the endurance of attention despite change in physical location, temporal interruption, and objects' not being spatially defined. He also presents a range of neuropsychological evidence, including evidence from dysfunctions of attention. The claim that attention is allocated to objects independently of their properties assumes that we have some means of individuating objects that does not depend on property individuation. In Chapter 5, Pylyshyn presents his argument for this assumption. He claims that there is a mechanism, called a visual index, which is deployed prior to focal attention being allocated, that individuates objects and allows them to be tracked before any of their properties have been detected. Visual indexes are references to individual objects in the world. Pylyshyn argues that indexing is a causal, rather than a conceptual process: while certain properties of an object secure assignment of an index, the index does not represent the object it picks out as possessing those properties and continues to pick out that object however its properties may change. The claim that vision involves an indexing mechanism is an interesting one. However, it is not clear how an index, once allocated, could track an object without representing any of its properties. Without some method of comparing those properties that initially secured allocation of an index with the properties of a subsequently viewed visual scene, there seems to be no basis for picking out a particular part of that scene as comprising the object to which the index refers. While Pylyshyn discusses a range of empirical evidence that he takes to support his claim, this evidence is not conclusive and he does not adequately defend the coherence of the notion of a visual index. This book is valuable in many ways. It is engaging and clearly written. It also presents a persuasive defence of the claim that our visual and cognitive processes are independent, which nonetheless does not neglect the various ways in which our beliefs may influence what we see. Furthermore, it is valuable both as a summary of Pylyshyn's views on vision and mental imagery and for its presentation and discussion of the relevant empirical data. However, it does not significantly advance the issues it sets out to discuss. Much of this book re-presents arguments that Pylyshyn has previously developed elsewhere. Moreover, in his discussion of mental imagery in the final three chapters, Pylyshyn is content to argue against a very limited range of construals of the claim that PSYCHE: http://psyche.cs.monash.edu.au/ PSYCHE 2005: VOLUME 11 ISSUE 1 6 images are pictorial, and does not address either the issue of how else their pictorial nature might be construed or that of how else the difference between mental imagery and other mental representations could be explained. Unfortunately, the book is also ridden with minor typographic errors. References Hopkins, R. 1998. Picture, Image and Experience. Cambridge, Cambridge University Press. Kosslyn, S.M. 1983. Ghosts in the Mind's Machine. New York, W.W. Norton. Lopes, D. 1996. Understanding Pictures. Oxford, Clarendon Press. Peacocke, C. 1987. Depiction. Philosophical Review XCVI(3): 383-410. Schier, F. 1986. Deeper Into Pictures. Cambridge, Cambridge University Press. Tye, M. 1988. The picture theory of mental images. Philosophical Review XCVII (4): 497-520. Tye, M. 1991. The Imagery Debate. Cambridge, Massachusetts, MIT Press. Walton, K. 1990. Mimesis as Make-Believe. Cambridge, MA, Harvard University Press. Wollheim, R. 1987. Painting as an Art. Princeton, Princeton University Press.