This paper explores the difference between Connectionist proposals for cognitive a r c h i t e c t u r e a n d t h e s o r t s o f m o d e l s t hat have traditionally been assum e d i n c o g n i t i v e s c i e n c e . W e c l a i m t h a t t h (...) e m a j o r d i s t i n c t i o n i s t h a t , w h i l e b o t h Connectionist and Classical architectures postulate representational mental states, the latter but not the former are committed to a symbol-level of representation, or to a ‘language of thought’: i.e., to representational states that have combinatorial syntactic and semantic structure. Several arguments for combinatorial structure in mental representations are then reviewed. These include arguments based on the ‘systematicity’ of mental representation: i.e., on the fact that cognitive capacities always exhibit certain symmetries, so that the ability to entertain a given thought implies the ability to entertain thoughts with semantically related contents. We claim that such arguments make a powerful case that mind/brain architecture is not Connectionist at the cognitive level. We then consider the possibility that Connectionism may provide an account of the neural (or ‘abstract neurological’) structures in which Classical cognitive architecture is implemented. We survey a n u m b e r o f t h e s t a n d a r d a r g u m e n t s t h a t h a v e b e e n o f f e r e d i n f a v o r o f Connectionism, and conclude that they are coherent only on this interpretation. (shrink)
The computational view of mind rests on certain intuitions regarding the fundamental similarity between computation and cognition. We examine some of these intuitions and suggest that they derive from the fact that computers and human organisms are both physical systems whose behavior is correctly described as being governed by rules acting on symbolic representations. Some of the implications of this view are discussed. It is suggested that a fundamental hypothesis of this approach is that there is a natural domain of (...) human functioning that can be addressed exclusively in terms of a formal symbolic or algorithmic vocabulary or level of analysis. Much of the paper elaborates various conditions that need to be met if a literal view of mental activity as computation is to serve as the basis for explanatory theories. The coherence of such a view depends on there being a principled distinction between functions whose explanation requires that we posit internal representations and those that we can appropriately describe as merely instantiating causal physical or biological laws. In this paper the distinction is empirically grounded in a methodological criterion called the " cognitive impenetrability condition." Functions are said to be cognitively impenetrable if they cannot be influenced by such purely cognitive factors as goals, beliefs, inferences, tacit knowledge, and so on. Such a criterion makes it possible to empirically separate the fixed capacities of mind from the particular representations and algorithms used on specific occasions. In order for computational theories to avoid being ad hoc, they must deal effectively with the "degrees of freedom" problem by constraining the extent to which they can be arbitrarily adjusted post hoc to fit some particular set of observations. This in turn requires that the fixed architectural function and the algorithms be independently validated. It is argued that the architectural assumptions implicit in many contemporary models run afoul of the cognitive impenetrability condition, since the required fixed functions are demonstrably sensitive to tacit knowledge and goals. The paper concludes with some tactical suggestions for the development of computational cognitive theories. (shrink)
Although the study of visual perception has made more progress in the past 40 years than any other area of cognitive science, there remain major disagreements as to how closely vision is tied to general cognition. This paper sets out some of the arguments for both sides and defends the position that an important part of visual perception, which may be called early vision or just vision, is prohibited from accessing relevant expectations, knowledge and utilities - in other words it (...) is cognitively impenetrable. That part of vision is complex and articulated and provides a representation of the 3-D surfaces of objects sufficient to serve as an index into memory, with somewhat different outputs being made available to other systems such as those dealing with motor control. The paper also addresses certain conceptual and methodological issues, including the use of signal detection theory and event-related potentials to assess cognitive penetration of vision. A distinction is made among several stages in visual processing. These include, in addition to the inflexible early-vision stage, a pre-perceptual attention allocation stage and a post-perceptual evaluation, memory-accessing, and inference stage which provide several different highly constrained ways in which cognition can affect the outcome of visual perception. The paper discusses arguments that have been presented in both computer vision and psychology showing that vision is "intelligent" and involves elements of problem solving". It is suggested that these cases do not show cognitive penetration, but rather they show that certain natural constraints on interpretation, concerned primarily with optical and geometrical properties of the world, have been compiled into the visual system. The paper also examines a number of examples where instructions and "hints" are alleged to affect. (shrink)
Although the study of visual perception has made more progress in the past 40 years than any other area of cognitive science, there remain major disagreements as to how closely vision is tied to cognition. This target article sets out some of the arguments for both sides (arguments from computer vision, neuroscience, psychophysics, perceptual learning, and other areas of vision science) and defends the position that an important part of visual perception, corresponding to what some people have called early vision, (...) is prohibited from accessing relevant expectations, knowledge, and utilities in determining the function it computes – in other words, it is cognitively impenetrable. That part of vision is complex and involves top-down interactions that are internal to the early vision system. Its function is to provide a structured representation of the 3-D surfaces of objects sufficient to serve as an index into memory, with somewhat different outputs being made available to other systems such as those dealing with motor control. The paper also addresses certain conceptual and methodological issues raised by this claim, such as whether signal detection theory and event-related potentials can be used to assess cognitive penetration of vision. (shrink)
In "Things and Places," Zenon Pylyshyn argues that the process of incrementally constructing perceptual representations, solving the binding problem (determining which properties go together), and, more generally, grounding perceptual ...
In three experiments, subjects attempted to track multiple items as they moved independently and unpredictably about a display. Performance was not impaired when the items were briefly (but completely) occluded at various times during their motion, suggesting that occlusion is taken into account when computing enduring perceptual objecthood. Unimpaired performance required the presence of accretion and deletion cues along fixed contours at the occluding boundaries. Performance was impaired when items were present on the visual field at the same times and (...) to the same degrees as in the occlusion conditions, but disappeared and reappeared in ways which did not implicate the presence of occluding surfaces (e.g. by imploding and exploding into and out of existence, instead of accreting and deleting along a fixed contour). Unimpaired performance did not require visible occluders (i.e. Michotte’s tunnel effect) or globally consistent occluder positions. We discuss implications of these results for theories of objecthood in visual attention. (shrink)
Marr (1982) may have been one of the rst vision researchers to insist that in modeling vision it is important to separate the location of visual features from their type. He argued that in early stages of visual processing there must be “place tokens” that enable subsequent stages of the visual system to treat locations independent of what specic feature type was at that location. Thus, in certain respects a collinear array of diverse features could still be perceived as a (...) line, and under certain conditions could function as such in perceptual phenomena like the Poggendorf illusion. (shrink)
This paper argues that a theory of situated vision, suited for the dual purposes of object recognition and the control of action, will have to provide something more than a system that constructs a conceptual representation from visual stimuli: it will also need to provide a special kind of direct (preconceptual, unmediated) connection between elements of a visual representation and certain elements in the world. Like natural language demonstratives (such as `this' or `that') this direct connection allows entities to be (...) referred to without being categorized or conceptualized. Several reasons are given for why we need such a preconcep- tual mechanism which individuates and keeps track of several individual objects in the world. One is that early vision must pick out and compute the relation among several individual objects while ignoring their properties. Another is that incrementally computing and updating representations of a dynamic scene requires keeping track of token individuals despite changes in their properties or locations. It is then noted that a mechanism meeting these requirements has already been proposed in order to account for a number of disparate empiri- cal phenomena, including subitizing, search-subset selection and multiple object tracking (Pylyshyn et al., Canadian Journal of Experimental Psychology 48(2) (1994) 260). This mechanism, called a visual index or FINST, is brie. (shrink)
It is generally accepted that there is something special about reasoning by using mental images. The question of how it is special, however, has never been satisfactorily spelled out, despite more than thirty years of research in the post-behaviorist tradition. This article considers some of the general motivation for the assumption that entertaining mental images involves inspecting a picture-like object. It sets out a distinction between phenomena attributable to the nature of mind to what is called the cognitive architecture, and (...) ones that are attributable to tacit knowledge used to simulate what would happen in a visual situation. With this distinction in mind, the paper then considers in detail the widely held assumption that in some important sense images are spatially displayed or are depictive, and that examining images uses the same mechanisms that are deployed in visual perception. I argue that the assumption of the spatial or depictive nature of images is only explanatory if taken literally, as a claim about how images are physically instantiated in the brain, and that the literal view fails for a number of empirical reasons – for example, because of the cognitive penetrability of the phenomena cited in its favor. Similarly, while it is arguably the case that imagery and vision involve some of the same mechanisms, this tells us very little about the nature of mental imagery and does not support claims about the pictorial nature of mental images. Finally, I consider whether recent neuroscience evidence clarifies the debate over the nature of mental images. I claim that when such questions as whether images are depictive or spatial are formulated more clearly, the evidence does not provide support for the picture-theory over a symbol-structure theory of mental imagery. Even if all the empirical claims were true, they do not warrant the conclusion that many people have drawn from them: that mental images are depictive or are displayed in some (possibly cortical) space. Such a conclusion is incompatible with what is known about how images function in thought. We are then left with the provisional counterintuitive conclusion that the available evidence does not support rejection of what I call the “null hypothesis”; namely, that reasoning with mental images involves the same form of representation and the same processes as that of reasoning in general, except that the content or subject matter of thoughts experienced as images includes information about how things would look. (shrink)
In the past decade there has been renewed interest in the study of mental imagery. Emboldened by new findings from neuroscience, many people have revived the idea that mental imagery involves a special format of thought, one that is pictorial in nature. But the evidence and the arguments that exposed deep conceptual and empirical problems in the picture theory over the past 300 years have not gone away. I argue that the new evidence from neural imaging and clinical neuropsychology does (...) little to justify this recidivism because it does not address the format of mental images. I also discuss some reasons why the picture theory is so resistant to counterarguments and suggest ways in which non-pictorial theories might account for the apparent spatial nature of images. (shrink)
The task of tracking a small number (about four or five) visual targets within a larger set of identical items, each of which moves randomly and independently, has been used extensively to study object-based attention. Analysis of this multiple object tracking (MOT) task shows that it logically entails solving the correspondence problem for each target over time, and thus that the individuality of each of the targets must be tracked. This suggests that when successfully tracking objects, observers must also keep (...) track of them as unique individuals. Yet in the present studies we show that observers are poor at recalling the identity of successfully tracked objects (as specified by a unique identifier associated with each target, such as a number or starting location). Studies also show that the identity of targets tends to be lost when they come close together and that this tendency is greater between pairs of targets than between targets and nontargets. The significance of this finding in relation to the multiple object tracking paradigm is discussed. (shrink)
Using a novel enumeration task, we examined the encoding of spatial information during subitizing. Observers were shown masked presentations of randomly-placed discs on a screen and were required to mark the perceived locations of these discs on a subsequent blank screen. This provided a measure of recall for object locations and an indirect measure of display numerosity. Observers were tested on three stimulus durations and eight numerosities. Enumeration performance was high for displays containing up to six discs—a higher subitizing range (...) than reported in previous studies. Error in the location data was measured as the distance between corresponding stimulus and response discs. Overall, location errors increased in magnitude with larger numerosities and shorter display durations. When errors were computed as disc distance from display centroid, results suggest a compressed representation by observers. Additionally, enumeration and localization accuracy increased with display regularity. (shrink)
Establishment holds that thc psychological mechanism of inference is the ment psychological thcorizing. Moreover, given this conciliatory reading, transformation of mental representations, it follows that perception is in.
I recently discovered that work I was doing in the laboratory and in theoretical writings was implicitly taking a position on a set of questions that philosophers had been worrying about for much of the past 30 or more years. My clandestine involvement in philosophical issues began when a computer science colleague and I were trying to build a model of geometrical reasoning that would draw a diagram and notice things in the diagram as it drew it (Pylyshyn, Elcock, Marmor, (...) & Sander, 1978). One problem we found we had to face was that if the system discovered a right angle it had no way to tell whether this was the intersection of certain lines it had drawn earlier while constructing a certain figure, and if so which particular lines they were. Moreover, the model had no way of telling whether this particular right angle was identical to some bit of drawing it had earlier encountered and represented as, say, the base of a particular triangle. There was, in other words, no way to determined the identity of an element (I use the term “element” when referring to a graphical unit such as used in experiments. Otherwise when speaking informally I use the term “thing” on the grounds that nobody would mistake that term for a technical theoretical construct. Eventually I end up calling them “Visual Objects” to conform to usage in psychology) at two different times if it was represented differently at those times. This led to some speculation about the need for what we called a “finger” that could be placed at a particular element of interest and that could be used to identify it as particular token thing (the way you might identify a particular feature on paper by labeling it). In general we needed something like a finger that would stay attached to a particular element and could be used to maintain a correspondence between the individual element that was just noticed now and one that had been represented in some fashion at an earlier time. The idea of such fingers (which.... (shrink)
inﬂuence. One of the principal characteristics that distinguishes Cognitive Science from more traditional studies of cognition within Psychology, is the extent to which it has been inﬂuenced by both the ideas and the techniques of computing. It may come as a surprise to the outsider, then, to discover that there is no unanimity within the discipline on either (a) the nature (and in some cases the desireabilty) of the inﬂuence and (b) what computing is –- or at least on its.
We present three studies examining whether multiple-object tracking (MOT) benefits from the active inhibition of nontargets, as proposed in (Pylyshyn, 2004). Using a probedot technique, the first study showed poorer probe detection on nontargets than on either the targets being tracked or in the empty space between objects. The second study used a matching nontracking task to control for possible masking of probes, independent of target tracking. The third study examined how localized the inhibition is to individual nontargets. The result (...) of these three studies led to the conclusion that nontargets are subject to a highly localized object-based inhibition. Implications of this finding for the FINST visual index theory are discussed. We suggest that we need to distinguish between the differentiation (or individuation) of enduring token objects and the process of making the objects accessible through indexes, with only the latter being limited to 4 or 5 objects. (shrink)
6. Seeing With the Mind’s Eye 1: The Puzzle of Mental Imagery .................................................6-1 6.1 What is the puzzle about mental imagery?..............................................................................6-1 6.2 Content, form and substance of representations ......................................................................6-6 6.3 What is responsible for the pattern of results obtained in imagery studies?.................................6-8..
��In four experiments we address the question whether several visual objects can be selected voluntarily (exogenously) and then tracked in a Multiple Object Tracking paradigm and, if so, whether the selection involves a different process. Experiment 1 showed that items can indeed be selected based on their labels. Experiment 2 showed that to select the complement set to a set that is automatically (exogenously) selected — e.g. to select all objects not ﬂashed — observers require additional time and that given (...) 1080 ms they were able to select and track them as well as those selected automatically. Experiment 3 showed that the additional time needed in the previous experiment cannot be attributed solely to time required to disengage attention from the initially automatic selections. Experiment 4 showed that the added time provides a monotonically greater beneﬁt when there are more targets, suggesting a serial process. These results are discussed in relation to the Visual Index (FINST) theory which assumes that visual indexes are captured by a data-driven process. It is suggested that voluntarily allocated attention can be used to facilitate the automatic attention capture by objects of interest. (shrink)
The target article proposes that visual experience arises when sensorimotor contingencies are exploited in perception. This novel analysis of visual experience fares no better than the other proposals that the article rightly dismisses, and for the same reasons. Extracting invariants may be needed for recognition, but it is neither necessary nor sufficient for having a visual experience. While the idea that vision involves the active extraction of sensorimotor invariants has merit, it does not replace the need for perceptual representations. Vision (...) is not just for the immediate controlling of action; it is also for finding out about the world, from which inferences may be drawn and beliefs changed. (shrink)
called,_ Cognitive Science_ was to bring back scienti?c realism. This may strike you as a very odd claim, for one does not usually think of science as needing to be talked into scienti?c realism. Science is, after all, the study of reality by the most precise instruments of measurement and.
It is argued that the traditional distinction between artificial intelligence and cognitive simulation amounts to little more than a difference in style of research - a different ordering in goal priorities and different methodological allegiances. Both enterprises are constrained by empirical considerations and both are directed at understanding classes of tasks that are defined by essentially psychological criteria. Because of the different ordering of priorities, however, they occasionally take somewhat different stands on such issues as the power/generality trade-off and on (...) the relevance of the sort of data collected in experimental psychology laboratories. (shrink)
The target article claimed that although visual apprehension involves all of general cognition, a significant component of vision (referred to as early vision) works independently of cognition and yet is able to provide a surprisingly high level interpretation of visual inputs, roughly up to identifying general shape-classes. The commentators were largely sympathetic, but frequently disagreed on how to draw the boundary, on exactly what early vision delivers, on the role that attention plays, and on how to interpret the neurophysiological data (...) showing top-down effects. A significant number simply asserted that they were not willing to accept any distinction between vision and cognition, and a surprising number even felt that we could never tell for sure, so why bother? Among the topics covered was the relation of cognition and consciousness, the relation of early vision to other modules such as face recognition and language, and the role of natural constraints. (shrink)
People have always wondered how thinking takes place and what thoughts are constructed from. We typically experience our thoughts as involving pictorial (or sensory) contents or as being in words. Although this idea has been enshrined in psychology as the “dual code” theory of reasoning and memory, serious questions have been raised concerning this view. It appears that whatever the form of our thoughts it is unlikely that it is anything like our experience of them. But if thought is not (...) in pictures or words, what form does it take? If we do not sometimes think in words, then what actually goes on when we think by engaging in an “inner dialogue”? And if we do not sometimes think in pictures, what goes on when we reason by creating and examining “mental images”? (shrink)
You might reasonably surmise from the title of this paper that I will be discussing a theory of vision. After all, what is a theory of vision but a theory of how the world is connected to our visual representations? Theories of visual perception universally attempt to give an account of how a proximal stimulus (presumably a pattern impinging on the retina) can lead to a rich representation of a three dimensional world and thence to either the recognition of known (...) objects or to the coordination of actions with visual information. Such theories typically provide an effective (i.e., computable) mapping from a 2D pattern to a representation of a 3D scene, usually in the form of a symbol structure. But such a mapping, though undoubtedly the essential purpose of a theory of vision, leaves at least one serious problem that I intend to discuss here. It is this problem, rather than a theory of vision itself, that is the subject of this talk. (shrink)
Jacques Mehler was notoriously charitable in embracing a diversity of approaches to science and to the use of many different methodologies. One place where his ecumenism brought the two of us into disagreement is when the evidence of brain imaging was cited in support of different psychological doctrines, such as the picture-theory of mental imagery. Jacques remained steadfast in his faith in the ability of neuroscience data (where the main source of evidence has been from clinical neurology and neuro-imaging) to (...) choose among different psychological positions. I personally have seen little reason for this optimism so Jacques and I frequently found ourselves disagreeing on this issue, though I should add that we rarely disagreed on substantive issues on which we both had views. This particular bone of contention, however, kept us busy at parties and during the many commutes between New York and New Jersey, where Jacques was a frequent visitor at the Rutgers Center for Cognitive Science. Now that I am in a position where he is a captive audience it seems an opportune time to raise the question again. (shrink)