Word recognition is the Petri dish of the cognitive sciences. The processes hypothesized to govern naming, identifying and evaluating words have shaped this field since its origin in the 1970s. Techniques to measure lexical processing are not just the back-bone of the typical experimental psychology laboratory, but are now routinely used by cognitive neuroscientists to study brain processing and increasingly by social and clinical psychologists (Eder, Hommel, and De Houwer 2007). Models developed to explain lexical processing have also aspired to (...) be statements about the nature of human cognition (e.g., connectionist models, Plaut, McClelland, Seidenberg, and Patterson 1996). Words were convenient objects to study for cognitive psychologists because they are welldefined and their nature as alphabetic strings was a good fit to analysis with the computer programming languages of the 1970s and 1980s which excelled at string manipulation. But are words actually the privileged unit of mental representation and processing that all of this scientific attention makes them out to be? Like a growing number of other language researchers, our answer is no (see, e.g., (Bybee and Hopper 2001; Wray 2002). We propose that the mental representations for lexical structures form a continuum, from word combinations which have fossilized into single units (nightclub) to those that both exist as independent units and yet have bonds, varying in tightness, with the words with which they frequently co-occur (Harris 1998). The first line of support for this view is the simple observation that fluent speakers easily recognize the familiarity and cohesive quality of word combinations in their language. Examples in English include common noun compounds (last year, brand new), verb phrases (cut down, get a hold of, faced with) and other multi-word expressions such as common sayings and references to cultural concepts (saved by the bell, speed of light; Jackendoff 1995).. (shrink)
We tested the hypothesis that more frequent exposure to multiword phrases results in deeper entrenchment of their representations, by examining the performance of subjects of different religiosity in the recognition of briefly presented liturgical and secular phrases drawn from several frequency classes. Three of the sources were prayer texts that religious Jews are required to recite on a daily, weekly, and annual basis, respectively; two others were common and rare expressions encountered in the general secular Israeli culture. As expected, linear (...) dependence of recognition score on frequency was found for the religious subjects (being most pronounced for men, who are usually more observant than women); both religious and secular subjects performed better on common than on rare general culture items. Our results support the notion of graded entrenchment introduced by Langacker and shared by several cognitive linguistic theories of language comprehension and production. (shrink)
The distributional principle according to which morphemes that occur in identical contexts belong, in some sense, to the same category  has been advanced as a means for extracting syntactic structures from corpus data. We extend this principle by applying it recursively, and by using mutual information for estimating category coherence. The resulting model learns, in an unsupervised fashion, highly structured, distributed representations of syntactic knowledge from corpora. It also exhibits promising behavior in tasks usually thought to require representations anchored (...) in a grammar, such as systematicity. (shrink)
Beer’s paper devotes much energy to buttressing the walls of Castle Dynamic and dredging its moat in the face of what some of its dwellers perceive as a besieging army chanting “no cognition without representation”. The divide is real, as attested by the contrast between titles such as “Intelligence without representation” (Brooks, 1991) and “In defense of representation” (Markman and Dietrich, 2000), to pick just one example from each side. It is, however, not too late for people from both (...) sides of the moat to meet on the drawbridge and see if all that energy can be put to a better use. The parley can be organized around an attempt to identify those attributes of representations that both sides may ﬁnd useful. In my view, one such attribute is the capacity for hierarchical abstraction, an aspect of the representational approach not actually mentioned by Beer. The capacity for abstraction is gained automatically by those theoretical frameworks that opt for the explanatory beneﬁts of representation in understanding cognition. Moreover, without appropriately structured mediating states, a cognitive system would be incapable of dealing with complex reality. Thus, hierarchical abstraction is useful not just for the cognitive scientists in their attempts to scale up their understanding of cognition: it is also indispensable for cognitive systems that aspire to scale up their understanding of the world. (shrink)
A metaphor that has dominated linguistics for the entire duration of its existence as a discipline views sentences as ediﬁces consisting of Lego-like building blocks. It is assumed that each sentence is constructed (and, on the receiving end, parsed) ab novo, starting (ending) with atomic constituents, to logical semantic speciﬁcations, in a recursive process governed by a few precise algebraic rules. The assumptions underlying the Lego metaphor, as it is expressed in generative grammar theories, are: (1) perfect regularity of what (...) Saussure called langue, (2) inﬁnite potential recursivity of syntactic structures, (3) unlimited human capacity for linguistic creativity, (4) the impossibility of acquiring structural knowledge from examples, and (5) the impossibility of such knowledge being stored in a memory-intensive form (ensembles of exemplars). (shrink)
We describe a uniﬁed framework for the understanding of structure representation in primate vision. A model derived from this framework is shown to be effectively systematic in that it has the ability to interpret and associate together objects that are related through a rearrangement of common “middle-scale” parts, represented as image fragments. The model addresses the same concerns as previous work on compositional representation through the use of what+where receptive ﬁelds and attentional gain modulation. It does not require (...) prior exposure to the individual parts, and avoids the need for abstract symbolic binding. (shrink)
We compare our model of unsupervised learning of linguistic structures, ADIOS [1, 2, 3], to some recent work in computational linguistics and in grammar theory. Our approach resembles the Construction Grammar in its general philosophy (e.g., in its reliance on structural generalizations rather than on syntax projected by the lexicon, as in the current generative theories), and the Tree Adjoining Grammar in its computational characteristics (e.g., in its apparent afﬁnity with Mildly Context Sensitive Languages). The representations learned by our algorithm (...) are truly emergent from the (unannotated) corpus data, whereas those found in published works on cognitive and construction grammars and on TAGs are hand-tailored. Thus, our results complement and extend both the computational and the more linguistically oriented research into language acquisition. We conclude by suggesting how empirical and formal study of language can be best integrated. (shrink)
Nearest-neighbor correlation-based similarity computation in the space of outputs of complex-type receptive elds can support robust recognition of 3D objects. Our experiments with four collections of objects resulted in mean recognition rates between 84% (for subordinate-level discrimination among 15 quadruped animal shapes) and 94% (for basic-level recognition of 20 everyday objects), over a 40 40 range of viewpoints, centered on a stored canonical view and related to it by rotations in depth. This result has interesting implications for the design of (...) a front end to an arti cial object recognition system, and for the understanding of the faculty of object recognition in primate vision. (shrink)
We report a quantitative analysis of the cross-utterance coordination observed in child-directed language, where successive utterances often overlap in a manner that makes their constituent structure more prominent, and describe the application of a recently published unsupervised algorithm for grammar induction to the largest available corpus of such language, producing a grammar capable of accepting and generating novel wellformed sentences. We also introduce a new corpus-based method for assessing the precision and recall of an automatically acquired generative grammar without recourse (...) to human judgment. The present work sets the stage for the eventual development of more powerful unsupervised algorithms for language acquisition, which would make use of the coordination structures present in natural child-directed speech. (shrink)
SUMMARY. This paper examines four current theoretical approaches to the representation and recognition of visual objects: structural descriptions, geometric constraints, multidimensional feature spaces, and shape-space approximation. The strengths and the weaknesses of the theories are considered, with a special focus on their approach to categorization — a computationally challenging task which is not widely addressed in computer vision (where the stress is rather on the generalization of recognition across changes of viewpoint).
We examined the role of ﬁtness, commonly assumed without proof to be conferred by the mastery of language, in shaping the dynamics of language evolution. To that end, we introduced island migration (a concept borrowed from population genetics) into the shared lexicon model of communication (Nowak et al., 1999). The effect of ﬁtness linear in language coherence was compared to a control condition of neutral drift. We found that in the neutral condition (no coherence-dependent ﬁtness) even a small migration rate (...) – less than 1% – sufﬁces for one language to become dominant, albeit after a long time. In comparison, when ﬁtness-based selection is introduced, the subpopulations stabilize quite rapidly to form several distinct languages. Our ﬁndings support the notion that language confers increased ﬁtness. The possibility that a shared language evolved as a result of neutral drift appears less likely, unless migration rates over evolutionary times were extremely small. (shrink)
An image of a face depends not only on its shape, but also on the viewpoint, illumination conditions, and facial expression. A face recognition system must overcome the changes in face appearance induced by these factors. This paper investigate two related questions: the capacity of the human visual system to generalize the recognition of faces to novel images, and the level at which this generalization occurs. We approach this problems by comparing the identi cation and generalization capacity for upright and (...) inverted faces. For upright faces, we found remarkably good generalization to novel conditions. For inverted faces, the generalization to novel views was signi cantly worse for both new illumination and viewpoint, although the performance on the training images was similar to the upright condition. Our results indicate that at least some of the processes that support generalization across viewpoint and illumination are neither universal (because subjects did not generalize as easily for inverted faces as for upright ones), nor strictly objectspeci c (because in upright faces nearly perfect generalization was possible from a single view, by itself insu cient for building a complete object-speci c model). We propose that generalization in face recognition occurs at an intermediate level that is applicable to a class of objects, and that at this level upright and inverted faces initially constitute distinct object classes. (shrink)
The computational program for theoretical neuroscience initiated by Marr and Poggio (1977) calls for a study of biological information processing on several distinct levels of abstraction. At each of these levels — computational (deﬁning the problems and considering possible solutions), algorithmic (specifying the sequence of operations leading to a solution) and implementational — signiﬁcant progress has been made in the understanding of cognition. In the past three decades, computational principles have been discovered that are common to a wide range of (...) functions in perception (vision, hearing, olfaction) and action (motor control). More recently, these principles have been applied to the analysis of cognitive tasks that require dealing with structured information, such as visual scene understanding and analogical reasoning. Insofar as language relies on cognition-general principles and mechanisms, it should be possible to capitalize on the recent advances in the computational study of cognition by extending its methods to linguistics. (shrink)
Lasnik’s review of the Minimalist program in syntax  offers cognitive scientists help in navigating some of the arcana of the current theoretical thinking in transformational generative grammar. One may observe, however, that this journey is more like a taxi ride gone bad than a free tour: it is the driver who decides on the itinerary, and questioning his choice may get you kicked out. Meanwhile, the meter in the cab of the generative theory of grammar is running, and (...) has been since the publication of Chomsky’s Syntactic Structures in 1957. The fare that it ran up is none the less daunting for the detours made in his Aspects of Theory of Syntax in 1965, Government and Binding in 1981, and now The Minimalist Program, in 1995. Paraphrasing Winston Churchill, it seems like never in the ﬁeld of cognitive science was so much owed by so many of us to so few (the generative linguists). For most of us in the cognitive sciences this situation will appear quite benign (that is, if we don’t hold a grudge for having been taken for a longer than necessary ride), if we realize that it is the generative linguists who should by rights be paying this bill. The reason for that is simple and is well known in the philosophy of science: putting forward a theory is like taking out a loan, to be repayed by gleaning an empirical basis for it; theories that fail to do so (or their successors that may have bought their debts) are declared bankrupt. In the sciences of the mind, this maxim translates into the need to demonstrate the psychological (behavioral), and, eventually, the neurobiological, reality of the theoretical constructs. Many examples of this process can be found in the study of human vision, where, as in language, direct observation of the underlying mechanisms is difﬁcult; for instance, the concept of multiple parallel spatial frequency channels, introduced in the late 1960s, was completely vindicated by purely behavioral means over the following decade; see, e.g., . In linguistics, the nature of the requisite evidence is well described by Townsend and Bever: “What do we test today if we want to explore the behavioral implications of syntax? .. (shrink)
By what empirical means can a person determine whether he or she is presently awake or dreaming? Any conceivable test addressing this question, which is a special case of the classical metaphysical doubting of reality, must be statistical (for the same reason that empirical science is, as noted by Hume). Subjecting the experienced reality to any kind of statistical test (for instance, a test for bizarreness) requires, however, that a set of baseline measurements be available. In a dream, or in (...) a simulation, any such baseline data would be vulnerable to tampering by the same processes that give rise to the experienced reality, making the outcome of a reality test impossible to trust. Moreover, standard cryptographic defenses against such tampering cannot be relied upon, because of the potentially unlimited reach of reality modiﬁcation within a dream, which may range from the integrity of the veriﬁcation keys to the declared outcome of the entire process. In the face of this double predicament, the rational course of action is to take reality at face value. The predicament also has some intriguing corollaries. In particular, even the most revealing insight that a person may gain into the ultimate nature of reality (for instance, by attaining enlightenment in the Buddhist sense) is ultimately unreliable, for the reasons just mentioned. At the same time, to adhere to this principle, one has to be aware of it, which may not be possible in various states of reduced or altered cognitive function such as dreaming or religious experience. Thus, a subjectively enlightened person may still lack the one truly important piece of the puzzle concerning his or her existence. (shrink)
Construction-based approaches to syntax (Croft, 2001; Goldberg, 2003) posit a lexicon populated by units of various sizes, as envisaged by (Langacker, 1987). Constructions may be speciﬁed completely, as in the case of simple morphemes or idioms such as take it to the bank, or partially, as in the expression what’s X doing Y?, where X and Y are slots that admit ﬁllers of particular types (Kay and Fillmore, 1999). Constructions offer an intriguing alternative to traditional rule-based syntax by hinting at (...) the extent to which the complexity of language can stem from a rich repertoire of stored, more or less entrenched (Harris, 1998) representations that address both syntactic and semantic issues, and encompass, in addition to general rules, “totally idiosyncratic forms and patterns of all intermediate degrees of generality” (Langacker, 1987, p.46). Because constructions are by their very nature language-speciﬁc, the question of acquisition in Construction Grammar is especially poignant. We address this issue by offering an unsupervised algorithm that learns constructions from raw corpora. (shrink)
(a) Learn a grammar GA for the source language (A). (b) Estimate a structural statistical language model SSLMA for (A). Given a grammar (consisting of terminals and nonterminals) and a partial sentence (sequence of terminals (t1 . . . ti)), an SSLM assigns probabilities to the possible choices of the next terminal ti+1.
Two of the premises of the target paper -- surface reconstruction as the goal of early vision, and inaccessibility of intermediate stages in the process presumably leading to such reconstruction -- are questioned and found wanting.
Language is a rewarding ﬁeld if you are in the prediction business. A reader who is ﬂuent in English and who knows how academic papers are typically structured will readily come up with several possible guesses as to where the title of this section could have gone, had it not been cut short by the ellipsis. Indeed, in the more natural setting of spoken language, anticipatory processing is a must: performance of machine systems for speech interpretation depends critically on the (...) availability of a good predictive model of how utterances unfold in time (Baker, 1975; Jelinek, 1990; Goodman, 2001), and there is strong evidence that prospective uncertainty affects human sentence processing too (Jurafsky, 2003; Hale, 2006; Levy, 2008). The human ability to predict where the current utterance is likely to be going is just another adaptation to the general pressure to anticipate the future (Hume, 1748; Dewey, 1910; Craik, 1943), be it in perception, thinking, or action, which is exerted on all cognitive systems by evolution (Dennett, 2003). Look-ahead in language is, however, special in one key respect: language is a medium for communication, and in communication the most interesting (that is, informative) parts of the utterance that the speaker is working through are those that cannot be predicted by the listener ahead of time. (shrink)
differentiaily rated pairwise similarity when confronted with two pairs of objects, each revolving in a separate window on a computer screen. Subject data were pooled using individually weighted MDS (ref. 11; in all the experiments, the solutions were consistent among subjects). In each trial, the subject had to select among two pairs of shapes the one consisting of the most similar shapes. The subjects were allowed to respond at will; most responded within 10 sec. Proximity (that is, perceived similarity) tables (...) derived from the judgments were processed to verify their degree of transitivity (4% of all triplets were found intransitive) and then submitted to MDS. In the long-term memory (LTM) variant of this experiment, the subjects were first trained to associate a label (a three-letter nonsensical string, such as "BON" or "POM") with each object and then carried out the pairs of pairs comparison task from memory, prompted by the object labels rather than by the objects themselves. Six subjects participated in each of the two LTM experiments (Star and Triangle). The subjects were taught each shape in a separate session and had to discriminate between that shape and six similar nontargets from various viewpoints. Training continued until the recognition rate reached 90%, over a period of several days. The subjects were never exposed to more than one target in one session and were not told the ultimate purpose of the experiment. After 2 to 3 days of rest, they were tested with questions such as: "is the BON more similar to POM than TOC to ROX?", for all pairs of pairs of stimuli. In the LTM experiments, 8% of the.. (shrink)
Supposing the symbol system postulated by Barsalou is perceptual through and through -- what then? The target article outlines an intriguing and exciting theory of cognition in which (1) wellspecified, event- or object-linked percepts assume the role traditionally allotted to abstract and arbitrary symbols, and (2) perceptual simulation is substituted for processes traditionally believed to require symbol manipulation, such as deductive reasoning. We take a more extreme stance on the role of perception (in particular, vision) in shaping cognition, and propose, (...) in addition to Barsalou's postulates, that (3) spatial frames, endowed with a perceptual structure not unlike that of the retinotopic space, pervade all sensory modalities and are used to support compositionality. (shrink)
We describe a new approach to the visual recognition of cursive handwriting. An effort is made to attain humanlike performance by using a method based on pictorial alignment and on a model of the process of handwriting.
Idealized mo dels of receptive elds (RFs) can be used as building blocks for the creation of p owerful distributed computation systems. The present rep ort concentrates on inv estigating the utility of collections of RFs in representing 3D objects under changing viewing conditions. The main requirement in this task is that the pattern of activity of RFs vary as little as p ossible when the object and the camera move relative to each other. I propose a method for representing (...) objects by RF activities, based on the observation that, in the case of rotation around a xed axis, dierences of activities of RFs that are prop erly situated with resp ect to that axis remain inv ariant. Results of computational exp eriments suggest that a representation scheme based on this algorithm for the choice of stable pairs of RFs would p erform consistently b etter than a scheme inv olving random sets of RFs. The proposed scheme may be useful under object or camera rotation, b oth for ideal Lam b ertian objects, and for real-world objects such as human faces. (shrink)
We compare our model of unsupervised learning of linguistic structures, ADIOS , to some recent work in computational linguistics and in grammar theory. Our approach resembles the Construction Grammar in its general philosophy (e.g., in its reliance on structural generalizations rather than on syntax projected by the lexicon, as in the current generative theories), and the Tree Adjoining Grammar in its computational characteristics (e.g., in its apparent afﬁnity with Mildly Context Sensitive Languages). The representations learned by our algorithm are truly (...) emergent from the (unannotated) corpus data, whereas those found in published works on cognitive and construction grammars and on TAGs are hand-tailored. Thus, our results complement and extend both the computational and the more linguistically oriented research into language acquisition. We conclude by suggesting how empirical and formal study of language can be best integrated. (shrink)
We outline an unsupervised language acquisition algorithm and offer some psycholinguistic support for a model based on it. Our approach resembles the Construction Grammar in its general philosophy, and the Tree Adjoining Grammar in its computational characteristics. The model is trained on a corpus of transcribed child-directed speech (CHILDES). The model’s ability to process novel inputs makes it capable of taking various standard tests of English that rely on forced-choice judgment and on magnitude estimation of linguistic acceptability. We report encouraging (...) results from several such tests, and discuss the limitations revealed by other tests in our present method of dealing with novel stimuli. (shrink)
We describe a method for automatic word sense disambiguation using a text corpus and a machine- readable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this deﬁnition is resolved by an iterative, converging process, in which the system learns from the corpus a set of typical usages for each of the senses of the polysemous (...) word listed in the MRD. A new instance of a polysemous word is assigned the sense associated with the typical usage most similar to its context. Experiments show that this method performs well, and can learn even from very sparse training data. (shrink)
Although computational considerations suggest that a resource-limited memory system may have to trade oﬀ capacity for generalization ability, such a trade-oﬀ has not been demonstrated in the past. We describe a simple model of memory that exhibits this trade-oﬀ and describe its performance in a variety of tasks.
Converging evidence from anatomical studies (Maunsell, 1983) and functional analyses (Hubel & Wisesel, 1968) of the nervous system suggests that the feed-forward pathway of the mammalian perceptual system follows a largely hierarchic organization scheme. This may be because hierarchic structures are intrinsically more viable and thus more likely to evolve (Simon, 2002). But it may also be because objects in our environment have a hierarchic structure and the perceptual system has evolved to match it. We conducted a behavioral experiment to (...) investigate the effect of the degree of hierarchy of the generative probabilistic structure in categorization. We generated one set of stimuli using a hierarchic underlying probability distribution, and another set according to a non-hierarchic one. Participants were instructed to categorize these images into one of the two possible categories a. Our results suggest that participants perform more accurately in the case of hierarchically structured stimuli. (shrink)
We describe a pattern acquisition algorithm that learns, in an unsupervised fashion, a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structured knowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentences as paths on a graph whose vertices are words (or parts of words). Signiﬁcant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented by (...) trees composed of signiﬁcant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected to a standard test of English as a Second Language (ESL) proﬁ- ciency. The results are encouraging: the model attains a level of performance considered to be “intermediate” for 9th-grade students, despite having been trained on a corpus (CHILDES) containing transcribed speech of parents directed to small children. (shrink)
We describe a linguistic pattern acquisition algorithm that learns, in an unsupervised fashion, a streamlined representation of corpus data. This is achieved by compactly coding recursively structured constituent patterns, and by placing strings that have an identical backbone and similar context structure into the same equivalence class. The resulting representations constitute an efﬁcient encoding of linguistic knowledge and support systematic generalization to unseen sentences.
To learn a visual code in an unsupervised manner, one may attempt to capture those features of the stimulus set that would contribute signiﬁcantly to a statistically eﬃcient representation (as dictated, e.g., by the Minimum Description Length principle). Paradoxically, all the candidate features in this approach need to be known before statistics over them can be computed. This paradox may be circumvented by conﬁning the repertoire of candidate features to actual scene fragments, which resemble the “what+where” receptive ﬁelds found in (...) the ventral visual stream in primates. We describe a single-layer network that learns such fragments from unsegmented raw images of structured objects. The learning method combines fast imprinting in the feedforward stream with lateral interactions to achieve single-epoch unsupervised acquisition of spatially localized features that can support systematic treatment of structured objects . (shrink)
Unsupervised statistical learning is the standard setting for the development of the only advanced visual system that is both highly sophisticated and versatile, and extensively studied: that of monkeys and humans. In this extended abstract, we invoke philosophical observations, computational arguments, behavioral data and neurobiological ﬁndings to explain why computer vision researchers should care about (1) unsupervised learning, (2) statistical inference, and (3) the visual brain. We then outline a neuromorphic approach to structural primitive learning motivated by these considerations, survey (...) a range of neurobiological ﬁndings and behavioral data consistent with it, and conclude by mentioning some of the more challenging directions for future research. (shrink)
The statistical structure of a class of objects such as human faces can be exploited to recognize familiar faces from novel viewpoints and under variable illumination conditions. We present computational and psychophysical data concerning the extent to which class-based learning transfers or generalizes within the class of faces. We rst examine the computational prerequisite for generalization across views of novel faces, namely, the similarity of di erent faces to each other. We next describe two computational models which exploit the similarity (...) structure of the class of faces. The performance of these models constrains hypotheses about the nature of face representation in human vision, and supports the notion that human face processing operates in a class-based fashion. Finally, we relate the computational data to well-established ndings in the human memory literature concerning the relationship between the typicality and recognizability of faces. (shrink)
A computational-level analysis of the processes dealing with object and scene structure requires that we ﬁrst identify the common functional characteristics of structure-related behavioral tasks. In other problems in high-level vision, effective functional characterization typically led to advances in the computational understanding, and to better modeling, of the relevant aspects of the human visual system. In the study of visual motion, for example, the realization of the central role of the correspondence problem constituted just such an advance. Likewise, object recognition (...) tasks, such as identiﬁcation or categorization, have at their core a common operation, namely, the matching of the stimulus against a stored memory trace. For the structure-processing tasks, a good candidate for the signature common characteristic is the re- striction of the spatial scope of at least some of the operations involved to some fraction of the visual extent of the object or scene under consideration. In other words, a task should only qualify for the label “structural” if it calls for a separate treatment of some fragment(s) of the stimulus (and not merely of the whole). Here are a few examples of behavioral tasks that qualify as structural according to this criterion. (shrink)
Computer vision systems are, on most counts, poor performers, when compared to their biological counterparts. The reason for this may be that computer vision is handicapped by an unreasonable assumption regarding what it means to see, which became prevalent as the notions of intrinsic images and of representation by reconstruction took over the ﬁeld in the late 1970’s. Learning from biological vision may help us to overcome this handicap.
The publication in 1982 of David Marr’s Vision has delivered a singular boost and a course correction to the science of vision. Thirty years later, cognitive science is being transformed by the new ways of thinking about what it is that the brain computes, how it does that, and, most importantly, why cognition requires these computations and not others. This ongoing process still owes much of its impetus and direction to the sound methodology, engaging style, and unique voice of Marr’s (...) Vision. (shrink)
Variation set structure — partial alignment of successive utterances in child-directed speech — has been shown to correlate with progress in the acquisition of syntax by children. The present study demonstrates that arranging a certain proportion of utterances in a training corpus in variation sets facilitates word segmentation and phrase structure learning in miniature artiﬁ- cial languages by adults. Our ﬁndings have implications for understanding the mechanisms of L1 acquisition by children, and for the development of more eﬃcient algorithms for (...) automatic language acquisition, as well as better methods for L2 instruction. (shrink)
What insights does comparative biology provide for furthering scienti¿ c understanding of the evolution of dynamic coordination? Our discussions covered three major themes: (a) the fundamental unity in functional aspects of neurons, neural circuits, and neural computations across the animal kingdom; (b) brain organization –behavior relationships across animal taxa; and (c) the need for broadly comparative studies of the relationship of neural structures, neural functions, and behavioral coordination. Below we present an overview of neural machinery and computations that are shared (...) by all nervous systems across the animal kingdom, and the related fact that there really are no “simple” relationships in coordination between nervous systems and the behavior they produce. The simplest relationships seen in living organisms are already fairly complex by computational standards. These realizations led us to think about ways that brain similarities and differences could be used to produce new insights into complex brain–behavior phenomena (including a critical appraisal of the roles of cortical and noncortical structures in mammalian behavior), and to think brieÀy about how future studies could best exploit comparative methods to elucidate better general principles underlying the neural mechanisms associated with behavioral coordination. In our view, it is unlikely that the intricacies interrelating neural and behavioral coordination are due to one particular manifestation (such as neural oscillation or the possession of a six-layered cortex). Instead of considering the human cortex to be the standard against which all things are measured (and thus something to crow about), both broad and focused comparative studies on behavioral similarities and differences will be necessary to elucidate the fundamental principles underlying dynamic coordination. (shrink)
The proponents of machine consciousness predicate the mental life of a machine, if any, exclusively on its formal, organizational structure, rather than on its physical composition. Given that matter is organized on a range of levels in time and space, this generic stance must be further constrained by a principled choice of levels on which the posited structure is supposed to reside. Indeed, not only must the formal structure fit well the physical system that realizes it, but it must do (...) so in a manner that is determined by the system itself, simply because the mental life of a machine cannot be up to an external observer. To illustrate just how tall this order is, we carefully analyze the scenario in which a digital computer simulates a network of neurons. We show that the formal correspondence between the two systems thereby established is at best partial, and, furthermore, that it is fundamentally incapable of realizing both some of the essential properties of actual neuronal systems and some of the fundamental properties of experience. Our analysis suggests that, if machine consciousness is at all possible, conscious experience can only be instantiated in a class of machines that are entirely different from digital computers, namely, time-continuous, open, analog dynamical systems. (shrink)
By what empirical means can a person determine whether he or she is presently awake or dreaming? Subjecting the experienced reality to a statistical test for bizarreness requires a set of baseline measurements. In a dream or in a simulation, those would be vulnerable to tampering by the same processes that give rise to the experienced reality, making the outcome of a reality test impossible to trust. Moreover, cryptographic defenses against tampering cannot be relied upon, because of the potentially unlimited (...) reach of reality modification, which may range from the integrity of the verification keys to the declared outcome of the entire process. Although the rational course of action in the face of this double predicament is to take reality at face value, even the most revealing insight that a person may gain into the ultimate nature of reality (for instance, by attaining enlightenment) is ultimately unreliable, for the reasons just mentioned. However, to adhere to this principle, one has to be aware of it, which may not be possible in various states of altered cognitive function (e.g., dreaming). Thus, a subjectively enlightened person may still lack the one truly important piece of the puzzle concerning his or her existence. (shrink)
Shanahan’s eloquently argued version of the global workspace theory ﬁts well into the emerging understanding of consciousness as a computational phenomenon. His disinclination toward metaphysics notwithstanding, Shanahan’s book can also be seen as supportive of a particular metaphysical stance on consciousness — the computational identity theory.
A standing challenge for the science of mind is to account for the datum that every mind faces in the most immediate – that is, unmediated – fashion: its phenomenal experience. The complementary tasks of explaining what it means for a system to give rise to experience and what constitutes the content of experience (qualia) in computational terms are particularly challenging, given the multiple realizability of computation. In this paper, we identify a set of conditions that a computational theory must (...) satisfy for it to constitute not just a sufficient but a necessary, and therefore naturalistic and intrinsic, explanation of qualia. We show that a common assumption behind many neurocomputational theories of the mind, according to which mind states can be formalized solely in terms of instantaneous vectors of activities of representational units such as neurons, does not meet the requisite conditions, in part because it relies on inactive units to shape presently experienced qualia and implies a homogeneous representation space, which is devoid of intrinsic structure. We then sketch a naturalistic computational theory of qualia, which posits that experience is realized by dynamical activity-space trajectories (rather than points) and that its richness is measured by the representational capacity of the trajectory space in which it unfolds. (shrink)
The three commentaries of Van Orden, Spivey and Anderson, and Dietrich (with Markman’s as a backdrop) form a tableau that reminds me of a fable by Ivan Andreevich Krylov (1769 - 1844), in which a swan, a pike, and a crawﬁsh undertake jointly to move a cart laden with goods. What transpires then is not unexpected: the swan strives skyward, the pike pulls toward the river, and the crawﬁsh scrambles backward. The call for papers for the present ecumenically minded special (...) issue of JETAI was designed to minimize this kind of discord, by charging the authors to examine the possibility of epistemological pluralism in cognitive science — a ﬁeld whose very diversity makes fundamental disagreement more likely than in other sciences. No doubt, the road mapped out by the editor had been conceived with good intentions in mind, but where did it lead us? It has been said that no good intention must go unpunished. To celebrate this venerable academic tradition (and also because I have a reputation to maintain), the following remarks will therefore be mostly other than conciliatory; caveat lector. (shrink)
Are minds really dynamical or are they really symbolic? Because minds are bundles of computations, and because computation is always a matter of interpretation of one system by another, minds are necessarily symbolic. Because minds, along with everything else in the universe, are physical, and insofar as the laws of physics are dynamical, minds are necessarily dynamical systems. Thus, the short answer to the opening question is “yes.” It makes sense to ask further whether some of the computations that constitute (...) a human mind are constrained by functional, algorithmic, or implementational factors to be essentially of the discrete symbolic variety (even if they supervene on an apparently continuous dynamical substrate). I suggest that here too the answer is “yes” and discuss the need for such discrete, symbolic cognitive computations in communication-related tasks. (shrink)