In recent articles Fodor and Lepore have argued that not only do considerations of learnability dictate that meaning must be compositional in the wellknown sense that the meanings of all sentences are determined by the meanings of a finite number of primitive expressions and a finite number of operations on them, but also that meaning must be 'reverse compositional' as well, in the sense that the meanings of the primitive expressions of which a complex expression is composed must be (...) determined by the meaning of that complex expression plus the manner of its composition. I argue against the requirement of reverse compositionality and against the claim that learnability requires it. I consider some objections and close the paper by arguing against the related claim that concepts are reverse compositional. (shrink)
A striking cross-linguistic generalisation about the semantics of determiners is that they never express non-conservative relations. To account for this one might hypothesise that the mechanisms underlying human language acquisition are unsuited to non-conservative determiner meanings. We present experimental evidence that 4- and 5-year-olds fail to learn a novel non-conservative determiner but succeed in learning a comparable conservative determiner, consistent with the learnability hypothesis.
This paper investigates the learnability by positive examples in the sense of Gold of Pregroup Grammars. In a first part, Pregroup Grammars are presented and a new parsing strategy is proposed. Then, theoretical learnability and non-learnability results for subclasses of Pregroup Grammars are proved. In the last two parts, we focus on learning Pregroup Grammars from a special kind of input called feature-tagged examples. A learning algorithm based on the parsing strategy presented in the first part is (...) given. Its validity is proved and its properties are examplified. (shrink)
In this paper we present learning algorithms for classes of categorial grammars restricted by negative constraints. We modify learning functions of Kanazawa  and apply them to these classes of grammars. We also prove the learnability of intersection of the class of minimal grammars with the class of k-valued grammars.
This paper proposes solutions to two semantic learnability problems that have featured prominently in the literature on language acquisition. Both problems have often been deemed unsolvable for language learners as a matter of logic, and they have accordingly been taken to motivate principles making sure they will not actually arise in the course of language acquisition. One problem concerns the acquisition of ambiguous sentences whose readings are related by entailment. Crain et al.'s (1994) Semantic Subset Principle is intended to (...) preempt the problem by preventing acquisition of the weaker reading before the stronger reading has been acquired. In contrast, we demonstrate that this very order of acquisition becomes feasible in principle if children can exploit non-truth-conditional evidence of various kinds or evidence from sentences containing downward entailing operators. The other learnability problem concerns the potential need for expunction of certain readings of ambiguous sentences from a child's grammar. It has often been assumed that, in the absence of negative evidence, such expunction is impossible, and Wexler and Manzini (1987) posit a Subset Principle to preempt the problematic learning scenario. We argue, however, that if the evidence available to the child includes dialogues, and if listeners are expected to interpret speakers' utterances charitably, then expunction of unavailable readings is possible in principle. (shrink)
We propose that free viewing of natural images in human infants can be understood and analyzed as the product of intrinsically-motivated visual exploration. We examined this idea by first generating five sets of center-of-gaze (COG) image samples, which were derived by presenting a series of natural images to groups of both real observers (i.e., 9-month-olds and adults) and artificial observers (i.e., an image-saliency model, an image-entropy model, and a random-gaze model). In order to assess the sequential learnability of the (...) COG samples, we paired each group of samples with a simple recurrent network, which was trained to reproduce the corresponding sequence of COG samples. We then asked whether an intrinsically-motivated artificial agent would learn to identify the most successful network. In Simulation 1, the agent was rewarded for selecting the observer group and network with the lowest prediction errors, while in Simulation 2 the agent was rewarded for selecting the observer group and network with the largest rate of improvement. Our prediction was that if visual exploration in infants is intrinsically-motivated – and more specifically, the goal of exploration is to learn to produce sequentially-predictable gaze patterns – then the agent would show a preference for the COG samples produced by the infants over the other four observer groups. The results from both simulations supported our prediction. We conclude by highlighting the implications of our approach for understanding visual development in infants, and discussing how the model can be elaborated and improved. (shrink)
It is sometimes argued that if PDP networks can be trained to make correct judgements of grammaticality we have an existence proof that there is enough information in the stimulus to permit learning grammar by inductive means alone. This seems inconsistent superficially with Gold's theorem and at a deeper level with the fact that networks are designed on the basis of assumptions about the domain of the function to be learned. To clarify the issue I consider what we should learn (...) from Gold's theorem, then go on to inquire into what it means to say that knowledge is domain specific. I first try sharpening the intuitive notion of domain specific knowledge by reviewing the alleged difference between processing limitations due to shortage of resources vs shortages of knowledge. After rejecting different formulations of this idea, I suggest that a model is language specific if it transparently refer to entities and facts about language as opposed to entities and facts of more general mathematical domains. This is a useful but not necessary condition. I then suggest that a theory is domain specific if it belongs to a model family which is attuned in a law-like way to domain regularities. This leads to a comparison of PDP and parameter setting models of language learning. I conclude with a novel version of the poverty of stimulus argument. (shrink)
How do minds emerge from developing brains? According to the representational features of cortex are built from the dynamic interaction between neural growth mechanisms and environmentally derived neural activity. Contrary to popular selectionist models that emphasize regressive mechanisms, the neurobiological evidence suggests that this growth is a progressive increase in the representational properties of cortex. The interaction between the environment and neural growth results in a flexible type of learning: minimizes the need for prespecification in accordance with recent neurobiological evidence (...) that the developing cerebral cortex is largely free of domain-specific structure. Instead, the representational properties of cortex are built by the nature of the problem domain confronting it. This uniquely powerful and general learning strategy undermines the central assumption of classical learnability theory, that the learning properties of a system can be deduced from a fixed computational architecture. Neural constructivism suggests that the evolutionary emergence of neocortex in mammals is a progression toward more flexible representational structures, in contrast to the popular view of cortical evolution as an increase in innate, specialized circuits. Human cortical postnatal development is also more extensive and protracted than generally supposed, suggesting that cortex has evolved so as to maximize the capacity of environmental structure to shape its structure and function through constructive learning. (shrink)
Although neural encoding by bats and owls presents seductive analogies, the major contribution of locus equations and orderly output constraints discussed by Sussman et al. is the demonstration that important acoustic information for speech perception can be captured by elegant and neurally-plausible learning processes.
It is proved that for any k, the class of classical categorial grammars that assign at most k types to each symbol in the alphabet is learnable, in the Gold (1967) sense of identification in the limit from positive data. The proof crucially relies on the fact that the concept known as finite elasticity in the inductive inference literature is preserved under the inverse image of a finite-valued relation. The learning algorithm presented here incorporates Buszkowski and Penn's (1990) algorithm for (...) determining categorial grammars from input consisting of functor-argument structures. (shrink)
A variety of inaccurate claims about Gold's Theorem have appeared in the cognitive science literature. I begin by characterizing the logic of this theorem and its proof. I then examine several claims about Gold's Theorem, and I show why they are false. Finally, I assess the significance of Gold's Theorem for cognitive science.
We sketch the view we call contextual semantics. It asserts that truth is semantically correct affirmability under contextually variable semantic standards, that truth is frequently an indirect form of correspondence between thought/language and the world, and that many Quinean commitments are not genuine ontological commitments. We argue that contextualist semantics fits very naturally with the view that the pertinent semantic standards are particularist rather than being systematizable as exceptionless general principles.
Boersma’s (1997, 1998) Gradual Learning Algorithm (GLA) performs a sequence of slight re-rankings of the constraint set triggered by mistakes on the incoming stream of data. Data consist of underlying forms paired with the corresponding winner forms. At each iteration, the algorithm needs to complete the current data pair with a corresponding loser form. Tesar and Smolensky (Linguist Inq 29:229–268, 1998) suggest that this current loser should be set equal to the winner predicted by the current ranking. This paper develops (...) a new argument for Tesar and Smolensky’s proposal, based on the GLA’s factorizability. The underlying typology often encodes non-interacting phonological processes, so that it factorizes into smaller typologies that encode a single process each. The GLA should be able to take advantage of this factorizability, in the sense that a run of the algorithm on the original typology should factorize into independent runs on the factor typologies. Factorizability of the GLA is guaranteed provided the current loser is set equal to the current prediction, providing new support for Tesar and Smolensky’s proposal. (shrink)
We formalise the notion of those infinite binary sequences z that admit a single program P which expresses the entire algorithmical structure of z. Such a program P minimizes the information which must be used in a relative computation for z. We propose two concepts with different strength for this notion, the learnable and the super-learnable sequences. We establish three different equivalent characterizations of learnable (super-learnable, resp.) sequences. In particular, we prove that a sequences z is learnable (super-learnable, resp.) if (...) and only if there is a computable probability measure p such that p is Schnorr (Martin-Lof, resp.) p-random. There is a recursively enumerable sequence which is not learnable. The learnable sequences are invariant with respect to all total and effective transformations of infinite binary sequences. (shrink)
A celebrated argument for the claim that natural languages are compositional is the learnability argument. Briefly: for it to be possible to learn an entire natural language, which has infinitely many sentences, the language must have a compositional semantics. This argument has two main problems: One of them concerns the difference between compositionality and computability: if the argument is good at all, it only shows that the language must have a computable semantics, which allows speakers to compute the meanings (...) of new sentences. But a semantics may be computable without being compositional (and vice versa). Why would we want the semantics to be compositional over and above being computable? The learnability argument doesn’t tell us. (shrink)
Christiansen & Chater (C&C) suggest that language is an organism, like us, and that our brains were not selected for Universal Grammar (UG) capacity; rather, languages were selected for learnability with minimal trial-and-error experience by our brains. This explanation is circular: Where did our brain's selective capacity to learn all and only UG-compliant languages come from?
The primary goal of modern linguistic theory (at least in the circles I inhabit) is an explanation of the human language capacity and how it enables the child to acquire adult competence in language.1 Adult competence in turn is understood as the ability (or knowledge) to creatively map between sound and meaning, using a rich combinatorial system – the lexicon and grammar of the language. An adequate theory must satisfy at least three crucial constraints, which I will call the Descriptive (...) Constraint, the Learnability Constraint, and the Evolutionary Constraint. (shrink)
The main task is to discuss the issue in belief dynamics in which philosophical beliefs and rational introspective agents incorporate Moorean type new information. First, a brief survey is conducted on Moore’s Paradox, and one of its solutions is introduced with the help of Update Semantics. Then, we present a Dynamic Doxastic Logic (DDL) which revises the belief of introspective agents put forward by Lindström & Rabinowicz. Next, we attempt to incorporate Moorean type new information within the DEL (DDL) framework, (...) as advised by van Benthem, Segerberg et al. Though we maintain the principle of “the primacy of new information” from the literature on traditional belief revision theory, several unsuccessful ways are also presented. We then conclude that some special kind of success (weak success) can still be found in those revision processes although absolute success does not hold. At last, the relevant problem of “learnability” is re-considered through weak success. (shrink)
Natural language is full of patterns that appear to fit with general linguistic rules but are ungrammatical. There has been much debate over how children acquire these “linguistic restrictions,” and whether innate language knowledge is needed. Recently, it has been shown that restrictions in language can be learned asymptotically via probabilistic inference using the minimum description length (MDL) principle. Here, we extend the MDL approach to give a simple and practical methodology for estimating how much linguistic data are required to (...) learn a particular linguistic restriction. Our method provides a new research tool, allowing arguments about natural language learnability to be made explicit and quantified for the first time. We apply this method to a range of classic puzzles in language acquisition. We find some linguistic rules appear easily statistically learnable from language experience only, whereas others appear to require additional learning mechanisms (e.g., additional cues or innate constraints). (shrink)
This paper contributes to the debate about ‘tenseless languages’ by defending a tensed analysis of a superficially tenseless language. The language investigated is St’át’imcets (Lillooet Salish). I argue that although St’át’imcets lacks overt tense morphology, every finite clause in the language possesses a phonologically covert tense morpheme; this tense morpheme restricts the reference time to being non-future. Future interpretations, as well as ‘past future’ would-readings, are obtained by the combination of covert tense with an operator analogous to Abusch’s (1985) WOLL. (...) I offer St’át’imcets-internal evidence (of a kind not previously adduced) that the WOLL-like operator is modal in nature. It follows from the analysis presented here that there are only two (probably related) differences between St’át’imcets and English in the area of tense. The first is that St’át’imcets lacks tense morphemes which are pronounced. The second is that the St’át’imcets tense morpheme is semantically underspecified compared to English ones. In each of these respects, the St’át’imcets tense morpheme displays similar properties to pronouns, which may be covert and which may fail to distinguish person, number or gender. Along the way, I point out several striking and subtle similarities in the interpretive possibilities of St’át’imcets and English. I suggest that these similarities may reveal non-accidental properties of tense systems in natural language. I conclude with discussion of the implications of the analysis for cross-linguistic variation, learnability and the possible existence of tenseless languages. (shrink)
It is argued that donald davidson has not succeeded in showing that we need a constructive theory of meaning--A theory for a natural language which davidson considers to have as its base a finite number of semantic primitives--In order to explain language learning and, In particular, Linguistic productivity. This linguistic productivity is the ability of a speaker who has mastered the meaning of a finite stock of words and a finite number of grammatical rules, To produce and understand sentences which (...) he has not constructed or encountered before. (shrink)
The acquisition of the passive in English poses a learnability problem. Most transitive verbs have passive forms (e.g., kick/was kicked by), tempting the child to form a productive rule of passivization deriving passive.participles from active forms. However, some verbs cannot be passivized (e.g. cost/*was cost by). Given that children do not receive negative evidence telling them which strings are ungrammatical, what prevents them from overgeneralizing a productive passive rule to the exceptional verbs (or if they do incorrectly pas- sivize (...) such verbs, how do they recover)? One possible solution is that children are conservative: they only generate passives for those verbs that they have heard in passive sentences in the input. We show that this proposal is incorrect. (shrink)
Computational learning theory explores the limits of learnability. Studying language acquisition from this perspective involves identifying classes of languages that are learnable from the available data, within the limits of time and computational resources available to the learner. Diﬀerent models of learning can yield radically diﬀerent learnability results, where these depend on the assumptions of the model about the nature of the learning process, and the data, time, and resources that learners have access to. To the extent that (...) such assumptions accurately reﬂect human language learning, a model that invokes them can oﬀer important insights into the formal properties of natural languages, and the way in which their representations might be eﬃciently acquired. In this chapter we consider several computational learning models that have been applied to the language learning task. Some of these have yielded results that suggest that the class of natural languages cannot be eﬃciently learned from the primary linguistic data (PLD) available to children, through.. (shrink)
Learning theory has frequently been applied to language acquisition, but discussion has largely focused on information theoretic problems—in particular on the absence of direct negative evidence. Such arguments typically neglect the probabilistic nature of cognition and learning in general. We argue first that these arguments, and analyses based on them, suffer from a major flaw: they systematically conflate the hypothesis class and the learnable concept class. As a result, they do not allow one to draw significant conclusions about the learner. (...) Second, we claim that the real problem for language learning is the computational complexity of constructing a hypothesis from input data. Studying this problem allows for a more direct approach to the object of study—the language acquisition device—rather than the learnable class of languages, which is epiphenomenal and possibly hard to characterize. The learnability results informed by complexity studies are much more insightful. They strongly suggest that target grammars need to be objective, in the sense that the primitive elements of these grammars are based on objectively definable properties of the language itself. These considerations support the view that language acquisition proceeds primarily through data-driven learning of some form. (shrink)
Jablonka & Lamb's (J&L's) extended evolutionary theory is more amenable to being applied to human cultural change than standard neo-Darwinian evolutionary theory. However, the authors are too quick to dismiss past evolutionary approaches to human culture. They also overlook a potential parallel between evolved genetic mechanisms that enhance evolvability and learned cognitive mechanisms that enhance learnability.
The natural communication system of chimpanzees has some unique characteristics rooted in two possible ways of producing call variants in primates. The chimpanzee call repertoire contains variants available to all group members. The transfer presupposes voluntary control and learnability. Chimpanzee vocalization (or its homologue in the common ancestor of chimpanzee and man) seems to represent a real precursor of human language.
Selection through iterated learning explains no more than other non-functional accounts, such as Universal Grammar (UG), why language is so well designed for communicative efficiency. It does not predict several distinctive features of language, such as central embedding, large lexicons, or the lack of iconicity, which seem to serve communication purposes at the expense of learnability.
Children learn their native language by exposure to their linguistic and communicative environment, but apparently without requiring that their mistakes be corrected. Such learning from “positive evidence” has been viewed as raising “logical” problems for language acquisition. In particular, without correction, how is the child to recover from conjecturing an over-general grammar, which will be consistent with any sentence that the child hears? There have been many proposals concerning how this “logical problem” can be dissolved. In this study, we review (...) recent formal results showing that the learner has sufficient data to learn successfully from positive evidence, if it favors the simplest encoding of the linguistic input. Results include the learnability of linguistic prediction, grammaticality judgments, language production, and form-meaning mappings. The simplicity approach can also be “scaled down” to analyze the learnability of specific linguistic constructions, and it is amenable to empirical testing as a framework for describing human language acquisition. (shrink)
Limiting identification of r.e. indexes for r.e. languages (from a presentation of elements of the language) and limiting identification of programs for computable functions (from a graph of the function) have served as models for investigating the boundaries of learnability. Recently, a new approach to the study of "intrinsic" complexity of identification in the limit has been proposed. This approach, instead of dealing with the resource requirements of the learning algorithm, uses the notion of reducibility from recursion theory to (...) compare and to capture the intuitive difficulty of learning various classes of concepts. Freivalds, Kinber, and Smith have studied this approach for function identification and Jain and Sharma have studied it for language identification. The present paper explores the structure of these reducibilities in the context of language identification. It is shown that there is an infinite hierarchy of language classes that represent learning problems of increasing difficulty. It is also shown that the language classes in this hierarchy are incomparable, under the reductions introduced, to the collection of pattern languages. Richness of the structure of intrinsic complexity is demonstrated by proving that any finite, acyclic, directed graph can be embedded in the reducibility structure. However, it is also established that this structure is not dense. The question of embedding any infinite, acyclic, directed graph is open. (shrink)
The learnability of features and their dependence on task and context do not rule out the possibility that primitives used for constructing new features are as small as pixels, nor that they are as large as object parts, or even entire objects. In fact, the simplest approach to feature acquisition may be to treat objects not as if they are composed of unknown primitives according to unknown rules, but rather as if they are what they seem: patterns of atomic (...) features, standing in various similarity relationships to other objects, which serve as holistic features. (shrink)
Recent studies of Italian past definite and past participle forms show that human performance with regular and irregular inflections is not dissociated as Clahsen's model would predict. Some performance profiles, accounted for by dual-mechanism models in terms of an underlying symbol-manipulating combinatorial procedure, are generated in Italian by the higher learnability and generalizability of phonologically regular morphological processes.
The aim of this paper is to introduce Robert Brandom’s Inferentialism (Inferential theory of meaning) and Fodor and Lepore’ compositionality objection, and to protect Inferentialism from the objection based on compositionality. According to Inferentialism, To grasp or understand a concept is to have practical mastery over the inferences in which it is involved. However, Fodor and Lepore oppose Inferentialism by offering the compositionality objection. They argue that compositionality is needed to explain productivity, systematicity and learnability of language, meaning is (...) compositional. Since inferential role is not compositional, however, meaning is not an inferential role. Against Fodor and Lepore’s objection, I present Brandom’s responses and develop my own views. (shrink)