Emergent constraints on word-learning: a computational perspective

https://doi.org/10.1016/S1364-6613(03)00108-6Get rights and content

Abstract

In learning the meanings of words, children are guided by a set of constraints that give privilege to some potential meanings over others. These word-learning constraints are sometimes viewed as part of a specifically linguistic endowment. However, several recent computational models suggest concretely how word-learning – constraints included – might emerge from more general aspects of cognition, such as associative learning, attention and rational inference. This article reviews these models, highlighting the link between general cognitive forces and the word-learning they subserve. Ultimately, these cognitive forces might leave their mark not just on language learning, but also on language itself: in constraining the space of possible meanings, they place limits on cross-linguistic semantic variation.

Section snippets

Explaining word-learning through general learning processes

Many models of word-learning are grounded in general learning processes, rather than language-specific ones. Either implicitly or explicitly, these models suggest that although word-learning constraints are linguistic in nature, the learning mechanisms they spring from might not be. Instead, these linguistic constraints might emerge from general learning processes as they operate on linguistic experience.

Accelerating representations

In some word-learning models, early learning produces expectations that enable faster subsequent learning – which further strengthens the expectations, leading to yet faster learning. We may think of these expectations as ‘accelerating representations’: they permit a slow entry into word-learning to give way to accelerated learning as the expectations gradually become more accurate (cf. ‘autonomous bootstrapping’ [29]). This concept could help to explain the vocabulary spurt – a sometimes

Grounding meaning in the world and in words

Several computational models ground words in perceptual representations of objects and events in the world 11, 17, 18, 43, 44, 45. Most others ground word meaning in more abstract featural representations, but still on the assumption that there is some concrete element of experience to which the word is being linked.

However, much word-learning does not occur in this fashion – people eventually learn words for things that are not grounded in their personal experience at all (e.g. ‘prehistoric’).

Constraints and semantic universals

Word-learning constraints are a possible source of cross-linguistic semantic universals. For if words are learned in the same constrained manner across languages, the meanings of words in different languages should bear some mark of the constraints that produced them.

Regier's connectionist model of spatial term learning [11] illustrates this idea. The model learns to categorize spatial events according to the spatial system of a given target language. Because languages differ in their spatial

Objections and limitations

Many of the models discussed in this review assume an associative basis of some sort for word-learning. This basic assumption is one that has encountered two broad sorts of objection.

The first objection is that word-learning is too fast to be a reflection of an associative or statistical process 8, 30, 31. As we have seen, children can eventually learn a new word given only a very few exposures. But is this really a problem? It is true that one often thinks of associative learning as requiring

Conclusions

Word-learning is generally thought to be an underdetermined inductive problem, such that children require a set of constraints to tackle it. This view has been bolstered by the considerable empirical evidence for such constraints. However, several recent computational models of word-learning suggest that these constraints need not spring from language-specific forces. General-purpose learning mechanisms, accelerating representations, and perceptual and textual forces might all combine to

Acknowledgements

This work was supported by NIH grant DC03384. I thank Susanne Gahl and the anonymous reviewers for helpful comments on an earlier draft of this paper.

References (51)

  • C.B. Mervis et al.

    Acquisition of the novel name-nameless category (N3C) principle

    Child Dev.

    (1994)
  • N. Chomsky

    Knowledge of Language: Its Nature, Origin and Use

    (1986)
  • L. Markson et al.

    Evidence against a dedicated system for word-learning in children

    Nature

    (1997)
  • P. Bloom

    How Children Learn the Meanings of Words

    (2000)
  • W. Merriman

    Competition, attention, and young children's lexical processing

  • T. Regier

    The Human Semantic Potential: Spatial Language and Constrained Connectionism

    (1996)
  • B. MacWhinney

    Competition and lexical categorization

  • T. Regier

    The emergence of words

  • G. Cottrell et al.

    Acquiring the mapping from meaning to sounds

    Connect. Sci.

    (1994)
  • K. Plunkett

    Symbol grounding or the emergence of symbols? Vocabulary growth in children and a connectionist net

    Connect. Sci.

    (1992)
  • V. Nenov et al.

    Perceptually grounded language learning: II. Dete: a neural/procedural model

    Connect. Sci.

    (1994)
  • D. Plaut

    Understanding normal and impaired word reading: computational principles in quasi-regular domains

    Psychol. Rev.

    (1996)
  • P. Li et al.

    Cryptotype, overgeneralization and competition: a connectionist model of the learning of english reversive prefixes

    Connect. Sci.

    (1996)
  • T. Landauer et al.

    A solution to Plato's problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge

    Psychol. Rev.

    (1997)
  • Cited by (51)

    • Lexicon structure and the disambiguation of novel words: Evidence from bilingual infants

      2013, Cognition
      Citation Excerpt :

      For example, under Markman & Wachtel’s (1988) mutual exclusivity account, children operate under the default assumption that object labels denote mutually exclusive categories, and use this assumption to infer that a novel label could not go with an object that already has a label. It has also been suggested that mutual exclusivity is an emergent property of computational processes that support word learning (Frank, Goodman, & Tenenbaum, 2009; McMurray, Horst, & Samuelson, 2012; Merriman, 1999; Regier, 1996, 2003), or that it might be founded in children’s preference for novelty (Horst, Samuelson, Kucker, & McMurray, 2011). An alternative view is that the development of disambiguation is driven by experience, and that it emerges only once a child has established that each object should have a basic-level label (Mervis & Bertrand, 1994), or has ascertained that adults use different words to refer to different kinds of objects (Diesendruck & Markson, 2001).

    • Probabilistic Inference in Human Infants

      2012, Advances in Child Development and Behavior
      Citation Excerpt :

      Take, for example, the domain of word learning, where a satisfying theory must account for a number of known phenomena. Empiricist accounts of word learning (e.g. Colunga & Smith, 2005; Regier, 2003, 2005) account reasonably well for the fact that children are capable of learning words at multiple levels of taxonomic hierarchies (e.g. they learn words such as animal, dog, and poodle). However, they have difficulty dealing with the fact that children are able to learn the meaning of new words after observing very small numbers of exemplars, a phenomenon called fast mapping, as the learning mechanisms typically posited by these accounts require a large number of object and label pairings to acquire new words.

    • Young children's use of statistical sampling evidence to infer the subjectivity of preferences

      2011, Cognition
      Citation Excerpt :

      This rational learning mechanism differs from existing associative models that have been used to explain how children learn the meanings of words. Associative learning models assume that children pick up on the statistical regularities among early lexical categories – they keep track of word-referent pairings, adjust the strengths of these associations based on repeated exposures, and form expectations about how to generalize novel words (e.g., Colunga & Smith, 2005; Regier, 2003). Such associative models might explain how children form prior beliefs about the preference of others (e.g., through repeated exposures to people favoring objects that are intrinsically interesting) and bring generalized expectations to the current situation.

    • Mutual exclusivity in autism spectrum disorders: Testing the pragmatic hypothesis

      2011, Cognition
      Citation Excerpt :

      On such accounts, word learning constraints are either a direct reflection of the structure of domain-general learning mechanisms or are the result of applying these learning mechanisms to input which has an underlying structure that gives rise to the relevant constraint (Regier, 2005; Smith et al., 2002).1 For example, Regier (2003) proposes that mutual exclusivity arises from general mechanisms of competition in a connectionist network. As a word becomes more associated with one referent, the probability that the same word will be used with another referent declines sharply.

    View all citing articles on Scopus
    View full text