Review
Models of word production

https://doi.org/10.1016/S1364-6613(99)01319-4Get rights and content

Abstract

Research on spoken word production has been approached from two angles. In one research tradition, the analysis of spontaneous or induced speech errors led to models that can account for speech error distributions. In another tradition, the measurement of picture naming latencies led to chronometric models accounting for distributions of reaction times in word production. Both kinds of models are, however, dealing with the same underlying processes: (1) the speaker’s selection of a word that is semantically and syntactically appropriate; (2) the retrieval of the word’s phonological properties; (3) the rapid syllabification of the word in context; and (4) the preparation of the corresponding articulatory gestures. Models of both traditions explain these processes in terms of activation spreading through a localist, symbolic network. By and large, they share the main levels of representation: conceptual/semantic, syntactic, phonological and phonetic. They differ in various details, such as the amount of cascading and feedback in the network. These research traditions have begun to merge in recent years, leading to highly constructive experimentation. Currently, they are like two similar knives honing each other. A single pair of scissors is in the making.

Section snippets

Two kinds of model

All current models of word production are network models of some kind. In addition, they are, with one exception5, all ‘localist’, non-distributed models. That means that their nodes represent whole linguistic units, such as semantic features, syllables or phonological segments. Hence, they are all ‘symbolic’ models. Of the many models with ancestry in the speech error tradition6, 7, 8 only a few have been computer-implemented9, 10, 11. Among them, Dell’s two-step interactive activation model9

Conceptual preparation

The first step in accessing content words such as cat or select is the activation of a lexical concept, a concept for which you have a word or morpheme in your lexicon. Usually, such a concept is part of a larger message, but even in the simple case of naming a single object it is not trivial which lexical concept you should activate to refer to that object. It will depend on the discourse context whether it will be more effective for you to refer to a cat as cat, animal, siamese or anything

Lexical selection

In the chronometric tradition lexical selection has been studied with interference paradigms, in particular picture-word interference (see Box 1). The recurring finding has been that naming an object is slowed down when a distracter word is presented with the picture; the effect is stronger when the distracter word is semantically related to the target than when it is semantically unrelated and it is at maximum when picture and distracter word are presented simultaneously27. The WEAVER model

Morpho-phonological encoding

When you are planning the sentence ‘they are selecting me’, you must retrieve from your lexicon the morpho-phonological codes for each of the selected words, among them the two morpheme-size codes select and ing (see Fig. 3), and compute their syllabification and accent structure in context (se-léc-ting). This naturally divides the process into ‘code retrieval’ and ‘prosodification’.

Phonetic encoding and articulation

As incremental prosodification proceeds, the resulting syllabic and larger prosodic structures should acquire phonetic shape. As a speaker you will incrementally prepare articulatory gestures for the syllables in their prosodic context. A core feature of the WEAVER model is the notion of a syllabary51. Statistics show that native speakers of English or Dutch do 80 percent of their talking with no more than about 500 different syllables18 (although these languages have many more than 10 000

Conclusion

There is still a long way to go before the two research traditions emerging from speech error analysis and from naming chronometry are fully reconciled. But there has been lively and highly constructive interaction, leading to a much improved understanding of the processes involved in lexical selection and phonological encoding. One unifying force has been computational modeling. Current implemented models share their major strata, they are localist and symbolic; they compute quite similar

Outstanding questions

  • How should error-based and chronometric models be further reconciled computationally and empirically?

  • What causes a speech error? Is it caused by occasional cascading or occasional feedback in a normally non-cascading, feed-forward system? Is it the product of noise in a normally cascading interactive system? Or is the origin of speech error something else entirely?

  • How does the word-production network relate to the word-perception network? How is self-monitoring realized in this combined system?

Acknowledgements

I gratefully acknowledge helpful commentary by Antje Meyer and by Gary Dell.

References (57)

  • E. Marx, Gender processing in speech production: evidence from German, J. Psycholinguist. Res. ((in...
  • J.A. Goldsmith

    Autosegmental and Metrical Phonology

    (1990)
  • A. Roelofs et al.

    Metrical structure in planning the production of spoken words

    J. Exp. Psychol. Learn. Mem. Cognit.

    (1997)
  • Indefrey, P. & Levelt, W.J.M. in The Cognitive Neurosciences (2nd edn) (Gazzaniga, M., ed.), MIT Press (in...
  • R.D. Kent et al.

    Models of speech production

  • C.J. Price

    The functional anatomy of word comprehension and production

    Trends Cognit. Sci.

    (1998)
  • G.A. Miller

    The Science of Words

    (1991)
  • A. Garnham

    Slips of the tongue in the London–Lund corpus of spontaneous conversation

    Linguistics

    (1981)
  • W.J.M. Levelt

    Speaking: From Intention to Articulation

    (1989)
  • W.J.M. Levelt

    Language production: a blueprint of the speaker

  • D.G. Mackay

    The Organization of Perception and Action: A Theory for Language and other Cognitive Skills

    (1987)
  • G.S. Dell

    A spreading-activation theory of retrieval in sentence production

    Psychol. Rev.

    (1986)
  • T.A. Harley

    Phonological activation of semantic competitors during lexical access in speech production

    Lang. Cognit. Process.

    (1993)
  • U. Schade et al.

    The role of inhibition in a spreading-activation model of language production: II. Simulational perspective

    J.

    (1992)

    Psycholinguist. Res.

    (1992)
  • G.S. Dell

    Lexical access in aphasic and non-aphasic speech

    Psychol. Rev.

    (1997)
  • M.F. Damian et al.

    Semantic and phonological codes interact in single word production

    J. Exp. Psychol. Learn. Mem. Cognit.

    (1999)
  • G.W. Humphreys et al.

    An interactive activation approach to object processing: effects of structural similarity, name frequency and task in normality and pathology

    Memory

    (1995)
  • M.O. Glaser et al.

    Time course analysis of the Stroop phenomenon

    J. Exp. Psychol. Hum. Percept. Perform.

    (1982)
  • Cited by (423)

    View all citing articles on Scopus
    View full text