Elsevier

Cognition

Volume 122, Issue 2, February 2012, Pages 135-149
Cognition

The relation between event apprehension and utterance formulation in children: Evidence from linguistic omissions

https://doi.org/10.1016/j.cognition.2011.10.002Get rights and content

Abstract

The relation between event apprehension and utterance formulation was examined in children and adults. English-speaking adults and 4-year-olds viewed motion events while their eye movements were monitored. Half of the participants in each age group described each event (Linguistic task), whereas the other half studied the events for an upcoming memory test (Nonlinguistic task). All participants then completed a memory test in which they identified changes to manners of motion and path endpoints in target events. In the Nonlinguistic task, eye movements and memory responses revealed striking similarities across age groups. Adults and preschoolers attended to manner and path endpoints with similar timing, and in the memory test both successfully detected manner and path changes at similar rates. Substantial differences in production emerged between age groups in the Linguistic task: whereas adults usually mentioned both manners and paths in their event descriptions, preschoolers tended to omit one event component or the other. However, eyegaze patterns remained equivalent across the two age groups, with both children and adults allocating more attention to event components that they planned to talk about. Children in the Linguistic task were at chance in the memory test, whereas adults actually showed a memory benefit as compared to the Nonlinguistic task. We conclude that developmental differences in the description of motion events are not due to pure attentional differences between adults and children, but leave open the possibility that they stem from limitations that are solely linguistic in nature or that arise at the interface of attention and language production.

Highlights

► Motion event apprehension and description in English-speaking children and adults. ► Children were less likely to mention certain event components compared to adults. ► Eyegaze patterns and memory for events were similar across age groups. ► Developmental differences in motion event description are not due to attention.

Introduction

The ultimate goal in studies of language production is to gain an understanding of the processes that carry an idea from message formation to linguistic formulation. This area of research is complicated by the fact that the object most readily available for study is the output of language production, i.e., utterances, but the representations and processes that contribute to this output are more difficult to observe. Recent studies have demonstrated that in adults, on-line tracking of attention allocation provides a powerful window into the process of utterance formulation. Studying the time course from apprehension of some visual stimulus to grammatical encoding of some components of that stimulus has allowed researchers to infer the content of underlying representations as they are built and to observe the way that adults transform those conceptual representations into linguistic representations (e.g., Bock et al., 2003, Gleitman et al., 2007, Griffin and Bock, 2000, Papafragou et al., 2008, Trueswell and Papafragou, 2010). However, the way that language production relates to attention allocation and event representation in young children has not yet been studied systematically. In this paper, we describe a novel experimental method that sheds new light on this process, offering an opportunity to better understand how children accomplish the mapping between thinking and speaking.

When adults view a static or dynamic depiction of an event while planning to talk about what they see, they very quickly direct their attention to components of the scene that they plan to talk about, usually in the order that they plan to speak about them (e.g., Bock et al., 2003, Gleitman et al., 2007, Griffin and Bock, 2000). Griffin and Bock (2000) tracked eyegaze as adults viewed static line drawings of simple actions that could be described with transitive sentences (e.g., a mouse spraying a turtle with a water pistol). Analysis of eye movements in relation to language production revealed the influence of linguistic formulation on allocation of attention to the participants in the depicted events. Regardless of whether speakers produced active or passive descriptions of the images, they turned their attention to the event participant that they would encode as the subject of their sentence up to a full second before they began their utterance, followed by looks to the event participant they planned to encode as the object. For example, speakers who were preparing an active event description like “The mouse is spraying the turtle” directed their attention to the mouse approximately one second before they began to speak, followed by looks to the turtle. Gleitman and colleagues (2007) replicated this tight coupling of attention allocation and word order, demonstrating moreover that even the earliest looks to event participants can be used to predict which of them will be mentioned first in a description of the event.

Bock and colleagues (2003) have argued that patterns of attention allocation during language planning are informative not only about the process of linguistic formulation, but also about the way that speakers form a representation of (conceptualize) the relationships between components of a visual stimulus during apprehension. In their study, speakers of English and Dutch who were asked to report times presented on analog and digital clocks using either absolute (e.g., “four thirty”) or relative (e.g., “half-past four”) linguistic formulations directed their attention to the region of the clock that would be relevant to the first part of their report (hours for absolute reports and minutes for relative reports) within a few hundred milliseconds after onset of the visual stimulus. Gleitman and colleagues (2007) showed, moreover, that the starting point of event apprehension can be manipulated by nonlinguistic attention capture mechanisms, which give rise, in turn, to differences in the order in which event components are mentioned in linguistic output.

Taken together, these studies demonstrate clear relations between patterns of attention allocation and language planning in adults. Other studies have shown that the attention that adults allocate to different components of a scene in preparation for speaking varies by language background (Papafragou et al., 2008). This suggests that crosslinguistic differences in the encoding of event components run at least as deep as linguistic formulation. However, adults do not exhibit these language-specific patterns of scene inspection while performing nonlinguistic tasks (e.g., when asked to inspect a scene in preparation for a memory task; Papafragou et al., 2008) or tasks in which access to language has been blocked during event viewing (Trueswell & Papafragou, 2010).

Very little is known about how children allocate their attention as they view scenes, either while they are planning a description of a scene or while they are simply inspecting the scene. Does speech planning direct attentional resources in children in the same ways that it does in adults? The development of remote eye-tracking systems has made it easier to capture the relations between the way children apprehend the world and the way they formulate utterances to describe their experience of that world. In this study, we take advantage of these technological advances to look more closely at the relation between language production and the ongoing inspection of event scenes in preschool-aged children. Specifically, we examine children’s attention allocation (via eyetracking) while they describe short animated events. As we argue below, a fuller picture of what children understand about events as well as the attentional mechanisms that support language planning can be gained by examining children’s visual inspection of events when they are, and are not, engaged in the task of describing them. We focus on children’s apprehension and description of simple motion events, which provide a particularly suitable domain in which to probe for developmental differences.

Following Talmy (1985), we define a motion event as one in which a particular Figure experiences a change in location with respect to some Ground object. The details of a motion event may be filled in by specifying the Manner in which the Figure moves (e.g., roll, bounce, slide) and the trajectory, or Path, that the Figure takes in relation to the Ground object (e.g., exit, circle, down, up). In the case of a bounded motion event, the Ground object may define either the Origin (often referred to as the Source) or the Endpoint (often referred to as the Goal) of motion. In addition, motion events may involve some Instrument (e.g., a car) and/or Cause (e.g., a catapult) that determines either the manner or path of motion. When describing a motion event, speakers make choices about which of these event components they want to include in their utterance and how they want to package those components in the linguistic representation. There are well-known typological trends in the encoding of motion event components (e.g., Slobin, 1996, Talmy, 1985). Adult speakers of so-called satellite-framed languages like English and Chinese tend to conflate motion with manner in the main verb of a sentence, and to either express path in a non-verb position or to omit it altogether. This pattern is demonstrated in sentence (1) below, in which the verb “flew” describes the bird’s manner of motion and the optional post-verbal prepositional phrase “to its nest” describes its path of motion. In contrast, speakers of verb-framed languages like Spanish, Greek, and Turkish are more likely to conflate motion with path in the main verb and to encode manner in a non-verb position (if at all). In sentence (2), the verb “entered” describes the path of the bird’s motion and the post-verbal modifier “flying” describes its manner of motion.These language-specific patterns begin to emerge in early childhood, according to some reports by as early as 3 years of age (Allen et al., 2007; Naigles, Eisenberg, Kako, Highter, & McGraw, 1998; Özyürek et al., 2008; Papafragou et al., 2006, Papafragou and Selimis, 2010, Slobin, 1996, Zheng and Goldin-Meadow, 2002). There are, however, substantial developmental differences in the description of motion events. In particular, children’s descriptions tend to be less informative than those provided by adults. Developmental studies of motion event description find differences between adults and children in the frequency with which event components are encoded in linguistic representations (Allen et al., 2007, Papafragou and Selimis, 2010, Papafragou et al., 2006, Özyürek et al., 2008). Papafragou and Selimis (2010) report, for example, that 4- and 5-year-old English-speaking children are far less likely than adults to include both manner and path in their descriptions of dynamic motion events. In their study, adults mentioned both the manner and the path of motion events in 72% of their event descriptions, whereas children mentioned both event components in only 29% of their descriptions. Özyürek and colleagues (2008) report a similar pattern of omissions: although English-speaking 3- and 5-year olds and adults showed a similar syntactic distribution of manner and path elements in their motion event descriptions, adults mentioned both event components in 84% of their event descriptions compared to 58% for 5-year-olds and 49% for 3-year-olds. Papafragou and colleagues (2006) found that this trend toward omission extends to children as old as 8 years of age: in their study English-speaking 8-year-olds encoded both manner and path in only 50% of motion event descriptions as compared to around 80% for adults.

The developmental literature suggests that linguistic gaps in children’s utterances about motion are unlikely to arise as a result of developmental gaps in conceptual sophistication (cf. Gentner, 1982, Huttenlocher et al., 1983, Johnston and Slobin, 1978, Piaget, 1955). On the contrary, children from a range of cultural and linguistic backgrounds both perceive and describe motion event components from an early age (cf. Lakusta et al., 2007, Mandler, 2004, Pruden et al., 2008). Children appear to be able to discriminate both manners and paths of motion in nonlinguistic tasks by 14 months of age (Pulverman and Golinkoff, 2004, Pulverman et al., 2008, Pulverman et al., 2006, Pulverman et al., 2003). Pulverman et al. (2008) report, for example, that after being habituated to an event in which an animated character moved with a particular manner along a particular path, 14- to 17-month-old children dishabituated to changes in either the manner or the path of motion. This finding held across language backgrounds: dishabituation to each event component was equivalent in children learning English (a satellite-framed language) and Spanish (a verb-framed language). Young children also seem to be able to form categories based on motion event components across varied motion events: paths of motion by 10 months and manners of motion by 13 months of age (Pruden et al., 2004, Pruden et al., 2008). In addition, expressions that specify manners and paths of motion events appear very early in children’s vocabularies and, in the case of deaf home signers, in their gestural repertoire (e.g., Fenson et al., 1994, Naigles et al., 1998, Zheng and Goldin-Meadow, 2002).

Given that children appear to have a clear conceptual grasp of motion event components from a very early age, the question remains why they do not mention these components as often as adults do when describing motion events. One possibility is that children, unlike adults, lack the attentional capacity to simultaneously encode the complete set of components that make up a complex event—i.e., limitations in the event apprehension process, rather than in the linguistic formulation process, are resulting in this developmental difference in what children and adults talk about. There is independent evidence that the cognitive mechanisms that regulate attentional control develop over time, and that young children are more limited than adults in, for example, their ability to divide their attention between multiple locations in a visual array (e.g., Johnson, 1995, Scerif et al., 2005). If a general attentional deficit underlies children’s omissions, then it may be the case that children do not mention a particular event component simply because they have not paid attention to it. That is, perhaps children sometimes attend to manner and sometimes to path, based on the individual properties of a given event, and whichever event component they have attended to is the one they end up mentioning. Adults, in contrast, can adequately divide/allocate their attention among all relevant event components, and so are more likely to mention more of those components.

Alternatively, it may be the case that children’s omission of manner and path information is driven by constraints introduced by the task of language production itself. That is, children may attend to motion events and conceptually encode them with the same sophistication as adults, but linguistic output limitations—related to cognitive or linguistic resources or specific task demands—make it difficult for them to actually talk about everything they have encoded, resulting in the dropping of information from their event descriptions. These limitations could arise for a number of reasons, including developmental differences in lexical and/or syntactic accessibility associated with the typical ways of describing components, or even the working memory demands associated with planning certain structures.

In the current study, we adjudicate between these two general explanations for child omissions by assessing the relations between event apprehension and event description in preschool-aged English-speaking children. Pairing eyetracking with a language production task allows us to ask whether age-related discrepancies in linguistic output stem from developmental differences in attention to motion events or whether they arise as part of the process of encoding those events in language. In particular, we investigate children’s allocation of attention to particular components of a dynamic motion event and the encoding of those components in a verbal description of the event. We ask whether differences in children’s utterances relative to adults can be attributed to differences in attentional patterns. In addition, we ask whether, like adults, children show a tight relationship between the allocation of attentional resources during utterance formulation and the content of the resulting utterance. To our knowledge, this is one of the first studies to compare the attentional signatures of speech planning in children and adults.

More specifically, we compare eye movements of English-speaking 4-year-old children and adults viewing dynamic motion events while manipulating viewing conditions to present a Linguistic task (in which participants viewed and described motion events) and a Nonlinguistic task (in which participants freely viewed motion events in preparation for a memory task). Examining the eyegaze data allows us to assess the event components that participants in each age group actually attend to, whether those components make it into their event descriptions or not. Eye movement patterns in the Nonlinguistic task should reveal the way that children and adults assemble an event representation to be used for committing the event to memory (Papafragou et al., 2008), and eye movement patterns in the Linguistic task are expected to be informative about the process of apprehending an event while formulating an utterance describing the event (see Griffin and Bock, 2000, Papafragou et al., 2008 for discussion of adult data).

If children’s omissions of manners and paths from their event descriptions are due to an attentional difference between children and adults, we would expect to also observe this difference in their eye movements as the event unfolds, regardless of the task presented to them. That is, children in the Linguistic task should pay less attention than adults to the same event components that they tend to omit, and crucially, children in the Nonlinguistic task should show an equivalently restricted pattern of attention allocation. Alternatively, if the omissions arise as a result of limitations introduced by language planning, any age-related differences in event apprehension should only surface in the Linguistic task. That is, if children apprehend and conceive of the motion events as adults do, allocation of attention to manners and paths should be quite similar for children and adults in the Nonlinguistic task. However, in the Linguistic task, differences in attention allocation might arise as children grapple with competing resources.

A memory task presented to all participants at the end of the experiment provides an independent measure of event conceptualization, on the assumption that memory for the details of particular event components is revealing of the structure of underlying event representations. If children differ from adults in the way they apprehend motion events (especially manners and paths of motion), this difference should surface in memory performance in the Nonlinguistic task: children should have a poorer memory for the same event components that they did not attend to as much as adults. Alternatively, if children apprehend motion events as adults do, memory for manner and path elements should be comparable between the two age groups.

Section snippets

Participants

The final sample included 20 4-year-old children (mean age 4;6 years;months, range 4;1–5;0) and 20 adults. Data from eight additional preschool-aged participants were excluded from the analysis for the following reasons: equipment failure (n = 4), unwillingness to cooperate (n = 1), experimenter error (n = 1), inability to calibrate (n = 1), or significant trackloss during stimulus viewing (n = 1; see Section 2.5 for trackloss criteria). All participants were native monolingual speakers of English. The

Spoken event descriptions

Our analyses of the descriptions that were provided by participants in the Linguistic condition revealed that children were more likely than adults to omit motion event components from their descriptions of target events. Children produced event descriptions that were shorter than those produced by adults (preschool MLU in words = 5.8; adult = 7.7). Table 1 provides a coarse report of the semantic content of those utterances: we found that adults mentioned both Manners and Paths in their event

General discussion

The experiment reported here introduced a novel method for assessing the processes that underlie language production in young children: observing children’s eye movements as they examine and prepare descriptions of dynamic events. To accomplish this, we investigated the origins of linguistic omissions in children’s speech, focusing on motion event descriptions. Both prior studies (Allen et al., 2007, Özyürek et al., 2008, Papafragou et al., 2006, Papafragou and Selimis, 2010) and our own

Acknowledgements

This research was supported in part by Grant #R01-HD055498 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development to A. Papafragou and J.C. Trueswell. Correspondence concerning this article should be addressed to Ann Bunger, Department of Psychology, University of Delaware, 108 Wolf Hall, Newark, DE 19716. Email: [email protected].

References (48)

  • A. Papafragou et al.

    When English proposes what Greek presupposes: The cross-linguistic encoding of motion events

    Cognition

    (2006)
  • R. Pulverman et al.

    Infants discriminate manners and paths in non-linguistic dynamic events

    Cognition

    (2008)
  • J.C. Trueswell et al.

    Perceiving and remembering events cross-linguistically: Evidence from dual-task paradigms

    Journal of Memory and Language

    (2010)
  • M. Zheng et al.

    Thought before language: How deaf and hearing children express motion events across cultures

    Cognition

    (2002)
  • A.-M. Adams et al.

    Limitations in working memory: Implications for language development

    International Journal of Language & Communication Disorders

    (2000)
  • W.A. Croft

    Syntactic categories and grammatical relations

    (1991)
  • H. Feldman et al.

    Beyond Herodotus: The creation of a language by linguistically deprived deaf children

  • Fenson, L., Dale, P. S., Reznick, J. S., & Bates, E. (1994). Variability in early communicative development. Monographs...
  • D. Gentner

    Why nouns are learned before verbs: Linguistic relativity versus natural partitioning

  • A.E. Goldberg

    The relationships between verbs and constructions

  • Z.M. Griffin et al.

    What the eyes say about speaking

    Psychological Science

    (2000)
  • J. Huttenlocher et al.

    Emergence of action categories in the child: Evidence from verb meanings

    Psychological Review

    (1983)
  • G. Johansson

    Visual perception of biological motion and a model for its analysis

    Perception and Psychophysics

    (1973)
  • M.H. Johnson

    The inhibition of automatic saccades in early infancy

    Developmental Psychobiology

    (1995)
  • Cited by (24)

    • Speaking and gesturing guide event perception during message conceptualization: Evidence from eye movements

      2022, Cognition
      Citation Excerpt :

      GCA is also able to quantify variation due to fixed effects (i.e., group-level effects; in our case: tasks, types of path encoding in speech and types of path encoding in gesture) as well as the random variation introduced by individual differences (i.e., participants or items). For our dependent variable, we followed prior eye tracking work in the motion event domain (Bunger et al., 2012, 2021; Papafragou et al., 2008; Trueswell & Papafragou, 2010) and used difference scores as a measure for preference to fixate on one event component over the other. Thus, our dependent variable was the difference between the proportion of fixations to the Path AoI (out of all fixations) minus the proportion of fixations to the Manner AoI (out of all fixations).

    • How do children learn to avoid referential ambiguity? Insights from eye-tracking

      2017, Journal of Memory and Language
      Citation Excerpt :

      Perhaps the major limiting factor for developing a cognitive theory of children’s referential communication is that our current understanding of the moment-by-moment mechanisms involved in children’s language production is too sparse to offer much guidance. While we know an increasing amount about how children comprehend language online (Fernald, Pinto, Swingley, Weinberg, & McRoberts, 1998; Huang & Snedeker, 2009; Rabagliati, Pylkkänen, & Marcus, 2013; Snedeker & Trueswell, 2004; and see Snedeker and Huang (2015) for review), we know much less about how they plan and structure their own utterances (although for recent examples of investigations using eye tracking, see Bunger, Trueswell, & Papafragou, 2012; Davies & Kreysa, 2016; Norbury, 2014). Previous work on children’s referential communication has suggested some production strategies that children might use to decide what to say (Glucksberg et al., 1975; Sonnenschein & Whitehurst, 1984; Whitehurst & Sonnenschein, 1981), but has not tied these strategies into a specific processing model of children’s language production.

    View all citing articles on Scopus
    View full text