Abstract
Free full text
Segmentation in the perception and memory of events
Abstract
People make sense of continuous streams of observed behavior in part by segmenting them into events. Event segmentation seems to be an ongoing component of everyday perception. Events are segmented simultaneously at multiple timescales, and are grouped hierarchically. Activity in brain regions including the posterior temporal and parietal cortex and lateral frontal cortex increases transiently at event boundaries. The parsing of ongoing activity into events is related to the updating of working memory, to the contents of long-term memory, and to the learning of new procedures. Event segmentation might arise as a side effect of an adaptive mechanism that integrates information over the recent past to improve predictions about the near future.
Making sense by segmenting
Imagine walking with a friend to a coffee shop. If asked to describe this activity in more detail you might list a few of the events that make it up. The events listed could be broken up by changes in the physical features of the activity, such as location: ‘We started out by going down to the laboratory. We grabbed our coats and put them on. Then we walked out of the building to the corner by the subway station…’ Or, they could be broken up by changes in conceptual features, such as your goals: ‘We started our walk talking about how much construction is going on. When the topic turned to the new building with the coffee shop we decided to head over there to give it a try…’ Such descriptions are typical of how people talk about events, and they illustrate something important about perception: people make sense of a complex dynamic world in part by segmenting it into a modest number of meaningful units. Recent research on event perception reveals that, as an ongoing part of normal perception, people segment activity into events and subevents. This segmentation is related to core functions of cognitive control and memory encoding, and is subserved by isolable neural mechanisms.
Events and their boundaries
By ‘event’ we mean a segment of time at a given location that is conceived by an observer to have a beginning and an end [1]. In particular we focus on the events that make up everyday life on the timescale of a few seconds to tens of minutes – things like opening an envelope, pouring coffee into a cup, changing the diaper of a baby or calling a friend on the phone. Event Segmentation Theory (EST) [2] (see Glossary) proposes that perceptual systems spontaneously segment activity into events as a side effect of trying to anticipate upcoming information (see Box 1). When perceptual or conceptual features of the activity change, prediction becomes more difficult and errors in prediction increase transiently. At such points, people update memory representations of ‘what is happening now’. The processing cascade of detecting a transient increase in error and updating memory is perceived as the subjective experience that a new event has begun.
Segmentation tasks
How can a researcher discover when a person perceives that a new event has begun? One simple but surprisingly powerful answer is simply to ask them, usually by having them press a button [3]. Viewers tend to identify event boundaries at points of change in the stimulus, ranging from physical changes, such as changes in the movements of the actors, to conceptual changes, such as changes in goals or causes. To investigate movements of actors, Newtson et al. [4] asked participants to segment movies of an actor conducting everyday activities. The physical pose of the actor was coded at one-second intervals. Event boundaries tended to be marked at larger changes in pose. Zacks [5] extended this qualitative finding to a quantitative analysis of movement variables in a set of studies using simple animated stimuli. Event boundaries were predicted by changes in movement parameters including the acceleration of the objects and their location relative to one another (see also [Ref. 6].) Larger physical changes have been studied using commercial cinema as a stimulus, in which the locations and temporal setting of characters can change from shot to shot. In one study, such changes were found to predict where viewers segmented Hollywood movies [7]. Conceptual changes that are correlated with event boundaries include changes in goals of actors, in causal relations and in interactions amongst characters [8,9].
Automatic event segmentation
Does asking people for conscious judgments about event boundaries really tell us anything about ongoing perception outside the laboratory? Event segmentation tasks have good intersubjective agreement [10] and reliability [11], which suggests they tap into ongoing processing. Nonetheless, a basic limitation of directly applying segmentation tasks is that they might interfere with the ongoing perceptual processes they attempt to measure. Stronger evidence that event segmentation is automatic comes from implicit behavioral measures and from neurophysiological measures that require no overt task.
Reading-time evidence for automaticity of event segmentation
Studies of the pace of reading during narrative text comprehension indicate that readers slow down at event boundaries. These studies arise mostly from discourse comprehension theories, which propose that readers construct a series of mental models of a situation described by a narrative. In the Structure Building Framework [12], new mental models are initiated when the text refers to a new set of people, places or things, when the action in the text changes its temporal or spatial location, or when a new causal sequence is initiated. In the Event Indexing Model [13], new mental models are initiated when there is a change in space, time, protagonist, objects, goals or causes. When the changes identified by these models occur, readers have been found to read more slowly. This is consistent with the processing load hypothesis put forward by the Event Indexing Model, which states that changes in situational features increase the difficulty of integrating newly encountered information into the current mental model. In one such study [14], participants read two literary narratives one sentence at a time on a computer screen, and reading time was recorded. The texts were coded to identify changes in time, space and causal contingency. Reading time increased consistently at changes in time and cause; reading time sometimes increased at spatial changes, but this depended on the previous knowledge of the reader and task goals. More recent work has found that reading time increases at shifts in characters and their goals [15,16]. Such results support the notion that reading times increase at event boundaries because, as noted previously, event boundaries are associated with changes in time, space, causes, characters and goals. Although most reading-time studies have not measured event segmentation, two recent studies have directly compared event segmentation and narrative reading time. In one [17], cues to event boundaries were experimentally manipulated by changing the temporal contiguity of events (Figure 1). Phrases marking a temporal discontinuity (‘an hour later’) increased the likelihood that a clause would be identified as an event boundary during segmentation, and slowed reading during self-paced comprehension. The second study looked at the relation between event segmentation and reading time correlationally [8]. Two groups of participants read narratives about a young boy. One group segmented them into events; the other read them one clause at a time while reading time was recorded. Clauses in which more readers identified event boundaries produced longer reading times. These results support the view that when comprehenders encounter boundaries between events they perform extra processing operations.
Neurophysiological evidence for automaticity of event segmentation
Converging evidence for the automaticity of event segmentation comes from noninvasive neurophysiological measures, including functional magnetic resonance imaging (fMRI) and electroencephalography (EEG). These studies have been motivated by the hypothesis that if the brain is undertaking processing that relates to event boundaries, transient changes in brain activity should be observed at those points in time corresponding to event boundaries – whether or not you are attending to event segmentation. In one study [18], participants passively viewed short movies of everyday activities while their brain activity was recorded with fMRI. During the initial viewing and fMRI data recording, participants were asked simply to watch the movies and try to remember as much as possible. In the second phase of the experiment these participants segmented the movies into events. Event boundaries were associated with increases in brain activity during passive viewing in bilateral posterior occipital, temporal and parietal cortex and right lateral frontal cortex. The posterior activation included the MT complex [11], an area associated with the processing of motion [19]. A subsequent study using simple animations of geometric objects found that activity in the MT complex was correlated with the speed of motion of objects, and with the presence of event boundaries [20].
Whereas most of the neurophysiological research on event perception involves visual events, a new study by Sridharan et al. [21] investigated the perception of event structure in music. This study examined the extent to which musically untrained listeners use transitions between movements to segment classical pieces into coarse-grained events. A particular advantage in studying musical movements is that they provide objective, normative events. As Figure 2 illustrates, two dissociable networks in the right hemisphere were selectively responsive at transitions between movements: an early-responding ventral network that included the ventrolateral prefrontal cortex and posterior temporal cortex, and a late-responding dorsal network that included the dorsolateral prefrontal cortex and posterior parietal cortex.
fMRI data also have provided evidence for automatic event segmentation during reading. Speer et al. [22] got participants to read narrative texts, one word at a time, while brain activity was recorded with fMRI. Participants subsequently segmented the texts into events. Brain activity increased at the later-identified event boundaries in a set of regions that corresponded substantially with the regions that increased at event boundaries in movies; these included bilateral regions in medial posterior temporal, occipital and parietal cortex, in the lateral temporal-parietal and anterior temporal cortex, and in the right posterior dorsal frontal cortex.
Converging with the fMRI data, recent experiments using EEG indicate that perceptual processing is modulated at event boundaries on an ongoing basis. In one set of experiments [23], participants viewed movies of goal-directed activities while undergoing EEG recording, after which they segmented the movies into events. Evoked responses were detected at frontal and parietal electrode sites, and these responses were modulated by whether or not participants were familiar with the movies themselves or with the activities they depicted.
Finally, measurements of pupil diameter provide further evidence that information processing increases transiently after the perception of an event boundary. Pupil diameter provides an online measure of cognitive processing load, as demonstrated in studies using a variety of motor and cognitive paradigms [24]. In one study [25], participants viewed movies while their pupil diameters and eye movements were recorded. They then segmented the movies into events. Pupil diameter transiently increased following those points that were later identified as event boundaries. (Saccades also became more frequent around event boundaries, which might reflect the reorienting of viewers at the beginning of a new event.)
Together, the reading-time, neurophysiological and oculomotor data strongly suggest that there are transient changes in brain activity correlated with the subjective experience that one event has ended and another has begun. We can be confident these effects are independent of task demands such as the conscious intention to segment activity or covert attention to the location of event boundaries, because for all the studies discussed here the crucial reading-time or physiological data were collected before participants were introduced to the event segmentation task. An important question for ongoing research is: to what extent do such effects reflect neural processing that has a causal role in event segmentation, and to what extent do they reflect processing that is correlated with the presence of event boundaries but not a cause or consequence of event segmentation as such?
Hierarchical event perception
Events can be identified at a range of temporal grains, from brief (fine-grained) to extended (coarse-grained). In goal-directed human activity it is natural to think of such events as being hierarchically organized, with groups of fine-grained events clustering into larger units; this is partly because actions are hierarchically organized by goals and subgoals [1]. For example, the activity of making a sandwich includes the coarse-grained events of (i) removing ingredients from the refrigerator, and (ii) assembling them. The event of assembling the ingredients, in turn, might include subevents such as adding meat, adding cheese and spreading mayonnaise. When understanding activities, people seem spontaneously to track the hierarchical grouping of events. Evidence for hierarchical grouping has been found by asking a viewer or reader to segment an activity twice, on different occasions, once to identify coarse-grained event boundaries and once to identify fine-grained boundaries. If coarse-grained events subsume a group of finer-grained events, it would be expected that each coarse event boundary would fall slightly later in time than the fine event boundary to which it is closest. That is exactly what is observed [26]. Hierarchical dividing of coarse events into finer-grained events also would predict that each coarse event boundary should fall near one of the fine boundaries; this is also the case [27].
Events on different temporal grains can be sensitive to different features of activities. Running descriptions by viewers of coarse- and fine-grained events in movies has provided evidence for such differences [27]. Descriptions of coarse-grained events focused on objects, using more precise nouns and less precise verbs. Descriptions of fine-grained events focused on actions on those objects, using more precise verbs but specifying the objects less precisely. If different temporal grains depend on different features, you would expect that segmentation of fine- or coarse-grained events could be selectively impaired. Selective impairments of coarse-grained segmentation have been found in patients with frontal lobe lesions [28] and in patients with schizophrenia [29] (Box 2).
Parsing ongoing activity into hierarchically organized events and subevents might be important for relating ongoing perception to knowledge about activities. Studies of memory for events and story comprehension suggest that people use hierarchically organized event representations (scripts or schemata) to understand a particular activity in relation to similar previously experienced activities (see, e.g., [Refs 30,31]). Infants as young as 12 months seem to be sensitive to the hierarchical organization of behavior [32,33] (Box 3) and non-human primates seem to be sensitive to hierarchical organization in the behavior of conspecifics such that they can use such organization to learn manual skills [34]. Recent research has found that learning of a new procedure by adults can be facilitated by explicitly representing the hierarchical structure of the activity, or impaired by misrepresenting that structure [35]. Such learning might be related to the mechanisms by which people form ‘chunks’ in semantic knowledge [36]. In sum, people seem to track the hierarchical structure of activity during perception because this enables them to use prior knowledge for understanding, and to adapt that prior knowledge to learning new skills.
Memory for events
Event boundaries relate systematically to both the online maintenance of information (working memory) and the permanent storage of information for later retrieval (long-term memory). According to EST [2], this is because at event boundaries people update representations of the current event, which frees information from working memory, and orient to incoming perceptual information, which encodes it particularly strongly for long-term memory. Evidence for working memory updating at event boundaries comes from the comprehension of text narratives, picture stories and cinema.
Working memory for events in text
Studies of memory access during text reading generally have not directly measured event boundaries, but have found evidence that those cues associated with event boundaries reduce memory access. Memory for recently mentioned objects is poorer after reading sentences that indicate a shift in time or space [37–41]. As noted previously, one study [17] experimentally manipulated the presence of event boundaries in narratives by varying the passage of time indicated in an auxiliary phrase. As is shown in Figure 1, implicit and explicit measures of memory indicated that information became less accessible following a time shift.
Working memory for events in pictures
Gernsbacher [42] got participants to identify episode boundaries in a picture story. New participants then viewed the picture story, and from time to time were probed to discriminate pictures that had recently been presented in the story from pictures that were left–right reversed. Recognition was better for images from within the current episode than for images from the previous episode, indicating that some of the surface information in the pictures was less available once a new event had begun.
Working memory for events in cinema and virtual reality
Converging with the results from text and picture stories, experiments looking at recognition memory for objects in cinema and virtual reality indicate that working memory is updated at event boundaries. In one recent study [43], participants navigated in a virtual reality environment, and memory for recently appearing objects was tested. Memory was reduced after walking through a door into an adjacent room – a probable event boundary. In another recent set of studies [44], participants viewed excerpts from movies that were occasionally interrupted by recognition memory tests for objects that had been on the screen five seconds previously. The information that participants could retrieve about these objects differed systematically depending on whether an event boundary had occurred during those five seconds. Further, neuroimaging data suggested that the basis for responding also differed: retrieval of information about objects from within the current event selectively activated brain areas including the bilateral occipital and lateral temporal cortices and the right inferior lateral frontal cortex, whereas retrieval of information about objects from the previous event selectively activated brain areas including the medial temporal cortex and medial parietal cortex. The medial temporal regions correspond to the hippocampal formation, which is known to have a key role in long-term memory storage and retrieval [45]. This is consistent with the hypothesis that participants depended preferentially on working memory for within-event retrieval and shifted to long-term memory for across-event retrieval – although the information to be retrieved was only a few seconds old.
Long-term memory for events
Of course, the core function of long-term memory is to support access to information over much longer delays. Evidence from tests in which participants retrieve information over delays of minutes to hours indicates that event boundaries serve as anchors in long-term memory. Recognition memory for pictures drawn from event boundaries has been found to be better than memory for pictures drawn from points between the boundaries [46]. Manipulations that affect the perception by viewers of the locations of event boundaries in a film also affect their memory for the film when tested later. Boltz [47] got participants to watch feature films with or without embedded commercials. The commercials were inserted at event boundaries, or between event boundaries. Memory for the activity was better for movies without commercials and for movies in which commercials were inserted at event boundaries. Similarly, Schwan and Garsoffky [48] got participants to view movies of everyday events with or without deletions. The deletions were either of segments of time surrounding an event boundary, or segments of time within an event. The researchers found that recall was better for events when there were no deletions and when the deletions were of segments within an event, preserving the event boundaries. Segmentation grain has also shown an impact on memory: recall for details is better after fine-grained segmentation than after coarse-grained segmentation [49–51].
If event boundaries serve as anchors for long-term memory encoding, then individuals who segment an activity effectively should have better later memory for it. As described in Box 2, recent evidence supports this hypothesis [52].
How does segmentation help?
Why do people segment ongoing activity into events? Segmentation results from the continual anticipation of future events. This anticipation enables you adaptively to encode structure from the continuous perceptual stream, to understand what an actor will do next, and to select your own future actions [53]. Segmentation simplifies, enabling you to treat an extended interval of time as a single chunk. If you segment well, this chunking saves on processing resources and improves comprehension. However, to segment well, you must identify the correct units of activity – the correct events. One possible mechanism for identifying events is to monitor your ongoing comprehension and break activity into units when comprehension begins to falter (Box 1). Once events have been individuated you can start to learn to recognize sequences of events and plan reactions based on such sequences. Grouping fine-grained events hierarchically into larger units enables you to learn not just rote one-after-the-other relations but more complex ones. One such relation is partial ordering, which is ubiquitous in problem-solving and planning. Think of baking a cake, where there are several different orders in which you could mix the ingredients, but all those steps must be complete before the cake goes in the oven.
Events segmented during perception also can form the units for memory encoding, enabling you to store compact representations of extended activities. Identifying the correct events facilitates memory, much as it is easier to remember a sequence of vocalizations if it comes from a language you know, enabling you to segment it into words, clauses and sentences.
Event segmentation is automatic and important for perception, comprehension, problem-solving and memory (See Box 4 for open questions). None of this would be true if the structure of the world were not congenial to segmentation. If sequential dependencies were not predictable, if activity were not hierarchically organized, there would be no advantage to imposing chunking and grouping on the stream of behavior. In this regard, as in many others, human perceptual systems seem to be specialized information-processing devices that are tuned to the structure of their environment.
Acknowledgements
Preparation of this article was supported in part by grants R01-MH070674 and T32 AG000030–31 from the National Institutes of Health. We thank Devarajan Sridharan for providing figure materials, and thank the following colleagues for thoughtful comments on the manuscript: Dare Baldwin, Joe Magliano, G.A. Radvansky, Stephan Schwan, Ric Sharp, Khena Swallow and Barbara Tversky.
Glossary
Event model | an actively maintained representation of the current event, which is updated at perceptual event boundaries. |
Event segmentation | the perceptual and cognitive processes by which a continuous activity is segmented into meaningful events. |
Temporal grain | events can be perceived on a range of temporal grains, or timescales, from a second or less to tens of minutes. |
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
Full text links
Read article at publisher's site: https://doi.org/10.1016/j.tics.2007.11.004
Read article for free, from open access legal sources, via Unpaywall: https://europepmc.org/articles/pmc2263140?pdf=render
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1016/j.tics.2007.11.004
Article citations
Taking time to compose thoughts with prefrontal schemata.
Exp Brain Res, 14 Mar 2024
Cited by: 0 articles | PMID: 38483564
Construction or updating? Event model processes during visual narrative comprehension.
Psychon Bull Rev, 15 Feb 2024
Cited by: 0 articles | PMID: 38361105
People can reliably detect action changes and goal changes during naturalistic perception.
Mem Cognit, 05 Feb 2024
Cited by: 0 articles | PMID: 38315292
A multi-stage anticipated surprise model with dynamic expectation for economic decision-making.
Sci Rep, 14(1):657, 05 Jan 2024
Cited by: 0 articles | PMID: 38182692 | PMCID: PMC10770108
Visual event boundaries restrict anchoring effects in decision-making.
Proc Natl Acad Sci U S A, 120(44):e2303883120, 24 Oct 2023
Cited by: 2 articles | PMID: 37874857
Go to all (174) article citations
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Load response functions in the human spatial working memory circuit during location memory updating.
Neuroimage, 35(1):368-377, 19 Dec 2006
Cited by: 34 articles | PMID: 17239618
The fraction of an action is more than a movement: neural signatures of event segmentation in fMRI.
Neuroimage, 61(4):1195-1205, 13 Apr 2012
Cited by: 15 articles | PMID: 22521252
Event perception: a mind-brain perspective.
Psychol Bull, 133(2):273-293, 01 Mar 2007
Cited by: 295 articles | PMID: 17338600 | PMCID: PMC2852534
Review Free full text in Europe PMC
Short-term retrospective versus prospective memory processing as emergent properties of the mind and brain: human fMRI evidence.
Neuroscience, 226:236-252, 12 Sep 2012
Cited by: 9 articles | PMID: 22982622
Finding events in a continuous world: A developmental account.
Dev Psychobiol, 61(3):376-389, 06 Nov 2018
Cited by: 7 articles | PMID: 30402936
Review
Funding
Funders who supported this work.
NIA NIH HHS (2)
Grant ID: T32 AG000030-31
Grant ID: T32 AG000030
NIMH NIH HHS (3)
Grant ID: R01 MH070674
Grant ID: R01-MH070674
Grant ID: R01 MH070674-04