Original ArticlesChunk formation in immediate memory and how it relates to data compression
Introduction
Individuals have a tendency to make information easier to retain by recoding it into chunks (e.g., Cowan et al., 2004). The process of chunking simplifies memorization by taking advantage of knowledge to reduce the quantity of information to be retained (Miller, 1956). As a key learning mechanism, chunking (or grouping) has had considerable impact on the study of expertise (e.g., Chase and Simon, 1973, Ericsson et al., 1980, Hu and Ericsson, 2012), immediate recall (e.g., Chen and Cowan, 2005, Farrell et al., 2011), and memory development (e.g., Cowan et al., 2010, Gilchrist et al., 2009).
For chunking to benefit memory, people need to be able to retrieve the chunks they stored. One way people retrieve chunks is via long-term memory processes (French et al., 2011, Gobet et al., 2001, Guida et al., 2012, Reder et al., 2016). Consider the letter string IBMCIAFBI. As Miller discussed, this letter string can be easily simplified to form three chunks if one uses long-term memory to recall the U.S. agencies (Miller, 1956) whose acronyms are IBM, CIA, and FBI.
Previous work on chunking has focused on how long-term memory aids chunk creation. However, immediate memory might also play a fundamental role in the creation of chunks. People may form chunks in immediate memory by rapidly encoding patterns before any consolidation in long-term memory occurs. For example, it is easy to remember the letter string AQAQAQ using a simple rule of repetition (e.g., AQ three times). This type of simplification does not necessarily depend on the use of long-term memory to recall past knowledge that relates items to each other.1 Instead, this process depends on the apprehension of regularities inherent to the stimulus at hand, i.e., compression.
This idea that immediate memory might play a fundamental role in the creation of chunks has generally been overlooked. Some previous findings are consistent with the proposal that chunks can increase memory capacity (Brady et al., 2009, Feigenson and Halberda, 2008). However, these studies have mostly focused on how long-term-memory representations contribute to encoding in immediate memory. In contrast, our goal is to provide a principled quantitative approach to how immediate memory relates to the formation of chunks. Getting a larger picture of chunking as a process originating in immediate memory needs a precise conceptualization, and the concept of compressibility could help in doing so.
We propose a two-factor theory of the formation of chunks in immediate memory. The first factor is compressibility (i.e., the idea that a more compact representation can be used to recode information in a lossless fashion2). Compressibility could predict chunking because it measures the degree to which the material is patterned, and hence the degree to which the material can be simplified. Memory for compressible sequences should be superior to memory for non-compressible sequences (the same way that studies in the domain of categorization have shown that compressible material is better learned over the long term; see Feldman (2000)).
The second factor is the order of the information to memorize. Presentation order might influence the ease with which patterns or regularities in the stimuli can be discovered, and compression algorithms typically depend on this kind of information. A presentation order that aligns with the process of simplifying the material may increase the likelihood that chunking occurs. In contrast, presentation orders that do not aid in the discovery of regularities, might result in failure to chunk compressible materials, causing them to be remembered in a way similar to non-compressible materials. Presentation order should therefore interact with compressibility. As a simple example, one can compress the set “2, 3, 4, 5, 6” with the rule, “all numbers between 2 and 6”, whereas with the series “2, 4, 6, 3, 5”, that same rule might not be noticed by the participant, so compression might not take place.
This two-factor theory is adapted from the domain of categorization, which has provided a framework for studying category formation in long-term memory, with explanations based on the compressibility of descriptions (Bradmetz and Mathy, 2008, Feldman, 2000, Feldman, 2003, Goodwin and Johnson-Laird, 2013, Lafond et al., 2007, Vigo, 2006) using different types of presentation orders (based on rules, similarity, or dissimilarity; see Elio and Anderson, 1981, Elio and Anderson, 1984, Gagné, 1950, Mathy and Feldman, 2009, Medin and Bettger, 1994). This framework nicely accounts for a wide range of categorization performance in long-term memory, but could in principle provide similar predictions for immediate memory. Our theory is that a compression model (e.g., Feldman, 2000) can be adapted to immediate memory. The rationale is that elementary structures, i.e., the redundancies that make a structure compressible, are simple enough to be used rapidly in immediate memory to cope with new situations.
We conducted an experiment to test the proposal outlined above, namely, that chunk formation occurs in immediate memory to optimize capacity before any consolidation process in long-term memory occurs. Our prediction is that immediate-memory span is proportional to stimulus compressibility, but only when the order of the information allows the participant to spontaneously detect redundancies such as pairs of similar features.
In the Discussion, we provide ample evidence that there are two major classes of concurrent models that cannot provide correct predictions for our results. The first class is Interference-based models of short-term memory, which predict poorer performance when participants see sequences containing similar features, whereas our model predicts that participants can take advantage of these similarities to compress information. The second class includes the minimal description length (MDL) approaches to long-term memory, which rely on the repetition of trials, and as such, offer no predictions about the compression process at play in our task.
Section snippets
Method
Two key aspects were investigated in the present experiment: compressibility of a sequence and presentation order within a sequence. These two factors were studied using categorizable multi-dimensional objects, with discrete features, such as small green spiral, large green spiral, small red square. The sequences used could not conform to already-learned chunks. Although the features themselves are part of basic knowledge, we are reasonably confident, for instance, that none of our participants
Results
The analyses were conducted on correct (1) or incorrect (0) serial-recall scores for each trial (a response was scored correct when both the items and the positions were correctly recalled), and for the average recall score across conditions (proportion correct). The data was first aggregated for a given variable, e.g., ‘presentation time’, in order to run a separate univariate ANOVA for each dependent variable (e.g., the mean proportion correct for all trials pooled).
Discussion
We explored the ability of untrained participants to increase their immediate memory by parsing sequences of objects into newly-formed chunks. The main reason for exploring chunk formation in immediate memory is that a chunk is too often thought of solely as a product of an already-formed long-term representation. Showing that recoding can occur very rapidly in immediate memory is a different undertaking. This idea may seem merely intuitive, but no other model can give a precise quantitative
Conclusion
One common idea is that working memory simply uses pointers to retrieve chunks in long-term memory, and that chunks are only consolidated pieces of information stored in long-term memory. We believe that we observed a chunking process that seems to result from the temporary creation of new representations. Recall performance in the current study was as small as about three items, but the total increased when the sequential order of the items was favorable to the formation of new chunks, i.e.,
Acknowledgement
This research was supported by a grant from the Agence Nationale de la Recherche (Grant # ANR-09-JCJC-0131-01) awarded to Fabien Mathy, a grant from the Région de Franche-Comté AAP2013 awarded to Fabien Mathy and Mustapha Chekaf, and by NIH Grant R01 HD-21338 awarded to Nelson Cowan. We are grateful to Jacob Feldman, Ori Friedman, Alessandro Guida, Harry H. Haladjian, and the Attention & Working Memory Lab at Georgia Tech for their helpful remarks.
References (61)
- et al.
Word length and the structure of short-term memory
Journal of Verbal Learning & Verbal Behavior
(1975) - et al.
Perception in chess
Cognitive Psychology
(1973) Phonemic coding and rehearsal in short-term memory for letter strings
Journal of Verbal Learning & Verbal Behavior
(1973)A catalog of Boolean concepts
Journal of Mathematical Psychology
(2003)- et al.
Investigating the childhood development of working memory using sentences: new evidence for the growth of chunk capacity
Journal of Experimental Child Psychology
(2009) - et al.
Chunking mechanisms in human learning
Trends in Cognitive Sciences
(2001) - et al.
The acquisition of boolean concepts
Trends in Cognitive Sciences
(2013) - et al.
How chunks, long-term working memory and templates offer a cognitive explanation for neuroimaging data on expertise acquisition: A two-stage framework
Brain and Cognition
(2012) - et al.
Memorization and recall of very long lists accounted for within the long-term working memory framework
Cognitive Psychology
(2012) - et al.
The enigma of organizatio and distinctiveness
Journal of Memory and Language
(1993)
Chunking: Associative chaining versus coding
Journal of Verbal Learning & Verbal Behavior
The effect of a difficult word on the transitional error probabilities within a sequence
Journal of Verbal Learning & Verbal Behavior
Questioning short-term memory and its measurement: Why digit span measures long-term associative learning
Cognition
Complexity minimization in rule-based category learning: Revising the catalog of boolean concepts and evidence for non-minimal rules
Journal of Mathematical Psychology
What’s magic about magic numbers? Chunking and data compression in short-term memory
Cognition
A formal model of capacity limits in working memory
Journal of Memory and Language
Modeling by shortest data description
Automatica
A note on the complexity of Boolean concepts
Journal of Mathematical Psychology
Effects of visual similarity on serial report and item recognition
Quarterly Journal of Experimental Psychology: Section A
A common prefrontal-parietal network for mnemonic and mathematical recoding strategies within working memory
Cerebral Cortex
Response times seen as decompression times in Boolean concept use
Psychological Research
Compression in visual working memory: Using statistical regularities to form more efficient memory representations
Journal of Experimental Psychology: General
The magic number seven after fifteen years
A temporal ratio model of memory
Psychological Review
Memory for serial order: A network model of the phonological loop and its timing
Psychological Review
Chunk limits and length limits in immediate recall: A reconciliation
Journal of Experimental Psychology: Learning, Memory, and Cognition
Core verbal working-memory capacity: The limit in words retained without covert articulation
Quarterly Journal of Experimental Psychology
Applied multiple regression/correlation analysis for the behavioral sciences
The magical number 4 in short-term memory: A reconsideration of mental storage capacity
Behavioral and Brain Sciences
Constant capacity in an immediate serial-recall task: A logical sequel to miller (1956)
Psychological Science
Cited by (52)
Identification and conceptualization of procedural chunks in chess
2021, Cognitive Systems ResearchCitation Excerpt :Therefore, we can assume that STM capacity limit is a combination of chunk size and number. Our results are in line with several recent studies stating that in complex situations where complementary procedures are used STM capacity cannot be expressed as a simple concept (Cowan, Rowder, Blume, & Saults, 2012; Chekaf et al., 2016). We propose that in procedural information processing, the level of expertise and sorting order of retrieved information are important factors in defining STM capacity.
How does semantic knowledge impact working memory maintenance? Computational and behavioral investigations
2021, Journal of Memory and LanguageChunking and data compression in verbal short-term memory
2021, CognitionCitation Excerpt :This link between compression and the set of possible messages corresponds to a central result in the field of information theory: efficient storage and transmission of information depends on knowledge of the statistical properties of the signal (Shannon & Weaver, 1949). Only a small number of studies of STM have considered chunking in the context of data compression (e.g. Brady et al., 2009; Chekaf et al., 2016; Mathy & Feldman, 2012; Norris et al., 2020; Thalmann et al., 2019). Mathy and Feldman had participants recall digit sequences which varied in compressibility.
Similarity-Based Compression in Working Memory: Implications for Decay and Refreshing Models
2024, Computational Brain and Behavior