In the final pages of Supersizing the Mind, Clark describes Dawkins’ “mental flip.” Dawkins asked biologists to abandon their focus on individual organisms and instead to imagine “bodies falling transparent so as to reveal the near-seamless play of replicating DNA.” By taking such a view, one might see that an organism is just a gene’s way of replicating itself. Clark goes onto say that a similar mental flip is needed in the sciences of mind “…to cease to unreflectively privilege the inner, the biological, and the neural.” In this view, “[t]he human mind … emerges as the productive interface of brain, body, and social and material world.” Clark is to be congratulated for making this case and bolstering it with empirical evidence. From low-level processes of motor control to high-level processes of reasoning Clark shows us how cognitive processes are enacted in systems that transcend the boundaries of the individual organism. Supersizing the Mind delivers us to a point from which yet another “mental flip” is both possible and necessary. In this commentary I will try to describe this next flip and some of the things that become apparent when it is made.

1 The assembly of cognitive systems

Clark introduces the Principle of Ecological Assembly (PEA) as a central motif in his argument. According to the PEA, “the canny cognizer tends to recruit, on the spot, whatever mix of problem-solving resources will yield an acceptable result with a minimum of effort” (p. 13).Footnote 1 Clark exploits an important ambiguity in the meaning of the word “assembly.” Assembly can be either a process of putting things together, or the product that consists of the things that have been put together. His description of the principle does not commit to one reading or the other. Throughout Supersizing the Mind, Clark provides many examples of assemblies: products that then house extended cognitive processes, but he says less about the assembly processes. In fact, accounting for the organization of ecological assemblies is the central and unsolved problem in the book.

The agent in Clark’s description of ecological assembly is the “canny cognizer,” but who is that? Clark seems unsure how to answer that question. He addresses the question directly in Chap. 6 where he describes some clever experiments indicating that “our problem-solving performances … accord no special status or privilege to specific types of operations (motoric, perceptual, introspective) or modes of encoding (in the head or in the world)” (p. 121). Clark dubs this indifference to the location and type of resource the Hypothesis of Cognitive Impartiality. He then notes that this hypothesis introduces a difficult problem, which Clark labels “A Brain Teaser.” He asks, “Just what is it that is so potentially impartial concerning its sources of order and information? The answer looks to be “the biological brain.” So haven’t we (rather deliciously) ended up firmly privileging the biological brain in the very act of affirming its own impartiality?” (p. 122, emphasis in the original). This ironic contradiction is presented with a flourish, as though it were a surprising result. In fact, the seeds of this brainteaser were sown a few pages earlier when Clark suggested, “Let us make the (surely uncontroversial) assumption that the biological brain is, currently at least, the essential core element in all episodes of individual human cognitive activity” (p. 118).

In order to protect the claim of extended mind from this internal contradiction, Clark is careful to distinguish the two meanings of assembly as two separable problems: First, who or what controls the assembly process? And second, where is the cognitive process when the assembled product is functioning? In the text he gives the control of the assembly process to the biological brain while defending the distributed and extended nature of assembled cognitive systems. Meanwhile, in parenthesis and especially in footnotes, Clark shows a great deal of ambivalence about retreating to the biological brain.

He says,

… in rejecting the vision of human cognitive processing as organism bound, we should not feel forced to deny that it is (in most, perhaps all, real-world cases) organism centered. It is indeed primarily the biological organism that, courtesy especially of its potent neural apparatus, spins and maintains (or more minimally, selects and exploits) the webs of additional structure that then form parts of the machinery that accomplishes its own cognizing.[fn 18] … it is the biological human organism that spins, selects or maintains the webs of cognitive scaffolding that participate in the extended machinery of its own thought and reason.[fn 19] Individual cognizing, then, is organism centered even if it is not organism bound. (p. 123)

What is a reader supposed to make of the hedges provided in parenthesis? The first parenthetical condition raises the question: are real-world performances sometimes not organism centered? The second asks: is it sometimes the case that something other than the biological organism that spins the webs that are exploited by individuals? The footnotes answer both of these parenthetically posed questions in the affirmative. Footnote 18 says, “This is not to deny, of course, that much of the spinning is done by social groups of organisms spread out over long swaths of history” (p. 243). Footnote 19 says, “One difference [from spider webs] is that in the case of the webs of cognitive scaffolding, it is often the human organism acting in concert with existing webs of scaffolding that spins, selects, or maintains new layers of scaffolding, resulting in the powerful process that Sterelny (2004) dubs “incremental downstream epistemic engineering” (p. 243).

Notice that what was initially presented as an all-or-nothing proposition is now a distributional question. How much spinning is done by social groups? How often is the spinning accomplished in concert with existing webs of scaffolding? Answering these empirical questions should be a central enterprise in cognitive science.

Clark plays to the traditional brainbound interests in the text while also leaving himself a small trapdoor (opened inconspicuously in parenthetical comments and footnotes) through which he might slip into a less brainbound future. As with the hedge “currently at least” in the claim “…the biological brain is, currently at least, the essential core element…” Clark’s uses parenthetical comments and footnotes to point to the (as yet unrealized) possibility that something other than the brain could be responsible for the dynamic organization of cognitive systems.Footnote 2 A straightforward way to deal with this situation is to abandon the assumption that the biological brain is the essential core element. Doing so, of course, requires that one look elsewhere for the apparently impartial forces that assemble cognitive systems.

2 Self-organizing systems? Yes, but where?

Two of the pieces needed to solve this puzzle are present in Clark’s exposition, but he does not assemble them into a coherent solution. These two pieces are the notion of self-organization and the roles of cultural practices in the organization of cognitive processes.

In a subsection titled “Anarchic Self-stimulation” Clark insists that the “inner executive” must be rejected. He approvingly quotes Dennett, who maintained that “the manipulanda have to manipulate themselves.” Discussing the gating of relations between verbal and gestural representation processes (the gating organizes cognitive processes) Clark says, “For the gating routines themselves may be just more experience driven microdemons added to the semianarchic mix: demons whose activity, though in some sense higher order, does not reflect the judgments of any highly informed inner homunculus monitoring or controlling the flow of thought and reason” (p. 135).

Here is internal self-organization. Clark continues the project of sacking the inner executive by listing positive views concerning human cognitive organization. One of these is “The flow of control is itself fragmented and distributed, allowing different inner resources in interact with, or call upon, different external resources without such activity being routed via the bottleneck of conscious deliberation or the intervention of an all-seeing, all-orchestrating inner executive” (pp. 136–137). Clark hints that the self-organization principle works as well at the level of extended systems as it does for internal systems. But he still thinks he is confronting a deficit of organizational causes. Clark says of this fragmented and distributed view, “It invokes an ill-understood process of “recruitment” that soft-assembles a problem solving whole from a candidate pool that may include neural storage and processing routines, perceptual and motoric routines, external storage and operations, and a variety of self-stimulating cycles involving self-produced material scaffolding” (p. 137).

A good start on understanding this process of recruitment would be to notice the role of cultural practices in the orchestration of soft-assembly of extended systems. Consider, for example, the “experience driven microdemons” that are proposed to control the gating of relations between verbal and gestural representation processes. What sort of experience can shape and drive such microdemons? Since verbal and gestural representations are prototypical constituents and products of cultural activities, the obvious yet overlooked answer is that such microdemons are driven and shaped by cultural practices. Certainly, one might correctly point out that the microdemons exist in the biological brain. And so, it might seem that the question is this: if such microdemons are formed by culturally organized activity, should the organizing control be attributed to the brainbound microdemons or to the cultural activities that create them, scaffold them, and hold them in place while they do their work? This, however, is a false choice. Posing the correct question calls for the next conceptual flip; a perspective in which both the constraints of cultural practices and the malleable internal microdemons can be seen as elements of a single adaptive dynamical system. After a brief flirtation with the self-organization of inner and outer cognitive ecosystems, Clark quickly retreats to the Hypothesis of Organism-centered cognition. “But the organism (and within the organism, the brain/CNS) remains the core and currently the most active element”Footnote 3 (p. 139).

3 Cultural practices: present but not appreciated

As I read Supersizing the Mind I saw evidence of the role of cultural practices in every chapter. Cultural practices are the things people do and their ways of being in the world. A practice is cultural if it exists in a cognitive ecology such that it is constrained by or coordinated with the practices of other persons. Above all else, cultural practices are the things people do in interaction with one another. Virtually all external representations are produced by cultural practices. All forms of language are produced by and in cultural practices. Speaking is accomplished via discursive cultural practices. Reading and writing are cultural practices par excellence. The specifics of each language require its speakers to attend to some distinctions and permit them to ignore others. This “thinking for speaking” implies that even low-level perceptual processes are often organized by cultural practices. Cultural practices include particular ways of seeing (or hearing, or feeling, or smelling, or tasting) the world. Cultural practices are not cultural models traditionally construed as disembodied mental representations of knowledge. Rather they are fully embodied skills. Cultural practices organize the action in situated action. Cultural practices are emergent products of dynamic distributed networks of constraints. Some constraints may be internal and mental (some of these are perhaps consciously experienced, but most are implicit and affectively charged), some constraints arise from the mechanics and physiology of the body, some constraints may be provided by engagement with material artifacts and some from interactions with social others.

In Chap. 1, Clark summarizes the PEA, saying, “…embodied agents exploit the opportunities provided by dynamic loops, active sensing, and iterated bouts of environmental exploitation and intervention.” This account is correct, but it demands an examination of the role of cultural practices in the organization of both the processes of exploitation and the exploitable environments. Isolated embodied human agents probably do little of this exploitation without the shaping influences of culture.

According to Clark, this exploitation happens “on the spot,” but the constraints that determine which resources are exploited and how they are related to one another is not entirely formed “on the spot.” The “on the spot” phrase highlights the opportunistic nature of cognitive systems. However, without additional discussion, this wording may also bias the solution toward the biological brain by isolating the activity from the context of cultural historical processes. For example, few of the dynamic loops that link people to their environments are invented by the people who exploit them. Rather, the ability to establish and maintain such loops is acquired via participation in culturally organized activities with other people.

Cultural practices shape active sensing and ways of seeing the world by highlighting what to attend to and what to see when so attending. Clark mentions the activity of seeing a star. A far more interesting example is seeing a constellation, since a constellation exists only by virtue of someone enacting it via a cultural practice that allocates visual attention in a particular way (Hutchins 2008).Footnote 4 For humans, the environment that is to be exploited consists almost entirely of products of previous cultural activity, much of it having been produced by the culturally orchestrated environmental interventions of self or others. On average, each individual human’s cumulative lifetime contribution to the store of ways of exploiting cognitive environments is negligible. Every one of Clark’s descriptions of the transformative effects of language is an example of cultural practices orchestrating the organization of an ecological assembly. In Chap. 7 Clark makes the important point that “surrogate situations” (external models and representations of the world) are themselves worlds with which the brain and body can establish productive cognitive interactions. Of course, every such surrogate situation is the product of prior cultural practice, and every ongoing interaction with a surrogate situation is orchestrated by cultural practices. In an extended discussion of Noë’s Strong Sensorimotor Model, Clark shows why it must be the case that the brain can entertain representations that are not tightly tied to the specifics of the organism’s sensorimotor apparatus. As examples he cites “skills of sifting, sorting, classifying, selecting, choosing, reidentifying, and comparing” (p. 179). While some low-level versions of some of these may be innate, in adult human cognition these skills are mostly enacted in cultural practices.

Clark hints at the richness of the buildup of cultural practices when he says, “Developmental investigations … strongly suggest that space, classification, and language are made for each other, with spatial indexing of various forms … playing a major role in the learning of language, and language itself … playing a cognitive role very similar to that of space” (p. 66). This passage suggests the dynamics of a rich cognitive ecosystem in which the cultural practices of language learning interact with the resources of spatial processing to produce the skills of classification. The use of the various structures of language to organize thinking is the canonical example of ecological assembly. When undertaken jointly, it is also the best-known example of the power of ongoing cultural practice to organize thinking. Clark comes closest to seeing and articulating the role of cultural practices in the section in Chap. 4 titled “Epistemic Engineers” where he discusses Sterelny’s idea that theory of mind arises from cultural practices. “This explanatory strategy thus depicts much of what is most distinctive in human cognition as rooted in the reliable effects, on developmentally plastic brains, of immersion in a well-engineered, cumulatively constructed cognitive niche.” This quote contains the key ideas that the brain is a plastic medium and that cultural practices may give that medium its shape (p. 68). Unfortunately, Clark seems unable to hang onto this important idea. He goes onto use Sterelny’s analysis in support of the idea that assembled cognitive systems may be extended, while skipping over the implication of Sterelny’s argument that the assembly process itself is extended and orchestrated by the cultural practices that constitute the cognitive niche.

These examples illustrate how human cognition makes use of culturally constructed assemblies (products) and how the original construction process is orchestrated through the joint participation in cultural practices with social others. However, in spite of presenting this evidence of the roles of cultural practices, Clark fails to frame the discussion in a way that makes this point apparent.Footnote 5 I believe this is because of his use of some currently fashionable ways of speaking that render the roles of culture in the organization of cognition invisible. I offer a short list of simple claims about the cultural context of cognition as an antidote to the assumptions underlying these ways of speaking.

3.1 The cultural world is dynamic

One of the key problems created by Clark’s descriptions of the interactions of “brain, body, and world” is that the world is conceived as primarily static, not dynamic. If the world is static, then the dynamics of cognition must be provided by the biological brain and body. The most frequently mentioned example of extended cognition is that of Otto and his notebook. Otto suffers from a memory disorder, but can function by off loading memory by writing things in the notebook. In this example, the world (the notebook) is static, unless modified by Otto, and asocial, whereas the cultural world experienced by humans is pervasively dynamic and social.Footnote 6 Because the cultural world is dynamic, including as it does the dynamic activities of social others, the brain and body of a focal individual are not the only possible sources of dynamic organizing processes.

3.2 Cultural practices are not simply mental representations

Clark says, “The cultural transmission of knowledge and practices resulting from individual lifetime learning, when combined with the physical persistence of artifacts, yields another source of potentially selection-impacting feedback” (p. 62). This is useful because it indicates some realization of the importance of cultural practices in organizing the selection of resources in cognitive systems. Unfortunately, this small step forward is accompanied by a step backwards. The “transmission of knowledge” framing can be read to imply that cultural practices are bits of internally represented knowledge acquired by individuals via individual learning. Cultural practices often include internal representations, but, as described above, it is a mistake to identify the cultural practice with mental representations. Doing so is another way of implicitly granting the organization of ecological assemblies to the nervous system via “organism centered cognition.”

3.3 Ecological assemblies can be organized by coordination with social others

Citing Donald (2001) who describes acquisition of driving skill through self-teaching, self-rehearsal, and self-evaluation, Clark says, “the human agent, one might say, is nature’s expert at becoming expert.” There is an important grain of truth in this claim; humans are masters of meta-cognition, but again the framing invites an image of a person becoming expert in isolation. Self-rehearsal surely happens sometimes, but when? The self-rehearsal of high-level skills such as those that comprise driving is part of a cultural practice; a spinoff of joint rehearsals of those same skills. The criteria and the processes for self-evaluation of progress toward expertise are always culturally established.

This point is brought into high relief when one considers interactions between human and nonhuman primates. Clark’s discussion, in Chap. 3, of the behavior of the chimp Sheba could be improved by noting that every demonstration of chimpanzee cognitive capacities in captivity involves the animals acquiring the ability to engage in cultural practices with their keepers (and sometimes with their fellow chimps) (Johnson and Karin-D’Arcy 2006; Hutchins 2008). Learning number symbols and interpreting number symbols are cultural practices that the chimps engage in with their keepers. These practices are necessarily grounded in the social relations that have been established between the chimps and their keepers. Engaging in this cultural practice clearly depends on skills that the chimp learns. The chimp must acquire new internal resources that are recruited by engaging in the cultural practice. But does anyone imagine that the chimp is wholly responsible (organism centered) for the organization of the practices that it engages in while interacting with its keepers? The cultural practices do not exist entirely inside the chimp, nor do they exist entirely inside the keepers.Footnote 7 The organization of these cultural practices emerges from the enactment of relations among resources that are inside and outside both the human and the nonhuman participants.

The embodied perspective at the center of Clark’s presentation implies a previously underappreciated richness to social interaction. This has the consequence of providing unexpected forms of embodied learning in interaction, some of which imply the organization of cognitive systems by social others in more profound ways than either instruction or imitation. Jointly engaging in embodied cultural practices gives rise to behavior shaped by complementarity of action (Hutchins and Johnson 2009). Some of the constraints on the organization of joint action come from resources and structures that are not internal to the cognitive agent. Thus, real world skill learning typically provides good examples of assembly (process) that is controlled not solely by the biological brain, but by interactions with the organized activity of social others as well.

Clark’s Hypothesis of Organism-centered cognition claimed that “the organism (and within the organism, the brain/CNS) remains the core and currently the most active element.” (p. 139) Clark retreated here when he could not find an alternate source of active organizing processes. This retreat was forced by Clark’s use of popular, but misleading, ways of thinking about culture. If culture is reduced to mental representations then it cannot counter the Hypothesis of Organism Centered Cognition because mental representations are parts of the organism. If culture is reduced to a collection of lifeless artifacts then it cannot counter the Hypothesis of Organism Centered Cognition because it contains no active dynamic processes. If we recall, however, that cultural practices are the things we humans do together, then cultural practices, which have their own dynamics and transcend the boundaries of individual organisms, can contribute organization to Clark’s “ill-understood” recruitment process.

4 Enculturating the Supersized Mind

When Clark combined the question of who or what is responsible for the organization of ecological assemblies with Dennett’s notion of self-organizing “manipulanda that manipulate themselves” he found himself staring into an organizational abyss. He seems to have proposed the Hypothesis of Organism Centered Cognition as a sort of safety barrier to prevent us falling into the void. Clark says it is a mystery how the organism produces this “ill-understood process of recruitment,” yet in footnotes and parenthetical comments he expresses doubt that the organism alone can do it.

This is not a very satisfying position. However, if one makes the mental flip I proposed at the beginning of this essay, new processes come into focus. In particular, what may have appeared to Clark as a static organizational void surrounding the isolated human organism only looked empty because cognitive science has adopted ways of speaking and thinking that render cultural practices invisible. With the entire dynamic cognitive system in view, the scaffolding of brainbound thinking can be removed. Cultural practices clearly contribute a great deal to the organization of ecological assemblies. Exactly how much they contribute remains an empirical question. As Clark notes, we need a lot more careful documentation of real world cognitive systems. My experience of more than 30 years studying cognition in the wild (Hutchins 1980, 1995, in press) leads me to believe that cultural practices account for much of what is needed to account for the organization of human cognitive systems. In this perspective, the brain appears as a special super-flexible medium that can form functional subsystems that establish and maintain dynamic coordination among constraints imposed by the world of cultural activity, by the body, and by the brain’s own prior organization. The brain has causal powers, but when it comes to human cognition, most of the causal powers of the human brain derive from previous experience in cultural practices. In order to spur the program forward, I propose the hypothesis of enculturated cognition: The ecological assemblies of human cognition make pervasive use of cultural products. They are always initially, and often subsequently, assembled on the spot in ongoing cultural practices. With Supersizing the Mind, Clark has delivered the sciences of mind to a prospect from which the field can turn from the tunnel vision of brainbound thinking to the panorama of the enculturated Supersized Mind.