Journal of Cognitive Semiotics, IV(1): 225-250. http://www.cognitivesemiotics.com. Address for correspondence: Ruhr-Universität Bochum, Institut für Philosophie II, Universitätsstr. 150, 44801 Bochum, Germany; lcdebruin@gmail.com. Leon de Bruin* and Sanneke de Haan∫ ∫Academic Medical Center, University of Amsterdam *Ruhr-University, Bochum, Germany Enactivism & Social Cognition: In Search of the Whole Story Although the enactive approach has been very successful in explaining many basic social interactions in terms of embodied practices, there is still much work to be done when it comes to higher forms of social cognition. In this article, we discuss and evaluate two recent proposals by Shaun Gallagher and Daniel Hutto that try to bridge this 'cognitive gap' by appealing to the notion of narrative practice. Although we are enthusiastic about these proposals, we argue that (i) it is difficult to see them as continuous with the enactivist notion of direct coupling, and (ii) the failure to account for folk psychological action interpretation suggests that the enactive approach should adopt a broader notion of coupling. Keywords: social cognition, enactivism, narrative practice, embodied cognition, cognitive gap INTRODUCTION For a long time now, it has been common practice to explain social cognition under the heading of 'theory of mind'. Discussions of theory of mind are, almost without exception, dominated by two main approaches: theory theory (TT) and simulation theory (ST). According to TT, social cognition depends on a folk psychological theory that specifies how mental states (in particular, beliefs and desires) interrelate and give rise to intentions and actions. Churchland (1986: 299) describes this theory as a 'rough-hewn set of concepts, generalizations, and rules of thumb we all standardly use in explaining and predicting human behavior'. There are different stories about how human beings acquire such a folk psychological theory. The 'modular' subdivision of TT argues that the theory is acquired through an innately specified, domain-specific mechanism: a 'theory of mind module' (Fodor 1983, Baron-Cohen 1995, Leslie ENACTIVISM AND SOCIAL COGNITION | 226 1994).1 When it comes to the ontogenetic development of this module, some advocates of the modular view have suggested that it is already in place from the moment of birth. Fodor (1995), one of the champions of the modular view, suggests that 'the child's theory of mind undergoes no alteration; what changes is only his ability to exploit what he knows' (110). However, there are also versions of modular TT that are committed to a less substantial innate component. For example, Garfield et al. (2000) emphasize the importance of developmental processes in proposing that theory of mind is an 'acquired module' (502), shaped by interactions with the (social) environment. Scientific TT (or STT) downplays the importance of an innate module even further. It claims that, with the exception of a select number of specific theoretical principles, theory of mind is not innate but acquired through a course of development: children develop their everyday knowledge of the world by using the same cognitive devices used in science. They proceed like little scientists, testing and revising their hypotheses about intersubjectivity in the light of new evidence (Gopnik & Meltzoff 1997; Gopnik & Wellman 1992, 1994). Therefore, STT is also nicknamed 'the child-scientist hypothesis'. Simulation theory (ST) is usually portrayed as the main rival of TT. What the early papers on ST (Gordon 1986, Heal 1986, Goldman 1989) had in common was a strong desire to move away from the over-intellectualized picture of social interaction offered by TT, which demanded 'a highly developed theoretical intellect and a methodological sophistication rivalling that of modern-day cognitive scientists. That is an awful lot to impute to the four-year-old, or to our savage ancestors' (Gordon 1986: 71). ST was proposed as a solution to the problem of theory in TT, and as such posed a direct challenge to the latter. Its main claim is that, during their everyday intersubjective encounters, people use their own minds as an 'internal model' to understand the minds of others. By putting themselves 'in the other's shoes', they simulate how they will proceed (or would have proceeded) under the same circumstances, while making adjustments for the relevant differences. This proposal is different from TT because it is 'process driven' instead of 'theory driven' (cf. Goldman 1989). According to Goldman, human beings are capable of accurately simulating a 'target system' (another human being) even if they lack a theory, so long as their initial mental states are the same as those of the target system and the process that drives the simulation is the same as (or relevantly similar to) the process that drives the system (that is, their own system). Despite these differences, many ST and TT approaches share a commitment to the idea that social cognition is primarily about mindreading.2 First of all, this means that they uncritically accept the claim that 'our basic grip on the social world depends on our being able to see our fellows as 1 Tooby and Cosmides (1995: xvii), for example, claim that 'humans everywhere interpret the behavior of others in... mentalistic terms because we all come equipped with a "theory of mind" module... that is compelled to interpret others this way, with mentalistic terms as its natural language'. 2 However, there are also proponents of ST who explicitly reject the elements characteristic of mindreading. Gordon (1995: 53), for example, argues for a kind of 'radical simulation' that does not involve: (i) an analogical inference from oneself to others, (ii) introspection-based attributions of mental states to oneself, and (iii) the prior possession of the concepts of the mental states ascribed. Clearly, this gives a much more parsimonious picture of social interaction. At the same time, however, Gordon's account is not without problems (cf. Gallagher 2007). ENACTIVISM AND SOCIAL COGNITION | 227 motivated by beliefs and desires we sometimes share and sometimes do not... social understanding is deeply and almost exclusively mentalistic' (Currie & Sterelny 2000: 145-146). Moreover, ST and TT commonly assume that intersubjective capacities are primarily recruited for the purpose of behaviour prediction and explanation. A third commonality is that these premises usually entail a third-person perspective on social cognition.3 The question is whether such a third-person, mindreading approach to social cognition is able to capture the very practical, embodied nature of what goes on in many of one's social engagements with others. Recent enactive approaches to social cognition (cf. Hutto 2004, 2008; Reddy & Morris 2004; Iacoboni 2006; Ratcliffe 2007; Gallagher & Zahavi 2008; Fuchs & De Jaegher 2009) think this is not the case. They argue that most dealings with others should be explained in terms of second-person embodied practices which involve various capabilities – imitation, intentionality detection, gaze following, social referencing, etc. – but do not depend on mindreading. These practices constitute the baseline for social understanding – what Bruner & Kalmar (1998) call a 'massively hermeneutic' background – and pave the way for more advanced forms of social understanding. According to proponents of the enactive approach, there are two ways in which embodied practices are primary to the more reflective modes of social understanding promoted by TT and ST. In the first place, they involve social abilities that come earlier in development and may be partially innate. Gallagher (2001), for example, claims that 'before we are in a position to theorize, simulate, explain or predict mental states in others, we are already in a position to interact with and to understand others in terms of their gestures, intentions and emotions, and in terms of what they see, what they do or pretend'. Secondly, embodied practices are also primary in the sense that they continue to characterize most interactions with others, and remain the default mode for understanding them. We think that advocates of enactivism are right in emphasizing the importance of second-person practices, and we agree with their claims about the ontogenetic and pragmatic priority of these practices. At the same time, however, we feel that they still have to offer a convincing story about the kind of high-level folk psychological interpretation that has traditionally been the focus of TT and ST. So far, most of them have directed their arrows at the presumed scope of folk psychological interpretation; but the arguments put forward are usually to the effect that there is no job description for mindreading in social interaction. Because no attention is paid to the possibility of folk psychological interpretation of a non-mindreading kind, this threatens to throw the baby out with the bathwater. 3 Hutto (2004) argues that the tendency to postulate complicated mindreading routines in fact reveals a theoretical bias, since it is assumed that the processes involved in basic acts of recognition tacitly mimic those of mature reasoners who would tackle the same problem using a set of abstract concepts and general principles so as to make explicit inferences. According to Hutto, people are systematically misled on this score because 'in the very act of classifying such behavior we must employ our own conceptual scheme of reference. But it is nothing more than an intellectual bias to suppose that, for example, young children or animals must be tacitly employing it' (557). ENACTIVISM AND SOCIAL COGNITION | 228 Recent proposals have tried to address this lacuna by appealing to the notion of narrative practice. The main aim of this article is to discuss and assess the accounts that have been put forward by Shaun Gallagher and Daniel Hutto (Gallagher 2003; Gallagher & Hutto 2007; Hutto 2007, 2008). We argue that, although these narrative proposals look promising, it is hard to fit them into an enactivist framework. This is not just because they have a number of serious shortcomings. Rather, we think that the failure of these proposals to explain interpretation of folk psychological action from an enactive perspective is symptomatic of a general problem for enactivism: how to account for more advanced, remote, and abstract forms of social cognition. De Jaegher and Froese (2009: 439) refer to this problem as 'the cognitive gap'; they argue that the biggest challenge for enactivism is 'to show how an explanatory framework that accounts for basic biological processes can be systematically extended to incorporate the highest reaches of human cognition.' Although we are not going to take up this challenge in this article, we will say something about what would be required for enactivism to account for the 'highest reaches' of social cognition. In this section, we have presented a brief overview of the tradition TT and ST accounts of social cognition. In the next section, we take a closer look at the enactive approach: in particular, the idea that social cognition can be understood in terms of embodied interactions between (directly) coupled systems. We argue that the notion of embodied practice, thus understood, provides a basic but solid foundation for everyday intersubjectivity; we appeal to recent empirical findings to support this claim. In the third section, we introduce an important argument often made against TT and ST by proponents of the enactive approach: the argument from phenomenology. We intend to show that, although the argument certainly has some force, it is not conclusive and (when taken too far) actually obscures the above-mentioned problem of the 'cognitive gap': that is, it encourages one to ignore the question of how to account for folk psychological action interpretation from an enactive point of view. The fourth section discusses Gallagher's articulation of narrative practice in some detail. We think that Gallagher takes a good first step towards an embodied account of high-level action interpretation, but he does not address the ability to make sense of others in folk psychological terms. Moreover, his account includes a number of elements that, prima facie, seem incompatible with enactivism insofar as it focuses exclusively on direct coupled systems. In the fifth section, we comment on Hutto's 'narrative practice hypothesis' (NPH), which is specifically designed to capture the folk psychological understanding of others. Although we very much agree with Hutto's emphasis on the sociocultural, second-person nature of folk psychology, we show that his folk psychological narratives still betray a substantial commitment to the traditional belief-desire model of action interpretation. It seems hard to reconcile this with the kind of enactivism usually endorsed in discussions about social cognition. As a first step towards a possible solution, in the sixth section we propose that enactivism adopt a broader notion of coupling: one that does not fully rely on direct interactions between embodied systems, but instead allows for a continuous process of decoupling and re-coupling. ENACTIVISM AND SOCIAL COGNITION | 229 ENACTIVISM AND EVERYDAY EMBODIED PRACTICE Enactivism The aim of enactivism is to study the mind in a way that does justice to human experience while remaining scientifically sound. In order to do so, it combines insights from disciplines such as biology, dynamical systems theory, and the phenomenological tradition. Enactivism has been proposed as an alternative to cognitivism, according to which the mind is basically an information-processing system manipulating mental representations of the outside world by means of explicit rules. Mental representations have content by virtue of their ability to correspond with (things in) the world. At the same time, however, mental representations have a functional structure that makes it possible to study them independently of their contents. Consequently, cognitivists maintain that cognition can (to some extent) be studied in isolation from the world in which it is embedded. Enactivism questions the representational framework put forward by cognitivism and instead proposes to view cognition as a kind of embodied action. Varela et al. (1991), for example, explain this as follows: 'by using the term embodied we mean to highlight two points: first that cognition depends upon the kinds of experience that come from having a body with various sensorimotor capacities, and second, that these individual sensorimotor capacities are themselves embedded in a more encompassing biological, psychological, and cultural context. By using the term action we mean to emphasize once again that sensory and motor processes, perception and action, are fundamentally inseparable in lived cognition' (1991: 172-3).4 In other words, proponents of the enactive approach emphasize that cognition is not so much about passive information processing, but rather should be understood in terms of active sense making. Enactivism focuses on a dynamic coupling between world and organism; it argues for a notion of cognition as 'the enactment of a world and a mind on the basis of a history of the variety of actions that a being in the world performs' (1991: 9). When the basic principles of enactivism are applied to social cognition, sense making becomes participatory sense making (De Jaegher & Di Paolo 2007). The proper unit of analysis is not the individual (let alone the individual brain) but rather the coupled system as a whole, including the participants, their dynamic interactions, and the context in which these interactions take place. Characteristic of such a coupled-system view of social cognition is its emphasis on reciprocal interaction and recurrent feedback loops. It is the active process of engaging with others that gives rise to an understanding of them. This is why the enactive approach is very much in line with a secondperson methodology. 4 Note that according to this definition, the notion 'embodied' already includes embeddedness as well. ENACTIVISM AND SOCIAL COGNITION | 230 Everyday embodied practice What is attractive about a second-person approach is that it offers enactivism a way to explain everyday social interactions in a direct fashion: that is, without appeal to mental-state management or mindreading procedures. Gallagher (2008) articulates a notion of direct perception to emphasize this point. He argues that 'in seeing the actions and expressive movements of the other person in the context of the surrounding world, one already sees their meaning; no inference to a hidden set of mental states (beliefs, desires, etc.) is necessary. When I see the other's action or gesture, I see (I immediately perceive) the meaning in the action or gesture. I see the joy or I see the anger, or I see the intention in the face or in the posture or in the gesture or action of the other' (2008: 542). Most of the time, according to Gallagher, direct perception delivers all one needs to know in order to interact with others and make sense of them. We think Hutto (2008) is right to claim it is more correct to say humans are directly moved by another's psychological situation rather than that they directly perceives it. Hutto certainly agrees with Gallagher about the directness of intersubjective engagements and Gallagher's rejection of conceptual or inferential interventions. Hutto claims that 'we react directly to the attitudes of others as expressed bodily and we do so because of our natural predisposition, some of which gets reformed by experience and enculturation.... Our nonverbal acts of intersubjective responding are not prosecuted by the deployment of theory, inferential reasoning, or projective simulation' (115). The idea that people normally have a direct understanding of others – that they do not have to attribute mental states to others – is perfectly compatible with the notion of a coupled system. Ratcliffe (2006) argues that social interaction is 'seldom, if ever, a matter of two people assigning intentional states to each other.... Self and other form a coupled system rather than two wholly separate entities equipped with an internalized capacity to assign mental states to the other' (31). One of the main objectives of the enactive approach to social cognition has been to demonstrate that many basic interactions with others can be explained in terms of directly coupled systems. To give an impression of the progress that has been made in this area, we will briefly discuss some interesting findings in developmental psychology on both 'primary intersubjectivity' and 'secondary intersubjectivity' (Trevarthen 1979). Primary intersubjectivity is the label for those innate or early developing sensory-motor (emotionally informed) capabilities that provide a direct form of social understanding. For example, various developmental studies point out that neonates are not only able to discriminate between themselves and their environment (e.g., Rochat & Hespos 1997), but also respond selectively to other human agents. Despite not yet having acquired the appropriate concept of agent or face, they differentiate effectively between agents and non-agents, faces and non-faces. Meltzoff and Moore (1977, 1994) have shown that neonates are already capable of picking out a human face from the crowd of objects in their environment, with sufficient detail to enable them to imitate the expression they see on that face. ENACTIVISM AND SOCIAL COGNITION | 231 Although neonates' first imitative attempts lack a high degree of accuracy, neonates are able to correct and improve their gestural performance over time. This allows them to increasingly fine-tune their interactions with others. As Meltzoff and Moore (1994) point out, there is an interesting developmental change in the expression of imitative behavior. Although the ability to imitate is present from the moment of birth, infants need a lot of practice to pull off the more advanced modes of imitation that come later in development. Neonates imitate novel acts. Research on older infants reveals a generative imitation of novelty that is beyond the scope of younger infants (Bauer & Mandler 1992, Barr et al. 1996). One could say that there is a progression in imitation from pure body actions, to actions on objects, to using one object as a tool for manipulating other objects. What all of these forms of embodied responsiveness demonstrate is that neonates and young infants are perfectly capable of interacting with others in a dyadic way; but primary intersubjectivity alone does not allow them to interact with other agents in a world-involving way, nor does it provide them with an understanding of others in the pragmatically contextualized situations of everyday life. Such an understanding only starts to emerge when infants enter the realm of 'secondary intersubjectivity'. The embodied practices characteristic of secondary intersubjectivity are triadic: they involve a referential triangle of child, adult, and environment: an outside object or event to which they jointly attend. As Hobson (2002) points out, the defining feature of this shared attention is that 'an object or event can become a focus between people. Objects and events can be communicated about.... The infant's interactions with another person begin to have reference to the things that surround them' (62). From six months of age onwards, infants are capable of perceiving others as directing attention towards objects – first in their grasps of objects, later when they gaze and point at (distant) objects (Woodward 2005). 'Real' shared attention has been located around the age of twelve months. Some psychologists regard shared attention as the first sign of infants having (1) 'any understanding of the internal structure of intentional actions' (Tomasello et al. 2005: 678-9), and (2) an ability to interpret others' actions as 'goal-directed' (Gergely & Csibra 2003: 288). Reddy (2003) points to a very interesting lacuna in this line of thinking. She argues that shared attention is usually defined as involving shared attending to an outside object. However, long before infants are able to do that, they are aware of situations in which they themselves are the object of attention. Reddy argues that a second-person approach and real interaction with the infant are required to discover these kinds of abilities. From an enactive point of view, it is important to provide the right interpretation of these findings – one that acknowledges their interactive and direct nature. Take the imitation studies mentioned above. Tomasello (1999: 195) has suggested that the results from these studies indicate young infants to be 'imitation machines'. Such terms invite a mechanical, almost reflex-like description of what goes ENACTIVISM AND SOCIAL COGNITION | 232 on.5 Imitation is better characterized as an embodied responsiveness that stresses the (slight) modulations each participant brings to bear in her interactions. Imitative behavior allows participants to mutually tune in to rather than merge into each other. The individual modulations attest to the autonomy of the participants: for perfect contingency you only need a mirror, but for genuine social interaction you need another person. As De Jaegher and Di Paolo (2007) remark, participatory sensemaking is only participatory so long as the participants remain autonomous. Otherwise, it is merely one person forcing a sense upon another: a one-way interaction (see also Fuchs & De Jaegher 2009).6 Research shows that infants from three months on prefer these slight modulations in their embodied responses – time-delay for example – over perfect contingency. The exception is autistic children, who continue to prefer a perfect mirroring (Gergely 2001). Again, such a perfect contingency displays only one's own agency; a non-perfect contingency suggests the influence of another person and thus interpersonal contact. Given that normal infants are still developing their sense of agency during this period, it seems plausible to argue that they are mostly interested in finding out what they themselves can bring about. That said, as soon as their sense of agency has reached a certain level of sophistication, a pure reflection of their own deeds probably becomes a bit boring – especially compared to the novelty introduced by interaction with another person. Autistic children, however, continue to prefer perfectly contingent feedback to modulated feedback. Gergely (2001) explains autistic children's preference for perfectly contingent over modulated feedback in terms of a 'faulty switch' in a 'contingency detection module', which leads to symptomatic difficulties in social interactions. Although there is still an ongoing debate on the underlying mechanism(s) of autism, we are skeptical if the assumption of modules will push understanding any further. Given autists' difficulties with social interaction and novelty, it is not surprising that they find the suggestion of another person and the possibility of interpersonal contact less attractive. THE APPEAL TO PHENOMENOLOGY Enactive approaches are frequently promoted by philosophers taking their inspiration from the phenomenological tradition (e.g., Zahavi 2005, Gallagher & Zahavi 2007). They argue against (mentalistic) TT or ST interpretations of embodied practices from a phenomenological point of view: if mindreading processes are primary, pervasive, and explicit in everyday social interaction, then one might expect them to show up in everyday experience – but they do not. On the contrary: if one looks 5 The issue is not merely terminological. On the contrary: it has important empirical ramifications. For example, on a mechanical view, it is much harder to explain why infants are more likely to imitate after the experimenter has attended to them – as Csibra & Gergely (2009) have shown recently. This highlights the relevance of the enactive approach to scientific experimentation, precisely in the sense Reddy & Trevarthen (2004) propose. 6 That is, the notion of social interaction as a coupled system, by itself, is not sufficient to explain what happens in cases of imitation. What is needed is a definition that safeguards the autonomy of each participant. De Jaegher and Di Paolo put forward such a definition, characterizing social interaction as 'the regulated coupling between at least two autonomous agents, where the regulation is aimed at aspect of the coupling itself so that it constitutes emergent autonomous organization in the domain in relation dynamics, without destroying in the process the autonomy'. ENACTIVISM AND SOCIAL COGNITION | 233 at the phenomenological evidence, one must conclude that everyday encounters with others tend to be second-person and interactive. This is what Gallagher (2004, 202) calls 'the simple phenomenological argument'.7 Proponents of TT and ST usually parry by admitting that human beings do not consciously or explicitly employ theoretical principles or simulation routines during social engagements. If mindreading is something done in a tacit way, then what is experienced (or seemingly experienced) during social interaction is not a good guide for what is 'really' happening, and the appeal to phenomenology is inappropriate. Goldman (2006), for example, defends his version of ST against phenomenological objections by claiming that most simulation procedures are 'simple, primitive, automatic, and largely below the level of consciousness' (113). He argues that the phenomenological argument cannot be used to characterize social experience in a positive way (i.e., as 'enactive'), because phenomenological properties are elusive, 'incapable of supporting weighty thesis', hard to agree upon, and 'hotly disputed' (249). We think Goldman actually has a point: perhaps the simple phenomenological argument is a bit too simple. Although the argument does important work restricting the scope of TT and ST claims with respect to the personal level of social interaction, it is certainly not decisive. Gallagher (2004) recognizes this, admitting that introspective reports about the phenomenology of everyday social interaction can be notoriously suspect guides to what people are doing at the conscious level.8 He explicitly does not rule out the possibility that one sometimes engages in the kind of 'specialized' procedures promoted by TT and ST; but he claims that this happens only in those cases in which everyday second-person interactions break down, or where one has problems understanding the other person. 'Such specialized cognitive approaches do not characterize our primary or everyday encounters with others' (2004: 202).9 Still, an all-encompassing enactivist explanation of social interaction must be able to account for these exceptional situations as well. Moreover, the question is precisely to what extent these situations of social misunderstanding are indeed exceptional. Here, the simple phenomenological argument obscures an important issue. The everyday experience of others is usually not a good criterion to assess theories of social interaction, because many social skills – especially the ones used very frequently – are attentively recessive and do not take centre stage in conscious awareness. This was already observed by William James (1890), who wrote: 'it is a general principle in psychology that consciousness deserts all processes where it can no longer be of use' (496). Of course, one does not 7 See Ratcliffe (2007: 23) for a similar argument. 8 Gallagher thinks that an appeal to social phenomenology should go beyond an appeal to good-old introspection: i.e., subjective reports about everyday social encounters. He proposes to use phenomenology in its technical (Husserlian) sense: that is, as a strict method for the analysis of the common structures of experience (cf. Gallagher & Varela 2003; Gallagher & Brøsted Sørensen 2006). 9 See also Gallagher (2001: 85), where he acknowledges that people sometimes understand others by enacting theoretical attitudes or simulation, while observing that 'such instances are rare, however, relative to the majority of our interactions'. ENACTIVISM AND SOCIAL COGNITION | 234 want a theory of social interaction that postulates all kinds of unconscious processes, ultimately justifying this by claiming that these processes are 'just' innate. This seems nothing more than an excuse for any real understanding. We think that a solution to this dilemma can be found in a clearer focus on the development of social interaction. As the quote by James indicates, although the automatization of highly developed skills might be a psychological commonplace, this does not mean that they have not been consciously learned at some point in development. Take the ability to walk, for example. As I walk out to the beach – to borrow Gallagher's (2006) example – I am not normally conscious of how I activate my leg muscles. More generally, as I move through the world, I do not normally monitor the specifics of my motor action in any explicitly conscious way. Yet I did learn how to walk in early infancy (at least in a normal developmental scenario), and this required me to attend explicitly to my movements. What the simple phenomenological argument seems to ignore is that social learning happens primarily in those situations in which understanding of others is not smooth and direct. De Jaegher (2009, 540) points this out when she says that 'failures in understanding another's behavior are not exceptional. On the contrary, they form part and parcel of the ongoing process of social understanding. More even, misunderstandings are the pivots around which the really interesting stuff of social understanding revolves. In these instances where coordination is lost, we have the potential to gain a lot of understanding'.10 The moral here is that the simple phenomenological argument should be handled very carefully. Enactivism still has a job to do when it comes to explaining the folk psychological action interpretation that has traditionally been the focus of TT and ST. Furthermore, as the above considerations make clear, one cannot simply claim on phenomenological grounds that most of the work has already been done, since this kind of action interpretation is exceptional. The challenge for enactivism is to offer a convincing story about these more advanced modes of social understanding; in order to do so, it must go beyond the embodied practices discussed in the previous section. Recently, Gallagher has taken up the challenge by articulating a notion of narrative practice. Acknowledging that primary and secondary intersubjectivity are necessary but not sufficient for a full account of human intersubjectivity, he claims that there is 'much more to say about the role of language and narrative competency in a fuller account of intersubjectivity' (Gallagher 2007: 75). In the next session, we take a closer look at his proposal. NARRATIVE PRACTICE, ACCORDING TO GALLAGHER Gallagher's account of narrative practice is very much in line with his enactive approach to the embodied practices discussed in the previous section. The idea is that narrative competency enables 'a direct interpretation of the other's actions and intentions, without the mediation of folk psychology' 10 She further observes that 'where the ongoing flow of interacting and coordinating breaks, opportunities for redirecting sense-making open up. On such occasions, one of the things we can do is to attempt to repair our interaction, for instance through a re-attunement of movements and or utterances' (ibid). ENACTIVISM AND SOCIAL COGNITION | 235 (Gallagher & Hutto 2008). It allows for a context-sensitive, nuanced, and sophisticated understanding of others that cannot be captured in terms of TT or ST. An important question is how narrative practice provides such a 'direct interpretation' of others. A crucial feature of a narrative is its concern with the concrete and the particular. This is where it significantly differs from a theory. According to proponents of TT, as seen above, understanding of others is facilitated by a folk psychological theory that deals with the universal, abstracting away from particular contexts towards descriptions of the way the world tends to be in general. If Bruner (1986) is right, a narrative does exactly the opposite: it takes context to be primary in the determination of meaning, since it deals with particular, concrete situations. A narrative is always situated: it must be interpreted in light of a specific discourse, to cue interpreters to draw inferences about a structured time-course of particularized events. As a result, narrative practice has the potential to offer a kind of practical or applied understanding of behavior that functions very differently from a theoretical one. At the same time, narrative practice provides the means to reach beyond what is immediately given. Language represents what is not currently present. From an enactive perspective, it is natural to promote a pragmatic, practical take on language. Language is for use; it develops through use. Many theorists have made a case for the enactive, embodied, and embedded grounding of language. Mead (1962) already proposed that language develops out of gestures; along the same lines, Heal (2005) recently suggested that language provides an immensely delicate and useful way of pointing. According to Tomasello (1999), people use linguistic symbols primarily because of their perspectival nature, which allows them to induce others to view the world in one way rather than another: i.e., in a way that goes beyond the direct perceptual or motoric aspects of a given situation. He argues that when a child 'internalizes' a linguistic symbol and comes to understand the human perspective embodied in that symbol, she learns that a particular situation may be construed intentionally in a way that is convenient for the purpose at hand: that is, she begins to comprehend that, by using a particular symbol, she intends for another agent to pay attention to some specific aspect of their shared environment. In this way, language helps children to take an outside perspective on what they are doing; it allows them to distance themselves, in a way that is both literal and figurative, from their immediate impressions. This is not exclusively a linguistic affair. On the contrary: the increase in distancing goes hand in hand with the development of embodied practices that can no longer be described in terms of direct couplings simpliciter, since they involve couplings in which the original object of interaction is substituted for something else. This is what happens in cases of pretend play for example, when children pretend that a thing might actually be something else. Leslie (1987) has shown that, by two years of age, children are already able to use a banana as if it were a telephone. The child might pick up a banana, hold it to her ear and mouth, and say: 'hi. How are you? [Brief pause.] I'm Fine. OK. Bye.' Lillard (2002) gives an example of a child who is pretending that a pile of sand is a fantastic chocolate cake, calling it cake, mimicking eating it, and saying, 'yum-yum, what delicious cake!' ENACTIVISM AND SOCIAL COGNITION | 236 What is important, according to Lillard, is that the child does not actually eat the sand, clearly aware of its real identity all the while that she treats it as if it were something else. What happens in these interactions is that the child simultaneously considers one object (or person) as having two different identities. As Perner (1991) puts it, the child no longer directly represents the world 'as it is', but instead learns to entertain 'multiple mental models'. Language can be seen as a crucial facilitator of this development. We are not very happy with Perner's terminology, but that should not obscure what is important here: that during development children increasingly learn to distance themselves from the here and now. This process of distancing is central to the ontogeny of narrative practice; Gallagher's account of how children learn to understand others in terms of narrative seems to confirm this. According to Gallagher (2003), narrative practice depends on a number of precursors. In what follows, we will briefly discuss these precursors and show that they are representative of a development that cannot be described in terms of direct coupling. Before they are able to tell stories, children first need to master the so-called 'internal' time frame of a narrative that reflects the serial order in which events follow each other. This ability emerges by the first year, as children gradually begin to distinguish between past and future and sequence actions in order to construct coherent and cohesive events. They start to remember dynamic events – so-called scripts – and seem to understand sequences of familiar repeated events involving several related actions (Bauer et al. 1994). A study by Bauer and Mandler (1990) showed that one-year-old children are already able to remember brief sequences of novel events (two or three actions) over several days. This rapidly improves as they get older: by the age of three, children can verbalize a larger number of familiar scripts in a reliable sequence (cf. Nelson & Gruendel 1981, Friedman 1992). Scripts do not yet qualify as narratives: they are still mainly based on the child's immediate experience of the here and now, and very much lack a temporal dimension. The only temporal differentiation that children are capable of making until their second year is that between the present activity and everything else that has been experienced and memorized: sequences of events, people, places, and associated objects. Still, this form of temporal decoupling already indicates that these children are no longer solely interacting with their environment in a direct fashion. In order to engage in narrative practice, children not only must recollect the specific time when an event occurred; they must be able to attribute this event to themselves or others. According to Gallagher (2003), the first-person pronoun 'I' serves as the most minimal referent around which experienced events can be organized; the precise way in children learn to use it, starting around twelve months, gives them an 'extremely secure anchor' for the construction of a self-narrative. The firstperson pronoun is not just a 'deflated pronoun, grammatical structure or piece of vocabulary', however. On the contrary, it has an 'embodied referent'. Gallagher argues that its use depends ontogenetically on the minimal self. ENACTIVISM AND SOCIAL COGNITION | 237 Both the capacity for temporal integration and the ability to self-refer by means of the first-person pronoun are necessary for the proper functioning of autobiographical memory. It has been claimed that two-year-old children already possess such memory. Howe (2000) argues that, even though the autobiographical memories of children around this age must be elicited by questions and prompts, 'by 18-24 months of age infants have a concept of themselves that is sufficiently viable to serve as a referent around which personally experienced events can be organized in memory.... The self at 18-24 months of age achieves whatever 'critical mass' is necessary to serve as an organizer and regulator of experience.... This achievement in self-awareness (recognition) is followed shortly by the onset of autobiographical memory' (1991: 2). Autobiographical memory provides the background knowledge out of which a coherent narrative is formed. This, is not simply a matter of 'encoding' and 'retrieving' information. The creation of a self-narrative is very much a reconstructive process: it does not merely depend on the proper functioning of memory but, in an important sense, contributes to the functioning of that memory. Gallagher (2003: 419) suggests that, in order to form a self-narrative, 'one needs to do more than simply remember life events. One must see in such events a significance that goes beyond the events themselves; to reflectively consider them, deliberate on their meaning, and decide how they fit together semantically'. He argues that this interpretation process is facilitated by 'reflective metacognition', which enables people to fit and sometimes force memories into a narrative structure. This process generates a lot of confabulation. 'It is not unusual to construe certain events in a way that they did not in fact happen, for the sake of a unified or coherent meaning. Self-deception is not unusual; false memories are frequent. To some degree, and for the sake of creating a coherency to life, it is normal to confabulate and to enhance one's story' (ibid). All these elements – temporal integration of information, minimal self-reference, autobiographical memory, and reflective metacognition – are crucial for the emergence of what Gallagher calls a narrative self, which he defines as 'a more or less coherent self (or self-image) that is constituted with a past and a future in the various stories that we and others tell about ourselves' (2000: 15). Gallagher argues that self-narrative, in contrast to narratives of others, has a certain primacy in shaping self-identity. This might be true for mature and full-grown self-narratives, but it should not be forgotten that, during early development, the narrative self is primarily given form by others. As Fivush (1994) points out, children only gradually move from the contribution of one or more bits of information about a certain experience to a more equal co-construction of a narrative account of experience. Also, they frequently appropriate someone else's story as their own (Miller et al. 1990). Gallagher's account of what is required for narrative practice is restricted to the development of self-narrative. We think it fair to assume that similar capacities to the ones discussed above will be required in order to create and understand the narratives of others. The important question is whether Gallagher's proposal is continuous with the embodied practices discussed in the previous sections. It is ENACTIVISM AND SOCIAL COGNITION | 238 certainly a good first step towards an embodied account of narrative practice; but to what extent it is possible to give a full-blown enactive explanation of the above capacities in terms of direct coupling? All seem to involve something better described as a kind of de-coupling: the active suppression of a direct mode of interaction with the environment. To distinguish between past and future events, the child must be able to distance herself from what is going on right now. Self-reference by means of the first person entails a distancing from the self: the child no longer coincides with, but rather takes a stance towards, its own self. All the capacities discussed above are closely connected to the emergence of a more objective self no longer chained to the here and now, with the ability to refrain from direct interaction with the environment and other people. The notion of direct coupling seems no longer sufficient. This is especially obvious when it comes to explaining the more advanced capacities for narrative practice. In these cases, it is easy to forget one's enactivist agenda and fall back on explanations of the TT/ST variety. Gallagher (2003), for example, explains the capacity for reflective metacognition by referring to Gazzaniga's (1988) idea of a specific left-hemisphere mechanism called the interpreter. According to Gazzaniga (1988,), 'human brain architecture is organized in terms of functional modules capable of working both cooperatively and independently. These modules can carry out their functions in parallel and outside of the realm of conscious experience. The modules can effect internal and external behaviours, and do this at regular intervals. The interpreter considers all the outputs of the functional modules as soon as they are made and immediately constructs a hypothesis as to why particular actions occurred. In fact the interpreter need not be privy to why a particular module responded. Nonetheless, it will take the behavior at face value and fit the event into the large ongoing mental schema (belief system) that it has already constructed' (219). Apart from the apparent comeback of the homunculus, what is problematic about this explanation is the clear choice for modular TT. Gallagher is not necessarily committed to such a story, nor do we want to suggest that this is the explanation of reflective metacognition he prefers, since it seems at odds with the rest of his work; but the example does show how easy it is to succumb to a TT/ST explanation, all the more since there is no enactivist alternative available. Still, the example is interesting because it suggests that metacognition basically comes down to a process of interpretation. From an enactive point of view, this process could be explained in terms of re-coupling: coming up with new ways of organizing and structuring the interactions with the environment. The development of narrative practice involves a process of distancing that is hard to reconcile with direct coupling. Does this mean that narrative understanding of others cannot be direct, as Gallagher suggested at the beginning of this section? It depends on how the 'direct' is explicated. We agree with Gallagher that embodied practices provide a form of understanding that is direct in the sense that it does not require conceptual or inferential capabilities; but if this is what is meant by 'direct', then narrative understanding is clearly not direct, since it is mediated by language and crucially depends on the mastery of linguistic concepts. ENACTIVISM AND SOCIAL COGNITION | 239 Gallagher argues that narrative understanding does not require inference: 'crucially, coming to appreciate the other's story – to see why they are doing what they are doing – does not require mentalistic inference or simulation' (Gallagher & Hutto 2008: 34). However, narrative understanding depends to a large extent on the ability to make appropriate inferences over linguistic concepts – e.g., to connect goals with actions and to connect hierarchically related goals with each other. Lynch and Van den Broek (2007: 323) state that 'goals provide reasons for a character's actions throughout a narrative, and a character's success or failure in achieving his or her primary goal marks the natural conclusion of a story. Because of their centrality, readers or listeners of a narrative must be able to infer the connections between goals and other elements of a story to fully comprehend it'. Presumably Gallagher has a different notion of 'inference' in mind, one that is 'mentalistic' in nature and is usually put forward by proponents of TT. According to TT, people understand others (i.e., predict or explain their behavior) because they are able to infer mental states from a folk psychological law, in combination with certain starting premises (the initial conditions needed to connect this law to the specific explanation or prediction) and a ceteris paribus clause. Gallagher is probably right that this notion of inference -along with the theoretical approach to social understanding in which it is embedded – is not required for a narrative understanding of others. In fact, his rejection of inference seems to come down to a more general rejection of third-person approaches to social interaction, in which one must make sense of the hidden mental states of others. 'In seeking a narrative understanding of the other it is not their "inner" life – if understood as a serious of causally efficacious mental states – that we are attempting to access, but simply the other's life as it unfolds in response to worldly/situational contexts, and that is best captured in a narrative form' (Gallagher & Hutto 2007: 33).11 Gallagher admits that narrative understanding is not always direct: for example, when one is not already familiar with the story of the other person, or when one is perplexed or surprised by her action. When the actions of others deviate from what one normally expects and one encounters trouble, one may need to appeal to folk psychology. How often this happens is difficult to answer. Gallagher could argue that these cases of misunderstanding are the exception rather than the rule and invoke the simple phenomenological argument to support this claim. However, as seen in the previous section, this is not without problems. Even if it is true that folk psychology is only appealed to in the minority of social interactions, it still needs to be explained how people make sense of others in folk psychological terms: i.e., by referring to their beliefs and desires. Gallagher's account of narrative practice does not 11 Rattcliffe (2006: 37) gives an excellent illustration of this kind of narrative understanding, one that is worth quoting in full: 'when meeting somebody for a chat, we seldom have a pre-prepared, exhaustive list of discussion topics and viewpoints. Indeed we very often do not have a clue what we will talk about. Instead, the conversational narrative takes form through our interactions with each other. Facial expressions, body movements and verbal tones interact in intricate ways and seem to flow in harmony with the words spoken. Mutual interpretation is constrained by this interaction and by the shared narrative that unfolds. The flow of conversation is not simply facilitated by two discrete thinkers interpreting each other by ascribing internal mental states. My ability to interpret you is partially constituted by your interactions with me. You are a part of the interpretive process.' ENACTIVISM AND SOCIAL COGNITION | 240 provide an explanation of this ability. Moreover, as we have shown above, even the direct narrative understanding promoted by Gallagher is hard to explain from an enactive point of view. We will show in the next section that similar considerations apply to Hutto's Narrative Practice Hypothesis, despite the way it is designed specifically to capture folk psychological understanding of others. HUTTO'S FOLK PSYCHOLOGICAL NARRATIVES Defenders of TT/ST often argue that the embodied practices put forward by advocates of enactivism are, in principle, not incompatible with their own positions. They admit it might well be the case that practices such as primary and secondary intersubjectivity are, in fact, precursors to their accounts of social cognition in terms of mindreading. For example, Currie (2008: 212) claims that 'theorists in both camps have insisted that well before children acquire belief-desire psychology there is much going on which underpins competent interaction with other people, and some have speculated on what these precursor states might be. If their speculations do not give rise to hypotheses the same as those of Gallagher and Hutto, that is incidental: neither ST nor TT commits anyone to a view about exactly what these precursor states are'. This is precisely the kind of reaction that motivated Gallagher to articulate his story about narrative practice. However, as seen in the previous section, Gallagher does not directly address the capacity for folk psychological action interpretation and so leaves open the door for a TT/ST explanation of this capacity. In other words, proponents of TT/ST could still claim that Gallagher has merely sketched the precursors to their mindreading story of social cognition. A recent proposal that does compete directly with TT/ST as an explanation of folk psychological action interpretation is Hutto's Narrative Practice Hypothesis (NPH). What the NPH tries to explain is precisely how children acquire workaday skills in wielding folk psychology, understood as the ability to make sense of actions in terms of reasons. Hutto (2009: 10) remarks that this is a 'sophisticated high level capacity' and claims that it 'involves being able to answer a particular sort of "why"-question by skilfully deploying the idiom of mental predicates (beliefs, desires, hopes, fears, etc.)'. According to Hutto (2007), folk psychology is acquired by being introduced to, being made familiar with, and actively engaging with folk-psychological narratives: stories about reasons for actions. The NPH focuses on paradigmatic practices of storytelling, such as children listening to and actively participating in the telling of fables and fairy tales (i.e., asking questions, being invited to make sense of the protagonist's actions, retelling the story, etc.). 'The stories about those who act for reasons... are the foci of this practice. Stories of this special kind provide the crucial training set needed for understanding reasons. They do this by serving as exemplars, having precisely the right features to foster an understanding of the forms and norms of folk psychology' (2007: 53). The NPH radically departs from the mainstream TT and ST rivals in two ways. First, it locates the primary origin/basis of folk psychology in secondinstead of third-person encounters. Exercising folk psychological skills is not a 'spectator sport' (Hutto 2004) of inferring reasons from actions and vice versa from a distance. The requisite training takes place in conditions of mutual engagement, ENACTIVISM AND SOCIAL COGNITION | 241 when people ask for and give each other reasons for their actions. Third-person prediction of action in terms of motivating reasons is, Hutto claims, a derivative and not highly reliable activity that necessarily involves speculation. Hutto argues that, although folk psychology can be exercised in different contexts, most everyday social interactions take place in socially structured, normalized environments in which the need for action explanation is obviated. The NPH's second departure from orthodoxy is in shifting the explanatory burden from the individual to the individual within a sociocultural context. The acquisition of folk psychological skills, he claims, cannot properly be explained by focusing on the individual in abstraction from its sociocultural background. Advocates of TT and ST often argue that the core of intersubjectivity (the ability to practice folk psychology), is grounded in an internal set of principles, claiming that its acquisition is effectuated either through the biological triggering and maturation of innate folk psychological modules or through the child's private search for theoretical consistency in a social world she tries to understand. Hutto argues that folk psychological narratives provide more than merely a 'framework for disinterested prediction and explanation'; folk psychology is an 'instrument of culture', providing grounds for 'evaluative expectations about what constitutes good reasons' (cf. Hutto 2004). We very much agree with Hutto's emphasis on the sociocultural, second-person nature of folk psychology. The question is to what extent Hutto departs from a cognitivist view of folk psychology. The NPH is definitively a huge improvement over TT and ST insofar it claims that reason interpretation is primarily a second-person practice. However, Hutto remains committed to the beliefdesire model of action interpretation. According to this orthodoxy, action interpretation in terms of reasons is primarily about the attribution of belief-desire combinations.12 In some places, Hutto straightforward endorses this classical, psychologized picture of action interpretation. One reads, for example, that folk psychology minimally incorporates 'the practice of making sense of a person's actions using belief/desire propositional attitude psychology' (2007: 3). Elsewhere, Hutto claims that, in order to make sense of an action as performed for a reason, 'it is not enough to imagine it as being sponsored by a singular kind of propositional attitude; one must also be able to ascribe other kinds of attitudes that act as relevant and necessary partners in motivational crime' (2007: 26). Knowledge of how the propositional attitudes interrelate with one another 'comprises what we might think of as the "core principles" of intentional psychology' (2007: 29).13 Hutto stresses that these principles are not supposed to be theoretical in any meaningful sense: they do not have the form of a theory, nor are they acquired like one. At the same time, however, he seems to take the folk psychological principles out of the head in order to replace them by the 12 The belief-desire model of action interpretation has been held as close to common sense amongst theorists. Consider Frith & Happé (1999: 2) who claim that 'in everyday life we make sense of each other's behavior by appeal to a belief-desire psychology'. Elsewhere, we have criticized the belief-desire model of action understanding that gives rise to this picture (De Bruin & Strijbos 2010, Strijbos & De Bruin 2009). 13 See also Hutto (2007: 3) where he agrees with Baker (1999) that 'belief-desire reasoning forms the core of common sense psychology'. ENACTIVISM AND SOCIAL COGNITION | 242 'principles' in folk psychological narratives. Hutto might object that understanding of folk psychological narratives does not necessarily take the form of communing with a pre-existing set of theoretical principles in the mind. The point is that, if Hutto wants to avoid appealing to a tacit body of intrinsic knowledge, then the 'principles' he is after must be operative in the folk psychological narratives themselves. Hutto thinks this is indeed the case and gives the example of Little Red Riding Hood: 'Little Red Riding Hood learns from the woodcutter that her grandmother is sick. She wants to make her grandmother feel better [she's a nice caring girl], and she thinks that a basket full of treats will help, so she brings such a basket through the woods to her grandmother's house [beliefs and desires lead to actions]. When she arrives there, she sees the wolf in her grandmother's bed, but she falsely believes that the wolf is her grandmother [appearances can be deceiving]. When she realizes it is a wolf, she is frightened and she runs away, because she knows that wolves can hurt people. The wolf, who indeed wants to eat her, leaps out of the bed and runs after her trying to catch her' (Hutto 2007: 30, citing Lillard 1997: 268). The example is misleading: the belief-desire structure Hutto thinks to distinguish in this story is imposed upon it by Lillard, who argues that 'if we distill out our mentalistic interpretation, the tale is rather dry. A little girl hears from a woodcutter that her grandmother is sick. She walks to her grandmother's house, carrying a basket of treatments. A wolf who is in her grandmother's bed jumps up and runs after the girl. Incorporating an interpretation guided by our theory of mind makes the story a good deal more coherent and interesting' (1997: 268). It seems to us that this is not a claim Hutto would be willing to defend. It is not our intention to argue that folk psychological narratives do not exist. That said, Ratcliffe (2008) is certainly right that folk psychological narratives usually lack an explicit belief-desire structure, and this presents potential trouble for the NPH. Of course, Hutto could argue that folk psychological narratives do not display the relations between beliefs, desires, and other propositional attitudes explicitly, and instead propose that they only do so in an implicit manner: the folk psychological patterns one is looking for are potentially there, but they still have to be articulated. However, this would prompt the question how children are able to do this – is this not precisely what the NPH promised to explain? Moreover, it would reopen the door to the suggestion that children are able to recognize and identify the belief-desire structures implicit in folk psychological narratives because they already possess a tacit belief-desire psychology. Of course, all this does not amount to an argument against the belief-desire model itself. What our short analysis of the NPH does show is how hard it is to give a decent account of folk psychological understanding from a thoroughly enactivist standpoint. At the end of the day, Hutto does not explain how this form of cognition is enacted but instead succumbs to the idea that folk psychological narratives can be seen as blueprints that already represent the belief-desire structure required for folk psychological action interpretation. It is interesting to compare Hutto's proposal to Gallagher's account of narrative practice. Whereas the latter places more emphasis on embodiment but fails to ENACTIVISM AND SOCIAL COGNITION | 243 capture the folk-psychological mode of social cognition, Hutto seems to give up on embodiment to achieve his goal. This implies that much work still needs done when it comes to the embodiment of folk psychological understanding. Admittedly, Hutto pays a lot of attention to developmental studies that investigate the emergence of language and the acquisition of belief and desire concepts. He is certainly correct in his observation that traditional accounts have been lacking in this department. As Carpendale and Lewis (2004: 91) put it: 'proponents of the dominant theories have been notably quiet about what happens in development after the child's fifth birthday. However, research that explores whether 5-year-olds can use simple false belief knowledge to make inferences about their own and other's perspectives finds that they singularly fail to do so'. At the same time, it is yet not clear if and how the NPH comes closer to an enactive account of folk psychology as embodied cognition. It is even less clear how one could explain children's engagements with folk psychological narratives in terms of direct coupling. FURTHER DISCUSSION AND FUTURE DIRECTIONS The enactive approach still has a long way to go when it comes to explaining remote and abstract forms of social cognition. We have focused on the recent attempts of Gallagher and Hutto to account for folk psychological action interpretation in terms of narrative. Although we are enthusiastic about these proposals, we have shown that it is difficult to understand how they can be continuous with an enactive approach that puts a lot of emphasis on direct coupling. On the one hand, the proposals themselves have certain shortcomings. On the other hand, and maybe more importantly, the failure to explain higher-level forms of social understanding seems symptomatic of a general problem for the enactive approach. The problem is how to reconcile the enactivist notion of direct coupling with the process of distancing inherent in more reflective modes of social cognition, including the folk psychological one that has been the traditional explanandum of TT and ST. On the one hand, direct coupling is a precious part of the enactivist argument against the mental representational framework of cognitivist theories of social cognition. On the other, if enactivism wants to bridge the cognitive gap, it has to account for the capacity of human beings to distance themselves from their direct involvement with the environment and other human beings. We think that a reconsideration of direct coupling can provide a way out of this dilemma. First of all, it is important to be aware of the many different usages of the term 'direct' in the debate on social cognition – as seen in our discussion of Gallagher's narrative practice proposal. 'Direct' can refer both to the immediacy of knowing (i.e., without requiring explicit deliberation) and to the immediacy of the presence of the participants in a social situation (e.g., direct face-to-face contact). These do not necessarily coincide: people can be in a direct face-to-face encounter with someone while still using deliberation to find out what to make of the situation. Even if one takes directness to be primarily a ENACTIVISM AND SOCIAL COGNITION | 244 feature of knowing, it can still mean several things: without (folk-psychological) concepts, without explicit or conscious inferences, without inferences whatsoever, or without a representational framework. It obviously matters which of these notions of directness enactivism is committed to. So far, it has mainly focused on directness in both the situational and the epistemological sense: i.e., on social understanding between participants in each other's immediate presence, without the meditation of folk psychological concepts or inference. When one goes beyond primary and secondary intersubjectivity and enters the realm of narrative practice, however, the situational directness of social understanding is lost to a certain extent (as explained in the previous sections). Hence, if narrative understanding is to be counted as a genuine species of social sense making, then enactivism cannot be constrained completely by situational directness. What remains is the directness of knowing. In the previous sections, we argued that narrative practice requires (folk psychological) concepts and inferential abilities, and therefore the enactivist appeal to directness should not be constrained in this respect, either. Thus, it seems that enactivism is only committed to a notion of directness as 'without a representational framework'. But even this does not mean that representations have to be banned from the enactive framework altogether. Enactivism à la Varela et al (1991), for example, does allow for representations in what they call a weak sense: '[representation] refers to anything that can be interpreted as being about something. This is the sense of representation as construal, since nothing is about something else without construing it as being some way' (1991: 134). It does oppose the stronger sense of representation that carries the heavy cognitivist commitments, namely that '(1) the world is pregiven; (2) our cognition is of this world even if only to a partial extent, and (3) the way in which we cognize this pregiven world is to represent its features and then act on the basis of these representations' (1991: 135). Regardless of which interpretation of directness enactivism eventually embraces, the notion of direct coupling must be able to incorporate the process of distancing, of de-coupling and re-coupling that is characteristic of the development of narrative practice, and already inherent in primary and secondary intersubjectivity. Is it still meaningful to call this a process of coupling? We think that the answer to this question should be affirmative for two reasons. First, de-coupling is never an end to itself but always takes part within a wider framework of ongoing coupling. Organisms never go completely 'off-line'. Therefore, we are a bit sceptical of the recent distinction between 'online' and 'off-line' forms of embodied cognition. Wilson (2002: 635) writes that 'online aspects of embodied cognition... include the arenas of cognitive activity that are embedded in a task-relevant external situation.... Off-line aspects of embodied cognition, in contrast, include any cognitive activities in which sensory and motor resources are brought to bear on mental tasks whose referents are distant in time and space or are altogether imaginary'. Applied to social cognition, the distinction boils down to immediate, face-to-face interactions on the one hand, interactions with absent others on the other. ENACTIVISM AND SOCIAL COGNITION | 245 Although this distinction gets to the heart of the cognitive gap, Wilson's formulation is problematic in several respects. On a theoretical note, talk of 'task-relevant inputs and outputs' (2002: 626) puts one right back in the computational cognitivism that the enactive approach tries to overcome. Wilson takes task-relevance as a decisive criterion for judging whether a cognitive activity counts as online. This sounds fair enough – except that it will not always be easy to determine in a clear-cut manner. When something in my present interaction triggers a memory, how can I be sure whether or not it is task relevant? Moreover, task relevance may be too narrow a category when it comes to social cognition: human beings continuously engage in social interactions, even when there is no specific task toward which the interaction is directed. Many social interactions are initiated for their own sake. Finally, a binary on/off distinction does not allow for gradations in the way people interact with others: for instance in terms of involvement, reciprocity, and presence. Wilson equates off-line processing with reflection (2002: 627). It is true that reflection (per definition) entails a distancing from the here and now; but to describe this as 'off-line' is misleading. For one thing, reflection usually does not start from thin air: it is often triggered by something, embedded in and motivated by a given situation: thus very much task relevant. The opposite is true as well: many direct interactions or other 'online' activities incorporate off-line elements.14 From an enactive point of view, distancing or reflection should be described not as an instance of going off-line, but rather as something that is an integral part of the process of coupling. One might even say that coupling is nothing more than a continuous oscillation between de-coupling and recoupling. This brings us to the second reason why we still want to use the term coupling: as a dynamic process, coupling already entails de-coupling and re-coupling. This is so even when it is 'direct' (both in a situational and epistemological sense). The notion of coupling goes back to Von Weizsäcker's (1986) theory of the Gestaltkreis: a functional circle. According to his theory, the reaction of an organism to its environment should not be understood as a fixed response but instead depends upon its previous experiences and is constantly being re-patterned through these experiences. This indicates that coupling is not a static feature but a dynamic process that involves a degree of de-coupling – thereby allowing for processes of distancing as well. Perhaps the enactivist emphasis on the active role of the subject, against the cognitivist view of the subject as a passive sponge soaking up information the world throws at it, has obscured how receptivity is a fundamental part of coupling as well. Sense- 14 In the phenomenological literature, the fundamental intertwining of presence and transcendence is well documented. Take for example Husserl's (1985) famous analysis of time consciousness. Husserl showed that the present is always intrinsically linked to what has passed (retention) and an anticipation of what is about to happen (protention). The 'primal impression' is not a self-sufficient point in time but already transcends itself. This dimension of transcendence is also present in the phenomenological distinction between the 'lived body' or the body as object, and the 'living body' or the body as subject (see Zahavi 1999, 2004 for a comprehensive overview). Transcendence is already part of what it means to be a subject. That my body exists as a lived body means that I do not necessarily coincide with my body, and that I can therefore relate to it. The exteriority of my body entails a potential distance or transcendence from what is immediately given. ENACTIVISM AND SOCIAL COGNITION | 246 making includes sense discovering. Coupling is not a static feature but a dynamic process; it may well be described as a dynamic and dialectical process of de-coupling and re-coupling. What can one reasonably expect from enactivism? The challenge for enactivism will be to show how a richer notion of coupling can be put to work to explain the development from low-level to highlevel social cognition. Ideally, this notion should accommodate a number of factors that play an important role in this process, such as (without attempting to be exhaustive): (i) Level of verbality: it is hard to overestimate the space of possibilities for social interactions that language opens up. This does not mean, however, that bodily (including facial) expression does not continue to play a crucial role. Out of 'linguistic chauvinism', the role of bodily expression is, perhaps, too easily overlooked when language enters the interaction. (ii) Level of presence: social interactions can range from bodily presence in the same situation to differences in place and time and even existence. Even presence and absence are not clear-cut categories, especially when one considers modern means of communication. For instance, I can have a video chat with someone in another time zone: we are not present in each other's environment, but at the same time we create a shared situation through our contact. (iii) Level of reciprocity: 15 social cognition can be one-way or two-way directed.16 On one end of the spectrum, instances exist in which there is no reciprocity at all: for example, when I try to figure out the motives of the protagonist of a movie. The protagonist does not even know that I exist; on top of that, she does not exist herself. On the other end of the spectrum, one could think of an intense conversation or argument between two very involved participants. Intermediate forms are possible and prevalent. The involvement of the participants may vary over time: sometimes one person attends more to the other; sometimes the interaction is more equally balanced. Reciprocity is, of course, not limited to dyadic interactions, so one should allow not only for oneand two-way but also for multiple-way directedness. 15 Note that this distinction differs from the distinction between secondand third-person interactions. Second-person interactions are those in which participants are mutually involved. In third-person interactions, by contrast, participants are not actively involved but stand, as Gallagher (2007) puts it, 'at the margins of the situation'. There is obviously a difference in engagement between when I argue with you and when I watch you arguing with someone else. This is not to say that I am not engaged at all. I may be interested, amused, or emotionally invested in your argument. Not only can 'merely' observing imply a form of involvement on the part of the onlooker; it is plausible that this observation influences the interaction between the participants as well. Two people arguing will probably be affected by being watched. The trouble with the secondvs. thirdperson distinction is that it is hard to pin down the extent to which an onlooker plays a role. It may be more appropriate to speak of differences in active participation within an interaction. This active participation can be further specified with respect to the degree of reciprocity (level iii) on the one hand and the mode of activity (initiating, reacting, observing) on the other (level iv). 16 Fuchs and De Jaegher (2009) call this uni-directional incorporation and mutual incorporation respectively. The one-way directedness can be described as a 'coordination to', whereas the two-way directedness displays a 'coordination with'. ENACTIVISM AND SOCIAL COGNITION | 247 (iv) Level of active initiation: participation can range from initiation of interactions to reaction to others' initiatives to observing interactions. Here, too, the boundaries between initiation and reaction may not always be easy to draw. (v) Level of reflectivity: on the one end of the scale would be the situation in which I, prereflectively – thus without any deliberation - coordinate my movements with those of other people walking on the pavement. At the other end would be my eager attempts to interpret the intentions behind a text message I received. A more fine-grained understanding of the different factors at play in the enormous variety of social situations helps to spell out what any theory on social cognition – including an enactivist one – should be able to account for. They should also make a difference when it comes to designing experiments to tackle these issues from an empirical angle. This will make future attempts by enactivism to develop an adequate notion of coupling all the more challenging and interesting. Acknowledgements During the writing of this paper, Sanneke de Haan was supported by the EU Marie Curie Research Training Network 035975 DISCOS (Disorders and Coherence of the Embodied Self). REFERENCES Baker, L. (1999). What is this thing called 'commonsense psychology'? Philosophical Explorations, 2: 3-19. Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and Theory of Mind. Cambridge, MA: MIT Press. Bauer, P. (1996). The development of memory in childhood. London: University College London Press. Bauer, P. & Mandler, J. (1992). Putting the horse before the cart: The use of temporal order in recall of events by one-year-old children. Developmental Psychology, 28: 441-52. Bruner, J. & Kalmar, D. (1998). Narrative and metanarrative in the construction of self. In Ferrari, M. & Sternberg, R.J. (eds.) Self Awareness: Its Nature and Development (308-31) New York: Guilford Press. Carpendale, J. & Lewis, C. (2004). Constructing an understanding of the mind: The development of children's social understanding within social interaction. Behavorial and Brain Sciences, 27: 79– 151. Currie, G. & Sterelny, K. (2000). How to think about the modularity of mind-reading. The Philosophical Quarterly, 50(199): 145-160. ENACTIVISM AND SOCIAL COGNITION | 248 De Bruin, L. & Strijbos, D. (2010). Interpretation without principles: An alternative to belief-desire psychology. Philosophical Explorations, 13: 257-274. Jaegher, H. (2009). Social understanding through direct perception? Yes, by interacting. Consciousness and Cognition, 18(2): 535-42. De Jaegher, H. & Di Paolo, E. (2007). Participatory sense-making: An enactive approach to social cognition. Phenomenology and the Cognitive Sciences, 6(4): 485-507. De Jaegher, H. & Froese, T. (2009). On the role of social interaction in individual agency. Adaptive Behavior, 17: 444-460. Fivush, R. (1994). Constructing narrative, emotion, and self in parent-child conversations about the past. In Neisser, U. (ed.) The Remembering Self: Construction and Accuracy in the Self-Narrative (136-57). New York: Cambridge University Press. Fuchs, T. & De Jaegher, H. (2009). Enactive intersubjectivity: Participatory sense-making and mutual incorporation. Phenomenology and the Cognitive Sciences, 8(4): 465-86. Friedman, W. (1992). Children's time memory: The development of a differentiated past. Cognitive Development, 7: 171–87. Frith, U. & Happé, F. (1999). Theory of mind and self consciousness: What is it like to be autistic? Mind and Language, 14: 1-22. Gallagher, S. (2001). The practice of mind: Theory, simulation or primary interaction? Journal of Consciousness Studies, 8(5-7): 83-108. Gallagher, S. (2003). Self-narrative, embodied action, and social context. In Wiercinski, A. (ed.) Between Suspicion and Sympathy: Paul Ricoeur's Unstable Equilibrium (409-423). Toronto: The Hermeneutic Press. Gallagher, S. (2008). Direct perception in the intersubjective context. Consciousness and Cognition, 17(2): 535-543. Gallagher, S. & Brøsted Sørensen, J. (2006). Experimenting with phenomenology. Consciousness and Cognition 15: 119-34. Gallagher, S. & Hutto, D. (2008). Understanding others through primary interaction and narrative practice. In Zlatev, J., Racine, T., Sinha, C. & Itkonen, E. (eds.) The Shared Mind: Perspectives on Intersubjectivity (17-38). Amsterdam: John Benjamins. Gallagher, S. & Varela, F. (2003). Redrawing the map and resetting the time: Phenomenology and the cognitive sciences. Canadian Journal of Philosophy, 29: 93-132. Gallagher, S. & Zahavi, D. (2008). The Phenomenological Mind: An Introduction to Philosophy of Mind and Cognitive Sciences. London, New York: Routledge. Gazzaniga, M. (1988). Mind Matters. Boston: Houghton Mifflin. Gergely, G. (2001). The obscure object of desire: "nearly, but clearly not, like me": Contingency preference in normal children versus children with autism. Bulletin of the Menninger Clinic, 65(3): 411-26. ENACTIVISM AND SOCIAL COGNITION | 249 Gergely, G. & Csibra, G. (2003). Teleological reasoning in infancy: The naive theory of rational action. Trends in Cognitive Sciences, 7(7): 287-292. Gopnik, A. & Meltzoff, A. (1997). Words, Thoughts, and Theories. Cambridge, MA: MIT Press. Heal, J. (2005). Joint attention and understanding the mind. In Eilan, N., Hoerl, C., McCormack, T. & Roessler, J. (eds.), Joint Attention: Communication and Other Minds (34–44). New York: Oxford University Press. Hobson, R. (2002). The Cradle of Thought. London: Macmillan. Howe, M. (2000). The Fate of Early Memories: Developmental Science and the Retention of Childhood Experiences. Washington DC: American Psychological Association. Husserl, E. (1985). Texte zur Phänomenologie des inneren Zeitbewusstseins (1893 1917). Hamburg: Meiner. Hutto, D. (2004). The limits of spectatorial folk psychology. Mind & Language 19(5): 548-73. Hutto, D. (2007). The narrative practice hypothesis: Origins and applications of folk psychology. In Hutto, D. (ed.) Narrative and Understanding Persons (43-68), Cambridge University Press. Hutto, D. (2008). Limited engagements and narrative extensions. International Journal of Philosophical Studies, 16(3): 419 44. Iacoboni, M. (2006). Failure to deactivate in autism: the co-constitution of self and other. Trends in Cognitive Sciences, 10: 431-433. Mead, G. H. (1962). Mind, Self and Society. Chicago: The University of Chicago Press. Meltzoff, A. & Moore, M. (1977). Imitation of facial and manual gestures by human neonates. Science, 198: 75-8. Meltzoff, A. & Moore, M. (1994). Imitation, memory, and the representation of persons. Infant Behavior and Development, 17: 83-99. Miller, P., Potts, R., Fung, H., Hoogstra, L. & Mintz, J. (1990). Narrative practices and the social construction of self in childhood. American Ethnologist, 17: 292–311. Nelson, K. & Gruendel, J. (1981). Generalized event representations: Basic building blocks of cognitive development. In Lamb M. & Brown A. (eds.) Advances in Developmental Psychology Volume I (131-58). Hillsdale NJ: Erlbaum. Ratcliffe, M. (2007). Rethinking Commonsense Psychology: A Critique of Folk Psychology, Theory of Mind and Simulation. New York: Palgrave Macmillan. Reddy, V. (2003). On being the object of attention: implications for self-other consciousness. Trends in Cognitive Sciences, 7(9): 397-402. Rietveld, E. (2008). Situated normativity: The normative aspect of embodied cognition in unreflective action. Mind, 117(468): 973-1001. Strijbos, D. & De Bruin, L. (2009). Towards an analytic pragmatist account of folk psychology. In Amoretti, C., Penco, C. & Pitto, F. (eds.) Towards an Analytic Pragmatism: Workshop on Brandom's Recent Philosophy of Language (152-58). ENACTIVISM AND SOCIAL COGNITION | 250 Tomasello, M. (1999). The Cultural Origins of Human Cognition. Cambridge, MA: Harvard University Press. Tooby, J. & Cosmides, L. (1995). Foreword to Baron-Cohen, S. Mindblindness: An Essay on Autism and Theory of Mind (xi-xviii). Cambridge, MA: MIT Press. Varela, F., Thompson, E., & Rosch, E. (1991). The Embodied Mind: Cognitive Science and Human Experience. Cambridge, MA: MIT Press. Weizsäcker, V.v. (1986). Der Gestaltkreis. Theorie der Einheit von Wahrnehmen und Bewegen. Stuttgart: Thieme. Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin and Review 9(4): 625636. Woodward, A. (2005). The infant origins of intentional understanding. In Kail, R.V. (ed.) Advances in Child Development and Behavior (229-62) Oxford: Elsevier. Zahavi, D. (1999). Self-Awareness and Alterity. A Phenomenological Investigation. Evanston, IL: Northwestern University Press. Zahavi, D. (2004). Alterity in Self. In Gallager, S., Watson, S., Brun, P. & Romanski, P. (eds.) Ipseity and Alterity: Interdisciplinary Approaches to Intersubjectivity (137-152). Rouen: Presses Universitaires de Rouen. Zahavi, D. (2005). Subjectivity and selfhood. Investigating the first-person perspective. Cambridge MA: MIT Press.