'What the...!' The role of inner speech in conscious thought Abstract Introspection reveals that one is frequently conscious of some form of inner speech, which may appear either in a condensed or expanded form. It has been claimed that this speech reflects the way in which language is involved in conscious thought, fulfilling a number of cognitive functions. We criticize three theories that address this issue: Bermúdez's view of language as a generator of second-order thoughts, Prinz's development of Jackendoff's intermediate-level theory of consciousness, and Carruthers's theory of inner speech as a rehearsal of action-schemata. We contend they have problems to account for those cases in which inner speech is fragmentary, and for the difference with those instances in which it appears as more sentence-like. In addition, we present verbal overshadowing as a phenomenon that neither of them can easily explain. Finally, we propose an account in which inner speech is fundamentally silent outer speech and argue that it is more explanatory than the alternatives. 1. Introduction Introspection reveals a regular use of a kind of "inner voice". The reported frequency of this type of experience differs among studies but according to Heavey and Hurlburt's (2008) latest research it seems to occur at least a quarter of the time1. Despite of this fact, several authors have pointed out the striking lack of phenomenological data about this phenomenon. For 1 Heavey and Hurlburt point out that experiences that are different in kind may occur at the same time, e.g., inner speech and images. The possibility that participants may be reporting only the predominant state could partly explain the gross discrepancies with base rates observed in other studies, such as Klinger and Cox (19872 instance, Jackendoff (2007: 80) complains that there is little phenomenal description in the cognitive neuroscience of consciousness, and that most of it is devoted to visual experience, while Zlatev (2008: 7) remarks that even in the new 'phenomenological turn' in cognitive science (e.g., Gallagher and Zahavi, 2008) there is surprisingly little said about language. Part of this shortage of data could be explained by the methodological problems posed by the study of inner speech. However, there are a number of methods to which researchers can resort (cf. Guerrero, 2005, chapter 4), with increasing attention being paid to the form of inner speech, its role, and possible individual differences. The Vygotskyan approach to inner speech regards it as a phenomenologically degraded form of our own talking, which is also syntactically simplified. In this view, inner speech not only lacks pitch and volume but also appears typically in subsentential linguistic items. So inner speech is not experienced in the way of the internal monologues that classical novels depict but in the fragmentary way exemplified by the opening of Samuel Beckett's "The Unnamable": "Where now? Who now? When now? Unquestioning". Our "outer talk" also often has subsentential utterances2 but our inner talk often seems to be full of expressions like 'ah', 'yes', 'not this way', 'where the hell?' and 'the meeting!' (see Peacocke, 2007). The meaning of those linguistic items is typically clear to us but they might be to a large extent unintelligible to others if we uttered them (see on this respect the transcription of writers' notebooks in John-Steiner, 1997: 111ff). In addition, people can also engage in a more "sophisticated" inner talk, seemingly carried out in full sentences. This is especially noteworthy in cases such as when we prepare a lecture, think hard about an argument, or imagine possible conversations. Many of those cases seem to be related to linguistic actions, i.e., what our inner speech is doing can be characterized as a sort of rehearsal of the utterances that the subject will eventually make 88), who report 75% of inner talk. 3 public. However, sophisticated inner speech may also take place in other kinds of situations. For instance, in a research comparing the phenomenological qualities of inner speech in voice-hearing schizophrenia patients and healthy controls, Langdon et al (2009) found that both groups were most likely to report thinking in full sentences. Some authors (Fernyhough, 2004; Guerrero, 2005: 15-16) suggest that the different characterizations of inner speech as elaborate or impoverished talk may be a reflection of a phenomenon that occurs in stages or levels of processing. So at some point in the progression inner speech may appear almost without words, and at another it might be more specific in syntax and meaning –respectively called by Fernyhough condensed and expanded inner speech. Let us summarize these differences in two different claims about the occurrence of inner speech: (i) We have phenomenological acquaintance with our inner voice even when we are not rehearsing linguistic actions, and in many cases we do not experience phonological representations of sentences but dispersed linguistic items. (ii) We can experience richer, more sentence-like inner speech, typically but not exclusively related to linguistic activities. We think that any theory of the role of inner speech has to do justice to (i) and (ii). It has to be able to explain why it is often condensed or fragmentary, and whether there is a principled difference between cases of condensed and expanded inner speech. In this paper we want to motivate and defend an explanation for these claims that differs from others that have been put forward before. In particular, we will review Bermúdez's view of language as a generator of second-order thoughts (Bermúdez, 2003), Prinz's development of Jackendoff's (1987) intermediate-level theory of consciousness (Prinz, 2000, 2007), and Carruthers's theory of 2 Pinker (1994: 221ff) provides a nice example of elliptical, fragmentary conversation in the transcripts of 4 inner speech as a rehearsal of action-schemata (Carruthers, 2006). The three theories, however, share a basic claim about the function of inner speech, namely, that language is used to bring thoughts to attention (or to consciousness, to the extent to which we can speak indistinctively of consciousness and attention; more on this later). We share this basic claim, which can also be found in Clark (1998), and before him in Vygotsky. In a nutshell, language is claimed to be the means by which we can "objectify and contemplate our own thoughts". Converting thoughts in objects by formulating them linguistically enables us to hold the focus on our thinking, which in turn enables us to have a better control of our behaviour, of our planning and of our cognitive processing in general. It is not that language allows us to have second-order thoughts; rather, the idea is that language allows us to make thoughts conscious. We can be conscious of our thoughts through language, just as we are able to express our thoughts by language, and neither of those abilities requires that linguistic representations carry the thoughts themselves, i.e., it is not necessary to suppose that the representational format of thoughts is natural language itself. To put the idea in other terms: it is not that we recruit "language", whatever that is, to play a "second-order dynamics", as Clark (1998) calls it. Rather, we recruit linguistic communication, that is, we convert a pattern of outward actions, into a form of cognition. When we hear our inner voice we are listening to ourselves talking to ourselves. We recruit the way we have to tell our thoughts to others in order to tell them to ourselves (with all the cognitive changes this brings in). This explains, in our view, the introspective data we mentioned above: in general, when we engage in inner talk we are talking to someone who shares a lot of contextual information with the speaker, so we can keep our talk to a minimum. Nixon's Watergate tapes. 5 2. Bermúdez on second-order dynamics Bermúdez claims to be following Clark's (1998) suggestion that the use of natural language (henceforth NL) in cognition gives rise to a second-order dynamics, by what Clark means, basically, that thanks to NL we are able to think about our own thoughts by making them present to our minds. This basic capacity enables us to focus our attention on our own thinking, and by so doing we can control our behaviour, retrieve memories more easily and, in general, facilitate certain otherwise demanding cognitive processes. Bermúdez concedes that there may be second-order thoughts without NL, but argues that these cannot be conscious thoughts, or, in his own words, thoughts that can be entertained at the personal level. Thus, they cannot give rise to a second-order dynamics, which Bermúdez equates with reflexive thinking. Bermúdez corrects Clark's specific proposal on the role of NL in thought. Clark argues that inner speech is an internalized tool that shares a good number of functions with the external tool. In contrast, Bermúdez claims that some of those functions can be carried out by non-linguistic thinking. What cannot be done without NL, according to his proposal, is (a) have proper reasoning (which implies being able to contemplate thoughts and consider how they relate to each other logically and evidentially) (b) revise and also (c) ascribe beliefs (i.e., have a theory of mind), (d) have embedded compound thoughts (such as disjunctive, conditional and quantified thoughts), and, finally, as a result of this latter capacity, (e) have intermodular thoughts. The argument is that all these capacities, according to Bermúdez, hinge on the general capacity of having second-order thoughts at a personal level. And the only way to think about a thought at the personal level is by having thoughts carried by NL sentences. We will not discuss whether all these capacities do involve a second-order dynamics 6 (see Fodor, 2003 for a rebuttal). The ascription of beliefs, for instance, may be an entirely modular affair, which does not require any thought to be entertained at the personal level. It is also disputable that we need a second-order dynamics in order to have compound thoughts. Be it as it may, we think that Bermúdez's position is at odds with our claim (i), viz., the fact that we experience inner speech even when we are not rehearsing linguistic actions, and that what we experience are dispersed linguistic items. Let us see why. As a first step, let us focus on Bermúdez's thesis that NL is the only possible means to have second-order personal thoughts. Bermúdez discusses the possibility that we, or other beings, used some other, analogue, format, to make thoughts conscious, or to entertain thoughts at a personal level. He considers images, maps and mental models. Speaking about maps, he rejects the possibility on the following grounds: "Those very features of maps (their analogue nature and structural isomorphism with what they represent) that make them so useful for guiding action serve to make them inappropriate for the type of inferential evaluation characteristic of second-order dynamics. In order for such evaluation to take place, the maps must be interpreted in broadly propositional terms. We must interpret one map as expressing a proposition and the second as representing a further proposition, and then evaluate the inferential relations between the two propositions. Once again, our only understanding of how to do this rests on the two propositions being linguistically formulated". (p. 162) Here Bermúdez is claiming that the only way to have thoughts as the object of further thoughts is by having them encoded (formulated) in some vehicle that is (a) propositional and (b) introspectable. According to him, the only vehicles that fulfil these conditions are NL sentences, given that "[A]ll the propositional thoughts that we consciously introspect (...) 7 take the form of sentences in a public language" (2003: 159-60, his emphasis). We think that this claim is problematic for two reasons. First, there are good arguments to support the thesis that most NL sentences are not able to carry propositions by themselves. In order to do so, our mentally rehearsed NL sentences should be semantically determinate, and not something to be further interpreted. Yet there are by now a myriad of examples that show that our usual utterances are semantically underdetermined at one level or another (see Carston, 2002; Recanati, 2004; Travis, 2000), and it is suggested that all possible sentences would suffer from this "plague"3. So we take it that there are well founded reasons to disbelieve that NL could be the vehicle of our thinking, under the assumption that the vehicle of thought must be explicit (see Fodor, 2001; BLIND REFERENCE, 2008a). Secondly, introspective evidence does not reveal the use of "propositions being linguistically formulated" at all4. Let us remark again that the linguistic items that appear to conscious thought can be often condensed and subsentential. These simple linguistic items have to be "interpreted as expressing a proposition" much in the way Bermúdez suggests that maps have to be. Thus, there is, prima facie at least, no difference between an analogue vehicle and NL in this respect. If NL can help us have a second-order dynamics, despite its inexplicitness, there is no reason why analogue formats cannot. One exception is to be found, as we hinted above, in those cases in which language is used in thought to give a second-order dynamics to itself. This happens, for instance, when we reflect about the very same sentences that we have uttered or are planning to utter. We tinker with them, select a more appropriate word, wonder whether there is a misleading inference that the hearer may make. But from those cases one can hardly obtain a general model of 3 The inference from the weak "usual utterances are semantically underdetermined" to the strong "all possible utterances are semantically underdetermined" is illegitimate. Yet, most authors dealing with this issue do make, covertly or overtly, such an inference (see e.g. Pinker, 1994; Fodor, 2001; Gleitman and Papafragou, 2004). For a justification of the strong claim see BLIND REFERENCE (2005). 4 Fodor (2003) criticizes Bermúdez on the grounds that "it's among the important results in cognitive psychology that much of what you introspectively believe about your mind just isn't true". Our argument, on the contrary, is 8 language as a generator of second-order dynamics. We need to explain how language can play the same role for many other domains of our mental life, and why it is that inner speech may become sparser in them. Putting it in other words: Bermúdez takes it that for a vehicle to bring thoughts to consciousness, that vehicle must carry propositions. In particular, he takes the claim that NL makes thoughts arise at the personal level to mean that NL introspected sentences encode propositions. This conception of what it is to "bring thoughts to consciousness" leads him to deny that any other representational format but NL could bring thoughts to consciousness (given that NL is the only propositional format apart from the "language of thought", which cannot be introspected). We claim that this is wrong because NL cannot bring thoughts to consciousness by encoding them. On the one hand, it cannot encode thoughts due to the problem of semantic underdeterminacy. On the other, introspective data provide no good reason to think that NL is performing that function. We suggest that NL does bring thoughts to consciousness, but in the same way a chairperson can bring to the consciousness of a speaker that she has run out of time by saying 'no time'. She could just as well have pointed at her watch. A gesture or a few words are equally good ways of making people conscious of what other people think. An image or a few words, we submit, are both good ways of making someone conscious of what she's thinking. At least, Bermúdez provides no strong reason to suggest why things should be otherwise5. 3. Prinz: All consciousness is perceptual Jackendoff (1987) proposed a theory of consciousness known as the "intermediate-level that Bermúdez's claim is not consistent with the evidence that he offers on its behalf. 5 Lurz (2007; 292) gestures towards this same idea when he says that "there is nothing incoherent in the idea that the representational vehicles of the thoughts that we entertain during bouts of second-order cognitive dynamics are subpersonal sentences in our language of thought while the concrete particulars that we need to attend to so 9 theory of consciousness". Jesse Prinz has been arguing for a variant of this theory for almost a decade (see Prinz, 2000, 2003, 2007, forth.). According to Jackendoff, information processing proceeds hierarchically, much in the way Marr (1982) proposes visual information is processed, i.e., starting with disparate representations of local features of the stimulus (a primal sketch) and working, step by step, towards a coherent conceptual representation. It makes sense, thus, to speak about an intermediate level of representations: these are located halfway between the primal sketch and the conceptual representation. It is mistaken to identify the latter with conscious representations because concepts abstract away too much detail, that is, they represent the distal stimulus as a certain entity, disregarding particular aspects related to the point of view in which it is perceived. Yet it is clear that our conscious perception is intimately linked to the particular perspective in which we engage the object, and this is the perspective provided by the intermediate representational level. The claim thus is that we are only conscious of these intermediate-level representations, which Prinz equates with perceptual representations. In the case of vision, these intermediate-level representations are Marr's 21/2-D representations, that is, we are not conscious of 3D representations, which represent objects as invariant from perspectives or points of view: we are only conscious of objects as seen from a certain viewpoint. In his 1996 article Jackendoff adds that, just as we are unable to consciously access our high-level visual 3D representations, we are also incapable of being aware of our own thoughts as such. The only way we have of being conscious of what we think is by putting thoughts in words, or rather, by generating the phonological representations that correspond to, and behave as surrogates for, such thoughts. Phonological representations are the intermediate-level representations in linguistic processing, so what we do in order to bring thoughts to consciousness is to recruit the linguistic modality and bring to consciousness as to hold these thoughts in mind are some type of analogue representation". 10 particular instances of its intermediate-level representations. Jackendoff adds that only by having a linguistic modality can we have experience of abstract thoughts and chains of reasoning. Prinz (forth.) endorses this overall account and supports it with numerous neurological findings. Yet there are a couple of considerations that urge him to amend Jackendoff's account. First, Jackendoff contends that the reason why we have access only to intermediatelevel representations is that the intermediate level is also intermediate in the sense that it is the level where top-down and bottom-up processes meet: the intermediate level receives input from low levels, but it can also be affected by high-level information. Low levels, however, are encapsulated from high-level information. Yet, as Prinz says, this is wrong: for instance, there are well-documented top-down effects on low-level vision. Thus, the reason why the intermediate level is so special regarding consciousness must be another one. Prinz proposes that it is because that level is optimal for guiding action. Low-level representations are disunified and the high-level ones are too abstract to guide what he calls "egocentric planning", that is, planning that requires that things were presented from a certain point of view6. The second amendment to Jackendoff's proposal comes from considerations regarding the role of attention. In the context of discussing the role of language in thought and how it enables us to attend to our thoughts, Jackendoff (1996) claims that we can only pay attention to something we are conscious of. However, Prinz (2000, 2003) argues for an opposite thesis, namely, that attention is a prerequisite for consciousness, that is, we are conscious only of the things we attend to. Prinz cites research in neglect as evidence for this claim, as he takes it that it has been shown that neglect is an attention disorder. We think that these considerations are misleading with respect to conscious inner 6 At first blush, this does not sound very convincing: we need high-level representations that guide our allocentric planning –i.e., according to the representation of things as what they are. Prinz even concedes that 11 speech. One important problem lies in the fact that both Jackendoff and Prinz, but especially the latter, take perception as the paradigm to deal with consciousness7. They wonder what the locus of consciousness is when one perceives an object, and reasonably regard low-level and high-level representations as unfitting to be that locus. Consider the role of attention and what happens if you extrapolate Prinz's considerations to the case of inner speech. To put it in a crude manner, his suggestion is that the external object is there and one may or may not pay attention to it: if attention is directed to the object, one is conscious of it; if attention is directed elsewhere, one is not conscious of the object even if one is visually registering it. Accordingly, the parallel suggestion for inner speech would be that the internal sentence "is there" and one may or may not pay attention to it: we are conscious of the sentence the moment we direct our attention to it. This strikes us as an implausible consequence: to produce inner speech is precisely to produce a piece of conscious mental representations. It does not make much sense to claim that our linguistic processor is producing, say, a continuous flux of inner speech and that sometimes, when we direct our attention to the objects of this flux, we become conscious of those mental sentences. Nevertheless, one may not find this implausible, perhaps seduced by something like the 'multiple drafts' theory of consciousness put forward by Dennett (1991). So here is a different reason to persuade the reader of the implausibility of the view: it imposes an intolerable computational cost. Notice that such a view requires that the linguistic processor be constantly in the business of producing linguistic items, which wait to fall under the attentional focus to reach working memory. Now, this is computationally expensive for such a small pay-off. Moreover, it is utterly unparsimonious: it is much simpler to suppose that the both kinds of planning must interact. Why then are we conscious only of intermediate representations? 7 The roots of the perceptual theory are, of course, deeper in the case of Prinz, who notoriously defends not only that all consciousness is perceptual, but also the neoempiricist claim that all concepts are copies of perceptual representations that become endogenously controlled (Prinz, 2002). We do not intend to discuss his theory as such but only that, even if we granted it, there are problems to account for the phenomenology and role of inner speech. 12 production of a piece of inner speech by the linguistic processor and the allocation of attentional resources to it amount to the same thing. Attention is not an extra ingredient that is added to the linguistic item so produced. Rather, we produce phonological forms precisely in order to focus our attention on our own thinking. A second problem arises in what Prinz has to say about the role of language in consciousness. As we said, Prinz endorses Jackendoff's idea that we become conscious of our thoughts by putting them in phonological forms. Now, he acknowledges and deals with two problems for his position (Prinz, forth.). First, the experience of hearing a word differs depending on whether you understand that word, a simple fact that implies that we must be conscious of something else than bare phonological forms. Second, temporary aphasics report that they continued to have conscious thoughts during their aphasia. In both cases, Prinz resorts to images to save the day. In the first case, the difference between hearing or telling oneself a word one understands (vs. a word one does not understand) is spelled out in terms of the images –verbal or otherwise– that such a word triggers. An unknown word, being associated to no image, and being unable to trigger the occurrence of other words associated to it, feels very differently from a word that is meaningful to us. So the "something else" we are conscious of when hearing a known word is some image, e.g., purely visual images or images of other words. As for the case of aphasics, Prinz proposes that their alleged conscious thinking would be mediated not by language but by images: it is possible to be conscious of our thoughts by entertaining images at the conscious level. Neither of these responses sounds convincing. One problem is how language could then help, as Jackendoff (1996) points out, to express abstract thoughts and to aid reasoning. On this respect, Prinz's perceptual theory of concepts offers some resources to deal with what he calls the "hard cases" (Prinz, 2002: 166ff): concepts of unobservable stuff, such as causation or electron; concepts of abstract and complex entities, such as democracy; or concepts of 13 formal objects, such as disjunction. One solution is couched in term of clever but not too persuasive examples of images associated to such concepts, e.g., one can picture democracy by a concrete scenario of people queuing in front of voting urns. Another solution, more important for this paper, is that most of our concepts are associated with representations of words. Verbal skills would hence allow to know how words are related and hence to grasp and reason about intangible properties. In fact, he claims that some concepts –such as concepts known only by deference to experts– may be entirely dependent on words. However, we think that this does not help to explain the problem of unknown words. Known words whose content is abstract, or even completely deferential, feel differently from unknown words, yet if the associated image were the word itself, as Prinz seems to suggest, then there should be no difference between both cases. In the case of aphasics, one might try to criticize Prinz's theory arguing that if aphasics are using a non-linguistic medium for their conscious thinking, then they could not revise and attend to their inferential chains of thought: as we saw in the discussion about Bermúdez's account, it seems that images are not well-suited for mediating inferential reasoning, since they are not propositional. However, we want to grant that images can do the general work Prinz wants them to do in aphasics, that is, make their thoughts conscious, or at least bring them to their attention. However, for images to fill the bill, the theory must change in what seems to be essential claims. In particular, it must forsake the claim that all consciousness is perceptual, or of intermediate-level representations. Let us explain: If Prinz's thesis that all consciousness is perceptual is applied to thought-consciousness, it means that all consciousness of our own thoughts comes in a perceptual format, namely, perceptual representations of NL or images, in some special cases. But how can this be? Consider the cases of condensed inner speech. If we are right about the context-dependent (inexplicit) nature of language in general, the content of a thought carried explicitly by a 14 condensed linguistic representation is minimal. Yet, we seem to be conscious of the whole content of the thought, that is, that content which is (perhaps forever) out of the reach of NL8. This does not entail a denial of Jackendoff's claim that conscious thought is perspectival: on the contrary, our claim is that precisely because the actual content of a thought takes into account perspective and context, while NL sentences do not, that having conscious thoughts does not amount to perceiving inner speech. Prinz (2007) deals with the following case put forward by Peacocke (2007): You suddenly experience the words 'Meeting tomorrow!'. Such words, and such an experience, do not determine by themselves what kind of thought (propositional attitude) you are having. It may be that the words just came to you as a case of unbidden imagination; it may be that you are making a judgement based on memory, or it may be that you are making a decision. According to Peacocke, there is no conscious perception that can decide between these three possibilities9. Prinz (2007: 354), however, disputes this: he claims that, depending on what propositional attitude we are having, we will perceive different emotional states. A judgement is accompanied by a feeling of affirmation; a decision goes together with emotions of an "imperative nature", etc. This could perhaps work in Peacocke's case, where the focus is on the underdeterminacy of the force of an inner utterance: were those words uttered as an assertion or as an imperative? We cannot know if the only raw data are the uttered words themselves. However, invoking emotions is of no help if we focus on the underdeterminacy of the content of an inner utterance. 'Meeting tomorrow!' is not only underdetermined with respect to its force, but also with respect to its content. When those words come to someone's mind, she is 8 As we said above, this is the lesson to be extracted from the claims by authors from the field of pragmatics, such as Carston, Recanati and Travis. We have dealt with this issue elsewhere, arguing that all sentential utterances suffer from semantic underdeterminacy –i.e. their propositional content cannot be given by their semantics alone (see BLIND REFERENCE, 2005, 2008a). 9 Peacocke's thesis is that the difference comes from action-awareness, i.e. that we are conscious (but not perceptually conscious) of our making a decision, making a judgement, etc. when entertaining those words in 15 thinking about a particular meeting and her use of 'tomorrow' is anchored to a particular day. However, this part of the content of her thought is not carried by the inner utterance. Yet, it seems that she is conscious of what she is thinking. And it seems that part of what she is conscious of does not come in any perceptual format whatever, that is, we seem to be conscious of thoughts, or parts of thoughts, that are not perceptually presented. Note that Prinz himself would concede, or would have to concede, at any rate, that we are conscious of the contents of our thoughts when these are entertained at the conscious level. Otherwise, his debate with Peacocke would not make sense. Both parties in the debate assume that we are conscious of the (apparently unrepresented) force of our inner speech acts. It is safe to assume that both parties would also concede that we are conscious of the locutionary content of our inner speech acts. The problem for Prinz is that in cases of condensed inner speech we do not represent perceptually but a minor part of the content of such speech acts. Thus, we have to be conscious of things that are not perceptually presented10. Let us go back to images and to the debate of what role they can play in aphasics. In principle, Prinz cannot resort to images to explain aphasics' conscious thinking. The reason is, according to Bermúdez, and possibly according to Jackendoff, that they are not propositioncarriers. Images cannot represent propositional contents. This is a problem for Prinz, because Prinz wants perceptual formats to carry propositional thoughts. However, he could relax this last requirement. Indeed, that is what we think he should do, given that, just as images, language is unable to carry propositions and, in any event, our inner talk does not carry propositions. So, there is, as a matter of principle, no reason why images could not be mind. 10 Prinz might claim that NL does not have to be the only perceptual format we use to make thoughts conscious. He might suggest that we use a mixed vehicle of NL, images and perhaps emotions. But this is difficult to support. On the one hand, NL is full of underdeterminacies that cannot be solved by means of images (think about the genitive in 'John's car is fast'). On the other, in the case of condensed inner speech, if images and/or emotions were going to "fill the gaps", we should enjoy a rich variegated imagery and collection of emotional 16 recruited to do the same language does for us, namely, prompting thoughts that we can attend to. But the cost of going this way is clear: images can fulfil this role only if it is conceded that not all consciousness is perceptual, that is, if it is conceded that we are conscious of contents that do not come in an intermediate level perceptual format. In conclusion, Prinz's thesis fails as an explanation for our use of language in cognition. The role of language is to enable our attention to be focused on our thinking. However, it is not the case that it realizes this role by encoding thoughts in a perceptual format or by using perceptual surrogates for thoughts, surrogates that are the only things we may be conscious of. This idea does not sit well with the introspective data. It does not sit well either with the thesis that linguistic utterances are semantically underdetermined. Finally, it is not coherent with the aphasic data, for images could not be vehicles of thought. The account has to be revised so that it is allowed that we are conscious not just of phonological representations or images, but also of the propositional contents they prompt. 4. Carruthers: from language as a medium of thought to the rehearsal of actions Carruthers (1996) brought introspective data into the debate about the cognitive functions of language, putting forward a direct and compelling argument for the use of language as a vehicle of thought and against the idea that the medium of our thinking is another kind of "language of thought". In a nutshell, it says that our own introspection reveals that we do use language when we think. What introspection reveals, according to Carruthers, is that we codify linguistically episodic conscious thoughts. By considerations of simplicity, he further argues that the mind also uses NL to codify latent thoughts and unconscious token-thoughts that belong to the same types as those conscious episodic thoughts. Carruthers qualifies his states each time we tell ourselves something. But that does not seem to be the case. 17 claim, conceding that some thoughts use a mixed vehicle formed by NL words and images (for example, ''I want this chair to go there [insert image]''). As we have argued elsewhere (BLIND REFERENCE, 2005), the main problems that we find in Carruthers's former picture are two: One is that introspective data do not show, at least in an uncontroversial way, that we are using NL as a vehicle of any kind of thinking. The other is that whatever these data reveal, as we have mentioned several times, there are independent reasons to hold that NL cannot be, in any case, a vehicle of thought. As a matter of fact, this last fact seems to be the reason (see Carruthers 2006, ch. 3) why he has abandoned his former (1996) position, giving way to his present account of the use of language in the rehearsal of actions. Carruthers's (2006) new position no longer regards language as a vehicle of conscious thinking, but as a tool for it, a position that brings him close to Clark (1998). Carruthers's current concerns about language arise from his commitment to a massively modular architecture of mind. According to his view, minds are composed by a myriad of distinct processing systems, each performing some specific task in the functioning of the whole, in a manner that is largely independent of the functioning of the others. One of the problems of this architecture is to explain human cognitive flexibility –a multifarious phenomenon that involves the capacity to be less context-sensitive and less stimulus-bound, the ability to combine different kinds of contents, and the possibility of selecting a reasoning process appropriate for the circumstances– as well as human creativity. Carruthers argues that language plays a prominent role in the explanation of both flexibility and creativity. On the one hand, language provides the means to integrate information coming from different domain-specific modules. As the language system is both an input and output system it is ideally located to do this job. The use of language for intermodular integration would help to explain experimental evidence that shows that when subjects are carrying out a task 18 that keeps the linguistic processor busy, there is a significant deterioration in the performance of a simultaneous task that requires the integration of information for disparate domains –such as locating an object integrating spatial and colour clues. In contrast, when the task to be carried out is as demanding computationally but not a linguistic task, there is no such impairment on the object location task (Spelke, 2003; Hermer-Vazquez and Spelke, 1996, Hermer-Vazquez et al, 1999). Carruthers explains intermodular integration by means of the recursive character of language. Syntax provides us with recursively generated bare structures that can be filled up with pieces drawn from the lexicon, provided they have the required syntactic profile. This means that a sentence such as 'the object is located between the short wall and the long wall' can be expanded into 'the object is located between the short red wall and the long blue wall' because the logical form of the former has an implicit slot after 'short' that can be filled up by another adjective. On the other hand, Carruthers now thinks that language is used to make thoughts available to our conscious thinking. In contrast with other authors who have held a similar account, he takes pains in explaining how this can be made possible in massively modular minds like ours. In chapter 2 of his 2006 book he tries to show how some thinking can be globally broadcasted and made accessible both to consciousness and to the central modules. Very roughly: Actions can be aborted in the moment of their execution, generating a quasiperception of themselves, an action-schema which is globally broadcasted and turned into a suitable input for the conceptual modules. Thus, it is possible to generate actions in a "suppositional mode", contemplate them and evaluate their consequences before taking a decision. This is, according to Carruthers, what explains animal thought. In the case of humans (see 2006, chapter 4), language is used to generalize the "global broadcasting" model. The action-schemata that can be globally broadcasted are of two kinds, so that they can come in two formats. Like animals, we can generate images of actions; but 19 specific of us, we can also generate phonological images of linguistic actions. Our conscious inner talk thus consists in this rehearsal of linguistic actions. In more detail, the idea would be: first, the output linguistic module receives instructions to produce speech; second, such speech is not produced, but quasi-produced, generating a phonological image of the speech; third, such phonological image is processed by the input linguistic module, which de-codifies it and makes it available to the central modules, which extract from it information relevant to them. One of these central modules is the mind reading module, which is in charge of generating a second-order thought whose content is the content of the quasi-produced speech. To sum up, the language production systems broadcast sensory representations of natural language sentences. These representations are subsequently processed –"consumed", to use an expression Carruthers is fond of– by conceptual modules. It must be remarked that "processed" or "consumed" means, in this context, that the central modules must extract the conceptual information carried by the linguistic representation. There are two basic problems with this general account, however. The first has to do with the role of NL as an intermodular lingua franca, and will not be discussed here, as it goes beyond the scope of this paper (see BLIND REFERENCE 2008b for an extended discussion; Machery (2008) points out an analogous problem). It is enough to say that the account faces an "audience" problem, to wit: there is no central module capable of "understanding" the information the linguistic module is allegedly integrating. According to the thesis of massive modularity, most modules are domain-specific, so they do not have the representational resources to handle the conceptual information delivered by the linguistic system. The problem, then, is that they will be able to use effectively only those contents of the sentences/utterances that carry information related to their restricted domains. Hence it is unclear how language's integrating and broadcasting capacities can be of any use to them. The second problem is that Carruthers draws a parallel between animal thinking, 20 enabled by the rehearsal of action-schemata, and human thinking, enabled mostly by the rehearsal of linguistic action schemata. We think that this implementation of the general insight behind vaguely Vygotskyan proposals sounds reasonable. We also think that it provides a good explanation of the introspective data. If inner talk is, in effect, talk to oneself, and is processed much in the same way the speech of others is processed (i.e., by means of linguistic and pragmatic processors), then we are in a position to explain those cases in which inner speech is condensed. Given that the context required to process speech is immediately available to us when we are our own interlocutors, we can minimize our talk to a considerable extent11. However, it is not clear how far the analogy with animal thought can go. According to Carruthers, animal minds broadcast possible courses of action so that, after foreseeable consequences are processed, an emotionally based decision is taken. If what we do with inner speech were indeed parallel to what animals do with images of action-schemata, then we would use language only to imagine possible conversations. However, this is only a part of what we do with inner talk. So the analogy cannot be good. It seems that our inner speech serves more functions than broadcasting possible courses of actions. More often than not, what we tell ourselves is not anything we would be going to tell anybody. Rather, it seems that we have to quasi-speak just in order to know what we think, and thus control our subsequent actions, but 'quasi-speech' is not 'possible outer speech', as animals' quasiactions are possible actions. To be sure, we have doubts about Carruthers's own interpretation of the analogy with animal thinking. At times it seems he is defending that the function of verbalisation is to make thoughts accessible. But other times12 he remains closer to the analogy, and regards inner talk 11 This is the way we think the picture should go. Carruthers tends to dispose of the pragmatics processor, which we think is an error. 12 See, e.g. p. 306: "[T]his capacity might be deployed so as to help people to know which utterances might be worth making, enabling them to predict the likely effects of those utterances on others". 21 as rehearsal of possible courses of actions. Insofar as he is faithful to his own labelling of his model, that is, insofar as he considers that, when talking to ourselves, we are rehearsing linguistic actions, the model cannot account for all, or even the main, uses of language in thought. 5. Verbal overshadowing: a riddle for the three accounts In this section we want to present a piece of evidence that, in our view, has a difficult explanation in any of the accounts we criticized: the phenomenon known as verbal overshadowing (Chin and Schooler, 2008; Memon and Meissner, 2002). The evidence will be also important for our own positive account, which we will offer in the next section. Verbal overshadowing occurs when verbalising mental contents deteriorates the performance of a task in which those contents appear to be involved. For instance, in a classical experimental setting, all subjects watch a video about a certain salient individual that they will have to identify afterwards. After watching the film and before testing their identification capacity, some subjects had to describe verbally the target individual while others had to read an unrelated text for the same amount of time. The results showed that the subsequent performance in recognizing the individual (e.g., picking him/her out of a line-up) was poorer for those subjects that had been asked to describe the individual. The phenomenon is robust in the domain of face recognition –where it was originally demonstrated– but Chin and Schooler (2008) report that it has been observed in domains such as decision making, problem solving, analogical reasoning, or visual imagery. Now, we claim that neither of the views we just reviewed is able to give an appropriate account of verbal overshadowing. To begin with Bermúdez's picture, verbal overshadowing shows that language is not only unnecessary to reason properly but it can even 22 be a hindrance to perform successfully in many domains. In other words, by giving a secondorder dynamics to contents provided by other non-linguistic systems, the subject worsens her performance instead of improving it. Bermúdez's claim that propositional vehicles are needed in order for thoughts to be objects of other thoughts is similarly misled, given that the more elaborate and propositional the product of the linguistic description, the more verbal overshadowing is obtained (Meissner et al, 2001). Attempting an explanation of verbal overshadowing along the lines of Prinz's account would amount to showing that employing one representational modality –words– to represent concepts from a domain in which another modality dominates –visual images of faces– impairs performance with respect to the latter domain. We think that Prinz's theory cannot show that. To see why, let us review another effect of verbal overshadowing demonstrated by Melcher and Schooler (1996). They produced evidence that individuals who are perceptual experts in some domain but do not have comparable verbal expertise (e.g., because they received only perceptual training) are especially prone to verbal overshadowing, as compared to individuals with neither perceptual nor verbal expertise in the domain, and to individuals with both perceptual and verbal expertise. (Their subsequent discussion makes it clear that they identify verbal expertise with conceptual expertise, as verbal expertise is just but a way in which concept expertise is displayed). This result is at odds with Prinz's perceptual theory. In his view, concepts are perceptual representations that came under internal control. So a consequence is that people with perceptual expertise but no conceptual expertise should only differ from people with both perceptual and conceptual expertise with respect to the fact that the latter are able to deploy at will the relevant percepts. Yet this leaves unexplained why the former are more prone to verbal overshadowing: they may lack the words but they still possess the perceptual representations that matter in Prinz's account. Moreover, if concepts are often, as Prinz contends, multimedia representations that recruit different sensory 23 modalities, it is unclear why the lack of one of those modalities –the verbal modality– should have an effect on the others when the lacking modality is exercised (i.e., when we ask the subjects to put their percepts into words). It seems that Prinz's account should predict a gradation in performance, with subjects who have neither perceptual nor verbal expertise as the worst off. Finally, turning to Carruthers's model, he relied on results from Spelke et al to argue that language is necessary to integrate certain concepts so as to produce complex thoughts. However, verbal overshadowing seems to pull in the opposite direction: where Spelke et al offer evidence that having the linguistic processor busy impairs performance in a certain nonlinguistic task, verbal overshadowing shows that performance in the task is equally impaired when the linguistic processor is busy integrating contents relevant for that task. Taking both pieces of evidence together, they suggest that whether language is or is not integrating information from the other systems has nothing to do with the observed impairment. Rather, what they have in common is simply that in both cases the linguistic processor is busy. Yet this cannot be the end of the story either, as we will argue in the next section. 6. Inner talk as inner talk We have presented and critically assessed three influential theories about the role of language in cognition that share one basic claim, namely, that language is used in our conscious thinking. We took as a starting point that our introspective data support this basic claim, as well as two others: (i) that we experience inner speech accompanying non-linguistic actions and that often that speech is condensed and fragmentary13, (ii) that we experience fuller, 13 As a referee points out, this is consistent with Galton's observation that "often while engaged in thinking out something I catch an accompaniment of nonsense words, just as the notes of a song might accompany thought. Also, that after I have made a mental step, the appropriate word frequently follows as an echo; as a rule, it does not accompany it. Lastly, I frequently employ nonsense words as temporary symbols, as the logical x and y of 24 sentence-like inner speech, especially but not exclusively associated to the mental rehearsal of linguistic activities. Most of the theories also coincide in fleshing out Clark's (1998) and Jackendoff's (1996) insight that language, which is a communicative tool, is recruited to make thoughts conscious and help us focus our attention on our own cognitive processes. The theories differ in the ways they flesh out this basic idea, however. We have tried to show that none of them accounts satisfactorily for the introspective data: Bermúdez's and Prinz's accounts entail that language is a means of carrying thoughts, an implausible thesis that flouts claim (i) in cases when inner speech is not couched in full sentences. Carruthers's view of language as an intermodular integrator does not help to explain the role of condensed inner speech either. In addition, we have argued that the phenomenon of verbal overshadowing is at odds with any of those accounts. It is time, however, that we present our own positive attempt to overcome those difficulties. To put it in a slogan-like manner, we will say that, in our view, inner talk is just inner talk. By this apparently tautological statement, we mean that inner speech shares fundamental properties and computational demands with outer speech. Inner speech is like outer speech, but used in a very particular context14. Let us first note a consequence of the basic claim that language is used in conscious thought –a consequence that we take to be quite obvious yet sometimes overlooked. If inner speech is to be used in cognition it must be processed, as any other representation in mind. Yet inner speech can only be processed by the linguistic processor. No other system in the cognitive architecture is capable of taking linguistic items ordinary thought" (Galton 1887, 29). 14 As Schlinger (2008), we think there is a continuity between outer and inner speech, yet we do not want to prejudge the question whether there are peculiarities in the way the latter is produced. In this respect, Frawley (1997) argues that the particularities of inner talk make it different from outer talk. For instance, he claims that inner talk lacks functors, so all its predication is referential, having words such as 'red' but no definite articles like 'the'. However, we do not think that this makes inner talk fundamentally different from outer talk given that they draw on basically the same syntactic and semantic resources, and the same communicative channels. Differences come, we contend, from adapting those resources to communication with oneself. In addition, Frawley seems to be overextending the characteristics of children's monologues to inner talk in general. Yet children's talk is peculiar and it is not clear that such an extension is guaranteed. Finally, Frawley ignores those cases in which inner talk actually possesses the same syntactic complexity as outer talk. 25 as input15. The consequence is that if we use inner speech to serve some further cognitive purpose, we have to exploit the computational resources of the linguistic processor. The picture that we want to reject is that when one experiences inner speech and uses it for cognitive purposes, one is taking the products of the linguistic system to serve a different end to that they ordinarily serve. We contend that the products of the linguistic processor can only re-enter the linguistic system itself. For instance, if I am putting aside red marbles from a box, I may be experiencing inner speech of the form "red... another red... not this... good". The aim of those words is to keep me focused on the task and to draw the attention of my visual system to the relevant feature. Yet the visual system does not use the words as such. Those linguistic items, meagre as they may be, have to be handled by the language system16. Unlike the theories we criticized, we contend that our view can account for verbal overshadowing, as well as for Spelke et al's results that are behind Carruthers's proposal. As we said in the previous section, what both cases have in common is that the linguistic processor is busy. Yet this obviously cannot be the full explanation: if it were, then subjects engaged in a linguistic task, such as reading a text, should be also prone to verbal overshadowing. What matters is the trade-off between the computational demands of the linguistic and non-linguistic systems, and the trade-off will vary depending on the kind of task, the kind of verbalisation, and the context17. For instance, the trade-off may roughly go 15 That is, as linguistic input –they might be able to process it as mere auditory input though. 16 Clowes (2007), following the lead of Steels (2003), offers a model in which inner speech is a tool for focusing attentional resources and restructuring internal organization. He sustains his claim by means of a series of experiments with artificial agents in which words can re-enter the language interpretation system. By re-using language the agent is capable of self-regulating so as to reach a better performance in its task (e.g., moving objects in a block world). To be sure, the model is still too undeveloped to provide answers to many of the questions we raised about inner speech, for instance, how it is that verbalisation can lead to poorer performance, and it seems to reduce linguistic re-entrance to commands. Still, it helps to illustrate the idea that linguistic products have to re-enter the linguistic processor if they are to be put to any cognitive service. 17 As Chin and Schooler (2008) point out, currently there is no unified explanation of what is going on in verbal overshadowing. There are at least three different accounts for it: one "suggests that it is something about the specific contents of verbalisations that impairs memory" (p. 399); another suggests that "verbalisation causes a shift from a holistic/global processing orientation towards a more analytic/local processing orientation" (p. 403); 26 like this: (1) The less demanding the non-linguistic task is, the more linguistic resources can be allocated to facilitate it, so depriving the subject of those resources will impair performance on the non-linguistic task. This is what happens in the spatial reorientation tasks studied by Spelke and others. (2) The more demanding the non-linguistic task, the less linguistic resources can be devoted to assist it, so increasing the demands on the linguistic processor will impair performance as well, especially when the linguistic systems draws on resources that belong to the non-linguistic system (e.g., its domain of concepts) in a way that has not been previously automated (e.g., when the subject does not possess a previous verbal expertise that facilitates the search and processing of the concept-related words). We submit that this is what may happen in cases of verbal overshadowing, especially in the case of perceptual experts without verbal expertise18. We contend that the simple view we are presenting explains in a better way the cases of condensed inner speech. Firstly, if the use of inner speech is straightforwardly parasitic on language's function as a tool for communication, then inner speech often does not need to be more elaborate. If inner talk is, in effect, inner talk, we are producing speech for an audience that is as attuned to our communicative intentions as it can possibly be. Instead of talking to someone else, we talk to ourselves, and this way we can economize our talk as much as we need. Probably, in most occasions, we speak more than we need, but that is just what happens most commonly with our outer speak too. We recruit the means we have to focus someone else's attention on our thoughts in order to focus our own attention on our own thoughts. A possible way in which we do it is (as Carruthers proposes) by giving orders to produce speech that we then abort. By so doing we generate a kind of shadow of our linguistic action, a quasi- the third suggests that "verbalisation causes a shift in criterion towards more conservative choosing, causing participants to be more likely to choose a target not present option if available" (p. 407). Whatever is the answer, the tendency in any case is to increase the overall computational burdens. 18 We conjecture that it may possible to obtain verbal overshadowing experimental settings analogous to the spatial reorientation task studied by Hermer-Vazquez and Spelke. To this end it would be necessary to make the spatial reorientation task more computationally demanding, and then ask the subjects to provide elaborate verbal 27 percept that is then processed by the channels we use in processing customary speech. That is, our silent speech activates the linguistic module and the pragmatics module, which probably work in tandem to extract the thought that has been communicated, i.e., the thought we try to tell ourselves. This way the thought is made available to consciousness and accessible to our attention. Moreover, secondly, the fact that what we are doing is re-process our linguistic output, and that this takes computational resources, shows that inner speech had better be fragmentary in many occasions. In contrast with Bermúdez and Carruthers, who coincide in assuming that language is necessary for certain higher cognitive abilities, such as thinking about thoughts or integrating information, we are suggesting that language is optional: it is one more of the resources available to our mind in order to perform its functions. It will be co-opted when the computational cost of having the linguistic processor active pays dividends, and it can possibly be learnt how to exploit its possibilities in order to be more proficient. Our proposal helps to explain, on the other hand, why inner speech is other times elaborate, propositional-like. It will be so, to put it roughly, when it can and when it has to. It can be elaborate if the computational costs are reasonable. It will have to be so if the task demands it. The latter is clearly the case when the task that is carried out is linguistic itself, e.g., when we are mentally rehearsing pieces of a paper we are preparing, or a conversation that might go this or the other way. Notice that in these cases the condition that inner speech can be elaborate is also satisfied: as there is no extra computational cost apart from what the linguistic system itself demands, one can afford to be more articulate, and one had better be so, given that it is her linguistic performance itself what is at stake. An anonymous referee suggests that inner speech gets more sophisticated as the task difficulty increases, whatever its descriptions of the elements involved. The reason why we think that these amendments would be necessary is that the spatial reorientation task seems to be simpler than the face recognition task. 28 nature. We agree that linguistic tasks are not the only cases in which we have expanded inner speech –only perhaps the clearest cases in terms of inner speech coming at no extra computational cost. Yet we think that task difficulty cannot be the only parameter at stake, as the phenomenon of verbal overshadowing shows: if face recognition is impaired by verbalisation then sophisticate verbalisation would not help either. However, one might hypothesize that increasing the difficulty of the recognition task could correlate with higher degrees of spontaneous verbalisation, showing greater reliance on the linguistic system for more complex tasks. The literature on overshadowing does not seem to back this hypothesis but it is still an empirical possibility to test. Let us consider a possible objection. Aphasics seem to tell against the picture we have presented, for aphasics cannot recruit linguistic communication in order to be aware of their thoughts. Moreover, according to the model sketched, it seems that both comprehension and production aphasics would be unable to monitor their own thinking, control their behaviour, revise their thoughts and make plans the way non-aphasics do: production aphasics because they cannot produce language, and comprehension aphasics because they would be unable to understand what they tell themselves. However, our model is not committed to these results. Even if it is obvious that language is our best communicative tool, it is not the only one we have, especially if our communication is directed to someone who shares a lot of contextual knowledge with us. Simple gestures and images can be just as effective communicational means when our audience is attuned to the state of our minds. Thus, a non-linguistic adult could recruit other communicational abilities to focus her attention on her own thinking. What seems to be necessary, however, is that she has such communicative abilities. 7. Conclusion 29 Inner speech has a definite function in our cognition, which is to focus our attention on our thinking so that we can monitor our cognitive processes from a personal point of view. This is not an original claim and the positions that we have criticized in this paper share this much. However, we have argued that their explanations fail to account for what we take are the basic phenomenological facts about the phenomenon: that it often may appear as fragmentary and subsentential, and that there is a difference to explain between those cases in which it is condensed and those others in which it appears as more articulate and sentence-like. First, sometimes they are committed to the claim that a sentence of inner speech is a conscious thought. However, we argued that there are independent reasons to reject that a natural language sentence can constitute a determined, explicit thought. Instead, we contend that one can be conscious of a piece of inner speech and of a thought, yet this does not mean that the former is the representational vehicle by means of which the latter is formed. Second, they offer accounts that ask either too much or too little from inner speech. They ask too much from inner speech when they demand that it delivers full sentences in order to make thoughts conscious, or that it is in the business of producing a continuous flux of speech that may or may not be attended to. They ask too little when they lead to the assumption that inner speech has to be supplemented by other perceptual representations in order to be able to make a certain thought conscious. Third, the theories have trouble accounting for the differences in inner speech as an aid to certain processes and as a nuisance to others, as exemplified by the phenomenon of verbal overshadowing. We take that the alternative we are sketching is simple, yet it complies with significant pragmatic views on how linguistic items relate to their contents, and gets rid of the problematic idea that thoughts are actually couched in those items. Moreover, it respects the basic phenomenological data and it can begin to give an explanation of why language is sometimes not only unnecessary but even a burden for cognitive processes to fix the attention 30 on their tasks. By endorsing the Vygotskyan view that inner speech is internalized outer speech that is used as an aid to cognition, we do not mean to imply that there are no more possible roles for inner speech to play19. We dare to say that many times we do not seem to talk to ourselves in order to solve demanding cognitive tasks. The pieces of information that we raise to the personal, conscious, level do not look like pieces we should be conscious of in order to attain some cognitive goal. Just as we do not communicate with others exclusively – even mainly– to exchange important information, but just to "keep things going" and strengthen bonds, we may be using inner talk to strengthen our sense of ourselves, to "keep ourselves going". Inner talk, thus, would be the main means we use to elaborate our own narratives. References Bermúdez, J. L. (2003), Thinking without Words (Oxford: Oxford University Press). Carruthers, P. (1996), Language, Thought, and Consciousness: An Essay in Philosophical Psychology (Cambridge: Cambridge University Press). Carruthers, P. (2002), 'The cognitive functions of language', Behavioral and Brain Sciences, 25 (6), pp. 657-726. Carruthers, P. (2006), The Architecture of Mind (Oxford: Oxford University Press). Carston, R. (2002), Thoughts and Utterances (London: Blackwell). Chin, J. M. & Schooler, J. W. (2008), 'Why do words hurt? Content, process, and criterion shift accounts of verbal overshadowing', European Journal of Cognitive Psychology, 19 For instance, there is an ongoing debate regarding the extent of the involvement of inner speech in selfawareness and the identification of personal boundaries (cf. Morin 2005, 2008; Mitchell, 2008; Schlinger, 2008). Still, we think that they are roles that also depend on language's fundamental communicative function, e.g., as one talks to oneself one comes to know that one is thinking, and if it goes well, that it is oneself that is doing the thinking. 31 20, pp. 396-413. Clark, A. (1998), 'Magic words: How language augments human computation', in P. Carruthers & J. Boucher, eds. Language and Thought: Interdisciplinary Themes (Cambridge: Cambridge University Press), pp. 162-183. Clowes, R. (2007), 'A self-regulation model of inner speech and its role in the organisation of human conscious experience', Journal of Consciousness Studies, 14 (7), pp. 59-71. Dennett, D. C. (1991), Consciousness Explained, (Boston: Little Brown). Fernyhough, C. (2004), 'Alien voices and inner dialogue: towards a developmental account of auditory verbal hallucinations', New Ideas in Psychology, 22, pp. 49-68. Fodor, J. (2001), 'Language, thought and compositionality', Mind and Language 16, pp.1-15. Fodor, J. (2003), 'More peanuts. Review of José Luis Bermúdez 'Thinking without Words'', London Review of Books, 25 Frawley, W. (1997), Vygotsky and Cognitive Science, (Cambridge, MA: Harvard University Press). Gallagher, S. & Zahavi, D. (2008), The Phenomenological Mind: An Introduction to Philosophy of Mind and Cognitive Science, (London: Routledge). Galton, F. (1887), 'Thought without words', Nature, 36, pp. 28-29. Gleitman, L. & Papafragou, A. (2005), 'Language and thought', in K. Holyoak & B. Morrison, eds. Cambridge Handbook of Thinking and Reasoning (Cambridge: Cambridge University Press), pp. 633-661. Guerrero, M. C. M. de (2005), Inner Speech – L2: Thinking Words in a Second Language, (New York: Springer). Heavey, C. L. & Hurlburt, R. T. (2008), 'The phenomena of inner experience', Consciousness and Cognition, 17, pp. 798-810. Hermer-Vazquez, L. and Spelke, E. (1996), 'Modularity and development: the case of spatial 32 reorientation', Cognition, 61, pp. 195-232. Hermer-Vazquez, L., Spelke, E., & Katsnelson, A. (1999), 'Sources of flexibility in human cognition: Dual-task studies of space and language', Cognitive Psychology, 39, pp. 336. Jackendoff, R. (1987) Consciousness and the Computational Mind, (Cambridge, MA: MIT Press). Jackendoff, R. (1996), 'How language helps us think', Pragmatics and Cognition, 4(1), pp. 135. Jackendoff, R. (2007), Language, Consciousness, Culture: Essays on Mental Structure, (Cambridge, MA: MIT Press). John-Steiner, V. (1997), Notebooks of the Mind: Explorations of Thinking, (Oxford: Oxford University Press). Klinger, E., and Cox, W. M. (1987–1988), Dimensions of thought flow in everyday life. Imagination, Cognition and Personality, 7, pp. 105-128. Langdon, R., Jones, S. R., Connaughton, E. & Fernyhough, C. (2009), 'The phenomenology of inner speech: comparison of schizophrenia patients with auditory verbal hallucinations and healthy controls', Psychological Medicine, 39, pp. 655-663. Lurz, R. (2007), 'In defense of wordless thoughts about thoughts', Mind and Language, 22, pp. 270-296. Machery, E. (2008), 'Massive modularity and the flexibility of human cognition', Mind and Language, 23, pp. 263–272. Marr, D. (1982) Vision. New York: Freeman. Meissner, C. A., Brigham, J. C., & Kelley, C. M. (2001), 'The influence of retrieval processes in verbal overshadowing', Memory & Cognition, 29(1), pp. 176–186. Melcher, J. M. and Schooler, J. W. (2004), Perceptual and conceptual training mediate the 33 verbal overshadowing effect in an unfamiliar domain', Memory and Cognition, 32, pp. 618-631. Memon, A. and Meissner, C. A. (2002), Special issue: Investigations of the effects of verbalization on memory, Applied Cognitive Psychology, 16(8). Mitchell, R. W. (2008), 'Self-awareness without inner speech: A commentary on Morin', Consciousness and Cognition, 18, pp. 532-534. Morin, A. (2005), 'Possible links between self-awareness and inner speech: Theoretical background, underlying mechanisms, and empirical evidence', Journal of Consciousness Studies, 12, pp. 115-134. Morin, A. (2008), 'Self-awareness deficits following loss of inner speech: Dr. Jill Bolte Taylor's case study', Consciousness and Cognition, 18, pp. 524-529. Peacocke, C. (2007), 'Mental action and self-awareness', in J. Cohen & B. McLaughlin, eds. Contemporary Debates in the Philosophy of Mind, (Oxford: Blackwell), pp. 358-376. Pinker, S. (1994), The Language Instinct, (New York: Morrow). Prinz, J. (2000), 'A neurofunctional theory of visual consciousness', Consciousness and Cognition, 9, pp. 243-59. Prinz, J. (2002), Furnishing the Mind, (Cambridge, MA: MIT Press). Prinz, J. (2003), 'A neurofunctional theory of consciousness', in A. Brook & K. Akins, eds. Cognition and the Brain, (Cambridge: Cambridge University Press), pp. 381-396. Prinz, J. (2007), 'All consciousness is perceptual', in J. Cohen & B. McLaughlin, eds. Contemporary Debates in the Philosophy of Mind, (Oxford: Blackwell), pp. 335-357 Prinz, J. forthcoming, The Conscious Brain, (Oxford: Oxford University Press). Recanati, F. (2004), Literal Meaning, (Cambridge: Cambridge University Press). Schlinger, H. D. (2008), 'Some clarifications on the role of inner speech in consciousness', Consciousness and Cognition, 18, 530-531. 34 Spelke, E. (2002), 'Developing knowledge of space: Core systems and new combinations", in A. M. Galaburda, S. M. Kosslyn, & Y. Christen, eds. Languages of the Brain, (Cambridge, MA: Harvard University Press), pp. 239-258. Steels, L. (2003), 'Language re-entrance and the "Inner Voice"', Journal of Consciousness Studies, 10, pp. 173-185. Travis, C. (2000), Unshadowed Thought, (Cambridge, MA: Harvard University Press). Zlatev, J. (2008), 'The dialectics of consciousness and language', Journal of Consciousness Studies, 15, pp. 5-14.