This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier's archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy Default Semantics and the architecture of the mind Alessandro Capone Department of Philosophy of Language, Viale delle Scienze, University of Palermo, Italy 1. Introduction The main topic of this paper is the relationship between Default Semantics and the Principle of Relevance within the modular architecture of themind. I propose to come to a better grasp of the interaction in question by utilizing knowledge of the issue ofmodularity ofmind. In particular, I analyze the phenomenon Bach (1998) calls 'standardization' and propose that once inferences become standardized, they are no longer processed through the Principle of Relevance, given that they can be furnished directly by the default inferences archive. I consider potential objections to this idea, based on experimental pragmatics and arrive at the conclusion that merger representations, which guarantee compositionality at the level of the utterance, take into account both Default Semantics and modulated effects due to context. Following Horn (2005), I assume that contextual information may seep into the pragmatic interpretation while the default semantics is considered. I shall assume the existence of a module called 'Mind-reading module' within which the inferential work related to understanding utterances is carried out. I shall argue that the Mind-reading module includes the processes described by Jaszczolt's Default-Semantics as well as the pragmatic inferential processes for which the Principle of Relevance is responsible. I shall argue that Default Semantics describes inferences that make use of inferential shortcuts stored in an archive and also utilizes simple heuristic procedures (fast and frugal procedures in the terms of Gigerenzer et al., 1999). Default Semantics includes the study of the inferential processes explained through the Principle of Relevance and its basic heuristics (e.g. What is not said is not) are specializations of the Relevance heuristic. Specializations arise in response to recurrent environmental problems and consist of heuristics which, though based on the Principle of Relevance, have formed a distinct sub-component of the theory of mind, and can be characterized as associative. A certain stimulus will trigger a certain output based on these specialized heuristics as a result of semantic associations. Journal of Pragmatics 43 (2011) 1741–1754 A R T I C L E I N F O Article history: Received 27 August 2009 Received in revised form 22 May 2010 Accepted 9 November 2010 Available online 8 January 2011 Keywords: Default semantics Relevance Theory Modularity of mind Inferential pragmatics Cognitive science Dedicated to Jacob L. Mey A B S T R A C T In this paper, I explore the relationship between Relevance Theory and Jaszczolt's Default Semantics, framing this debate within the picture of massive modularity tempered by the idea of brain plasticity (Perkins, 2007). While Relevance Theory focuses on processing (see cognitive efforts and contextual effects interplay), Default Semantics focuses on types of sources from which addressees draw information and types of processes that interact in providing it. In particular, I argue that Relevance Theory interacts with default semantics by standardizing inferences which are ultimately compressed (to use a term by Bach, 1998) into a default semantics. I briefly discuss potential obstacles to the idea of default semantics coming from the experimental pragmatics literature (e.g. Noveck and Sperber, 2007; Breheny et al., 2005) and I support further the idea of the division of labor default inferences and the inferences derivable through the Principle of Relevance. In the end, I compare Relevance Theory and Default Semantics, in an attempt to come to amore unified picture.  2010 Elsevier B.V. All rights reserved. E-mail addresses: alessandro.capone@unipa.it, alessandro.capone@istruzione.it. Contents lists available at ScienceDirect Journal of Pragmatics journa l homepage: www.e lsev ier .com/ locate /pragma 0378-2166/$ – see front matter  2010 Elsevier B.V. All rights reserved. doi:10.1016/j.pragma.2010.11.004 Author's personal copy In this paper I will not advocate a conflation of the mental and the neuropsychological levels or advocate reductionism of the mental to the neuropsychological level (See Chomsky, 2000). A theory about the mind may surely proceed along a separate dimension, being based on deductions on the basis of what we know about language, comprehension and other most important introspection data and analyses of those data (see also Capone, 2010); however, if the findings about neuropsychology can independently support our speculative considerations about theway language and language usework, this cannot but be a welcome result. Neuropsychological data are normally treated as corroborating evidence, but it is possible even to consider the possibility that the neuropsychological facts may be at odds with theoretical considerations. It is theoretical considerations thatmattermost. Chomsky ismethodologically right: for examplewe canmake sense of certain patterns of electrical activity in the brain by the notion of semantic deviance. Surely we could not understand what those patterns of electrical activity would signify if we did not have an independent notion of syntactic deviance and of its importance for a theory of language and of the mental. In this paper, I do my best to disentangle the mental from the neuropsychological, even if sometimes the two levels will meet. 2. On modularity of mind In this paper I presuppose a modular picture of the mind. I shall confine myself to only sketching this picture with broad brush strokes. Work on the modularity of the mind started with Fodor (1983) and was extended in Fodor (2000). Fodor mainly distinguishes between input systems (e.g. perceptual processes) and a central system. While modular processes are encapsulated, domain-specific, specialized, shallow, fast and obligatory, central processes for Fodor are not encapsulated but draw inputs from a variety of domains. This strict picture of modularity is being replaced by the massive modularity picture, mainly advocated by Carruthers (2006a), Sperber (2005),Wilson (2005), Sperber andWilson (2002) and Carston (1996, 2002), among others, where themind consists of myriad modules, each correlating with a certain (specialized) function, whose input domain is highly restricted (e.g. the visual system is for processing visual percepts only, it does not deal with inputs from other organs) andwith certain transducers. In this picturemodules can share resources, especially if they are situated in close neural areas and if they do not perform concurrent functions. The main advantage of the modularity picture is that it explains how the mind can react in such a fast manner to simultaneous inputs coming from different types of transducers and how it facilitates learning in cases in which two or more different types of input have to be analyzed simultaneously. Another advantage of modularity is that each module, in so far as it is to some extent insulated from the remaining architecture, can be damaged, without affecting the remaining modules. So, breakdown in a module creates limited damage, since other modules are available to process information coming from the outside. People can have their language systemdamagedwhile keepingmuch of the remainder of cognition intact (aphasia); people can lack the ability to reason about mental states while still being capable of much else (autism); people can lose their ability to recognize just human faces; and so forth and so on. An obvious advantage ofmassive modularity is that processing of information does not create an informational bottleneck, given that various tasks can be carried out in parallel. Another advantage is that the modular system allows evolution to add newmodules or to tinker with existing ones, without affecting the remainder of the system. A modular mind is capable of evolving and adding new specialized mechanisms in response to environmental challenges. Unlike most other authors on modularity, Karmiloff-Smith (2010) puts forward the view that the mind is gradually modularized during development, as a result of the fact that different areas of the brain are more suitable for dedicated mechanisms (more relevant to certain types of processing). So, on this view, starting out with tiny differences across brain regions in terms of the patterns of connectivity, synaptic density, neuronal type, etc., some areas of the brain are somewhat more suited than others to processing of certain types of input. These ideas are interesting, as wemay be open to the fact that modules emerge, not as a result of genetic endowment, but as a result of interaction with the environment and with repeatable patterns of experience. 2.1. The mind-reading module In this paper, I am particularly interested in the mind-reading module. Theories of mind have largely been of two types: the theory–theory of themind and the simulation approach. The theory–theory of themind seesmind-reading as essentially scientific thought that is based on generalizations. Tomake an example, if you know that human beings suffer severely due to the loss of their dear relatives, then by seeing a person who has lost her father you make the prediction that that person is severely suffering (of course, like scientific theories, this sort ofmind-reading can gowrong in circumstances inwhich people deviate strikingly from certain (near)-universal dispositions). The most influential theories of this type were worked out by Wimmer and Perner (1983), who studied how children's minds developmentally change in relation to false-belief tasks. The simulation approaches see mind reading as running a simulation. You put yourself into another person's shoes and create certain conditions in imagination (suppose I am terminally ill; what would I do?) and then run a simulation on those conditions. Thiswouldbeunlike theory–theory approaches, in that all youdo is toexperienceyourownreactions in response to simulatedconditions.However, youhave toquarantine those statesofmindwhichare likely to interferewith the simulation, by NOT letting in states of mind which belong to you and not to the person whose behavior you are trying to simulate. In this article, in consonance with Sperber (2005), Wilson (2005), Sperber and Wilson (2002) and Carston (1996, 2002), I shall try to explain mind-reading in connection with pragmatic inferences (inferences arising from verbal behavior) by A. Capone / Journal of Pragmatics 43 (2011) 1741–17541742 Author's personal copy accepting the existence of the Principle of Relevance that processes inputs in a very fast way and provides shallow inferences, that is to say inferences triggered by procedures that are designed to provide responses to environmental stimuli in real time, without getting bogged down in laborious reasoning. These inferential processes are fast, automatic, and schematic – in other words they follow what Gigerenzer et al. (1999) call fast and frugal heuristics. In connection with this approach, Carruthers (2006a) writes: Most cognitive scientists now think that the processing rules deployed in the humanmind have been designed to be good enough, not to be optimal. Given the speed of processing is always one constraint for organisms that may need to think and act swiftly in order to survive, evolution will have led to compromises on the question of reliability. Indeed, it will favor a satisficing strategy, rather than an optimal one. (Carruthers, 2006a:54). I produce some simple examples of the type of fast-and-frugal heuristics proposed by Gigerenzer et al. Suppose you are required to answer the question: which of two cities is the larger. A simple heuristic can be useful – you can choose the only one of two options that you recognize. Another heuristic to be used in response to the same question can be the following: you may look first at beliefs about which properties of cities have correlated best with size in the past. If having a top-division team correlatedwith greater size in the past, then you select the town which has a top-division team, while neglecting the town which does not have one. If none of the two towns has a top-division team, you move on to the next best predictor of size. The nice thing about these fast-and-frugal heuristics is that they comewith stopping rules; you knowwhen you can stop and you do not indefinitely process information in order to arrive at an optimal answer.What is good enoughwill suffice, and you stop there. Before closing this section, I need to remind readers of the kind of evidence generally bearing on the question whether there is (or not) a mind-reading module and whether certain mind-reading tasks are executed within it. The most widely cited study showing that there is a mind-reading module is Baron-Cohen et al.'s (1985) study of autistic children. Autistic children are impaired in mind-reading tasks, and they seem to be blind to the notion that other people have minds. Thus, an autistic child will drag his father as if he were a toy. Autistic children, however, seem to be impaired only in the kind of tasks that involve attributing intentions to others, but they need not be impaired in more generic cognitive tasks. Thatmind-reading activities may involve automatic, fast heuristics is proven by patients afflictedwithWilliams' syndrome. This disorder results in an average IQ of around 50, combined with linguistic abilities and social skills. People with Williams' syndrome have good abilities for mind-reading and communication but poor general reasoning abilities. This dissociation seems to support the existence,within themind-readingmodule, of a sub-module (ormore sub-modules) dedicated to fast and automatic inferences concerning a speaker's intended meaning (see, however, Perkins, 2007, for a deeper discussion). The existence of this sub-module dedicated to mind-reading is supported by another dissociation. Patients with Asperger's syndrome have good general reasoning abilities but serious impediments in mind-reading. These people can use general reasoning to compensate for the lacking special-purpose skills (see Wilson, 2005 for a deeper treatment of this point). For lack of space I cannot expand on these ideas, but I finally refer the reader to Happè and Loth's (2002) important paper on the dissociation between the mind-reading module in connection with actions and the mind-reading module in connection with communication. This is evidence that the Relevance Theory module is a sub-module of the more general mind-reading module. 3. Default semantics Now that I have discussed the issue ofmodularity ofmind, I turn to Default Semantics, asmy aim in this paper is to situate Default Semantics in a modular view of the mind, which helps explain the relationship with the Principle of Relevance. I discuss the important approach to inferential pragmatics developed by Jaszczolt in a series of publications starting from her 1997 seminal paper (Jaszczolt, 1999, 2005, 2009), which she called 'Default Semantics'. In discussing it, I propose crucial modifications of the framework. A crucial feature of this framework is the centrality of the notion of the speaker's intention. Linguistic actions – like nonlinguistic ones – are animated by intentions and are successful in so far as hearers recognize such intentions. The theory elaborated by Jaszczolt is complex andmulti-faceted. However, here I freely draw on themost central ideas of her theory. Jaszczolt (1999) notes that there might be a tension between the individual and the social path of interpretation and she correctly remarks that it is the social path of interpretation thatmustwin. The individual path of interpretation gives room to a number of idiosyncrasies and allows the hearer to manipulate the speaker's intentions on the basis of what it is convenient or palatable for her to believe. The tension between the individual and the social path of intentionality can be best represented as a tension between selfishness and altruism/responsibility. In the words of Dascal (2003), communication is regulated by the duty to understand (on the part of the hearer) and the duty to make oneself understood (on the part of the speaker) and the interpretation process is described as a reaching out towards the speaker. This terminology is particularly felicitous, because, in describing the speaker's and the hearer's duties in the communication process, it emphasizes the notion of responsibility, and the notion of reaching out towards the speaker emphasizes the idea of altruism vs. selfishness as being implied in the notion of a responsible communicator (the social path of intentionality, as Jaszczolt says). As Dascal (2003) says, this reaching towards the other – one might say, this altruistic A. Capone / Journal of Pragmatics 43 (2011) 1741–1754 1743 Author's personal copy orientation – is inherent in communication qua coordinated action (such considerations are dealt with in detail in four maxims underlying communication in Capone, 2004). This is part of the picture. The other part of the picture is the attempt to get rid of ambiguitywhenever possible. Brandishing Occam'sModifiedRazor,à laGrice, Jaszczolt says that if a lexical itemcanbe interpreted as 'a' and 'b' in different circumstances, instead of positing ambiguity proper, one can say that one is faced with an interpretative ambiguity and one can assign the lexical item a Default Semantics in case one of the two interpretative options is chosen in most contexts. As Mey says: In real life, that is, among real language users, there is no such thing as ambiguity – excepting certain, rather special occasions, on which one tries to deceive one's partner, or 'keep a door open'. (Mey, 2001:12). Take for example the lexical item 'some'. If you utter (1) (1) Some of the students arrived the Default interpretation is that 'some but not all of the students arrived' (so some did not arrive, the class is partially empty). If my interpretation of Jaszczolt is correct, her approach is similar but not completely identical to Levinson (2000) on presumptive meanings. And her approach is to be differentiated from the one by Relevance theorists (e.g. Sperber and Wilson, 1986; Carston, 2002), despite the fact that both approaches can be classified as post-Gricean, intention-based, contextualist accounts of utterance processing. As far as one can see, themain difference between the two enterprises is that while Relevance Theory focuses on processing (see effort/effect), Default Semantics focuses on types of sources from which addressees draw information and types of processes which interact in providing it. It appears that there is scope for an eclectic account (more on differences later). Relevance theorists always describe inferences in context. They are mainly concerned with the class of phenomena Grice dubbed 'particularized implicatures'. Default inferences à la Jaszczolt, instead, do not arise fromparticular contexts – in this sense they are similar to Levinson's presumptivemeanings. Jaszczolt's approach is distinct from Levinson's because, while Levinson explicitly ties his inferential augmentations to pragmatic principles (specifically to his neo-Gricean Hornbased revision of the Gricean maxims), he considers pragmatic augmentations 'local'. Instead Jaszczolt opts for a global inferential approach, in which inferences are computed by integrating at the utterance level both semantic and pragmatic information. Jaszczolt gives one the impression that she opts for a semantic view of pragmatic phenomena by proposing the idea of 'default interpretations'. As I wrote in Capone (2002), she devises a system in which one stores default interpretations in an archive and such interpretations are automatically activated in a default context, unless there are visible clues that militate against the default interpretation and, thus, favor contextual modulation (I do not remember that Jaszczolt makes use of the word 'archive' though). If one wonders how this picture originates, one may probably go back to Wittgenstein's equation of meaning and use. Alternatively, onemay see phenomenology as inspiring Jaszczolt's perspective – pp. 88–90 of her book 'Discourse beliefs and intentions' are instructive in this respect. She takes both Husserl and Brentano to support the idea that the content of our attitudes are things of theworld (In Husserl's work, the connection between perception and belief is emphasized). On p. 48 of her 'Default Semantics' Jaszczolt clarifies the issue: Now, intentional acts can be about mental objects, real objects, or whole states of affairs (eventualities); states, events, or processes. I shall follow the later phenomenological tradition and assume that our mental acts are directed at real rather than mental objects, and at real eventualities. (Jaszczolt, 2005:48). In this paper, I would like to give amore distinctively cognitive slant to Jaszczolt's theory and I would like to claim that there are cognitive principles responsible for the assignment of referential interpretations to NPs. If my view is correct, we need not use premises from phenomenology to support the Default Semantics of NP, nor should we use explanations of Levinsonian or relevance-theoretic inspiration. I propose that the mind works this way. If you encounter an utterance of (2) Mary thinks that Ortcutt is crazy there are two possible readings: a. The de re reading. The reporter (of the belief) ascribes to Mary a belief about a particular, known individual (de re). 2b. 2 The de dicto reading. The reporter of the belief says that Mary believes in the existence of Ortcutt and Mary ascribes to him a certain property. you will be inclined to give a referential interpretation to 'Ortcutt' and a 'de re' reading of the sentence in (2) (Ortcutt is such that Mary thinks of him that he is crazy) not because the referential and 'de re' reading is the most informative one (the one which has greater contextual effects) but because this is the way the mind works. I assume that what Jaszczolt calls the 'Default De Re Principle' is nothing less than an a priori form of our interpretation processes. In the same way in which, for A. Capone / Journal of Pragmatics 43 (2011) 1741–17541744 Author's personal copy Kant, the ideas of space, time and cause are the a priori principles of knowledge, the DefaultDe Re principle is an a priori form of interpretation processes. The principle is exposed below: The default De Re principle The de re reading of sentences ascribing beliefs is the Default reading. Other readings constitute degrees of departure from the Default, arranged on the scale of the strength of intentionality of the corresponding mental state. (i) The hearer of an expression of belief of the form 'Bws' normally presumes that the speaker holds a belief de re and that the referring term is useful to refer to an individual, unless the content of utterance signals otherwise; (ii) The hearer of a belief report of the form 'A believes that 'B ws' normally interprets the utterance as de re, unless the context signals otherwise. The Default de re principle can be derived through reasoning. Suppose that there is no default de re principle in language. Then the child, presented with utterances, could very well interpret NPs as not being referential, taking them to refer to abstract categories. For example, faced with an utterance such as 'The dog is in the garage' a child could very well take 'the dog' to refer to the abstract category 'dogs'. This means that language would be unlearnable for the child (or learnable with great difficulty), as for every concrete object, the linguistic item referring to it would be multiply ambiguous, capable of referring either to a concrete object or to an abstract category. However, if a child is guided by something like the default 'de re' principle, language acquisition is facilitated, as he will not run the risk of misinterpreting concrete categories by applying abstract categories (gradually the child will move from the sphere of concrete objects to the sphere of abstract ones, but it is clear that the DefaultDe Re Principle is a very useful heuristic principle for the childwho needs to orient himself in otherwise chaotic pieces of language behavior). That children are guided by cognitive principles in language acquisition comes as no surprise, as Carruthers (2006a) claims, following Bloom (2000), that in learning a language children are guided by the innate notion that the speaker's intention as manifested through gaze or gestures is important; furthermore, he claims that a child will not try to apply a new word to an objectpart, but to the entire object (soonhearing 'rabbit' for thefirst time, the childwill not apply the concept 'rabbit' to a rabbit part (say, thehind leg) but to the entire object). (Goldman, 2006:178, discusses the followingprinciple: prefer parsing theworld into whole objects rather than arbitrary parts of whole objects or arbitrary merelological sums of whole objects.) It goes without saying that the Default de re principle is responsible both for the preferred referential interpretation of belief reports and for the preferred referential interpretation of utterances such as: (3) Smith's murderer is insane Sentences such as (3) are interpretatively ambiguous between a referential and an attributive interpretation: however, the hearer normally settles immediately and automatically for the referential interpretation. Examples of this type can be multiplied ad libitum. In fact, an autterance such as (4) (4) John wants to sell his cello (originally from Heim, 1992). is usually assigned the interpretation that John believes he has a cello and that such a belief ismatched by the speaker's belief that John has a cello. On this theory, the case in which crazy John falsely believes he has a cello, even if (4) is uttered, is a marginal one, a reading that constitutes a 'degree of departure from the default'. Are there any other interesting cases of default inferences? I would like to list a number of inferential phenomena which are not discussed by Jaszczolt, but whichmay support the case for cognitive Defaults. I will freely resort to cases discussed by Bach (2001) andDascal (2003), adapting them to the purposes of the present discussion. (I am responsible formodifications.) Speak seriously! When a speaker A proffers her utterance U, there is a presumption that she seriously intends to say U, unless some clues from the context clarify that she cannot have a serious intention (e.g. she is speaking ironically, humorously, etc.). Could we not consider this presumption as part of our cognitive make up – of the way our cognition works? Certainly a cognitive universal to the effect that the speaker's words are taken non-literally or non-seriously unless clues indicate otherwise could not be of any help to mankind; that would only disorient communicators. The reasoning would have to proceed like this. Suppose that there is no default procedure for discerning seriousness of intention. Then the child learning a language would have no guiding principle available orienting her towards the selection of the right lexical items corresponding to appropriate concepts. Language learning would have to be an unachievable task, as for each word, there is the possibility that the instructor (say, the mother or the father) is not speaking seriously, in which case the motivation for learning a language would decrease. Only one language! When a speaker uttersmore than one utterance, there is a presumption that hewould use the same language as that of the previous utterance. You do not struggle to guess, each time: ''Which language is she likely to use next?'. In a non-bilingual A. Capone / Journal of Pragmatics 43 (2011) 1741–1754 1745 Author's personal copy community, you are pretty sure that a language will be constant throughout the communication process. (In a bilingual communityyouknow inadvance that any of the two languages, butNOT a third one,will be selected in actual communication.) The reasoningwould proceed as follows. Suppose that it is a logical possibility that a sequence of utterances can be uttered by using any languagewhatsoever, at random. If such a logical possibilitywere countenanced by the child, themotivation for learning a language would decrease, because, by being exposed to a sequence of utterances, he would not know which language he has to learn and which language the speaker is likely to speak. Instead, if the child expects the speaker to use a single language (or at most a very limited set of alternatives) through a sequence of utterances, themotivation for learning a language would increase. The speaker as principal There is a presumption that the speaker is also the author and principal of his utterance content. These are notions from Goffman's (1981) Forms of Talk. A person can speak without being responsible for the words uttered (e.g. an actor/an ambassador), he simply voices the message. However, in normal speech we do not dissociate the role of the speaker as the source of the utterance from his role as the means of communication. The reasoning for this cognitive principle would have to proceed as follows. Suppose that there is no such cognitive principle, then for every utterance she hears, the child would not know whether the speaker speaks in her capacity as animator or author or principal. For all the child knows, it is a logical possibility that the speaker is just an animator and that everything she says comes from a source different from the animator (or the instructor). If such a logical possibility were contemplated seriously by the child, then it would not be possible for the child to associate the language spoken by the animator with the principal's language (so the chances that he is learning a language which is not the one spoken by the animator increase), furthermore it would not be possible to associate linguistic items and utterances with the speaker's intentions. But we saw, given what Carruthers says about gaze and gestures, that intentions are important (in fact, indispensable) for learning a language. These are ideal cases of preferential interpretations expressible through Default Semantics. You do not rely on particular clues – especially words – to draw these inferences. To argue that these are socio-cultural Defaults would imply that these aspectsof communicationare learnable, rather than innate.But thiswould requireshowingthatat least in somesocieties things are not this way. It would also involve explaining how adults/infants communication can occur, since these defaults are presuppositions of communication, rather than learnable aspects of communication or of culture. You do not first teach the infant these defaults and then proceed with communication with her; on the contrary, these defaults are presupposed in communication. Wewonder if the cognitive defaults reducible tomore general principles.1 Now, this question is clearly a question about the link between Default Semantics and Relevance Theory. While a cognitive default may work as an instruction to interpret a certain fragment of language use in a certain way, it is possible that behind it there is a cognitive principle of basic rationality. This I will not deny, albeit I will insist that cognitive defaults are short-circuited inferences, in which the mind is not busy calculating inferences on the basis of general principles of rationality. We can, however, note important connections. Each of such defaults may arise due to the need of avoiding ambiguities and obscurities which would impede not only language processing, but also language acquisition. Since themindworks bypromoting contextual effectswhile keeping efforts as lowas possible, and since without such cognitive defaults language acquisition would be impeded or retarded, the mind recruits Sperber and Wilson's Principle of Relevance for the purpose of creating cognitive defaults which, if implemented as simple instructions, are evenmore frugal and faster than the application of the Principle of Relevance each time a certain input occurs. We may see the cognitive defaults as specializations of the application of the Principle of Relevance. One may wonder why one should consider these tendencies as best explained as a priori, hardwired principles of (linguistic) interpretation when they can typically be derived from basic rationality. A possible reply to this question is that, granting the connection between these tendencies and the cognitive Principle of Relevance, which originally motivated them, a person is better off having these tendencies hard-wired, because of the possibility thatmore than one environmental problem of the type resolvable through these tendencies may present itself at the same time. The presence of two or three environmental problems of this type would put inferential reasoning to great costs, since cognition would have to work out responses to two or three or more environmental problems simultaneously. Instead, a hardwired, modular account of these tendencies resolves the problem of simultaneity of inference. This problem, as shown by Carruthers (2006b) is at the basis of modular stories. Surely, one could go on arguing that the cognitive Principle of Relevance can deal with more than one problem in parallel (in other words, it can deal with a series of problems by resolving them in parallel). Now, even assuming that this is so, one must admit that cognitive efforts will increase enormously and there is also a chance that the processor will not be able to cope with so much processing because of the cognitive load. Instead, if we posit that there are some principles related to, but different from, the Principle of Relevance that can deal withmatters such as 'serious speech' 'author vs. speaker', 'only one language', referential interpretations, the cognitive load for the operation of the Principle of Relevance is diminished, and this is clearly necessitated by the fact that the Principle of Relevance is always busy computing the relevance of an utterance in a given context. 1 An interesting reduction of the speaker's 'sincerity stance' to a cognitive story is to be found in Paglieri and Castelfranchi (2010)Paglieri and Castelfranchi (2010). For the sake of space I cannot go into this discussion. A. Capone / Journal of Pragmatics 43 (2011) 1741–17541746 Author's personal copy It is also natural that the Principle of Relevance, being a cognitive principle, should be designed to cope with novel problems, leaving problems that are recurrent to default heuristics, which are prepared, instead, to deal with problems which have repeated themselves in the past and forwhich 'standardized solutions' are ready.We can compare thework done by the default semantics heuristics with fast-and-frugal heuristics utilized for example by people who invest money in the stock-exchange. Someone who has witnessed the recent events in the world stock-exchanges has developed a prudential strategy based on experience: if you buy shares when the price fall, buy them in two or three installments, so that if the price continues to fall, you can still be able to buy the shares at the cheapest possible price. If cognition tells you that the best moment to buy shares is when the price falls, experience tells you that if the price continues falling you must go on buying shares. Fast-and-frugal heuristics are the result of cognitive principles being adapted to experiential data. Analogously, I argue that default semantics heuristics are the systematic response to environmental problems that have recurred and, because of their recurring quality, require dedicatedmechanisms andmore specialized solutions. Default interpretations are also more suited to interpretation problems for which there is little doubt or uncertainty, whereas cognitive mechanism of the inferential type are at work when one needs to eliminate uncertainty. As Gallistel (2002) says, the function of learning is to extract from experience properties of the environment likely to be useful in the determination of future behavior. We use memory to carry information that is useful to responding to environmental problems forward in time. We may go further and claim that there are other cognitive defaults. Suppose that the Default Semantics sub-module consists of fast and frugal heuristics such as: What is not said isn't (Levinson, 2000). Given this fast and frugal heuristics, hearing the utterance (5) (5) I have three children the hearer assigns the interpretation 'I have at most three children'. This fast and frugal heuristic could be seen as a specialization of the Principle of Relevance by Sperber & Wilson: Cognitive principle of relevance Human cognition tends to be geared to the maximization of relevance. Relevance of an input to an individual a. Other things being equal, the greater the positive cognitive effects achieved by processing an input, the greater the relevance of the input to the individual at that time. b. Other things being equal, the greater the processing effort expended, the lower the relevance of the input to the individual at that time. Presumption of optimal relevance a. The ostensive stimulus is relevant enough to be worth the auditor's processing effort. b. It is the most relevant one compatible with the communicator's abilities and preferences. Relevance-theoretic comprehension procedure a. Follow a path of least effort in computing cognitive effects: Test interpretative hypotheses (disambiguations, reference resolutions, implicatures, etc.) in order of accessibility. b. Stop when your expectations of relevance are satisfied (or abandoned). (From Wilson and Sperber, 2004). I should mention, before ending this section, that I am open to the possibility that the principles above like the default de re Principle, etc. , which I have related to the Principle of relevance, may be the result of modularization the creation of a dedicated module on the basis of repeatable patterns of experience via a link to innate cognitive operations for which the Principle of Relevance could be held responsible. If the modularization hypothesis is accepted, one need not claim that the principles in question are hardwired in the human brain, but only that what is hard-wired is a tendency towards modularizing certain abilities by exploiting principles already present in the mind and repeated patterns of inference. I will address this problem, which is certainly not uninteresting, in another paper. Of course, if the modularization hypothesis is accepted, one should accept that the Principle of Relevance has a crucial role especially until themodularization process has come to completion (in which case more specific dedicated principles would be at work). I note that even on this hypothesis onemust accept that there are innatemechanisms of cognitionwhich allow certain inferential operations. Some readersmay remain perplexed by the fact that I consider the heuristic 'What is not said is not' as related to the Principle of Relevance, nevertheless as belonging to the Default Semantics sub-module. However, given Carruthers's claim that different modules can share a number of parts, I also claim that the Default Semantics sub-module and the Principle of Relevance share some A. Capone / Journal of Pragmatics 43 (2011) 1741–1754 1747 Author's personal copy common mechanisms (the relationship between the two being one of specialization). The heuristic principle 'What is not said is not' can be said to follow from the Principle of Relevance, given that if the scalar interpretative options higher up in the scale obtained, then the speaker would put the hearer to undue processing efforts, in that one could never know in principle which of the two interpretation options should be chosen (hence the higher options in a scale are excluded). As (Mey 2001:69), says: when communicating, speakers try to be understood correctly, and avoid giving false impressions. I have reasons for wanting to place the heuristic 'What is not said is not' in the Default Semantics sub-module: (1) it furnishes pre-contextual processing and (2) it is a more frugal principle than the Principle of Relevance. If the Default Semantics sub-module can intercept a number of lexical items on the basis of their form or abstract semantic type and use a principle which is simpler and more specialized than the Principle of Relevance (albeit related to it), then this is a welcome result. Of course the form can include syntactic form. Quantifiers, in fact, are typically involved in scalar inferences provided that they do not constitute the upper end of the scale. 3.1. On the compatibility of Relevance Theory, Gricean theories, and Default Semantics The project of integrating Gricean theories and, in particular, Relevance Theory and Default semantics is not without problems. There are, in fact, terminological differences, as well as more substantial differences. If Grice's original theory is taken into account, the most obvious differences between the original Gricean picture and Relevance Theory are as follows. The Gricean picture is philosophically motivated, while RT aims at a psychological plausible theory in which Relevance is part of a broad picture of the way the mind works and of its cognitive architecture. Grice had a view of implicatures that had literalmeaningsasapointofdeparture,whileRTabandons the idea that the literal content gives rise to implicatures,whichmay be aborted in context. Grice had aminimal conception of what is said, while RTs have an intrusionistic view of what is said. RT adopts the idea that semanticsunderdeterminespragmatic interpretation toagreaterextent (seeHorn,2005,Carston, 2005).En passant, I should stress thatdespitemuch insistenceon thepart of the current literatureon thedifferent emphasisof theGricean project and RT concerning the speaker's intentions, Carston (2005) has (as always) said that RT is very much concerned with speaker meaning, both what it is and how an addressee attempts to recover it (and I approve of Carston's precis). We may agree with all this, and say that while we are obviously interested in the Gricean notion of speaker's meaning, which we hope to inherit for the purpose of our current intellectual enterprise, we are less interested in rehabilitating the less tractable parts of the Gricean program. Instead, I am interested in integrating, if possible, an offshoot of Gricean pragmatics – Default Semantics – with Relevance Theory. Here too compatibility is not to be taken for granted, as there are differences. However, one should not despair that these differences prevent us from contemplating the possibility of a unified picture. In the following I do my best to reconcile the two positions. I focus on the differences between the two projects and argue that these differences cannot prevent us from trying to reconcile these two theories. While Relevance Theory focuses on processing (see effort and effect interplay), Default Semantics focuses on types of sources from which addressees draw information and types of processes that interact in providing it. In Default Semantics, rationality of interloctutors is assumed, on a par with any Gricean approach and Relevance Theory. Rational communicative behavior is an assumption integral to the theory of Default Semantics. Like Relevance Theory, there is a difference between DS and Levinson's theory of GCI in that Levinson assumes local inferences, often word-based ones, while DS professes 'methodological globalism': it is methodologically prudent to assume global, proposition-based inferences (and defaults). Inferences, according to DS, are NOT fast and instantaneous. DS is a theory in which merger representations combine outputs of various processes, some of themdefault and some conscious, costly and inferential. Duration of processing is not a problem for DS: GCI will arise in some contexts as defaults, in others as the outcome of conscious processing, and yet in others do not arise at all. Defaults mean in DS automatic interpretations, assessed for a particular utterance. We must distinguish between Default Semantics as a theory and default interpretations (CD, SCWD) which constitute only two out of four types of processes in the theory. DS is a model of utterance interpretation which is self-contained and comprises also those types of processes that Relevance Theory postulates (contextual inferences). Relevance Theory is clearly and explicitly related to mental heuristics. The framework deals with cognitive processes that tend to reduce the amount of information needed and produce effects in a fast and efficient way. If you ask me whether Miland is bigger than Caltanissetta, I may use information that Miland has a series A soccer team while Caltanissetta has not as a predictor of size and if I do not have information about soccer then I will move to the next best predictor of size (for example, having an airport is a predictor of size). In the same way, Relevance Theory uses cognitive efforts and contextual effects to predict whether an interpretation is relevant or not. However, one could say that while the heuristics such as using a predictor of size (e.g. knowledge about a soccer team) are intelligent but nevertheless shortcuts grounded in experience, Relevance Theory interpretations are shortcuts grounded in general principles of cognition. If anything at all resembles Gigerenzer's heuristic shortcuts, this is Default Semantics and not Relevance Theory, since default interpretations are often derived by default, and often include social defaults. They are shortcuts that are grounded on experience and associative links, given that sociocultural defaults are created by associative links. Default semantics must be related to cognitive architecture because, since it provides principles of compositionality for the integration of linguistic semantics and pragmatic information, one must explain what features of cognitive architecture A. Capone / Journal of Pragmatics 43 (2011) 1741–17541748 Author's personal copy permits inter-modular communication. As previously said, DS is a theory in which merger representations combine outputs of various processes. It is Default Semantics that permits the integration of, say, perceptual and linguistic information or of syntactico/semantic and pragmatic information. While Relevance Theory confines itself to adding pragmatic increments to semantic templates, Default Semantics gives us the principles whereby such integration can be effected. Carruthers (2006b) discusses how information coming from different modules can be integrated, and mentioning that some scholars prefer the answer that the theory of mind is responsible for such integration, he proposes that the language module is responsible for this integration. If we assume, as Jaszczolt does, that Default Semantics is responsible for this integration, we must surely articulate mental architecture that corresponds to the workings of Default Semantics. Since Default semantics is said to be characterized by combinatorial principles, it is not completely outlandish that the workings of Default Semantics should subsume a linguistic competence, but it is also obvious that such combinatorial principles must be different from linguistic compositionality. What is clear so far is that an obvious difference between Default Semantics and Relevance Theory is the former's insistence on default inferences. These include socio-cultural defaults, and these cannot be easily accommodated by Relevance Theory. A clear-cut advantage of Default Semantics is that it can explain inference to stereotype (nurse! female nurse; surgeon!male surgeon, etc.) with relative ease. Default inferences of a lexical type are also a characteristic of Default Semantics, which is in conflict with the assumptions of Relevance Theory. In my opinion, one must attempt to integrate DS and RT by recognizing that much of the burden of cognitive processing must be alleviated by the construction of a mental archive where default inferences are stored, ready for use. Even if it is not clear that this approach involves faster mechanisms of inference, one thing is certain. Processing costswill beminimized, and this is useful especiallywhen the brain is at pains locating an utterance in a wide co-text and context. If co-text and context are not at odds with default semantics, this can be safely added to utterance interpretation by following the default semantics compositional principles and can be integrated in awider contextwhere it is necessary to compute further contextual effects (the effect of adding the utterance to the context in question). Computing costs will thus be predicted to be alleviated by Default Interpretations. These considerations make sense in the light of Perkins's (2007) work on compensatory strategies, which clearly involve brain plasticity as well as the natural inclination by the humanmind to compensate for certain impairments. While in the context of this discussion we are not discussing pragmatic impairments, it is natural to extend the notion of compensation to those cases in which the summation of inferential procedures would ultimately lead to an unbearable or anyway costly cognitive load and, thus, brain plasticity, and, in particular, the ability to shift information from the cognitive component to a default semantics archive, takes away this cognitive load and compensates for it by imposing a burden on the default semantics archive. In order to be more persuasive, I will make use of an example which is familiar to everyone: the use of the multiplication tables. I choose this example from Ryle (1949), since the philosopher used it to make a distinction between an intelligent activity (multiplying numbers according to a rule) and an activity which seemed to him to be less intelligent (the ability to give by rote the correct solution to a multiplication problem). We do not knowwhat Ryle would have said of the strategy of shifting the burden of cognitive operations to a semantic archive in which semantic associations replace cognitive operations.Would this be an intelligent activity? If we could show that this activity is governed by the Principle of Relevance, then we could prove that it is intelligent. I will address this issue later on. When we were children we found it tedious to commit to memory so much information about multiplication, which was independently derivable through a cognitive strategy (e.g. counting, or multiplying with the usual multiplication methods). This clearly involved a cognitive load on memory. However, once the multiplication tables were memorized, it was clear that a lot of effort was saved. Of course, committing information tomemory is useful only if there is a high likelihood that this informationwill turn out to be useful. Thus, if in our practice (buying and selling things,measuring, etc.) there is a high likelihood that themultiplication tableswill turn out to be useful, it will be felt to be justified to put children to such amemory load. Could we not then arbitrarily extend the capacity of multiplication tables? I found out throughWikipedia that while in some societymultiplication tables include the number 9, in USAmultiplication tables include the number 12. However, one usually does not find multiplication tables that include the number 19 or 99. Why should this be the case? It must be the case because given that one is unlikely to find multiplications like 99  88 in practical life, then the cost of memorizing such richer multiplication tables is not justified. Even for mathematicians, this is not justified, given the use of electronic calculators. However, if a given person found it was very useful to multiply numbers up to 99 in practical life (say in a society of thieves where everyonewill take an opportunity to steal money from you if you do not have suitable mathematical abilities), then it would be justified to learn the multiplication tables up to the number 99. Whenwe have to choose between a simple associative cognitive strategy, likememorizing, and amore complex cognitive strategy, such as having to make computations, we compare the costs and benefits involved and then decide which strategy is the best. We can invoke the basic assumption of Relevance Theory, that is the trade-off between cognitive effects and cognitive efforts to justify the shift from repetitive processing within the module where the Principle of Relevance is operative to systematic storage in a Default Semantics archive of information coming from RT processing. Furthermore, if ease of memorization is associated with repetition, then it goes without saying that the more repetitive a cognitive strategy is, the greater is the likelihood that it will be replaced by an associative learning and memorization of its results. Suppose there is a cognitive rule of this type: after finding that item x is associated with interpretation y for the nth time, commit y to memory by associating ywith item x. Then it goeswithout saying that after encountering the association x, y for the nth time, onewill commit it tomemory. However, at this pointwe could also say that Relevance Theorymechanisms have some saying A. Capone / Journal of Pragmatics 43 (2011) 1741–1754 1749 Author's personal copy on the fact that the rule 'Commit the association z, y to memory after encountering it for the nth time' exists, since memorization effort is now offset by cognitive rewards (this is in line with Millikan, 2006, who also stresses the connection between memorization of facts and cognitive rewards). All I am suggesting is that the number n involved in the expression 'the nth time', whatever it is, is not arbitrarily determined, but may be the result of a combination of factors, including the dispositional qualities ofmemory (synaptic density in neo-cortical tissues, changes in syntactic conductance, Gallistel, 2002) and cognitive rewards that are likely to offset memorization effort. To finish this defense of default inferences, I want to say that when a cognitive resource is very precious, as in the case of pragmatic inferences, then it will be good to make sure that other cognitive resources are allocated to the same function, so that this function will not be lost as a consequence of neurological damage. In the same way as our perceptual systems have got a certain redundancy (we have got two hands, two eyes, two ears, a nose and amouth that can both be used for breathing, two legs, two feet), our cognitive resources are best used by creating redundancy and by replicating them by shifting away the burden from cognition to memory. A further crucial difference between DS and RT is that unlike the other contextualist account, DS does not recognize the level of meaning at which the logical form is pragmatically developed/modulated as a real, interesting, and cognitively justified construct. To do sowould be to assume that syntax plays a privileged role among various carriers of information and that the syntax/pragmatics interaction is confined to pragmatic additions, embellishments, or 'developments' of the output of syntactic processing. Now, it is possible that this insistence on the idea that semantics is not a cognitively justified construct needs to be mitigated. After all, the idea of merger representations is that we merge information and linguistic information is clearly one of the sources of information that we merge. What Jaszczolt probably wants to challenge is the independence of semantics. Now, there are reasons for resisting the idea that semantics is an autonomous level of meaning, since semantic representations, in order to be merged, must have tags that allow for the merging. In other words, linguistic information must have appropriate tags that allow the binding (or merging) with non-linguistic information (the simplest example is the use of pronouns or demonstratives). In other words, they must have a potential for merging. Seeing things in this light, the insistence by Relevance Theorists on semantics as an autonomous, independent, cognitively justified level can be tempered. However, given that Jaszczolt deals with compositionality at the level of the utterance, we should remark in passing that having the potential for compositionality at the level of merger representations does not prevent semantics from being readable as a self-sufficient level of representation. One knows what could be meant by saying a given sentence and one also knows in advance what increments various and different contexts would add because we know what the semantic potential for merger representations is. It is probably this abstract semantic potential for combining with contexts that we call sentential semantics and there is no doubt that Relevance Theorists do well in paying attention to this level which may require different principles of compositionality from those of Jaszczolt's merger representations. 3.2. Default semantics and Levinsons's theory of default implicatures While there is no doubt that Jaszczolt's theory of default semantics seems (prima facie) to resemble Levinson's (2000) theory of generalized conversational implicatures, in the light of evidence from experimental pragmatics (especially Noveck and Sperber, 2007), we should make some effort to differentiate the two theories. Noveck and Sperber (2007) note that default implicatures à la Levinson presuppose the following theoretical picture: Generalized implicatures are more parsimonious than totally explicit communication; Generalized implicatures are more parsimonious than particularized implicatures (in the sense that they are less timeconsuming, less effortful). Generalized implicatures get through in a default context, but can be cancelled in specific contexts; The contexts where they are cancelled are fewer than those in which they are not. Noveck and Sperber (2007) consider that even if generalized implicatures were cancelled only in a third of all cases, one would have to sum the effortfulness of these cancellations with the effortfulness of having generalized implicatures. The criticism leveled against Levinson could be leveled towards Default Semantics. Thus, wemust excogitate how to answer the objections to Levinson while at the same time differentiating Default Semantics further from the Levinsonian picture. 3.3. Taking experimental pragmatics seriously The experimental side of Noveck and Sperber's article seems to show 'prima facie' that children are more likely to go for literal interpretations of utterances such as 'There might be a parrot in the box' (which are preferred to interpretations like 'There might but there must not be a parrot in the box' (See also Chierchia et al., 2001 on this issue)). The experiment is not much different in its results when children are given explicit instruction concerning the presence of a scalar conversational implicature. Now, Noveck and Sperber agree that these results can be interpreted by saying that all they show is that pragmatic competence in children and in adults are differentiated (does the theory of mindmodule undergo a paced pattern of evolution?) and, in fact, the temptation to interpret them as giving support at least to a picture of default semantics à la A. Capone / Journal of Pragmatics 43 (2011) 1741–17541750 Author's personal copy Jaszczolt is strong. If my idea is correct that information is gradually shifted to the default semantics archive, after inferences become routinized or standardized, say after the nth time they are repeated, then the experimental picture so far illustrated need not be inimical to default semantics, as the Default Semantics Archive could be considered a component of the theory of mindwhich partially emergeswith experience. However, the literature on experimental pragmatics is not unanimous on the idea that children are not pragmatically competent to calculate scalar implicatures. In fact, Papafragou and Tantalou (2004) argue that themethodology of the previous experiments, inwhich questionswere asked of the children concerning the truth of certain statements, was wrong and, thus, use instead a different methodology which does not privilege questions, but exploits informative statements containing quantifiers to assess whether a certain task was completely executed and thus deserved a reward (or not). In the light of the newmethodology, children were shown to be competent in calculating scalar implicatures if enough contextual clues were given. Another important article belonging to experimental pragmatics is the one by Breheny et al. (2005) in which it is shown that contextual cluesmay lead either to a scalar implicatures or to a literal interpretation respectively in greater or less time. Now, this is certainly an important study of inferential processes, but there are things to be said. If understanding a sentence is a question of connecting together its various parts, it is clear that syntactic complexity should be taken into account. It could be argued that syntactic complexity of prior segments have an effect on the reading time of the last segment, given that, after all, reading in this experiment meant understanding and understanding is always a holistic matter. The problems noted above do not appear to frighten Jaszczolt who distances herself from Levinson and claims that: It is much harder to provide experimental evidence for or against salient meaning that are so construed that they draw on some contextual information, arise late in utterance processing, and are not normally cancellable. The latter also seem much more intuitively plausible in that they are nothing less but shortcuts through costly pragmatic inference. They are just normal, unmarked meanings for the context and it is not improbable that such default, salient interpretationswill prove to constitute just the polar end of a scale of degrees of inference rather than have qualitatively different properties from non-default, clearly inference-based interpretations. They will occupy the area towards the 'zero' end of the scale of inference but will not trigger the dichotomy 'default vs. inferential interpretation'. (Jaszczolt, 2006). 4. Further considerations on the interaction between Relevance Theory and Default Semantics One of the ideas of this article is that inferences that start as pragmatic processes become standardized (to use Bach's term) or routinized. This involves a gradual shift of information from pragmatic processes based on fast and frugal heuristics to a (lexical) archive. This move is not extraneous to Relevance Theory as evinced from Wilson and Carston (2007) brilliant paper. In that paper, Wilson and Carston attempt to unify diverse pragmatic processes such as loosening or narrowing of concepts, or metaphorical extensions, hyperbole, etc. They convincingly argue that there is no clear division among them. They can successfully unify such diverse phenomena under the idea that Relevance will furnish 'ad hoc' concepts (an idea based on Barsalou, 1987 and Glucksberg, 2003). In other words, pragmatics will be able to modulate in context the concepts provided by lexical semantics. In that paper, they also argue that ''However, some of these pragmatically constructed senses may catch on in the communicative interactions of a few people or a group, and so become regularly and frequently used. In such cases, the pragmatic process of concept construction becomes progressively more routinized, and may ultimately spread through a speech community and stabilize an extra lexical sense''. (Wilson and Carston, 2007:15). As Mey says: (. . .) certain apt metaphors (e.g. 'sharp' for 'intelligent'), due to their 'success', obtain near-lexical status, analogous to certain fixed expressions (compare the role of the English modal verbs can andmay in indirect speech acts and negation) (Mey, 2004, 113). Now, it is clear that bothDefault Semantics and Relevance Theory recognize the importance of routinized inferences (see also Mey, 2004, in addition). One may, thus, object to default semantics that, after all, if all it does is to create an archive for the new senses that seep into the language, it amounts to no more than something like the lexicon as classically conceived of. One can reply to this saying that, of course, Default Semantics is more than a lexicon. I have already said before that it contains some very useful Cognitive Defaults, in additions to principles for the calculation of merger representations. Relevance Theory provides reasoning principles and Default Semantics the mechanism and the algorithm where compositionality applies to utterances being a product of interaction of information from various sources. So Default Semantics is a set of cognitive defaults and principles determining merger representations, but also has an archive which is very specific. While the lexicon, as normally intended, contains lexical rules that support monotonic inferences (if every student passed the exam, then some student passed the exam), the default semantics archive contains default interpretations, interpretations which can be defeated in a certain context. I would like to make the bolder claim that default semantics information is not simply added or defeated in specific contexts, but also interacts with other pieces of information. So, in principle, one cannot rule out that an input that travels first to the default semantics sub-module for a rough interpretation is then subjected to further modulation, thus being passed on to the Relevance mechanisms. A. Capone / Journal of Pragmatics 43 (2011) 1741–1754 1751 Author's personal copy AlthoughDefault Semantics does not specifically address the issue ofmetaphors, it is clear that this is the areawhere the interaction between default semantics and the Principle of Relevance is more evident. Consider the cases of incandescent metaphors used to speak about Berlusconi in the recent political debate. Surely, people who utter the following literal translations of Italian sentences 'one must get rid of Berlusconi', 'one must give a punch to Berlusconi', ''one must stop Berlusconi'', etc. are not speaking loosely, butmetaphorically. Suchmetaphors, especially in the Italian translations of these sentences, have become so standardized, that one would hardly say that they are incitements to violence, as the rightwinged politicians immediately argued after the recent physical attack on Berlusconi. However, it is also true that in a discoursewheremany of thesemetaphors are used, the conventionalmetaphorsmay be transformed into 'adhoc' concepts again through Relevance heuristics (a deactivated concept can be reactivated). It is a pity that Jaszczolt does not say much about metaphors, but in principle it is possible to shift under the rubric of default semantics many of the data on metaphors by Giora (2003). The main idea is that (conventional) metaphors do not allow the hearer's access to the literal meaning first, but they are cases in which literal interpretations take more time than non-literal interpretations. They are, therefore, ideal cases for default semantics. To defend the idea that conventional metaphors are stored within the default semantics archive one needs to argue the case that conventionalized metaphors are mid way between genuinely pragmatic processes and lexical inferences. A recent debate on the language used by themedia and politicians (prior to a violent attack on him) inwhich Berlusconiwas described by (violent) metaphors and the charge that such a language was intentionally used to create a climate of violent opposition, proves that the idea of storing conventionalized metaphors in the Default Semantics archive is not outlandish, since these metaphors still live in some intermediate stadium between what counts as langue and language use. 5. Modularity and innateness I briefly discuss the question whether the Default Semantics archive is completely innate or, otherwise, developed due to exposure to experience. My answer is of a mixed type. On the one hand, heuristics such as the Default De Re Principle seem to be a priori principles (or at least related to innate principles such as the Cognitive principle of Relevance), hence innately present in the mind. On the other hand, this module grows up as a result of exposure to experience. It is similar to the mental lexicon, in the sense that experience leads to progressive accumulation of information – of course, some ordering principles must be present in the archive, and these are presumably innate. The growing of the Default Semantics archive resembles very closely the modularization process involved in skills such as driving and reading. According to Barrett (2005) and Karmiloff-Smith (1992) the modularization involved in novel skills such as reading recruits evolved modular capacities such as object recognition. The experience of reading would influence the development of the reading-module in such a way that the developed system, as observed in reading adults, appears to contain a specialization for reading. According to Barrett and Kurzban, it is possible that while the reading module recruited the object recognition module to start with, there was a bifurcation of modular skills during development. As Barrett and Kurzban say: In this case novel tasks such as identifying letters orwords, would still be treated by the evolved developmental system as a special case (or token) of an evolved skill (object recognition) if they satisfied its input criteria. However, the development system in question could contain a procedure or mechanism that partitioned off certain tasks – shunting them to into a dedicated developmental pathway – under certain conditions, for example when the cue structure of repeated instances2 of the task clustered tightly together, and when it was encountered repeatedly, as when highly practiced (. . .). (Barrett and Kurzban, 2006:639). (Bold mine) The considerations above seem to me to be of extraordinary importance. The Default Semantics archive could be seen as originally recruiting information from the operation of the Principle of Relevance, until the tasks it carried out started to be partitioned from the mechanism of Relevance in such a way as to allow for dedicated mechanisms and specializations causing ease of cognitive load. If the ontogenesis of the Default Semantics sub-module reflects that of the readingmodule, we expect dissociations between the mechanism of Relevance and the Default Semantics sub-module as a result of the bifurcation in cognitive specialization that resulted in a new sub-module. In other words, my expectation is that damage to the Default Semantics archive may leave the operation of the Principle of Relevance untouched and vice versa. This dissociation is, of course, ultimately very valuable for the system: suppose that the Default Semantics archive, for some reason, ceases to exist or to function properly; then its inferential power will have to be replaced by the less dedicated Relevance heuristics. The considerations by Barrett and Kurzban also support the existence of the Default Semantics sub-module as a separate module in that the accumulation of Default Semantics information in the dedicated archive is the result of repeated practice. We may suppose that there is a mechanism specifying that, if a certain inferential procedure is routinely used, and its repeated use distracts cognitive resources from the computation of relevance, then the output will be systematically stored in the Default Semantics archive and the corresponding inferential mechanism in the Relevance Theory archive will be inhibited. 2 Bold mine, in this case. A. Capone / Journal of Pragmatics 43 (2011) 1741–17541752 Author's personal copy 6. Conclusion The most important idea of the paper is to link the interaction between Default Semantics and the Principle of Relevance to amodular picture of themind. In fact, I have sketchedwith broad brush strokes a picture that integrates Relevance Theory and Default Semantics within the same modular architecture (the mind-reading module) and I have advanced the hypothesis that the Default Semantics heuristics share mechanisms with the Principle of Relevance, as, in fact, they originated ontogenetically by originally recruiting cognitive operations from thatmechanism,which, once routinized,would turn into cognitive defaults which would then split off from the mechanism of relevance (the two mechanisms, that originally shared resources, subsequently partitioned into two separate and dissociable mechanisms). Acknowledgments I would like to give thanks to a number of scholars who contributed to my education and encouragedme for many years: Wayne Davis, Tullio De Mauro, Igor Douven, James Higginbotham, Yan Huang, K. Jaszczolt, Istvan Kecskes, Franco Lo Piparo and Jacob L. Mey. All remaining errors are clearly my own. References Bach, Kent, 1998. Standardization revisited. In: Kasher, A. (Ed.), Pragmatics: Critical Assessment. Routledge, London. Bach, Kent, 2001. Semantically speaking. In: Kenesei, I., Harnish, R. (Eds.), Perspectives on Semantics, Pragmatics and Discourse. A Festschrift for Ferenc Kiefer. John Benjamins, Amsterdam, pp. 146–170. Baron-Cohen, S., Leslie, A., Frith, U., 1985. Does the autistic child have a 'theory of mind'? Cognition 21, 37–46. Barrett, Clark H., 2005. Enzymatic computation and cognitive modularity. Mind and Language 20/3, 259–287. Barrett, Clark, Kurzban, Robert, 2006. Modularity in cognition: framing the debate. Psychological Review 113/3, 628–647. Barsalou, L., 1987. The instability of graded structure in concepts. In: Neisser, U. (Ed.), Concepts and Conceptual Development. CUP, New York, pp. 101–140. Bloom, P., 2000. How Children Learn the Meanings of Words. MIT Press, Cambridge, MA. Breheny, Richard, Katsos, Napoleon,Williams, John, 2005. Are generalized scalar implicatures generated by default? An on-line investigation into the role of context in generating pragmatic inferences. Cognition 100, 434–463. Capone, Alessandro, 2002. Review of Jaszczolt's 'Discourse, beliefs and intentions'. Pragmatics and Cognition 9/2, 354–361. Capone, Alessandro, 2004. 'I saw you' (towards a theory of pragmemes). RASK 20, 27–44. Capone, Alessandro, 2010. Between Scylla and Charybdis: the semantics and pragmatics of attitudes de se. Intercultural Pragmatics 7/3, 471–503. Carston, Robyn, 1996. The architecture of the mind: modularity and modularization. In: Green, D. (Ed.), Cognitive Science: An Introduction. Blackwell, Oxford. Carston, Robyn, 2002. Thoughts and Utterances. Blackwell, Oxford. Carston, Robyn, 2005. Relevance Theory, Grice and neo-Griceans: a response to L. Horn. Intercultural Pragmatics 2/3, 303–319. Carruthers, Peter, 2006a. The Architecture of the Mind. OUP, Oxford. Carruthers, Peter, 2006b. Simple heuristics meet massive modularity. In: Stainton, R. (Ed.), Contemporary Debates in Cognitive Science. Blackwell, Oxford. Chierchia, G., Crain, S., Guasti, M.T., Gualmini, A., Meroni, L., 2001. The acquisition of disjunction: evidence for a grammatical view of scalar implicatures. In: Do, H.J., Dominguez, L., Johansen, A. (Eds.), Proceedings of the 25th Boston University Conference on Language Development. Cascadilla Press, Somerville, pp. 157–168. Chomsky, Noam, 2000. New Horizons in the Study of Language and Mind. CUP, Cambridge. Dascal, Marcelo, 2003. Interpretation and Understanding. John Benjamins, Amsterdam. Fodor, Jerry A., 1983. The Modularity of Mind. MIT Press, Cambridge, MA. Fodor, Jerry A., 2000. The Mind Doesn't Work That Way: The Scope and Limits of Computational Psychology. MIT Press, Cambridge, MA. Gallistel, C.R., 2002. The principle of adaptive specialization as it applies to learning and memory. Mn. Gigerenzer, G., Todd, P., The ABC Research Group, 1999. Simple Heuristics That Make us Smart. OUP, Oxford. Giora, Rachel, 2003. On Our Mind Salience, Context, and Figurative Language. OUP, Oxford. Glucksberg, S., 2003. The psycholinguistics of metaphor. Trends in Cognitive Science 7, 92–96. Goffman, E., 1981. Forms of Talk. University of Pennsylvania Press, Philadelphia. Goldman Alvin, I., 2006. Simulating Minds. OUP, Oxford. Happè, Francesca, Loth, Eva, 2002. 'Theory of mind' and tracking speakers' intentions. Mind and Language 17, 24–36. Heim, Irene, 1992. Presupposition projection and the semantics of attitude verbs. Journal of Semantics 9 (3), 183–221. Horn, Laurence, 2005. Current issues in neo-Gricean pragmatics. Intercultural Pragmatics 2/2, 191–204. Jaszczolt, K., 1997. The Default De Re Principle for the interpretation of belief utterances. Journal of Pragmatics 28, 315–336. Jaszczolt, K., 1999. Discourse, Beliefs and Intentions. Elsevier, Oxford. Jaszczolt, K., 2005. Default Semantics. Foundations of a Compositional Theory of Acts of Communication. OUP, Oxford. Jaszczolt, K., 2006. Default Semantics. Stanford Encyclopedia of Philosophy. Jaszczolt, K., 2009. Representing Time. An Essay on Temporality as Modality. OUP, Oxford. Karmiloff-Smith, Annette, 1992. Beyond Modularity: A Developmental Perspective on Cognitive Science. MIT Press. Karmiloff-Smith, Annette, 2010. A developmental perspective on modularity. In: Karmiloff-Smith, A. (Ed.), On Thinking. Springer, Berlin. Levinson, Stephen, 2000. Presumptive Meanings. MIT Press, Cambridge, MA. Mey, Jacob L., 2001. Pragmatics. An Introduction. Blackwell, Oxford. Mey, Jacob L., 2004. Review of R Giora. On ourmind. Salience, context, and figurative language. RASK: International Journal of Language and Communication 20, 99–126. Millikan, Ruth Garrett, 2006. Styles of rationality. In: Hurley, S., Nudds, M. (Eds.), Rational Animals? OUP, Oxford. Noveck, Ira, Sperber, Dan, 2007. The why and how of experimental pragmatics: The case of 'scalar inferences'. In: Noel, Burton-Roberts (Ed.), Advances in Pragmatics. Palgrave. Paglieri, Fabio, Castelfranchi, Cristiano, 2010. In parsimony we trust: non-cooperative roots of linguistic cooperation. In: Capone, A. (Ed.), Perspectives on Language Use and Pragmatics. Lincom, Muenchen, pp. 99–118. Papafragou, Anna, Tantalou, Niki, 2004. Children's computation of implicatures. Language Acquisition 72/1, 71–82. Perkins, Michael, 2007. Pragmatic Impairment. CUP, Cambridge. Ryle, Gilbert, 1949. The Concept of Mind. Chicago University Press, Chicago. Sperber, Dan, Wilson, Deirdre, 1986. Relevance, 2nd ed. Blackwell, Oxford. Sperber, Dan, Wilson, Deirdre, 2002. Pragmatics, modularity and mind-reading. Mind and Language 17, 3–23. A. Capone / Journal of Pragmatics 43 (2011) 1741–1754 1753 Author's personal copy Sperber, Dan, 2005. Modularity and relevance: how can a massively modular mind be flexible and context-sensitive? In: Carruthers, P., Laurence, S., Stich, S. (Eds.), The Innate Mind: Structure and Content. OUP, Oxford. Wimmer, H., Perner, J., 1983. Beliefs about beliefs. Representation and constraining function of wrong beliefs in young children's understanding of deception. Cognition 13, 103–128. Wilson, Deirdre, Sperber, Dan, 2004. Relevance theory. In: Horn, L.R., Ward, G. (Eds.), The Handbook of Pragmatics. Blackwell, Oxford, pp. 607–632. Wilson, Deirdre, 2005. New directions for research on pragmatics and modularity. Lingua 105, 1129–1146. Wilson, Deirdre, Carston, Robyn, 2007. A unitary approach to lexical pragmatics: relevance, inference and 'ad hoc' concepts. In: Burton-Roberts, N. (Ed.), Pragmatics. Palgrave/McMillan, Edinburgh. Wittgenstein, Ludwig, 1953. Philosophical Investigations. Blackwell, Oxford. Alessandro Capone was taught by Yan Huang and James Higginbotham at the University of Oxford, where he obtained a Ph.D. in linguistics. His interest lies in pragmatics and philosophy. He is editing volumes on 'de se' attitudes and on pragmatics and philosophy to be published by CSLI Stanford and Springer. He edited a special issue of the J. Pragmatics on 'Pragmemes'. He edited a volume entitled 'Perspectives on language use and pragmatics. A volume in memory of Sorin Stati'. He authored several books, among which 'Modal adverbs and discourse', published by ETS, Pisa, in 2001 and 'Dilemmas and excogitations, an essay on modality, clitics and discourse, Messina, Armando Siliano, 2000. He published various papers and reviews in journals like J. Pragmatics, La Linguistique, Australian Journal of Linguistics, RASK, Lingua e Stile, Oxford University working papers, etc. A. Capone / Journal of Pragmatics 43 (2011) 1741–