Contrasts between the ‘inner’ and the ‘outer’, that which is ‘inside’ and ‘outside’ of us (our minds, our heads), pervade the way we talk about mental phenomena and their relation to the non-mental world. Philosophers like Wittgenstein, Ryle, and Rorty have influentially argued that the uncritical use of such language is responsible for major philosophical confusions about the mind (Wittgenstein 1993; Ryle 1949/2000; Rorty 1979). In this article, we aim to show that ambiguous and misunderstood usage of inner/outer-terminology has had pernicious effects also in a more specialized context, namely, the study of Auditory Verbal Hallucinations (AVHs)—i.e., of episodes in which subjects ‘hear voices’ in the absence of a stimulus. Inner/outer-terminology is one important linguistic resource that voice-hearers, therapists, researchers and teachers use to articulate abnormal and sometimes bizarre experiences. Contrasts between ‘inner’ and ‘outer’ voices—voices that seem to be located inside or outside one’s body, head, or mind—have been used to define subtypes of AVH-like experiences (e.g., Kandinsky 1885; Jaspers 1911/1963, 1912/1963; 1959/1997), diagnostic criteria (in particular, for schizophrenia; e.g., Fish 1962; Sedman 1966; Mellor 1970), items in assessment tools (e.g., World Health Organization 1990a, 1990b; Mental Health Research Institute 1992; Haddock et al. 1999), and variables in empirical studies (e.g., Junginger and Frame 1985; Nayani and David 1996; Copolov et al. 2004; Docherty et al. 2015). As we will argue, much of this work is flawed by an ambiguity between different meanings of inner/outer-terminology—in particular, between a literal meaning (do voices seem to come from within or without the physical boundaries of one’s body/head?), and a metaphorical meaning concerning the reality that voices seem to have (do voices appear to really be there in the ‘outer’, mind-independent world, or is it obvious to subjects that the voices are there only ‘in their mind’?). The resulting confusion has deleterious effects in various regards, as we shall argue—including impaired reliability of empirical data, confusion about what exact hypotheses are supported or disproven by empirical findings, as well as potential negative effects on decisions in diagnosis and treatment and on the patient-therapist relation.

To argue for these claims, we proceed as follows. In section 1, we distinguish the readings of inner/outer-terminology that are most relevant in this context. In section 2, we reconstruct an influential tradition in earlier work on hallucinations (including Kandinsky, Jaspers and Schneider) according to which ‘outer’ and ‘inner’ voices belong to two different subtypes of hallucinatory experiences, namely, genuine hallucinations and pseudo-hallucinations. We argue that in the context of this tradition, the inner/outer-contrast is understood in the metaphorical sense of whether voices are experienced as having mind-independent reality or not. The following sections analyze the emergence and effects of ambiguity and confusion concerning ‘inner’ and ‘outer’ voices in AVH research since the 1960s. Section 3 focuses on two cases from the 1960s/1970s debate on diagnostic criteria for schizophrenia. Section 4 examines the item on ‘internal voices’ in the Present State Examination as important example for conceptual flaws in assessment tools for AVHs. In section 5, we turn to a set of empirical studies that address the diagnostic value of the distinction between ‘inner’ and ‘outer’ voices. Section 6 concludes by stating recommendations for future work in the area.

1 Disambiguating Inner/Outer-Terminology

Consider the following two first-personal reports from an online forum for schizophrenia patients:

Internal [voices] feel absolutely out of my control and say mean things usually, one of the prominent voices is very sinister and mocking, he says he has a silver tongue as well… which is pretty accurate. His location sounds like someone inputted the voice in the back of my head, and outside of my brain. (Cipher 2017)

My voices are internal in the sense that I can hear them as if I were hearing real people whispering to me from different positions around me but they are obviously in my head. (Xade 2017)

Both subjects report ‘internal’ voices that are located ‘in the head’. But context suggests that they mean quite different things by these terms. Cipher’s voice is ‘internal’ and ‘in’ Cipher’s head in the sense that its apparent spatial location—the place where it seems to come from—lies within the spatial boundaries of Cipher’s head, i.e. inside his/her skull.Footnote 1 By contrast, Xade experiences voices with apparent spatial location outside his/her head and body: they seem “real people whispering to me from different positions around me”. When writing that these voices are nevertheless ‘internal’ and ‘in’ his/her head, Xade seems to mean that (s)he is aware of the unreality of the voices: it is “obvious[…]” to Xade that these voices are not really existing, here and now, independently of his/her mind—they exist ‘only’ in his/her mind.

The two reports illustrate the two most important meanings of inner/outer-terminology in the context of AVHs, which can be contrasted as a literal and a metaphorical meaning. Literally understood (cf. Cipher), that something is ‘inner’ or ‘outer’ means that it is located within or without the spatial boundaries of the subject’s body or head. For example, the computer that I see in front of me, the voice that I hear behind me, and the ship that I imagine sailing by on the sea outside of my window, are all represented as being ‘outside’ me, located in ‘outer’ space, in this sense. The sound that I hear in my head when making a noise with my tongue, the food that I feel sitting heavy in my stomach, and the pneumatic hammer I imagine in my skull when I have a strong headache, are all represented as being ‘inside’ me, located in ‘inner’ space, in this sense. By contrast, in the relevant metaphorical sense (cf. Xade), the inner/outer-contrast is used to distinguish that which seems to be part of the real world ‘outside of our minds’ from that which doesn’t seem so. In this metaphorical sense, the computer that I see in front of me and the voice that I hear behind me, but also the sound in my head that I produce with my tongue and the food I feel in my stomach, are ‘outer’ in the metaphorical sense, while the ship that I imagine on the sea outside of my window and the pneumatic hammer I imagine in my head fail to seem real, they are ‘inner’ in this metaphorical sense.

Some instances of inner/outer-terminology are (relatively) unambiguous: for example, talking about a voice ‘in one’s mind’ signals the metaphorical reading, while talking about a voice at a determinate inner or outer location (as in Cipher’s report) indicates the literal reading. Other formulations—e.g., a voice ‘in one’s head’, ‘inner/outer voices’, ‘internal/external voices’—are themselves ambiguous between both meanings. Often the context suffices to disambiguate (as in the case of Xade’s report: the ‘inner’ character of the voices is contrasted with their perceived outer location, implying that ‘inner’ is used metaphorically). But when, for example, a rating scale asks subjects without further comment to agree or disagree with the affirmation “I have never been troubled by voices in my head” (Launay and Slade 1981, 227), it is up to the interviewer and to the interviewee to select the relevant meaning, and confusion and misunderstanding are bound to occur.

Things get further complicated by the fact that the metaphorical reading of inner/outer-terminology comes itself in different variants. To begin with, when Xade writes that his/her voices are “obviously in my head”, this can either be taken as a claim about the phenomenal character of the hallucinatory experience itself, or as a claim about a belief that accompanies (or follows) this experience. To bring out this contrast, compare an analogy. In the Müller-Lyer illusion, the phenomenal character of the subject’s perception is such that the lines look to be of different length. Initially, subjects normally also believe that the lines have different length. When they learn about the illusion, they abandon this belief, while at the phenomenal level, the illusory effect persists: the lines still look to be of different length. That the phenomenal appearance of length and the subject’s belief about length can come apart in this way shows that they are two distinct elements.

Similarly, mental states can, as part of their phenomenal character, present their objects as real, or they can fail to do so. Normal perceptions as well as realistic hallucinations make it phenomenally seem to the subject that she is directly confronted with objects that exist here and now, ‘out there’ in mind-independent reality. But regardless of whether a given episode does or does not create this appearance, the subject can either believe that the objects represented by the episode are real, or lack such a belief.

The phenomenal character in virtue of which the subject of both genuine perceptions and realistic hallucinations seems to directly confront real objects—the ‘perceptual presentation of reality’, as we will call it—is itself likely to be composed of various different elements, which may come apart from each other (cf. Aggernaes 1972; Farkas 2013). This leads to different intermediate possibilities in between full presence and full absence of perceptual presentation of reality, together with different variants of the inner/outer-contrast. The following two seem particularly important for the case of hallucinatory experiences:

  1. 1.

    Perceptions and realistic hallucinations make it phenomenally seem to us that the represented objects are part of mind-independent reality, and in this sense ‘outside’ our mind (or ‘head’). Other episodes (e.g. mental imagery) lack this feature, and do not phenomenally suggest more than a mind-dependent reality of the represented objects—a reality ‘inside our mind’ (‘head’) which objects have merely in virtue of being represented by us.

  2. 2.

    Perceptions and realistic hallucinations provide a full sensory presentation of an object (or a scene) with rich and vivid sensory detail and a certain amount of stability—so much so that they create the impression that the subject is directly confronted with physical objects themselves. Other episodes, such as mental imagery, thoughts and episodic memories, lack this sensory detailedness, vividness and stability, and are instead phenomenally manifest as mental—and can in this sense be characterized as ‘inner’. Consider an episode where I recall a walk on a particular beach some time ago: while this episode does suggest to me some (past) mind-independent reality of the represented objects (and hence is not ‘inner’ in the sense of (1)), it is phenomenally evident to me that the more or less fragmentary and fleeting scenes that show up are themselves mental, ‘inner’ images of the beach, not the real thing.

Existing descriptions suggest that less-than-fully-realistic hallucinatory experiences can take both forms (cf. section 2 below for (1), and Baillarger 1846 for (2)Footnote 2; in either case, those hallucinatory experiences differ from normal imagery in virtue of some other elements of the perceptual presentation of reality which are not present in normal imagery, such as richness of sensory detail and/or lack of direct voluntary control in case (1), and lack of direct voluntary control and/or sense of mind-independence in case (2)).Footnote 3

However, it is often impossible to decide which of these (or possibly other) variants is being used when inner/outer-terminology is applied metaphorically at the phenomenal level. We will therefore mostly focus on the more coarse-grained threefold disambiguation that has emerged above:

outerLIT vs. innerLIT

experienced as literally located outside vs. inside the boundaries of the subject’s head/body

outerBEL vs. innerBEL

involving vs. lacking belief in reality

outerEXP vs. innerEXP

experienced with full/robust perceptual presentation of reality vs. experienced with reduced/lacking perceptual presentation of reality.

In order to further specify (where relevant) ‘inner/outerEXP’ in terms of the above subdistinction between (1) and (2), we will use the following abbreviations:

outerEXP-1 vs. innerEXP-1

experienced with vs. without sense of mind-independent reality

outerEXP-2 vs. innerEXP-2

direct presentation of physical objects with rich, vivid, stable sensory features vs. experience of mental episode with reduced or absent sensory features.

2 Inner vs. Outer Voices: A Tradition

While inner/outer-terminology was used in taxonomic and diagnostic debates on hallucinations occasionally and informally already in the 1840s (Baillarger 1846, 386–388), taxonomic distinctions between ‘inner’ (‘internal’) and ‘outer’ (‘external’) voices in more recent decades build on a tradition that has its roots in the work of the Russian psychiatrist Victor Kandinsky (1885). In this tradition, ‘inner’ hallucinatory experiences are classified as ‘pseudo-hallucinations’, while only ‘outer’ hallucinatory experiences qualify as genuine hallucinations. Moreover, as we shall see, some authors in this tradition also ascribe diagnostic value to this distinction, arguing that ‘outer’ voices are more characteristic of disease than ‘inner’ voices. (On the history of the distinction between hallucinations and pseudo-hallucinations in general, cf. Spitzer 1987; Berrios 1996; van der Zwaard and Polak 2001; Telles-Correia et al. 2015.)

2.1 Kandinsky

In his 1885 monograph on hallucinations and related phenomena, Kandinsky distinguishes genuine hallucinations from phenomena that he calls ‘proper pseudo-hallucinations’,Footnote 4 or ‘pseudo-hallucinations’ for short, on the ground that the former, but not the latter, possess a feature that he calls ‘character of objectivity’ (e.g., 1885, 30; sometimes, he also speaks of ‘character of reality’, e.g., 1885, 30, and of the hallucination being ‘palpable’ [leibhaftig], 1885, 135Footnote 5). Hallucinations share this ‘character of objectivity’ with perceptions, while the lack of this feature is common to pseudo-hallucinations, episodic memories and mental images (1885, 53). Otherwise, pseudo-hallucinations are more like perceptions: they have fully detailed, vivid and stable sensory features, and are not directly controlled by the will (1885, 29–30, 42–3).

While Kandinsky does not define what he means by ‘character of objectivity’, he provides some important clues. He illustrates the term by writing that hallucinations have “for consciousness (not for judgment or the intellect) the same validity as [perceptions]” (1885, 135), and explains this ‘validity’ by writing that a proper hallucination “is reality itself for him [sc. the patient]” (1885, 40). So in virtue of their ‘character of objectivity’, perceptions and hallucinations make it seem to the subject that what they represent is real.

However, this point is ambiguous between a reading in terms of what we have called perceptual presentation of reality and a reading in terms of belief in reality, according to our distinction in section 1. Which of these readings is the relevant one for Kandinsky? He points out: “It is of course also possible that the hallucinating person does not regard the phantom as reality; but this will not in the least alter the sensual dimension of the phenomenon” (1885, 53). This is most plausibly read in the following sense: a (proper) hallucination can come with or without belief in reality (“regard the phantom as reality”), but this makes no difference to its phenomenal character (“sensual dimension”). Hence, the ‘character of objectivity’ that Kandinsky considers as characteristic for hallucinations must consist in perceptual presentation of reality, not in belief in reality. Conversely, this means that for Kandinsky, pseudo-hallucinations, mental images and episodic memories have reduced or lacking perceptual presentation of reality.

Kandinsky goes on to argue that not only perceptions and proper hallucinations, but also pseudo-hallucinations, mental images and episodic memories present the experienced objects as located in space (1885, 48). However, imagined, recalled or pseudo-hallucinated objects are (at least normally: 1885, 51n.) not experienced as standing in any determinate spatial relations to the “real [...] objects” (1885, 50) in the environment that we are simultaneously perceiving—they are not experienced as integrated in the “objective” space that we access through perception (cf. 1885, 50). Rather, they are placed in a “subjective space” (1885, 50), an imaginary counterpart of objective space.

It is in this context that Kandinsky employs the terminology of ‘inner’ and ‘outer’. To begin with, he contrasts at one point in this discussion an “outer” and an “inner” “spatiality” (Räumlichkeit) (1885, 50). In a footnote, he comments:

There are [...] two kinds of visual representations: first, primary ones, with the specific character of reality and objectivity, and second, secondary or reproduced ones, which do not have this character. [...] [T]he spatiality of the primary visual image appears in consciousness with the character of objectivity and reality; the spatiality of the memory image remains merely ideal. In this sense, one can speak of outer and inner spatiality [der äussern und innern Räumlichkeit] and, if one likes, of objective and subjective space [...]. (1885, 51n.).

Importantly, Kandinsky does not introduce here any new content over and above the distinction between real space (with full/robust perceptual presentation of reality) and imaginary space (with reduced/absent perceptual presentation of reality): he merely points out that this latter distinction could also be referred to in terms of ‘outer/inner’ or ‘objective/subjective’ space. Of the different meanings of inner/outer-terminology that we distinguished in section 1, the one that is relevant here must therefore be inner/outerEXP, not inner/outerLIT or inner/outerBEL. Indeed, Kandinsky explicitly describes proper auditory hallucinations—and hence experiences in ‘outer’ space—of voices that seem to be located within the subject’s ear (1885, 82). This would be very puzzling if he understood ‘outer’ space in the sense of outerLIT. Moreover, Kandinsky explains that the term ‘inner voice’ is due to the fact that in pseudo-hallucinations, “patients know from immediate feeling that the source of voices lies in their own inner nature [in ihrem eigenen innern Wesen]” (1885, 82)—i.e. such voices are experienced as mind-dependent, as originating and located ‘in the subject’s mind’, and hence, they are innerEXP. This latter passage and Kandinsky’s term ‘character of objectivity’ also suggest that the precise variant of ‘inner/outerEXP’ that Kandinsky employs is ‘inner/outerEXP-1’ (absence/presence of a sense of mind-independent reality). (Notice also that Kandinsky’s pseudo-hallucinations are characterized by full sensory detail and stability (1885, 29–30), so they are not innerEXP-2.)

2.2 Jaspers

Kandinsky’s discussion strongly influenced Karl Jaspers’s views on the matter. Jaspers published detailed discussions of hallucinations and pseudo-hallucinations in two long, undeservedly little-known articles (1911/1963; 1912/1963), and a short account of this distinction in his textbook Allgemeine Psychopathologie (General Psychopathology, first published in 1913; 1959/1997, 68–74). Unlike Kandinsky’s book and Jaspers’s two articles, the textbook was translated to English and had a vast influence. The brief discussion in this book therefore became the main source through which both Jaspers’s own account and Kandinsky’s account of the distinction between hallucinations and pseudo-hallucinations became known to anglophone authors and clinicians.

Jaspers adopts all the main points of Kandinsky’s views on the matter, clarifying and elaborating them in several regards (but see Spitzer 1987 for a discussion of important points that remain unclear in Jaspers’s presentation). In particular, he insists against Goldstein (1908)Footnote 6 that the ‘character of objectivity’ or ‘palpability’ (Leibhaftigkeit)Footnote 7 common to proper hallucinations and perceptions is a phenomenal feature of the hallucinatory/perceptual state which can come apart from the subject’s belief about the reality of the experienced object (1911/1963, 192–193), and he puts particular emphasis on the phenomenal distinctness of subjective or ‘inner’ and objective or ‘outer’ space (1911/1963, 206; 1959/1997, 69). Jaspers makes extensive use of the inner/outer-terminology for the two spaces, which Kandinsky had only briefly suggested. Like Kandinsky, Jaspers does not explicitly define this terminology, but his usage makes it possible to reconstruct how he understands it. First, Jaspers uses ‘inner subjective space’ and ‘inner imaginary space’ interchangeably with ‘subjective space’, and ‘outer objective space’ interchangeably with ‘objective space’. This indicates that, like Kandinsky in his footnote, Jaspers uses ‘inner’ and ‘outer’ in this context merely as additional characterization of the same distinction that is also marked by the terminology of ‘subjective’ vs. ‘objective’ space. Second, in his comparative table of the characteristics of hallucinations and pseudo-hallucinations in General Psychopathology, Jaspers introduces “outer objective space” as the space in which the objects of perception are located, and “inner subjective imaginary space” [innerer subjektiver Vorstellungsraum] as the space in which the objects of imagery are located (1959/1996, 69). This suggests that he is working with the same contrast as Kandinsky, i.e. a contrast between a perceptual space that involves robust perceptual presentation of reality, and an imaginary space that fails to do so (hence, ‘inner/outerEXP’—and more precisely, ‘inner/outerEXP-1’, since Jaspers follows Kandinsky in all relevant points). This reading is confirmed by some of Jaspers’s examples: he describes a case of pseudo-hallucinatory ‘inner voices’ that are located in inner (subjective) space but outside the subject’s body (namely, in a remote town) (1911/1963, 211); and he discusses cases of proper AVHs, and hence AVHs in ‘outer objective space’, where subjects hear voices as “localised inside their body [in ihrem Körper lokalisiert], in the body-trunk, head, eyes, etc.” (1959/1997, 74, translation modifiedFootnote 8).

2.3 Diagnostic Value: Bumke and Schneider

Neither Kandinsky nor Jaspers assign any diagnostic value to the distinction between hallucinations and pseudo-hallucinations (despite frequent claims to the contrary in more recent literature, as we shall see below). Kandinsky emphasizes that his pseudo-hallucinations occur both in healthy and mentally ill subjects (1885, 40); he does not clarify whether the same holds for proper hallucinations, or whether these are (more) specific for disease. Jaspers is entirely silent about any diagnostic relevance of the distinction.

By contrast, several later authors do argue on the basis of their own clinical experience that outerEXP hallucinatory experiences are more characteristic of disease than innerEXP ones. Thus, in a textbook from 1919, the German psychiatrist Oswald Bumke offers a discussion of hallucinatory experiences that is strongly influenced by Jaspers. He adopts Kandinsky’s and Jaspers’s account of the distinction between hallucinations and pseudo-hallucinations and adds, apparently on the basis of his own observations, that hallucinations are very rare in healthy subjects, while pseudo-hallucinations occur both in health and disease (1919, 26–27; the passage he quotes from Kandinsky on p. 27 suggests that Bumke, too, understands the contrast in terms of inner/outerEXP-1).

A similar view can be found in Kurt Schneider’s seminal monograph on schizophrenia (Schneider 1959). The symptoms of schizophrenia as described by Schneider include several forms of AVH-like experiences: audible thoughts, arguing voices, and voices that comment on the subject’s behavior (1959, 93–94). While Schneider does not explicitly distinguish hallucinations from pseudo-hallucinations, he insists that only episodes in which the subject really “sees”, “hears” etc. a non-existing object count as hallucinations, as opposed to less realistic episodes that are qualified by “as if” in subjects’ reports (1959, 92). Given the Jaspersian background of Schneider’s work, this strongly suggests that for him, only genuine AVHs in Jaspers’s sense, i.e. AVHs with full/robust perceptual presentation of reality—and hence outerEXP voices—are strong indicators of schizophrenia (cf. Koehler 1979, 245). This is also the view of Fish (1962) (on the most plausible reading) and Mellor (1970), who both draw on Schneider’s work (cf. next section). (However, it is impossible to decide how exactly Schneider, Fish and Mellor understand ‘inner/outerEXP’.)

So we have seen that in the earlier literature, inner/outer-terminology is granted an important taxonomic role in the context of the distinction between proper hallucinations and pseudo-hallucinations (Kandinsky, Jaspers). Inner/outer-terminology is understood here in the metaphorical sense of lacking vs. involving full/robust perceptual presentation of reality (and more precisely, as absent vs. present sense of mind-independent reality). Some authors who take up this distinction (Bumke, Fish, Mellor, and in an implicit way, Schneider) also assign diagnostic value to it, arguing that proper/outerEXP hallucinations are more indicative of disease than pseudo-/innerEXP hallucinations.

By contrast, the literal location of voices (and of hallucinatory experiences in general) inside or outside the subject’s body or head plays no comparable role in the earlier literature. Where it is mentioned at all (e.g., Jaspers, 1959/1997, 74; Kraepelin 1899, 116–117), it figures merely as one of many different phenomenological variables. Only Bleuler (1911) puts more emphasis on literal location when he reports that his patients typically distinguish two forms of voice-hearing, one where the voices “come from outside just like the natural ones”, and one where the voices are “projected into one’s own body” (1911, 91)—for example, in the subject’s heart, ears, nose, abdomen, genitals, and even in the urine in the bladder (1911, 82). However, neither he nor, as far as we can see, any other authors in the earlier literature ascribe any diagnostic relevance to literal location.

3 Emerging Confusions

When anglophone psychiatrists became aware of an urgent need for better, more reliable and comparable criteria for the diagnosis of schizophrenia in the 1960s, many authors in the resulting debate on the diagnosis of schizophrenia took the ‘first-rank symptoms’ that they extracted from Schneider (1959) as basis for further discussion (e.g. Fish 1962; Sedman 1966; Mellor 1970; Taylor and Heiser 1971; Koehler 1979). The diagnostic status of AVHs, in particular of arguing and commenting voices, became an important topic in this debate. In addition to Schneider, some authors also drew on Jaspers’s work in this context, including his distinction between hallucinations and pseudo-hallucinations (e.g., Fish 1962; Sedman 1966; Mellor 1970). So the earlier tradition that we reconstructed in the previous subsections became an important resource for the debate on diagnostic criteria for schizophrenia in the 1960s and 1970s.

However, this partial appropriation of earlier work took place against a methodological and conceptual watershed. Short characterizations and definitions that were easy to operationalize for the purposes of diagnosis and the collection of quantifiable data mostly replaced the careful description of a complex experiential reality that one could find in the work of phenomenologically oriented authors like Jaspers and Schneider. A downside of this development was that it favored formulations (including formulations employing inner/outer-terminology) which, lacking sufficient commentary and descriptive context, were ambiguous between different readings.

An important example is found in Frank Fish’s influential 1962 monograph on schizophrenia. (For further examples of ambiguous accounts of the traditional distinction between genuine/outer and pseudo−/inner AVHs, cf. Casey and Kelly 2007, 18–19; Sedman 1966; Koehler 1979; Oulis et al. 1995, 100.) Fish’s monograph played a crucial role in bringing to bear the work of Schneider and Jaspers on the contemporary anglophone debate about the diagnosis of schizophrenia. Fish not only offered one of the first discussions of Schneiderian first-rank symptoms, including commenting and arguing voices (1962, 81), he also followed Jaspers in distinguishing hallucinations and pseudo-hallucinations, claiming that “[t]he pseudo-hallucination is purely of academic interest and has no prognostic or diagnostic value in schizophrenia” (1962, 36)—i.e., only genuine hallucinations (when commenting and arguing) indicate schizophrenia. However, Fish’s explanation of the distinction between hallucinations and pseudo-hallucinations is very condensed: “A pseudo-hallucination is one in which the hallucination does not have all the qualities of a percept, so that it is not related to external space in a normal way and it lacks the quality of bodiliness which is normally associated with a percept” (1962, 35–36; “bodiliness” seems to be Fish’s translation for Kandinsky’s and Jasper’s ‘Leibhaftigkeit’). This explanation is clearly inspired by Jaspers, and a more detailed account in a later work (Fish 1967, 19–20) shows that Fish understands the distinction between hallucinations and pseudo-hallucinations like Jaspers in terms of presence vs. absence of full/robust perceptual presentation of reality. But the explanation in the 1962 monograph is so compressed that it is very easily misunderstood. In particular, Fish’s notion of “external space” in the above quote can be misunderstood in the sense of outerLIT space.

It will help to distinguish at this point three different kinds of statements. The first one is ambiguous:

I/O

Outer voices are more indicative of disease than inner voices.

The other two stand for two different ways in which I/O can be disambiguated:

I/OEXP

Voices with full/robust perceptual presentation of reality are more indicative of disease than voices with lacking/reduced perceptual presentation of reality.

I/OLIT

Voices with experienced location literally outside the subject’s head (body) are more indicative of disease than voices with experienced location literally inside the subject’s head (body).

Fish seems do endorse I/OEXP, but since his statement of the view is ambiguous (an instance of I/O), a reader can easily misunderstand it and conclude that Fish’s observations support I/OLIT (at least for the case of arguing and commenting voices).

Exactly this misunderstanding occurred in print in Taylor and Heiser (1971). In what they present as summary of Schneider’s first-rank symptoms as understood by Fish (1962) (1971, 483), the authors classify ‘complete auditory hallucinations’ (independently of whether they are arguing, commenting or audible thoughts or not) as first-rank symptoms, as opposed to ‘pseudo-hallucinations’, which they treat as second-rank symptoms. “Complete auditory hallucinations”, they explain, are experienced as “clearly audible voices coming from outside the patient’s head”, while auditory “pseudo-hallucinations” are “experienced as coming from inside the head (inner voices)” (1971, 485). The formulation ‘coming from inside/outside the head’ strongly supports a reading in terms of literal location. The example that the authors give for a ‘complete auditory hallucination’ seems to confirm this: “A 20-yr-old in boot camp began hearing several people in his barracks discussing his homosexuality […]” (1971, 485). While Taylor and Heiser formulate their account of first- and second-rank symptoms in relatively clear terms, this account rests itself on a misunderstanding of Fish’s ambiguously worded discussion of hallucinations vs. pseudo-hallucinations in 1962, which they cite as basis for their account (1971, 483).

Taylor and Heiser are, as far as we can see, the first authors who erroneously claim that I/OLIT is a ‘traditional’ view, a claim that rests on a misunderstanding of inner/outer-terminology. Their claim is repeated by many later authors who, for some reason, cite Jaspers’s General Psychopathology as source for I/OLIT, even though Jaspers is completely silent about the diagnostic status of the inner/outer-distinction (Docherty et al. 2015, 188; Casey and Kelly 2007, 19; Nayani and David 1996, 185; Blom 2010, 284; there have also been misascriptions to other authors, such as Baillarger, in Junginger and Frame 1985, 149–150, and Hagen and Kandinsky, in van der Zwaard and Polak 2001, 42–43). Correspondingly, one can also often read that Jaspers’s distinction between hallucinations and pseudo-hallucinations is a distinction between outerLIT and innerLIT hallucinations (e.g., Bentall 1990, 82; Looijestin et al. 2013, 314; Hunter et al. 2003, 16).

So the ambiguity and confusion about inner and outer voices that emerged in the 1960s and 1970s also affected accounts of past work on inner vs. outer voices, leading to gross misrepresentations of the work of authors like Jaspers and Schneider. Ambiguities and misunderstandings of this kind are bound to have negative consequences in more than one way. For one thing, when ambiguous or misrepresented accounts of diagnostic criteria are used in educational and clinical contexts, they can directly lead to wrong or unwarranted diagnoses and choices of treatment. In addition, they can have a detrimental impact on further research, which in its turn can negatively affect diagnosis and treatment. In the next two sections, we examine two cases in point: first, the design of assessment tools for AVHs, and second, empirical studies that tested the hypothesis I/OLIT.

4 Assessment Tools

The development of numerous assessment tools for AVHs since the 1970s—including both questionnaires for self-report and rating scales for structured and semi-structured interviews (for review, see Ratcliff et al. 2011)—was an immediate response to the need for more reliable diagnostic tools, especially for schizophrenia, that was perceived in that period. Not surprisingly, the debate on first-rank symptoms in the 1960s and 1970s strongly influenced many of these tools. Partly as a consequence of this, such tools standardly contain questions about the ‘location’ of voices, referring to voices ‘being inside/outside’ of one’s head or ‘coming from inside/outside’ of one’s head, and similar. In several cases, assessment tools employ formulations that are ambiguous, and not disambiguated by context (e.g., the available options for answers).

For reasons of space, we focus here on one particularly interesting example, the Present State Examination (PSE). This case illustrates how even an assessment tool that is based on a long process of development and displays a high degree of methodological and conceptual reflection—it is even accompanied by a 200-page long glossary—can struggle to find a clear language when it comes to the conceptual intricacies of the phenomenology of voice-hearing. The PSE is influenced by the tradition of Schneider’s first-rank symptoms and was used for and improved through the US/UK Diagnostic Study and the International Pilot Study on Schizophrenia. Its most recent version, PSE 10, became the main component of the World Health Organization’s Schedules for Clinical Assessment in Neuropsychiatry (SCAN; World Health Organization 1990a, 1990b). While earlier versions had an item on ‘pseudo-hallucinations’ (Wing et al. 1974, 211), this is replaced in PSE 10 by the item ‘internal hallucinations’ (#17.007). The item reads:

Do you hear them [sc. the voices] [a] in your head or mind, or [b] in your ears, or [c] as though coming from outside?

–Where do they seem to come from?

0 No AH.

1 Predominantly external hallucinations.

2 Partly within head/mind, partly outside.

3 Predominantly inner hallucinations. (World Health Organization 1990a, 202, #17.007)

A first problem with this item is that it is formulated in terms of a threefold distinction—voices in head or mind, voices in the ears, voices coming from outside—while the answer options work with the twofold distinction of “external” vs. “inner hallucinations” (or “within head/mind” vs. “outside”). It is therefore unclear what response the interviewer should choose when a subject reports voices located in the ears.

Next, the formulation of option [a] (and the first half of answer 2) is unfortunate for two reasons. First, it is left open whether the “or” (in answer 2: the slash) is meant to indicate equivalence between “in your head” and “in your mind”, or to signal two different options that are both counted as ‘inner hallucinations’ here. Second, while “in your mind” excludes a reading in terms of literal location, “in your head” is ambiguous: both those who experience innerLIT voices and those who experience innerEXP voices may respond positively to this option.

Option [b], “in your ears”, seems unambiguously to speak about location in the literal sense. But this suggests that the other options, too, should be read as concerned with location in the literal sense, while the formulation “in your mind” excludes such a reading. The question therefore contains a logical inconsistency, in addition to its various ambiguities.

Finally, option [c], “as though coming from outside”, can be read either in terms of literally external location, or of full/robust perceptual presentation of reality.

To what extent does the glossary clarify the meaning of the question? The relevant entry begins as follows:

Inner voices or images, perceived with the vividness and concreteness characteristic of hallucinations but lacking external projection. When asked to describe them, R [the respondent] may localize them as occurring ‘within the mind’ or ‘within the head’ but they cannot readily be provoked or altered at will. The lack of projection into external, objective space is not ‘insight’, which may or may not be present (see item 17.013). (World Health Organization 1990b, 132)

The reference to “insight” seems to exclude a reading of the inner/outer-contrast in terms of belief in reality. But otherwise, this entry does not help to clarify the item from the interview schedule so far: it mostly adds further ambiguous terms—“inner voices”, “external projection”, “projection into external, objective space”. It then goes on to explain:

Confusion should not occur between the space of the body and the mind as they may not overlap and may not be identical. Voices that occur in the mind must be therefore separated from those occurring outside the mind. (World Health Organization 1990b, 132)

Despite the praiseworthy effort at conceptual clarification, even this part of the glossary entry is not free from ambiguity. Is the “space of the body” contrasted with “the mind”, or with “the space […] of the mind”? Does “space of the body” stand for the area of space that is occupied by the body, or for space as such (including both the area occupied by the body and everything around it)? What is the point of the remark that the “space of the body” and “the mind” (or “the space of […] the mind”?) may not overlap and not be identical—that a voice that is described as being ‘in the mind’ is not necessarily experienced as being innerLIT, too? Or that real and imaginary space are distinct, i.e. that no unified experience of both is possible (as was claimed by Kandinsky and Jaspers, whose distinction between ‘outer objective space’ and ‘inner subjective space’ is echoed by the authors’ notion of “projection into external, objective space” in the first part of the glossary entry)?

The only genuine clarification in this glossary entry comes from the sentence “Voices that occur in the mind must be therefore separated from those occurring outside the mind”. For this remark makes it relatively clear that the schedule item about ‘internal hallucinations’ is really meant to deal with a contrast between innerEXP and outerEXP voices (although it remains open whether this contrast is understood in the sense of inner/outerEXP-1, inner/outerEXP-2, or some other way). However, one single sentence in the midst of a glossary entry that abounds with ambiguous terminology is not sufficient to clarify an interview question that allows for plenty of different readings and is even logically inconsistent.

The authors conclude the glossary entry with a terminological remark:

Such symptoms (i.e. voices occurring in the mind) are often termed ‘pseudo-hallucinations’, but this term can be used in other ways. (World Health Organization 1990b, 132–133)

This skepticism about the term ‘pseudo-hallucinations’ reflects the observations of, among others, Hare (1973) and Koehler (1979) about how differently the term is used in contemporary psychiatry. But in response to the problem, the authors merely replace one ambiguous terminology by another.

Given the pivotal role of AVH assessment tools both for clinical purposes and for the collection of data that serve as basis for empirical research on the phenomenology, epidemiology, explanation, treatment and other aspects of voice-hearing, the ambiguities that we have identified are bound to have damaging consequences both in clinical contexts and in research. To begin with, these ambiguities increase the likelihood of false positives and negatives for the item on ‘internal hallucinations’, and therefore reduce the reliability of the tool. If a tool with such ambiguous items is employed in the collection of research data, there is a genuine danger for the construction of diagnostic criteria and subtypes that are mere artifacts, and as a consequence, for clinical decisions about diagnosis, prognosis and treatment that lack empirical foundation and can have a negative impact on patients. There is indeed some evidence suggesting that scale items about the location of voices, and hence items which involve inner/outer-terminology, have low reliability:

A complication in the study of location [sc. of voices] is the doubtful reliability of the judgement. It has been noted (Gelder et al. 1985, 6) that in many cases, the patient is uncertain, and an analysis of clinician interrater reliability (Oulis et al. 1995) found relatively modest agreement, as indicated by a weighted kappa of only 0.38 for location. (Copolov et al. 2004, 5)

While Copolov and colleagues blame patients’ judgment for this (Copolov et al. 2004, 5), we hypothesize that in many cases, patients’ uncertainty and low interrater reliability is really due to unclear formulations in the questions that patients are asked.

Moreover, when PSE 10 is used for clinical purposes, interviewers are supposed to count both ‘external’ and ‘internal’ hallucinations when rating items like commenting and arguing voices (World Health Organization 1990b, 131; cf. Wing et al. 1974, 164). Hence, the authors take a stance in the debate on the definition of first-rank symptoms, counting both arguing and commenting hallucinations with and without full/robust perceptual presentation of reality as first-rank symptoms. A user who misreads their formulations about ‘internal’ hallucinations in terms of literal location will also fail to appreciate this view about first-rank symptoms. As a consequence, the user may not count arguing and commenting voices that lack full/robust perceptual presentation of reality in assessing the relevant items, and hence give diagnoses that are mistaken by the lights of the authors of PSE, and not supported by the clinical experience and evidence that this tool is based on.

Finally, good assessment tools facilitate the communication between patient and therapist and provide both with a shared vocabulary in which the patient can articulate her experiences, and the therapist can talk with her about them. There are several positive effects this can have: it improves the all-important patient-therapist relation, it may improve the patient’s ability to cope with her experiences (Trygstad et al. 2015), and it empowers the patient insofar as it makes it more likely that she is listened to, taken seriously, and believed when it comes to her experiences of voice-hearing (cf. Scrutton 2017). By contrast, conceptual confusions like the ones we found in PSE 10 are likely to reduce if not undermine all these positive effects of assessment tools, and they may even deteriorate the patient-therapist relation, create unnecessary stress for the patient, and disempower him/her. PSE 10 counts answers only if the patient is “able to understand the questions and clearly describe the relevant experiences without undue prompting” (World Health Organization 1990a, 194). If a patient finds herself in difficulty to respond to an unclear question, this may not only be frustrating for the patient, it can also have the consequence that she is silenced and what she says is not counted as proper answer and not taken seriously for the purposes of diagnosis and research.

5 Empirical Studies on ‘Internal’ Vs. ‘External Voices’

Over the past 25 years, a large body of empirical studies on AVHs in general, and on the phenomenology of AVHs and related experiences in specific, has been carried out, and many of them use interview items on ‘inner’ vs. ‘outer’ location of voices. Four of these studies (Junginger and Frame 1985; Nayani and David 1996; Copolov et al. 2004; Docherty et al. 2015) directly address the question whether the distinction between ‘inner’ and ‘outer’ voices has any diagnostic value. These four studies share a common structure:

  1. 1.

    They set out (among other things) to empirically assess the view I/O (‘Outer voices are more indicative of disease than inner voices’), which they find in the Jaspersian tradition (most of them wrongly ascribe the view to Jaspers himself). (Nayani and David 1996, 185: “Jaspers argued that those experiences apprehended in inner subjective space (sometimes called pseudo-hallucinations) are phenomena distinct from the classical, externally projected hallucination, and are less useful diagnostically”; Copolov et al. 2004, 1: “The differential significance of auditory hallucinations (AHs) heard inside or outside the head has a long and confusing history. The first treatment of the issue was by Jaspers who distinguished true hallucinations, heard in external space, from pseudo-hallucinations, which were located internally”; Docherty et al. 2015, 188: “Jaspers, among other early theorists […], viewed internal hallucinations as indicative of less severe pathology than external ones, and this thinking is still reflected in some current conceptualisations of symptom severity […]”. Junginger and Frame 1985, 149 claim: “Traditionally, voices perceived as originating outside of the head have been given greater emphasis in diagnosis”, giving a mistaken second-hand reference to “Baillanger” [sic] as source of the view; it is likely that their real source for the view is some author in the tradition of Jaspers.)

  2. 2.

    The studies ask their subjects questions about the experienced literal location of voices, inside or outside the head or body. Hence, they test the hypothesis I/OLIT. (The wording used by Junginger and Frame 1985, 155, in one of their questions is admirably clear: in response to the question “What was the location or origin of the voice (or, voices)?”, subjects are supposed to indicate a point on the spectrum between “definitely inside my skull” and “definitely outside my skull”. Copolov et al. 2004 and Docherty et al. 2015 used rating scales with unambiguous questions about the literal location of voices, namely, the Mental Health Research Institute Unusual Perception Schedule (MUPS) and the Psychotic Symptoms Rating Scales (PSYRATS; Haddock et al. 1999). The case of Nayani and David (1996) is interesting, as they use a scale that is an elaboration of PSE 10—which, remember, contains a highly ambiguous item on ‘Internal Hallucinations’. The authors adopt the terminology of ‘internal’ vs. ‘external’ hallucinations that is suggested in the SCAN glossary, but interpret these terms literally, as is shown by the different parameters they use to determine the precise external or internal location—e.g. “head/chest/abdomen; right/left/centre/back/unsure” for “internal locus”: 1996, 180.)

  3. 3.

    The studies fail to find significant correlations between literal inner/outer-location, on the one hand, and clinical factors like diagnoses with schizophrenia. Hence, they find I/OLIT disconfirmed by their observations. (Junginger and Frame 1985, 151: “A sign test failed to show that schizophrenics were more likely to perceive the location of their last hallucination as outside the head […] than inside the head […]”; Nayani and David 1996, 185: “Schizophrenic subjects were evenly divided between externalizers […] and internalizers […]”; Copolov et al. 2004, 4: “All associations of location with the principal psychiatric diagnoses of schizophrenia, affective disorder, other non-organic psychoses, and borderline personality disorder were non-significant”; Docherty et al. 2015, 193: “Patients with internal hallucinations did not differ from those with external-only hallucinations in severity of positive, negative, general or overall symptoms”.)

  4. 4.

    The studies argue (or at least imply) that this shows the ‘traditional’ view I/O to be either wrong or unwarranted. (Junginger and Frame 1985, 153: “There is currently no compelling evidence to warrant the de facto importance of the perceived location of verbal hallucinations”; Copolov et al. 2004, 1: “There appears to be no consistent differential impact and effect of internal and external AHs, and there was no support for the historical view that internal AHs are more benign”; Nayani and David 1996, 185; Docherty et al. 2015, 193 with 188.)

However, the evidence provided by the studies does not establish the conclusion in 4: the relevant reading of I/O that can be found in some authors working in the Jaspersian tradition is I/OEXP, and the findings under 3. do not bear on this latter hypothesis. The studies in question do not themselves put forward ambiguous formulations, but their fallacious argument rests on a misunderstanding of earlier work that is linked to different interpretations of inner/outer-terminology.

The problem with this is not so much the resulting historical inaccuracy but rather the consequence that a hypothesis that seems to have some support from the clinical experience of authors like Bumke, Schneider, Fish and Mellor, namely, I/OEXP, is not recognized as a separate hypothesis that merits empirical assessment: since they confuse this hypothesis with I/OLIT, the authors of the above studies think that they have tested this ‘traditional’ hypothesis while this is not really the case.

The situation is rendered more complicated by the fact that three of the four studies, as well as some other studies, include apparent reality of voices as an independent factor that is assessed and checked for correlations with other factors. So, prima facie, it might be the case that these and other studies actually did test I/OEXP, too.

However, questions about the reality of voices in most of these studies are afflicted by another ambiguity that we addressed, too, in section 1, namely, the ambiguity between perceptual presentation of reality (a part of the phenomenal character of the hallucinatory experience) and belief in reality (a part of the subject’s cognitive assessment of the experience). The case of Junginger and Frame (1985) is particularly instructive in this respect. The authors’ questionnaire contains the following question about apparent reality: “How real was the voice (or were the voices)?”, with possible answers lying on a continuum from “[c]ompletely imaginary” to “very real”. This question is equally open to a reading in terms of the character of the experience and of the subject’s beliefs about the reality of the voices. When they realized this because of comments made by subjects during the interviews, the authors changed the question (halfway through the interviews!) into “How real did the voice seem to you?” (1985, 152), which, however, is still not entirely free from the ambiguity (‘it seems to me that p’ often means as much as ‘I believe that p’). The subjects’ answers about reality yielded a very low score of reliability; the authors think it “appeared inherently difficult for the subjects to estimate” this parameter (1985, 152), but once again we suggest that the ambiguities in the questions that were used provide a more likely explanation. Due to this low reliability, this parameter was excluded from further analysis, and was not tested for correlations with clinical factors.

A very similar ambiguity as in Junginger and Frame’s initial formulation is found in several other studies (Lowe 1973; Singh et al. 2003; McCarthy-Jones et al. 2014). Copolov et al. (2004) use the MUPS, whose item on reality displays the same ambiguity (Mental Health Research Institute 1992, 31). Nayani and David (1996) apply Aggernaes’s framework for assessing the sense of reality (Aggernaes 1972), but their comments strongly suggest that they misunderstood this framework as a tool for assessing belief in reality (1996, 184: “Summing the Aggernaes scores provides a measure of reality conviction, the belief that the hallucination has objectivity, sensory qualities and so on”, cf. 187). Other empirical studies of AVH phenomenology are either silent about apparent reality (e.g. Leudar et al. 1997; Stephane et al. 2003; Docherty et al. 2015), or they address only belief in reality (Oulis et al. 2007).

The only findings we were able to identify that really bear on I/OEXP are contained in studies by Ramanathan (1982, 1983), who used the framework from Aggernaes (1972) to interview subjects with AVHs about the experienced reality of the voices and compared the findings with several further factors. The comments on Aggernaes’s criteria in these articles (e.g. Ramanathan 1982: 58–59) show that the author is aware of the crucial distinction between experience and belief. Ramathan’s studies support I/OEXP, but their evidential weight is limited by the small scale of the samples.

This leads us to the following conclusion: I/OEXP—i.e., the hypothesis that voices with full/robust perceptual presentation of reality are more indicative of disease than other voices—still stands in need of empirical assessment (and of further disambiguation in terms of the distinction between inner/outerEXP-1, inner/outerEXP-2, and possibly further relevant variants). At the same time, current research shows no awareness of this need because it does not recognize I/OEXP as a hypothesis in its own right that is distinct both from I/OLIT and from the hypothesis that presence vs. absence of belief in the reality of experienced voices makes a diagnostic difference. Instead, ambiguous and misunderstood inner/outer-terminology in earlier work and ambiguities between perceptual presentation of reality and belief in reality in more recent studies had the consequence that I/OEXP is confused partly with I/OLIT and partly with hypotheses about the diagnostic status of present or absent belief in reality.

6 Conclusion: Implications for Further Work

A first, obvious lesson to be drawn from our findings is that extreme care is needed when using terminology like ‘inner’ vs. ‘outer’, ‘internal’ vs. ‘external’ etc. when assessing AVHs. We suggest that wherever such terminology is used, it ought to be accompanied by explicit commentary. Where it is used by voice-hearers, interviewers should ask for clarification. Here are some examples for possible questions and commentaries. (The answer options may consist in yes/no, or in a scale with different frequencies. The scale/questionnaire should also specify the relevant meaning of ‘voices’; for example: Many people ‘hear voices’ that are not really there, or that cannot be heard by others even if they are close enough. The following questions are about experiences of ‘voices’ in this sense.)

  • Questions about location, literally understood:

    1. (a)

      Do you experience voices that sound as if they come from the space around you, outside your body?

    2. (b)

      Do you experience voices that sound as if they come from inside your body?

      Commentary: An example for a voice that sounds as if it comes from outside your body is the voice of a person who stands in front of you and is talking with you. An example for something that sounds as if it comes from inside your body is the sound you hear when you click your tongue, with mouth closed. We are only interested here in the location in space that the voices seem to have, not in whether they sound realistic or not.

  • Questions about perceptual presentation of reality:

    For this question, we wish you to focus only on how the voices sound and feel, and forget for a moment all the other things that may help you to decide if there is really someone speaking or not (e.g., whether you see someone at the place where the voice seems to come from; whether other people can hear the voice; etc.). Considering only the voices themselves—

    1. (a)

      Do you experience voices that feel real, as if there really was someone else saying or communicating or thinking these things right now?

    2. (b)

      Do you experience voices with a clear and detailed sound?

      Commentary: We are not interested here in where in space the voices seem to come from. Please answer these questions independently of each other—i.e., answer (a) regardless of whether you experience voices with a clear and detailed sound or not, and answer (b) regardless of whether you experience voices that feel real or not.

  • Questions about belief in reality:

    Do you experience voices even though you believe or know that there is no one really there who is speaking?

    Commentary: We are asking here only about what you believe or know about the voices, not about whether the voice sounds realistic or not.

Second, the hypothesis that voices with full/robust perceptual presentation of reality are more indicative of disease than voices without (i.e., I/OEXP) needs to be further disambiguated and empirically tested in future work.

Third, the confusion that we observed is, arguably, also due to some extent to a lack of sufficiently thorough engagement with historical sources such as Jaspers. If these authors had been read more closely, misunderstandings like those that we detected could have been avoided.

Finally, our discussion in this article was restricted in several ways. Arguably, there are many further cases of conceptual problems besides confusion about the inner/outer-contrast that have lead to ambiguities and confusion in assessment tools and empirical studies about AVHs—for example, the contrast between experience and belief more generally, or the contrast between ‘voice’ and ‘thought’. Also, we have focused on one example for an assessment tool (the PSE) where the question on the inner/outer-contrast in AVH is unclear and needs to be improved, but there are other cases as well that require discussion. Finally, conceptual problems of this kind are not restricted to AVH, but are relevant to other psychopathological phenomena and their assessment and research, too.