1 Introduction

Perspective-taking refers to the ability to recognize another person’s viewpoint and has two types: visual and social. Visual perspective-taking refers to the ability to visualize objects from the vantage point of an imaginary observer who has moved to some other point in space (Huttenlocher and Presson 1973). Social perspective-taking, however, refers to understanding others’ beliefs, feelings, or motivations and has been studied as mindreading, Theory of Mind (ToM), and mentalizing (Conway et al. 2017; Low et al. 2016). Both are essential for daily interactions with others and are pivotal for human development. Recently, a growing number of studies have suggested that even young children show altercentric responses and adults experience a deep influence from others’ perspectives. I will first present these findings briefly and show that there is insufficient consensus on the interpretation of these phenomena while presenting an overview of the major hypotheses and identifying their characteristics. Next, two points—embodiment and aging—are described, which have not been considered enough in the existing hypotheses. Thereafter, I will show that these hypotheses do not sufficiently answer the questions that arise from these points to present a possible alternative. Finally, the advantages and issues to be verified by this hypothesis are presented.

2 Two Groundbreaking Findings

The recent debate on perspective-taking has been sparked by two important findings: the understanding of perspectives in infancy and egocentric/altercentric bias. While children were thought to be incapable of mindreading before the age of 4, recent studies with less explicit measures have overturned this assumption. Some studies with implicit, non-verbal measures such as violation of expectation techniques, interactive techniques, and anticipatory-looking techniques have suggested that infants may use implicit perspective-taking. For example, the pioneering study of Onishi and Baillargeon (2005) discovered that 15-month-old infants demonstrated false-belief understanding, which was tested with the violation of expectation paradigm. Buttelmann et al. (2009) also showed a false-belief understanding of infancy through helping tasks. Recently, Király et al. (2018) provided clear evidence that eighteen-month-olds could track false beliefs based on episodic memories. Their results suggest that the ToM emerges much earlier than previously assumed. Additionally, Moll and Tomasello (2006) reported that 24-month-old children successfully handed an adult the occluded object in a case wherein one object was visible to all while another was visible only to the child, and Sodian et al. (2007) demonstrated that 14-month-old infants’ appropriate understanding of “seeing” using a looking-time paradigm. These reports indicate that even infants can take others’ perspectives.

Another important finding is the effects of distorted understanding on one’s own perspective and slower decision-making than usual, or conversely, facilitating the perceptual and memory processes (Mattan et al. 2015). These are called the “consistency effect” (Simpson and Todd 2017) or “altercentric bias” (Ferguson et al. 2018), which automatically occurs even in cases in which the others (avatars) are not directly relevant to the task. However, it has also been reported that when participants believed that another person could see the object, they were slower to judge their own perspective; such an altercentric bias did not occur when they believed that the other could not see the object (Furlanetto et al. 2016). This effect is not mandatorily triggered by the mere presence of others (Cole et al. 2017; Gardner et al. 2018), which indicates the awareness of another mind from the cues in the situation is essential for the biases to occur (Zwickel and Müller 2010). The opposite effect of this, well-known as “egocentric bias,” refers to a depression in the accuracy or speed of responses influenced by the self-perspective in the acquisition of another’s viewpoint; additionally, it stems from egocentrism that is characterized by the undifferentiation between the subject and the external world (Piaget and Inhelder 1948). For example, in a false-belief task, adults with knowledge of the object’s true location often overestimate the possibility that a person holding a false belief will search for that object in the correct location (Maehara and Saito 2011). Both altercentric and egocentric biases may emerge, therefore, from the difficulty of differentiating between the acquired perspective and irrelevant perspectives.

3 Recent Theoretical Conflicts

3.1 Two-systems Account

To explain the discrepancy between the findings that children younger than four years cannot pass explicit false-belief tasks, even though infants younger than two years pass implicit perspective-taking tasks, two kinds of ideas with somewhat different points have been proposed. One of them, the Two-Systems account, contains an early developing system that tracks simple mental states (an implicit system) and a later developing system based on conceptual belief (an explicit system). The former system works in a fast, automatic, and inflexible fashion, whereas the latter operates in a slower, controlled, and flexible fashion (Apperly and Butterfill 2009; Butterfill and Apperly 2013; Low et al. 2016). The implicit system is assumed to be evolutionarily and ontogenetically ancient. It is also shared by infants, children, and adults and is independent of cognitive resources such as working memories. In contrast, the explicit system develops later, is characteristic of older children, and makes substantial demands on executive functions. This means that the systems draw on separate conceptual resources. Because the same system cannot simultaneously be fast, slow, automatic, and effortful (Ferguson et al. 2015), these two distinct systems must be assumed. In fact, there was no link found between early mindreading measured at 14 and 18 months with the violation of expectation paradigm and explicit false-belief belief understanding at five years of age (Poulin-Dubois et al. 2020), which is indirect evidence of the existence of two qualitatively different systems. Furthermore, this accurate, rapid, and unconscious inference regarding their own and others’ perspectives makes it difficult for adults to completely ignore the unrelated perspective, thereby experiencing egocentric or altercentric interference, especially when two viewpoints are in conflict (Samson et al. 2010). Another kind of idea comes from nativists who argue that children already have perspective understanding when they pass an implicit false belief task, while they could fail an explicit one because of extra task demands (Baillargeon et al. 2010). They emphasize an implicit mechanism of mindreading; Scholl and Leslie (2001) expressed it as an “innate modular basis for the ToM” and Carruthers (2013) called it “basic mindreading capacities.” However, both Two-Systems proponents and nativists agree in believing that infants acquire implicit viewpoint acquisition functions early in their development.

3.2 Criticism from Submentalizing Accounts

There are objections to this claim that real perspective-taking exists in infants. On the early existence of Tom in infancy, for example, Dörrenberg et al. (2018) claimed that non-verbal methods of anticipatory-looking, looking times, pupil dilation in violation of expectation paradigms, and spontaneous communicative interaction—which have been frequently used to measure implicit ToM—were not reliable. Furthermore, Barone et al. (2019) performed a meta-analysis of the empirical evidence on spontaneous false belief in infants younger than two years and pointed out the dependence of the belief on the type of experimental paradigm and the suspicion of publication bias. Several other studies have questioned the replicability and robustness of spontaneous false beliefs in infancy (Cole et al. 2016; Crivello and Poulin-Dubois 2018; Edwards and Low 2019; Kulke et al. 2018a, b; Kulke et al. 2019a, b). Whether infants indeed possess what can be regarded as true implicit ToM is far from certain.

Even if we accept the implicit system, its identity is debatable (Kuhn et al. 2018). For example, Santiesteban et al. (2014) disagreed with the assumption of consistency effect—interpreted as the main evidence of implicit mentalizing—and showed that it was because of domain-general processes such as attentional orienting. More recently, Gardner et al. (2018) indicated that reflexive attention orienting may yield findings that explain the implicit perspective-taking process, and von Salm-Hoogstraeten et al. (2020) emphasized the referential coding for acting on the basis of others’ perspectives. These concepts that general information processing of the domain can provide fast and efficient substitute alternatives for implicit mentalizing are called “submentalizing” (Heyes 2014) or “minimalist” accounts (Ruffman 2014).

3.3 What’s Causing the Confusion?

After all, which accounts are appropriate remains controversial (Kulke et al. 2019a, b). This may be because these different positions may view the same phenomena in different ways or discuss different phenomena that they nevertheless believe to be the same (Cole and Millett 2019). In fact, they both explain the same phenomenon in different ways. Qureshi et al. (2020) view the implicit system, which is the crux of the Two-Systems accounts, as a complex construct that includes the inhibitory control function (Frick and Baumeler 2017), with which we can account for individual differences. Conversely, Baillargeon et al. (2018) concluded that the failures of replication of implicit understanding in infants often reported by submentalizing supporters can be explained by the procedural differences between studies. They warned that we should not short-circuit our conclusions because the possibility of mindreading in infancy, on which the Two-Systems accounts are based, remains. It is true that direct comparisons are made difficult by the confusions either derived from different standpoints with different theoretical perspectives or their experimental tasks in subtly different ways (O’Grady et al. 2020). In light of this, this clamor is similar to past controversies on “imagery” (Cole et al. 2020).

To overcome these problems, a unique idea has been proposed in recent years to reinterpret the implicit system and to show the inevitability of two different ways of perspective-taking. Southgate (2020) insisted that humans can take others’ perspectives without mature executive functions (implicit perspective-taking) only if their cognitive mechanism places attention on others with the simultaneous absence of a competing self-perspective; on the other hand, an intentional (explicit) perspective-taking system accompanying executive functions is required after the emergence of competition between self and others’ perspectives at age two (the altercentric hypothesis). Following this hypothesis, what makes it possible for infants to take other perspectives easily and spontaneously comes from the altercentric bias and not the early acquired perspective-taking system that is assumed in the Two-Systems accounts. The bias leads infants to understand others’ feelings or beliefs as self-evident only in unseparated situations of self and others, while the Two-Systems accounts explain that the once acquired implicit system can work in conjunction with the explicit system even in situations where self and others exist separately (e.g., in adults). Altercentricism in infants is not acquired by overcoming egocentrism (Piaget 1959) in their development; on the contrary, infants have altercentric tendencies from the onset, which gives them an advantage in understanding others. However, even this hypothesis has some weaknesses. After all, existing theories, including the altercentric hypothesis, do not adequately consider some of the important features of perspective-taking, which will be discussed below. We are now at a stage when we must stop and reconsider to determine if there is anything that has been overlooked. To determine a more appropriate hypothesis, I present two crucial points to be considered.

4 Two Crucial Questions

4.1 Embodiment

Perspective-taking is considered the psychological counterpart of the idiom “putting oneself in someone else’s shoes.“ In spatial perspective-taking, it is a mental simulation of the physical actions required to adjust to another person’s perspective (Erle and Topolinski 2017). Fischer and Demiris (2020) proved that this metaphor is psychologically real by using computational models to represent three types of response time differences in the angular disparity between a participant and avatar, body posture variations, and perspective-taking strategy differences. Such operation of embodied self-image is evident in children aged 4–12 (Hirai et al. 2020), students aged 18–24, and elderly people aged 63–76 (Watanabe 2011), and also in the inference of others’ beliefs in ToM (Xie et al. 2018). Additionally, the skill of balancing on one leg as an index for motor control is significantly related to children’s spatial performance (Frick and Möhring 2016) and the spatial perspective-taking of young and old people (Watanabe 2018). All these suggestions support the idea of embodied perspective-taking (Kessler and Thomson 2010).

Even though it is evident that perspective-taking is an embodied process, important things about its features have not yet been fully clarified. What part of the brain is responsible for the embodiment of perspective-taking, and how does it work? How does the embodied self in perspective-taking change throughout life? For the first question, Wang et al. (2016) used magnetoencephalography (MEG) and found that the mental operation of body schema results from a distributed network comprising body schema, somatosensory, and motor-related regions centered at the right Temporo-Parietal Junction (rTPJ). As reported in their research, deep participation of rTPJ in both social perspective-taking (ToM) (Kobayashi et al. 2007) and visual perspective-taking (Nijhof et al. 2018) has already been demonstrated. However, it is still not fully understood how embodied self-images are generated and controlled in the brain. To answer the second question, Watanabe (2016) showed that the effect of increased somatosensory stimulation on the performance of the perspective-taking task is the biggest in students, followed by young children and the older adults. However, there are no findings about why the embodied self changes its impact on perspective-taking during lifelong development. The role of embodiment in perspective-taking has not yet been sufficiently explored, even though it can be the key to innovations in future research. Therefore, it is important for an ideal hypothesis to adequately answer the question of why perspective-taking is characterized by embodiment.

4.2 Aging

Although many developmental studies have involved infants to young adults (Meinhardt-Injac et al. 2020; Symeonidou et al. 2016; Warnell and Redcay 2019), there are only a few reports on the changes in perspective-taking occurring from mid-adulthood to old age. One example of such aging research suggests that even older adults may show excellent performance in perspective-taking (Happé et al. 1998). De Beni et al. (2007) found that the older adults are inferior to students in mental rotation but superior in visual perspective-taking. Watanabe and Takamatsu (2014) used an original video game task to measure the implicit visual perspective-taking ability in eight age groups ranging from younger to older adults with a 10-year range. They also found that the oldest group performed similarly to the student group in taking perspectives, while the oldest group was inferior to the student group in spatial information processing. Additionally, it is known that compared to younger adults, the older adults experience difficulties in situations that require explicit perspective-taking. Grainger et al. (2018), for example, measured the eye movement patterns of older and younger adults in both explicit and implicit false-belief tasks and found that both age groups were not different in ​implicit false-belief processing, while older adults were inferior to younger ones in explicit ToM. In this way, it has been suggested that implicit false-belief understanding is maintained relatively well even in old age. For a more reasonable understanding of the features of lifelong development and aging in perspective-taking, we must correctly answer the question of how to explain the robustness of implicit perspective-taking in the older adults.

4.3 The Adequacy of the Theories

The Two-Systems accounts cannot fully account for the reason why an embodied self-image is deeply relevant to implicit and explicit perspective-taking. Ward et al. (2019) reported in implicit spatial perspective-taking that the well-known positive linear relationship between response times and angle of orientation was replicated, and that recognition becomes slower the more the participants must mentally rotate their perspectives to the canonical orientation of the item; however, this is the only possibility that the implicit system includes an embodied self-image. Similarly, the involvement of embodiment in the innate mindreading module from which the nativists claim that the implicit perspective-taking understanding emerges has not been fully examined the Two-Systems account is also unexpectedly weak for developmental interpretation, although it assumes systems with different emergence periods. There is insufficient explanation of the mechanism behind why the implicit system appears in infancy and is maintained thereafter. The explicit system is also assumed to be conventional perspective-taking that has been thoroughly studied, but the timing of its appearance and the effects of aging have not been fully examined to see whether they meet the predictions derived from the hypothesis. Additionally, evidence of the distinct systems has been gathered from two main approaches: research that finds its germ in infancy and research on egocentric/altercentric bias within adults; nonetheless, the attempts to apply the suggestions from infant research to adults and vice versa are insufficient, which leads to a danger of over-application of age-specific characteristics to other age groups without sufficient evidence. There has been little evidence derived from applying the same measurement method to compare two or more varying age groups, especially younger children and older adults, in the Two-Systems framework. In fact, the balance between the two functions in a visual perspective-taking task, operating perspectives and processing necessary spatial information, differs between infants and the older adults, even when they present the same number of correct answers (Watanabe and Takamatsu 2014), which means the possibility that the age groups use different strategies.

So, can the approach of submentalizing adequately explain developmental changes? The submentalizing accounts could be more adequate than the Two-Systems account if limited to adult ages; for example, it will be relatively easy to explain age-related decline, such as lower scores, in older adults than in younger people with declines in some executive functions. However, how can these hypotheses explain how perspective-taking, in practice, remains relatively well maintained while most cognitive functions generally decline with age. An example of such an explanation is that only the targeted functions are maintained at a high level, whereas the others decline with aging. For example, Long et al. (2018) examined the developmental relationship between perspective-taking and executive functions in two age groups—younger and older adults. They found that both age groups maintained their perspective-taking ability similarly; meanwhile, the difference between the two groups was that the older group’s scores in perspective-taking tasks were best captured by switching performance, whereas the younger group’s inhibitory performance predicted the scores. As shown in these results, perspective-taking ability may not be significantly compromised because all cognitive functions do not decline uniformly with aging (Hasan et al. 2011). However, in this case, it is noteworthy why the specific functions, which help to maintain perspective-taking, are fortunately selected to survive. Alternatively, we can assume that compensation between functions that is often observed in the older adults will maintain their performance, considering the finding that the executive functions that construct perspective-taking ability differ between age groups. Decreases in cognitive ability during old age in one hemisphere could be compensated for by other executive functions in the other hemisphere, assuming that the laterality in the prefrontal cortex, which is present at younger ages, decreases with age (Cabeza 2002). Nevertheless, it would be difficult to explain the superior understanding of others in the older adults with only compensation between executive functions because of an actual decline in many executive functions related to perspective-taking with aging (Treitz et al. 2007). In fact, there have been no successful reports of this description, including by nativists, who usually explain the development in terms of some changes in executive functions. Furthermore, this approach of submentalizing will be harder to apply to infants because it is difficult to capture their abilities as accurately and stably as adults’. Moreover, this account seems to pay little attention to the embodied features of perspective-taking.

In contrast, it seems that the altercentric hypothesis has more advantages than the Two-Systems or submentalizing accounts for describing the developmental data in mindreading. It considers innate altercentric tendencies as a source of implicit perspective-taking and explains the emergence of the explicit one as an outcome of the differentiation between self and others. In this sense, the altercentric hypothesis provides a good idea in which perspective-taking in children and infants can be explained in a different manner from the existing theories. Also, in terms of aging, its general assumption that a balance of memory traces between the self- and other-based representations determines which perspective to take might make it possible to better describe a sequential mechanism from infancy to old age. However, neither the details of the mechanism nor any demonstrable data have been presented. In this sense, the altercentric hypothesis is inadequate to describe the phenomenon of aging at this point.

Even if we accept the altercentric hypothesis as somewhat better than the Two-Systems or submentalizing accounts in describing the development of perspective-taking, there are still issues to be addressed. This is because this hypothesis does not sufficiently consider embodiment in perspective-taking, even though operating the embodied self-image is a central mechanism not only in visual perspective-taking but also in ToM. This defect may come from the whole reliance of the switching mechanism between self and other perspectives on the balance of memory traces as concepts of perception or cognition (de Guzman et al. 2016). It is necessary and natural to include the embodied self in the process of distinction between self and others, because the sense in which people feel that the entity performing the action is “me” is brought about not only by conceptual representations but also by sensory-motor information such as self-acceptance and visual feedback in the basal layer (Synofzik et al. 2008). The following section introduces a new idea to overcome these deficits.

5 Self–other Distinction for Perspective-Taking

5.1 Loosening Phenomenon

The self–other distinction plays an important role in understanding others’ mental states developmentally and neurocognitively (Steinbeis 2016). Although the nature of the developmental process of self–other distinction throughout infancy remains debated, there could be a consensus that the constant and repeated inputs of somatosensory information play an important role in the formation of a sense of self (Proulx et al. 2016) and that the true altercentricity is acquired in the first two years. Such subjective experiences through which people perceive their bodies foster feelings of self-ownership and self-agency (Gallagher 2000). A sense of self-ownership is the perception that the observed object belongs to one’s own body, and a sense of self-agency means perceiving the motion of the observed object as being caused by oneself. The first object that is perceived with a sense of self-ownership in infancy is one’s own body (Montirosso and McGlone 2020); this is because muscle tonus, which is involved in the posture maintenance mechanism and body temperature regulation mechanism, constantly provides passive resistance to muscle stretch, or interoception about the tension that the muscle makes (Palmer and Tsakiris 2018). The processing and awareness of bodily signals arising from visceral organs stabilizes the mental existence of oneself as distinct from others, and the sense of self-ownership is established resulting from that coherence of bodily self-awareness. Simultaneously, infants rely on multisensory processing of concurrent visual and tactile inputs to perceive the sense of self-agency in their physical and social surroundings (Lewkowicz and Bremner 2020). It is thought that parental mind-mindedness (Meins et al. 2002) and embodied simulation (Gallese 2009) also foster a sense of self-agency as a subject who works and is worked on. As a result, infants aged over 9 months become aware of the distinction between self and others; for example, they can be aware of others’ behavior, imitating their movements; and then they perceive their bodies as containers that are clearly separated from the outside and in which they exist as agents in the first year (Tomasello 1993).

The bundled body senses through these processes, which generate self-ownership and self-agency, enhance the self–other distinction, and suppress the innate altercentric tendency. This ought to interfere with taking other perspectives with an altercentric bias in late infancy. However, the bundled body senses are loosened easily in some cases in young children. Hartmann (1991) proposed a personality dimension of thin vs. thick ego boundaries and considered that the nature of boundaries may vary depending on the situation and that thin boundaries may be associated with psychoticism or artistic creativity. Especially in the case of young children, the shackles of their physical selves often seem to loosen easily, as evidenced by their thin ego boundaries. It can be considered that this loosening phenomenon of the self–other distinction allows the resurgence of altercentric bias and gives the infant access to the perceptions and beliefs of others again (Fig.1). Following this assumption, among the previous reports on infants’ competency in mentalizing, some can be interpreted as examples of the emergence of altercentric bias before self–other distinction at eight months (Choi et al. 2018; Kampis et al. 2015; Kovács et al. 2010; Luo and Johnson 2009; Southgate and Vernetti 2014), while the other with over nine months can be because of the reappearance of altercentric tendency because of the loosened physical self in specific task situations (Luo and Baillargeon 2007; Onishi and Baillargeon 2005; Surian and Geraci 2012; Thoermer et al. 2012). In fact, one of the former studies (Kovács et al. 2010) reported that 7-month-old infants automatically pay attention to both others’ beliefs and their own beliefs similarly, whereas one of the latter ones (Luo and Baillargeon 2007) found that 12.5-month-old infants can suppress their own more complete representation to use the agent’s representations for interpreting others’ actions. Thus, the loosening phenomenon makes it possible to implicitly take others’ perspectives, even after the establishment of a self–other distinction. However, this phenomenon is not the same as the implicit system of perspective-taking in the Two-Systems account because evolutionarily it is not acquired for the purpose of perspective-taking; conversely, implicit perspective-taking may be a byproduct of this function. Indeed, an overreliance on fast, automatic, biased mentalizing and, as a result, the conflation of the mental states of self and others is often observed in people with borderline personality disorder who are characterized by thin ego boundaries (Luyten et al. 2021).

Fig. 1
figure 1

Lifelong development of the loosening phenomena and the detachment function in self–other distinction

Here, we must pay attention to the fact that behaviors such as perspective-taking by the altercentric bias before self–other distinction are not true perspective-taking. Perspective-taking needs an imaginary observer and the other to be observed, which is achieved after self–other distinction. Joint actions such as affective entrainment, which can be seen in the undifferentiated state of self and others in younger infants, could also not be considered as a perspective-taking behavior, because it is not accompanied by a sense of self-agency. Although human infants can engage in rich social interaction shortly after birth because of their altercentric nature, the state of coexistence with others during this period should not be considered true perspective-taking.

It is assumed that the loosening phenomenon appears in our embodied self throughout life, even after the self–others distinction in late infancy. Therefore, this phenomenon can not only describe the mechanism of implicit perspective-taking but also the egocentric/altercentric biases in lifelong development, including old age. The egocentric bias, which means that the self-perspective interferes with the other ones to be taken, is caused by the difficulty of abandoning the self-consciousness that originated from the somatosensory information once the self and others have become distinct. This indicates that somatosensory information is mixed more or less into their altercentric judgments even in adulthood. In fact, sensorimotor interference (Riecke et al. 2007), which means that characteristics such as the limits of body motion or direction of movement appear in the reaction times and correct response rates in body representational operations, has also been observed in adults. On the contrary, altercentric bias, which means that we are influenced by information about other perspectives even though our own perspective is required to be taken, arises because it is difficult to completely separate ourselves from others and close ourselves off from the outside world. The rubber hand illusion (Botvinick and Cohen 1998), which is the feeling of ownership of a rubber hand displaced from an occluded real hand with synchronously stroking both hands, is a good example of this. The physical self constantly perceives information from the outside for a lifelong time. Therefore, the altercentric bias has been often observed in young children and participants adult participants.

5.2 Detachment Function

There is also something inadequate about explicit perspective-taking in the altercentric hypothesis. It is not suitable to contrastively consider the self and other representations as the beginning of explicit perspective-taking because other representations do not constantly mean the perspective of a specific other. Rather, it is suitable to define another representation as one other than the self-body schema. In fact, healthy people will experience difficulty in simultaneously imagining both self-representation and their self-body schema, while those with hallucinatory symptoms unintentionally have an experience of viewing their own body from another self, which indicates confusion between self-representation and the real body (Blanke and Arzy 2005). In this sense, explicit perspective-taking could be defined as a process of separating self-representation from self-body schema. More attention should be paid to the resistance in trying to “detach” the self-representation from the self-body schema, which occurs when some characteristics of one’s own body are detached from inherent experiences and behaviors to externalize them as self-representations (Wallon 1934), than on the conflicts between self- and other-based representations. This detachment function means dividing the self-body schema from the external world in a state where self and others are distinct under the control of the antecedent loosening phenomenon. But once the self and other distinction is loosened, the innate altercentric tendency should reappear as altercentric bias, even in adults. Thus, this hypothesis of loosening and detachment in the process of objectifying the self-body could describe as a byproduct of such functions both the lifelong appearance of spontaneous or intentional perspective-taking and the mechanism of frequent shifting between egocentric and altercentric perspectives.

It should be noted that this developmental sequence of loosening phenomena and detachment function does not correspond to the well-known developmental stages—Level 1 or virtual perspective-taking of what someone else sees and Level 2, or identifying how someone else sees (Flavell et al. 1981). Level 1 refers to the process of movement of one’s own perspective to other positions, and Level 2 includes information processing in addition to the process of Level 1 to derive others’ beliefs or views. Loosening and detachment are included in both Levels 1 and 2 to a varying degree. My hypothesis focuses on the division process of perspectives, which is common to these two levels and is most crucial in perspective-taking activities; on the contrary, Two-Systems and submentalizing accounts intend to describe all the processes, including actual reactions in perspective-taking activities. Even considering the insufficiency of my hypothesis in theoretical inclusiveness, it is worthy of consideration as follows.

5.3 Advantages and Issues

In this paper, it is argued that the mechanism of perspective-taking can be redefined from that inherent to humans through the well-known process of self–other distinction without resorting to either the assumption of novel systems or an excessive reduction of executive functions. Therefore, it was hypothesized that implicit perspective-taking by altercentric tendency in infancy lasts throughout life because of the loosening phenomenon, even after self–other distinction, and that the self-representation is separated from the self-body image with the detachment function to be used for explicitly taking perspectives. This new hypothesis compensates for the features of embodiment and aging in perspective-taking that are lacking in the altercentric hypothesis. It can also explain a series of self–other distinction processes in both the decline of explicit perspective-taking performance and the robustness of implicit perspective-taking in the older adults, as well as the appearance of egocentric/altercentric bias in adults. However, evidence from empirical data is not yet sufficient. In particular, compared with perspective-taking in children, fewer studies have examined perspective-taking in aging. Nevertheless, our hypothesis may be more useful than other explanations in that it provides a consistent explanation for the findings already available and allows us to recognize perspective-taking development in lifespan.

Although we have been looking for a perspective-taking module for a long time, this may have been a false assumption from the beginning. Humans have acquired the ability to objectify themselves through evolution, which may make it possible to take other perspectives as a second-order effect. In fact, it has been reported that some kinds of apes exhibit mirror self-recognition responses very similar to human infants (Inoue-Nakamura 2001), and they also seem to be capable of implicitly taking perspectives (Buttelmann et al. 2017; Krupenye et al. 2016). However, the differences between humans and apes are also important. For example, Tomasello (2018) introduced the concept of shared intentionality to explain why only older children pass some kinds of false-belief tasks, whereas both infants and apes pass others. This shared intentionality of coordinating self and other perspectives for understanding other people as agents may facilitate the detachment process to explicitly take others’ perspectives. In short, it seems that the loosening phenomenon works for understanding in early life of the other intentions, with such joint actions being common to both humans and apes; on the other hand, the detachment function has a core role in intentional mindreading as humans’ specific activities. The progress in comparative psychology may lead us to verify these anticipations in the near future.

Additionally, the central role of the embodied self in this hypothesis is consistent with the fact that rTPJ is deeply involved in perspective-taking. For example, Blanke and Arzy (2005) and Blanke and Mohr (2005) conducted a meta-analysis of previous studies that reported on patients who exhibited hallucinatory symptoms such as autoscopy and out-of-body experiences, which are abnormal phenomena in which the world is perceived from a position outside the physical body; they found that rTPJ is generally involved in these kinds of experiences. Blanke et al. (2005); Arzy et al. (2006a, b) similarly pointed out the involvement of the rTPJ by examining brain activity in autoscopy patients. It is also known that stimulation of the angular gyrus, a part of rTPJ, can produce an out-of-body experience (Arzy et al. 2006a, b). Considering the intersectional role of the rTPJ in tactile, auditory, and visual processing, as well as the suggestion that out-of-body experiences are somatosensory disruptions (Braithwaite and Dent 2011), it can be inferred that some processing of the embodied self-image takes place in the rTPJ. It is expected that compatibility with brain science will certify this hypothesis (Luyten et al. 2021).

So, can it be stated that this hypothesis has as much or more explanatory power than other theories? The Two-Systems account, in which the features of the systems are defined according to the existing experimental results, and the submentalizing accounts, in which some general functions can be freely selected to explain the experimental results, have goodness in terms of their adequateness in describing the findings from research. However, my hypothesis can also describe them as reasonably as the existing theories without resorting to either the assumption of controversial systems or an excessive reduction in executive functions. For example, it could provide a satisfactory explanation for the difference between implicit and explicit perspective-taking. Grosse Wiesmann et al. (2020) found that children aged three years were worse in the explicit false-belief tasks than those aged four years above, whereas there was not a significant difference between these groups in the implicit false-belief tasks. Additionally, they reported that explicit false-belief tasks correlate with linguistic abilities, whereas implicit false-belief tasks do not. These facts could be described as follows: the loosening phenomenon, which an individual is already equipped with at three years and is driven without verbalizing, made all the children pass the implicit false-belief tasks, whereas only the four-year-old group passed the explicit false-belief tasks, in which the detachment function is needed to induce a self-representation for intentional perspective-taking. Additionally, relatively poor performance on an explicit false-belief task compared to an implicit one in the older adults (Grainger et al. 2018) could be explained by assuming that the deterioration of operating self-representations because of the decline of cognitive functions and motor organs in normal aging disturbs the detachment function, whereas the physical self, which concerns the loosening phenomenon, is more robust against aging. Whether such explanations are overwhelmingly more effective than those from the Two-Systems account or submentalizing accounts remains to be seen. More research is needed in the fields of self-psychology, developmental psychology, and brain science to elucidate the features of the loosening phenomenon and detachment function and show the validity of the hypothesis with careful attention to not adding excess assumptions to try to reconcile the results.

First, the existence of the loosening phenomenon and detachment function must be demonstrated with concrete evidence. In particular, the loosening phenomenon is a new concept that has not been examined before, and simultaneously, it seems to be difficult to define operationally; therefore, difficulties in demonstrating it are expected. The reaction time independent of the distance between self and other perspectives, which Watanabe (2016) and Watanabe and Takamatsu (2014) successfully separated, could be a good clue for the loosening phenomenon according to their presumption that it is an essential process of their implicit perspective-taking tasks. Alternatively, a surprising phenomenon that sensory-motor stimulation to human bodies enhanced their performance in an implicit perspective-taking task (Watanabe 2016) could indicate that the increase in attention to one’s own body helps trigger the loosening phenomenon. Moreover, if the assumption of the loosening phenomenon is correct, some qualitative changes in the performance of implicit perspective-taking are expected around age nine months, when the embodied self is thought to be established. Additionally, the verification of this prediction, the existence of the loosening phenomenon and detachment function must be clarified.

Second, the recently reported effects of empathy and emotions on perspective-taking should be incorporated into the hypotheses. For example, Todd and Simpson (2016) reported that anxiety disrupts spontaneous visual perspective-taking. Bukowski and Samson (2016) also found that guilt made participants more altercentric while anger made them more egocentric in Level 1 visual perspective-taking tasks. It should be noted that these effects were found in implicit perspective-taking, in which the loosening phenomenon would be strongly involved. Judging from the fact that the development of emotional empathy is rooted in early infancy and supports the ability to differentiate between self and others in daily empathic experiences (Tousignant et al. 2017), and that various emotional experiences are encoded to map within rTPJ in the brain (Lettieri et al. 2019), it could be assumed that the emotional changes loosened or strengthened the bonds of the physical self, depending on the circumstances. Thus, although it is also possible to reasonably interpret some findings on emotions under the assumption of a loosening phenomenon, more efforts should be made to further improve the consistency of this hypothesis based on future findings.

Finally, many important research questions on developmental changes and individual differences can be derived from the characteristics of the loosening phenomenon and detachment function. For example, why is the loosening phenomenon, which supports implicit perspective-taking, maintained relatively well even in later life, despite the general decline of physical functions in aging? Does the detachment function follow the development of general cognition, such as executive functions, because of the indispensable features of manipulating representations, or is it strongly influenced after all by one’s own body because of the constraints imposed by the embodied self? Is it safe to assume that those with autism spectrum disorder or phenomena such as autoscopy or out-of-body experiences have trouble with the healthy development of a sense of self? Although there are many issues to be resolved, the concepts of perspective-taking based on the self–other distinction are expected to lead to major developments in developmental psychology and psychopathology.