1 Introduction

There is a longstanding debate, renewed by Siegel (2010), as to whether high-order properties are perceivable ones, inasmuch as, unlike other properties, they are targets not only of cognitive but also of perceptual states. Some high-order properties are definitely non-perceivable, e.g. those that involve institutional factors, insofar as they contain normative elements that are irreducible to factual ones: for example, one cannot perceive whether someone is a President.Footnote 1 Some others are intuitively perceivable, e.g. some aesthetic properties: we usually say that we see the beauty of a landscape, of a person, of a painting. Yet not only there are a lot of high-order properties inbetween the above cases, but it is not even clear why such cases are paradigmatic instances of perceivability vs. non-perceivability.Footnote 2

Now, one may shed light on this debate once a criterion of perceivability is available, which provides necessary (and hopefully jointly sufficient) conditions of perceivability for high-order properties. We will try to provide such a criterion, on the basis of the idea that a high-order property is perceivable at least only if (i) it is given immediately and non-volitionally, and (ii) it is grasped via a holistic form of attention. This form of attention indeed is one of the forms of attention that are perceptually relevant. For, unlike spatial attention (or even conceptual noticing), it does not operates pre-perceptually (and not even post-perceptually). We will also see that this criterion essentially grasps grouping properties, i.e., properties corresponding to organizational aspects of an object, which make it the case its elements are arranged under a certain 'polar' direction in one of the three dimensions (Sect. 1).

On the basis of this criterion, we will first of all claim that, qua sort of grouping properties, facial properties are perceivable ones, for they are high-order properties that match the above criterion (Sect. 1). Moreover, given that for us something is a high-order property if it depends but does not supervene on lower-order properties (either straightforwardly low-order or merely less high-order ones), we will claim that our criterion allows for other high-order properties to be perceivable that are grouping properties as well yet located in a hierarchy of high-order properties; primarily, gender properties and racial properties (Sect. 3). Finally, we will see how these claims are empirically implemented, since they are confirmed by neurological and psychological findings (Sects. 2, 3).

2 Facial properties as both high-order and perceivable properties

In the contemporary debate about the admissible contents of perceptual states, people wonder whether such states can have a rich content determined not only by low-order properties–i.e., properties directly and basically grasped by perceptual modalities; as is standard in the literature, these properties are given not definitionally, but by lists: “The distinction between low‐level and high‐level features is generally made on the basis of a list of each of the relevant sensory modality's paradigm low‐level features […] paradigm low-level features include shape and color; in the auditory modality, paradigm low‐level features include volume and pitch” (Helton 2016:852)–but also by high-order properties. So, in order to enter this debate, one must preliminarily settle what is a high-order property.

Here is our criterion, which provides only sufficient conditions for being a high-order property: a property is high-order if it existentially depends on, but does not supervene on, lower-order properties.Footnote 3 According to this criterion, a high-order property could not be instantiated if a lower-order property were not instantiated as well. Yet it may be the case that there is a difference in the instantiation of high-order properties without that there is a difference in the instantiation of the lower-order properties on which both such properties depend.

Now, facial properties precisely exhibit both features, i.e., dependence and supervenience failure on low-order properties. By “facial properties” we precisely mean high-order organizational properties, which, as cases of pareidolias in which e.g. one sees faces in rocks (see the following Section) clearly show, can be instantiated both by animate (typically human) and by inanimate beings, even if they do not share the same low-order properties.Footnote 4 Taken in this sense, first, the property of being a face of kind F can be instantiated only if certain low-order properties (notably, certain colors and shapes) are instantiated as well. Yet second, one may have the instantiation of different facial properties while the very same low-order properties, actually the very same assemblage of colors and shapes, are instantiated. As Fig. 1 nicely shows. In it, the same assemblage of colors and shapes corresponds both to a face frontally given and to another face given in profile. Thus, facial properties are high-order properties.

Fig. 1
figure 1

Ambiguous face (by courtesy of Paola Tosti)

Yet saying that a property is high-order is obviously not saying yet that such a property is perceivable. There may be high-order properties, such as e.g. institutional properties like being a UK citizen or being a university professor, which are clearly not perceivable–one can perhaps perceive that someone else is a human, but one can neither perceive that she is a UK citizen nor that she is a university professor, for both properties involve normative elements that, in their non-factual nature, cannot be perceptually grasped. So, what must be added in order for facial properties, qua high-order properties, to be perceivable?

According to Siegel (2007, 2010), in order to answer the question of what makes a high-order property perceivable one must appeal to the method of phenomenal contrast. According to this method, there clearly are perceptual experiences, what she calls the contrast experience and the target experience, which differ phenomenally as a whole, even though they are quite similar. This phenomenal difference must be ultimately traced back to a difference in the rich content that the target experience does not share with the contrast experience, since the former represents a certain (fine-grained) high-order property (in her most famous example, the natural kind property of being a pine tree) that the latter does not represent. Hence, this high-order property is perceivable.

The method appeals to an argument to the best explanation consisting of three steps and a conclusion (Siegel 2007). First, the phenomenological datum: the overall experience of which the contrast experience is a part differs in its phenomenology from the overall experience of which the target experience is a part. Second, the appeal to a sensory, not cognitive, phenomenology: if the overall experience of which the contrast experience is a part differs in its phenomenology from the overall experience of which the target experience is a part, then there is a sensory phenomenal difference between the contrast experience and target experience. Third, the appeal to content: if there is a sensory phenomenal difference between those experiences, then the contrast and the target experience differ in content (to be conceived in terms of accuracy conditions). Fourth, the expected conclusion appealing to rich content: if there is a difference in content between such experiences, it is a difference with respect to (fine-grained) high-order perceptual properties represented in the target but not in the contrast experience. As such, it has been applied both by Kriegel (2007) and by Byrne (2009) precisely to facial properties, in order to account for the phenomenal difference holding between a proposagnosiac and a normal subject, insofar as unlike the former, in her experience the latter grasps her mother's face.

Yet precisely because it is an argument to the best explanation, each proposition in the argument can be criticized by attempting at showing how in the specific cases the explanations respectively ruled out (i.e., appealing to cognitive phenomenology, to non-representationalism, and to less fine-grained representationalism respectively) can prove to actually be better explanations than the endorsed ones (i.e., appealing to sensory phenomenology, to representationalism, and to more fine-grained representationalism respectively).Footnote 5 Thus to our lights, it seems better to put the method of phenomenal contrast aside and appeal to other perceivability criteria. For us, this appeal is also to be recommended since the method of phenomenal contrast applies only to perceptual experiences but not to perceptual states in general, whether conscious or not.

Now in the aforementioned debate, some criteria have already been put forward that independently work for establishing the perceptual character of low-order properties. In order for a property to be perceivable, that property must be given immediately (Nes 2016; Brogaard 2018) and involuntarily, hence compellingly and not amendably (Toribio 2018) in a form amenable to adaptation, as in the case of the well-known ‘waterfall illusion’ (Block 2014; Di Bona 2017).

Now, let us grant that the above criteria hold for low-order properties. Let us also suppose that they also hold for high-order properties, which definitely is a matter of debate.Footnote 6 Yet even if this were so this would not be the end of the matter. For when applied to high-order properties the above criteria provide merely necessary conditions of perceivability. Indeed, a high-order property can be given in all such ways and yet not be perceivable.Footnote 7

To our lights, a further necessary condition of perceivability for high-order properties must be added. Hopefully this condition will provide, along with the previous ones, jointly sufficient conditions of perceivability of high-order properties, thereby showing how a high-order property can also be a perceivable one.Footnote 8 This condition has to deal with being attentionally given in a holistic way.Footnote 9 Holistic attention is a specific form of attention, which consists in both focusing an object and wandering around all its properties (this is the form of attention that Nanay 2016 considers the fourth form of attention, focused on an object and distributed across its properties). As such, this form of attention is one of the kinds of attention that have a perceptual import, as Stokes (2018, 2021) underlines. Unlike spatial attention, which basically consists in pre-perceptually fixating one’s gaze on a certain area and illuminating it as it were, all the perceptually-relevant kinds of attention select something, whether objects or properties, already given in the perceptual field. In the case of holistic attention, such an import transpires from the fact that it manages to perceptually grasp certain organizational aspects of an object that correspond to certain high-order properties, its grouping properties as one may say; namely, those organizational aspects of an object corresponding to the fact that its elements are grouped under a certain 'polar' direction in one of the three dimensions. Such high-order properties would indeed not be grasped if attention merely played a pre-perceptual role, that is, it worked as a spotlight in order for the perceptual mechanism to operate once one’s gaze it fixated (Pylyshyn 2003; Raftopoulos 2009).Footnote 10

The perceptual character of holistic attention can clearly be seen if one sticks to ambiguous figures (Jagnow 2011), in which the alternate mobilization of grouping properties is particularly evident. In the following figure made of a grid of nine squares (Fig. 2), one may see either a cross-shaped configuration standing in front of a certain background, or a diamond-shaped configuration standing in front of another background. Since the contours of the two organizations overlap, the phenomenological switch between such organizations cannot refer to a presumed pre-perceptual fact that a certain area of the figure (say, the two top squares on its left) is spotted, rather than another one. Rather, in order to grasp an organization of the figure, fixing our gaze on certain points of the figure merely prompts a certain holistic perceptual attending to it, while in order to grasp a different organization of the figure, fixing our gaze on other points of the figure would merely prompt another holistic perceptual attending to it.

Fig. 2
figure 2

The nine-squared grid

On behalf of Stokes (2021), one may say that, insofar as this perceptual holistic form of attention is cognitively penetrable, high-order perception is cognitively penetrable as well. Yet it must be noted that this form of attention fits the model that Macpherson (2015) has labelled cognitive penetration lite. For not only its activation modifies only the phenomenology, but not the kind of content, of the perceptual state it affects, as happens in cases of weak cognitive penetration (Macpherson 2012): the mobilization of this form of attention lets the state's content remain non-conceptual just as the content the state has before that mobilization, as many (e.g. Jagnow 2011; Orlandi 2011; Raftopoulos 2011a) have underlined. But also, the perceptual state with that content may be prompted both by top-down and by bottom-up phenomena.Footnote 11

As Wittgenstein (1980, 2009) originally envisaged, this clearly happens with ambiguous figures again. Consider the case of the Mach figure (Fig. 3). Clearly enough, two different perceptual experiences correspond to the fact that, in conformity with two different mobilizations of the holistic form of attention with respect to the figure one faces, one sees either a diamond or a tilted square. Now, this experiential perceptual difference may certainly be prompted by mobilizing different concepts (as in reacting to the questions like, can you see the diamond? can you see the tilted square?). Yet it may also be prompted by spontaneously reacting to the different aspects the figure alternatively manifests, purely optical aspects as Wittgenstein (1980:I§§970,1017) labelled them. Thus, the contents of such different perceptual experiences is not conceptual–one does not see either that a diamond or that a tilted square is out there–but is non-conceptual–one sees either a grouping organization in which a diamond-like silhouette emerges or a grouping organization in which a tilted-square-like silhouette emerges.

Fig. 3
figure 3

The Mach figure

So first of all, on the basis of what we said grouping properties turn out to be perceivable properties. Not only their grasping is featured by immediacy and unvoluntariness: aspects light up on us (Wittgenstein 2009:IIxi§118), inevitably (one way of seeing the object will eventually replace another), and randomly (the duration of one alternation in that switch is not a function of previous durations) (Block 2014:567). But also, as we have just shown, what enables one to grasp such different organizations is perceptually-based holistic attention. Differently attending Fig. 3 as a whole makes one grasp different grouping organizations of it. Moreover, what we just said shows that grouping properties are high-order properties according to our criterion. For they existentially depend but do not supervene on certain low-order properties–certain colors and shapes. As Fig. 3 exhibits, different grouping organizations correspond to one and the same assemblage of colors and shapes without which such organizations would not exist. Thus, grouping properties are both high-order and perceivable properties.

At this point, we are also able to show that, in their already established dependence without supervenience on assemblages of colors and shapes, facial properties are not only high-order properties, but also a kind of grouping properties. Figure 1 gives a vivid example of this situation. In the phenomenological aspectual switch with the figure, two different faces are grasped depending on how the very same colors and shapes of the figure are differently organized. Thus, qua grouping properties, facial properties are not only high-order properties, but also perceivable ones. As Fig. 1 again confirms: the different faces are not only grasped immediately in the figure, but also via differently attending the figure as a whole.

Strikingly enough, one can get the same results about facial properties so meant not only via phenomenological considerations, but also via empirical methods. We will try to prove this in the next Section.

3 The Cognitive Basis of Facial Experiences

To our lights, we can arrive at the same conclusion as the previous Section if we consider empirical data in addition to phenomenological considerations. To begin with, there is an almost universal agreement in visual neuroscience that faces are processed in a holistic way, that is,”people are inclined to process the multiple facial parts as a perceptual whole or gestalt” (Jin et al. 2021:1). This hypothesis has received converging support from multiple sources. For instance, in the so-called "composite task" (Young et al. 1987), people are asked to recognize the top half of the face of a familiar person while ignoring the bottom half. Recognition is longer and less accurate when the top half is aligned with the bottom half of a different face, suggesting that facial parts are automatically grouped at the perceptual level, creating the illusion of a new facial configuration (for review, see Tanaka and Gordon 2011). In a modified version of the task (e.g. Hole 1994), people must say whether two halves of an unfamiliar face are identical. Two identical halves are perceived as different when aligned with distinct bottom halves, suggesting that holistic attention makes it hard to process one facial part while ignoring the other. In both versions of the task, the ‘composite face illusion’ is reduced or even eliminated when the two facial halves are misaligned or presented upside down (Fig. 4).

Fig. 4
figure 4

The composite task with unfamiliar faces (Murphy et al. 2017)

Yet this is not the end of the matter, for the processing of one facial information appears to be strongly influenced by the whole-face context. In the so-called "part-whole task" (for review, see Tanaka and Simionyi 2016), people are presented with a picture of an unfamiliar face to memorize (Fig. 5). Then they are asked to recognize a facial part, such as the eyes, both when the part is presented in isolation alongside a distractor facial part (isolated condition) and when it is presented within the original face-context alongside a distractor whole-face (whole-face condition). People's memory tends to be better when facial parts are presented in the context of whole faces rather than in isolation, consistent with the hypothesis that facial properties are processed holistically at the cognitive level. Interestingly, the part-whole effect disappears in the context of scrambled and inverted faces, or non-face objects (Tanaka and Farah 1993).

Fig. 5
figure 5

The part-whole task (Tanaka and Sung 2016)

Yet further empirical evidence is consistent with the phenomenological observation that the grasping of facial properties is perceptual in nature. To begin with, facial properties are given immediately, between 140 and 200 ms after visual presentation, a time window that is fully compatible with the involvement of perceptual processes.Footnote 12 This has been demonstrated by means of the electrophysiological (ERP) technique. In a typical ERP study, the brain's activity is measured through electrodes placed on a subject's scalp when she sees photographs of whole faces, scrambled faces, and non-face objects. A particular ERP component labelled N170, a negative component that reaches its maximum at 170 ms after stimulus onset, is consistently reported as being larger for facial properties than any other properties (for review, see Rossion and Jaques 2008). Moreover, the N170 has been correlated with behavioral markers of holistic processing such as the composite effect (e.g. Jaques and Rossion 2008), indicating that holistic attention already operates at this relatively early stage of visual perception.

Interestingly, there is evidence that people can grasp facial properties even faster, with the first eyes movements towards images of faces occurring in just 100 ms (Crouzet et al. 2010). Moreover, it seems that these very early saccades are not entirely under voluntary control, since they occur even when people are searching for non-facial objects (e.g. images of vehicles). This is consistent with the finding that facial properties ‘pop out’ in an array of non-facial distractors; that is, they tend to grab attention immediately and automatically (e.g., Mayer et al. 2015). Furthermore, critically, the pop-out effect does not emerge when low-order properties that are usually co-instantiated with faces are separately available (e.g. being round, being pink) but exclusively when faces are presented as a whole. Hence, this visual search advantage appears to be specifically due to holistic attention (Hershler and Hockstein 2005).

Another piece of empirical evidence supporting the perceptual character of facial properties qua grasped both immediately and holistically comes from the phenomenon of face pareidolia, the persistent tendency to see’illusory’ faces in inanimate objects.Footnote 13 Illusory faces, just like human faces, are processed very quickly by the brain and generate a clear pop-out effect (Keys et al. 2021). Thus, facial experiences can be triggered even if typical low-order properties, such as typical skin color, are absent. More generally, there is evidence that ‘illusory’ faces are processed by the very cognitive mechanisms that support human-face perception, despite our knowledge that they are not humans. For instance, experiences with Arcimboldo paintings–that is, portraits of imaginative faces made entirely of objects like fruits and vegetables–are associated with enhancement of N170 (Caharel et al. 2013), a clear marker of human facial processing (see above). In this sense, face-detection cognitive mechanisms appear to be informatively encapsulated, just as perceptual processes are, thereby allowing at most cognitive penetration lite, as we said in the previous Section.

Further evidence supporting the perceivability of facial properties comes from neuroimaging data. A promising criterion of empirical adequacy holds that if a given property is claimed to be perceptual according to theoretical research, then it must be represented in regions of the brain that have been independently associated with perceptual processes, such as the visual cortex. Is that so for facial properties?

Facial experiences are known to activate a widely distributed network of brain regions, which includes not only a ‘core system’ of traditional visual areas, such as the inferior occipital gyrus and the fusiform gyrus, but also an ‘extended system’ of limbic (e.g., amygdala) and associative, supramodal areas, such as the inferior frontal gyrus (for review, see Ishai 2008). Yet the consensus view in cognitive neuroscience is that a sub-region in the high-order visual cortex, labelled the fusiform face area (FFA; see Fig. 6), plays a critical role in the processing of facial properties, exerting the doming influence on the core and extended networks of facial regions (Kessler et al. 2021). Neuroimaging studies have shown that this high-order visual region responds preferentially to facial properties in a variety of tasks and paradigms (for review, see Kanwisher and Yovel 2006). Furthermore, studies using neural adaptation–a paradigm in which repeated exposure to a certain property reduces activation in neural areas representing that property–have shown that FFA is sensitive to many aspects of faces (e.g. Xu et al. 2009). Still, stimulation of FFA by means of electrodes implanted on the cortical surface for clinical reasons causes the perception of ‘illusory faces’ in objects (Schalk et al. 2017).

Fig. 6
figure 6

The experimental design of Wang et al. (2017)

Now, activity in FFA specifically reflects the engagement of holistic attention. This visual region responds specifically to facial properties, not to low-order properties that co-occur with faces, such as specific colors or shapes. As noted by Kanwisher and Yovel, “the FFA responds strongly and similarly to a wide variety of face stimuli that appear to have few low-level features in common, including front and profile photographs of faces […], line drawings of faces […], cat faces […], and two-tones stylized ‘Mooney faces’” (2006:123). The holistic character of the FFA response can clearly be seen if one considers the case of bistable stimuli, such as the Rubin Vase. Neuroimaging studies indicate that this visual region shows an increased activation only when participants report face-interpretations of the Rubin vase as compared to vase-interpretations, although the low-order features of the stimulus, as well as the retinal stimulation, remain unchanged (Andrews et al. 2002).Footnote 14 In a recent study, Wang et al. (2017) used multivoxel pattern analysisFootnote 15 (MVPA; see Norman 2006)to further explore the hypothesis that FFA, in its activity patterns, carry information about fluctuating face-content during perception of ambiguous pictures. In one condition (ambiguous condition), people were presented with the Rubin vase image and were asked to report, by pressing one of two buttons, any alternations between face and vase readings as soon as it was perceived (Fig. 7). In another condition (unambiguous condition), people were presented with unambiguous, black and white photographs of faces and vases. Results of this study confirm that activity patterns in FFA is sufficient to discriminate between facewise and vasewise segmentation of the Rubin vase. In other words, it is possible to use activity patterns in this visual region to predict when people are experiencing faces.

Fig. 7
figure 7

Fe/male faces (Fu et al. 2014)

Is the activity in the high-order visual cortex, specifically FFA, necessary for perceiving facial properties? The answer to this question appears to be affirmative. To begin with, studies using transcranial magnetic stimulation (TMS) have demonstrated that stimulation of visual regions in the vicinity of FFA, such as the occipital face area, temporarily impedes the processing of facial properties, leaving the processing of other body properties intact (Pitcher et al. 2009). More direct evidence comes from people with prosopagnosia, a syndrome associated with impaired facial processing in spite of normal vision, memory, and general intelligence (for review, see Corrow et al. 2016). When prosopagnosia is not innate but consequent to a brain insult, as in the case of acquired prosopagnosia, a lesion in the ventral temporal cortex, including FFA, is generally reported (e.g. Barton et al. 2002), suggesting that the preservation of this neural structure is necessary for normal grasping of facial properties. Critically, subjects with prosopagnosia are usually insensitive to effects due to holistic processing, such as the composite effect and the part-whole effect, indicating functional damage of holistic attention in this syndrome (Finzi et al. 2016).

All in all, empirical data strongly support phenomenological considerations about the perceptual character of facial properties. Not only a perceptually-based holistic form of attention is engaged when people are experiencing faces, but this holistic attention appears to be a necessary component of facial experiences.

4 A Hierarchy of High-Order Perceivable Properties

Once we have assessed that another at least necessary, and hopefully jointly sufficient, condition in order for a high-order property to be perceivable is for its instantiator to be holistically attended to, as in the case of facial properties qua kind of grouping properties, then other high-order properties in the vicinity of facial properties that are grouping properties as well may also be taken to be perceivable. Moreover, there may be a full-fledged hierarchy of high-order perceivable properties. For these properties are such that their dependence base is suitably concatenated, since any such property is the dependence base for another such property, starting from facial properties themselves. To anticipate matters, this may be the hierarchy among certain high-order perceivable properties: starting from the top, racial properties, gender properties, and facial properties, which ultimately depend on low-order properties. All such properties may indeed be conceived as respectively more and less sophisticated forms of grouping properties that, qua organizational properties, can also be possessed by inanimate beings. Let us see things more in detail.

To begin with, consider gender properties, in the sense of the properties like that of being a male or being a female.Footnote 16 First, such properties are clearly high-order in the above sense, yet their dependence but not supervenience base is not constituted by low-order properties but precisely by facial properties, as the case of an androgynous face that may appear both as masculine and as feminine clearly shows.Footnote 17 Yet second, such properties are also perceivable ones. For first of all, their grasping may depend on adaptation phenomena, showing the typical kind of compellingness that perceptual states have. Depending on whether an androgynous face is seen after either clearly masculine or clearly feminine faces, it is grasped either as feminine or as masculine respectively (Fu et al. 2014; for a similar situation concerning masculine vs. feminine voices, cf. Di Bona 2017). Now, whether or not adaptation can be applied to high-order properties just as it is applied to low-order properties,Footnote 18 it remains that such properties are grasped with the kind of immediacy that is a necessary condition of perceivability. Moreover and more importantly, holistic attention must again be mobilized in order to grasp a gender property, thereby showing its grouping nature. For it is the holistic way one attends to the traits of one and the same androgynous face–the eyes, the nose, the lips of that face–that lets one grasp that face either as masculine or as feminine; those traits are seen differently. Also in this case, a phenomenological consideration may easily show the situation at stake. In the following figure (Fig. 7), in which, depending on whether the second and the third face are assimilated either to the clearly masculine or to the clearly feminine face standing on the left and on the right of the figure respectively, those central faces are seen either as a male or as a female face.

This kind of phenomenological import is at stake also in other psychological experiments. Using the composite task, it has been shown that recognition of the gender of one half of the face is more difficult when the other part belongs to a different gender (Baudouin and Humphreys 2006). Still, grasping of gender properties is impeded when faces are scrambled or presented upside down, as shown in Fig. 8 (Zhao and Hayward 2010). These findings suggest that people encounter difficulties in grasping gender properties when holistic attention is interfered.

Fig. 8
figure 8

The gendered composite task (Zhao and Hayward 2010)

The perceptual nature of gender properties is also supported by neuroscience data. Although gender experiences are known to activate a widely distributed network of the brain (see Kaul et al. 2011), the high-order visual cortex appears to have a critical role, as in the case of facial properties. For instance, neuroimaging studies using the MVPA technique have shown that gender properties are encoded in a distinctive way in the fusiform gyrus, more precisely in FFA (Contreas et al. 2013; Wegrzyn et al. 2015). The pattern of brain activity in this area could be used to predict the gender of the current face stimulus as a function of cortical topography, suggesting that visual cortex activity alone may allow for critical aspects of gender experiences. Moreover, it has been shown that neural activity in FFA and lateral fusiform gyrus, independently of attention, is modulated as a function of linear changes in face gender (from extremely masculine to extremely feminine), suggesting the existence of a graded representation of gender properties in this visual region (Freeman et al. 2016).

On the empirical ground, the dependency of gender properties on facial properties is consistent with the observation that the former are processed slower than the latter, although still within the time window of perception. Grasping of gender properties occurs in a relatively late phase of visual perception, between 200 and 285 ms (Yokoyama et al. 2014), whilst facial properties, as we have seen, are processed between 140 and 200 ms (see Sect. 2). This dependency is also supported by patient data. For impairment in recognition of facial properties usually affects the grasping of gender properties, as demonstrated by the poor performance of prosopagnosic patients on gender recognition (Marsh et al. 2019), but not vice versa. In this sense, neuropsychological disorders of facial properties seem to be more basic than disorders of gender properties. In accordance with these data, several models in cognitive neuroscience (e.g. Diamond and Carey 1986) have postulated a hierarchical ordering between facial and gender properties. According to these models, while grasping of facial properties rely on information about basic holistic relations among key features within a face, such as the fact that the eyes are above the nose (first-order relational information), grasping of social properties such as gender is dependent on secondary, more precise holistic information, such as the spacing between the eyes, or between the eyebrows and hairline (second-order relational information).

All in all, in their grouping nature gender properties are both high-order and perceivable properties, yet they depend on properties that, as we have already seen, are both high-order and perceivable as well: i.e., facial properties.

Once things so stand as regards the relationship between gender and facial properties, one may try to hold that the same relationship holds between racial propertiesFootnote 19 and gender properties as well: not only the former are both high-order and perceivable, as their grouping nature may lead one to think, but they depend on the latter. Let us see.

To begin with, there is no doubt that racial properties are high-order properties, yet this time their dependence, not supervenience, base are gender properties. The grouping property, say, of being black would not be instantiated if the grouping property, say, of being male were not instantiated as well. This dependency is consistent with the observation that gender and racial properties have different developmental trajectories in children, with gender categorization emerging significantly earlier in childhood than racial categorizationFootnote 20 (Rhodes and Gelman 2009; Shutts et al. 2013). It is also consistent with the fact that, in adults, gender differences appear to bias racial recognition, suggesting that gender properties might have causal efficacy over racial properties. For example, as the femininity of a face increases, people become more efficient in categorizing it as white but less efficient in categorizing it as black (Carpinella et al. 2015). Yet it may be the case that a difference in racial properties is not matched by a difference in gender properties: one and the same (face of a) male may turn out to be either black or white.

Admittedly, the dependency of racial properties on gender properties might be objected. For instance, one might argue that the fact that gender properties seem to emerge earlier in infancy does not support the hierarchy thesis. It might be simply due to exposure: infants are often exposed since birth to parents of both sexes, but they are exposed later to people of a different racial group.Footnote 21 Moreover, the converse effect of racial properties on gender properties has also been experimentally demonstrated, with racial differences affecting the categorization of the gender of a face (e.g., Johnson et al. 2012). On this basis, it has been suggested that the representations of gender and racial properties in the human brain are intrinsically tied, both at the perceptual level and at the level of the associated stereotypical knowledge, with no clear ordering between the two (for review, see Freeman and Johnson 2016). Granted, both points are disputable: perhaps exposure to racial difference is not necessary for one to grasp a racial aspect in a gendered face, and the influence on gender categorization may mobilize races not at a perceptual, but at a cognitive level. In any case, more research is needed to further empirically establish this point.

The perceivability of racial properties, however, seems to be firmly established. First, once again their grasping may depend on adaptation phenomena that does not involve low-order properties (colors, in this case), but racial properties: seeing determinately racially black faces may prompt one to see a certain racially indeterminate face as racially white, while seeing determinate racially white faces may prompt one to see that racially indeterminate face as racially black (Fu et al 2014). Again, what counts here is not adaptation per se, but the sort of immediacy lying behing adaptation phenomena. Second and more importantly, they rely on a perceptual grasping of them matching a holistic form of attention, thereby showing their grouping nature. Again, phenomenology shows that this is the case. Consider Fig. 9. If one and same face is contoured either by certainly black or by certainly white faces, it is seen either as white or as black (Levin and Banaji 2006).

Fig. 9
figure 9

Different context, different racially-shaped faces (Levin and Banaji 2006)

From the empirical point of view, moreover, neuroimaging studies using MVPA have additionally shown that racial properties, as well as gender properties, can be decoded from neural activity in the high-order visual cortex, more specifically in FFA, conforming to the notion that such properties are encoded in perceptual areas of the brain (Contreas et al. 2013).

Thus, in their being a kind of grouping properties, racial properties are not only high-order, but also perceivable properties, grasped by the sort of perceptually-based holistic form of attention.

Incidentally, this does not mean that grasping of racial properties is not cognitively penetrated. For instance, it has been shown that racial bias (Brosch et al. 2013) or stereotypical knowledge about social categories (Stoiler and Freeman 2016) are reflected in the structure of neural patterns in the visual cortex (i.e., fusiform gyrus),Footnote 22 consistently with a growing number of behavioural data suggesting that racial perception is strongly influenced by high-order cognitive states (for review, see Bagnis et al. 2019). Yet this influence conforms to the aforementioned model of cognitive penetration lite. As is proved by the fact that the grasping of racial features may be prompted both by top-down phenomena and bottom-up phenomena. On the one hand, awarely belonging to a certain racial group may be an example where a top-down phenomenon is involved. For example, awarely belonging to white people makes easier one's grasping of the central male faces in Fig. 10 as racially black.

Fig. 10
figure 10

Top-down racially-shaped faces (Levin and Banaji 2006)

Yet on the other hand, grasping the difference in racial lightning may also be prompted by bottom-up phenomena, as when (Fig. 11) two blurred faces are still seen not as being merely different in color, but as being one racially darker than the other, just as in a non-blurred case (Firestone and Scholl 2015).

Fig. 11
figure 11

Bottom-up racially-shaped faces (Firestone and Scholl 2014)

The examples we have here considered regard high-order properties as instantiated by animate (typically human) beings. Theoretically speaking, however, nothing prevents such properties from being instantiated also by inanimate beings, especially when linked with humans by some pragmatically relevant bond (e.g. dolls, or maybe even non-humanlike items such as cars and shoes), even if they do not share the same low-order properties. For, to stress the point again, the properties in question are not biologically, but organizationally dependent: (more and more sophisticated) grouping properties, which depend merely genericallyFootnote 23 on the lower-order properties in the hierarchy.

So, among grouping properties, we have possibly found a hierarchy of at least three both high-order and perceivable properties each of which respectively depends on the lower one in the hierarchy: racial properties allegedly depend on gender properties (although further research is needed to conclusively establish this specific point), which in their turn depend on facial properties, which instead depend on low-order properties. And other concatenations may be supposed as well.Footnote 24

5 Conclusions

To sum up, in this article we have proposed a new criterion for a high-order property to be perceivable–such a property is perceivable at least only if (i) it is given both immediately and non-volitionally, and ii) it is grasped via a holistic form of attention. On this basis, we have first argued that, qua grouping properties, facial properties are both high-order and perceivable, for they are grasped via perceptually-based holistic attention. As we have seen, this claim is supported not only by phenomenological considerations but also by empirical data, such as the fact that facial properties are holistically processed in regions of the brain that have been independently associated to visual perception. Second, we have argued–on both the phenomenological and the empirical level–that other high-order grouping properties located in a hierarchy of high-order properties, notably gender properties and racial properties on top of them, are perceivable as well. With the present contribution, we hope we provided a general framework to rigorously assess the perceivability of further high-order properties, such as e.g. age properties or expressive properties, in future research.Footnote 25