Abstract
Free full text
Understanding What We See: How We Derive Meaning From Vision
Abstract
Recognising objects goes beyond vision, and requires models that incorporate different aspects of meaning. Most models focus on superordinate categories (e.g., animals, tools) which do not capture the richness of conceptual knowledge. We argue that object recognition must be seen as a dynamic process of transformation from low-level visual input through categorical organisation to specific conceptual representations. Cognitive models based on large normative datasets are well-suited to capture statistical regularities within and between concepts, providing both category structure and basic-level individuation. We highlight recent research showing how such models capture important properties of the ventral visual pathway. This research demonstrates that significant advances in understanding conceptual representations can be made by shifting the focus from studying superordinate categories to basic-level concepts.
Trends
We view object recognition as a dynamic process of transformation from low-level visual analyses through superordinate category to basic-level conceptual representations.
Understanding this process is facilitated by using semantic cognitive models that can capture feature-based statistical regularities between concepts, providing both superordinate category and basic-level information.
We highlight research using fMRI, MEG, and neuropsychological and behavioural testing to show how feature-based cognitive models can relate to object semantic representations in the ventral visual pathway.
The posterior fusiform and perirhinal cortex are shown to process complementary aspects of object semantics.
The temporal coordination between these regions is also highlighted, while superordinate category information precedes basic-level semantic information in time.
Flexible Access to Conceptual Representations
How do we understand what we see? We interpret this fundamental question as asking how visual inputs are transformed into conceptual representations. Our conceptual knowledge (see Glossary) reflects what we know about the world, such as learned facts, and the meanings of both abstract (e.g., freedom) and concrete (e.g., tiger) concepts. Our focus here is on concrete concepts. When conceptual knowledge is accessed, the information retrieved needs to be behaviourally relevant. Acting appropriately requires flexible access to different types of conceptual information. Depending on perceptual context and behavioural goals, objects are recognised in different ways, for example, as a cow, an animal, or living thing. The way objects are naturally recognised is by accessing information specific enough to differentiate them from similar objects (e.g., recognising an object as a cow rather than a horse or a buffalo) – a notion termed the basic or entry-level of representation 1, 2. However, part of understanding the meaning of an object also necessitates that more-general information is accessed – for example, the commonalities between similar objects that enable us to know that an object is part of a superordinate category (e.g., as an animal or living thing). To understand the cortical underpinnings of this flexible access to different aspects of conceptual representations, we need to specify the neurocomputational processes underlying meaningful object recognition. This in turn requires that conceptual representations are studied as the expression of a set of dynamic processes of transformation – from the visual input and different stages of visual processing in the brain, through different types of categorical organisation, to a basic-level conceptual representation.
Object recognition has generally not been conceptualised in these terms. It is a domain of research that straddles many different subdisciplines – most saliently vision science and semantic memory – but these different strands tend to remain fragmented owing to the complexity and depth of individual areas. A central theme in vision science is to develop computational accounts of the ventral visual pathway, based on visual image properties, which try to explain non-human primate and human brain data (e.g., 3, 4, 5, 6). However, these models are unable to capture the relationships between different concepts – that an apple and a banana are more related than an apple and a ball (which are more visually similar). Further, models of vision alone cannot account for properties such as conceptual priming and flexible access to different aspects of meaning.
Research in semantic memory, by contrast, focuses on the organisation of semantic knowledge in the brain resulting in a variety of accounts drawing upon neuropsychology, functional neuroimaging, computational modelling, and behavioural paradigms. Providing a review of these perspectives is beyond the scope of this article, and many excellent contemporary reviews are available 7, 8, 9, 10, 11, 12, 13, 14. Our focus here is on understanding the neural processes that underpin how meaning is accessed from vision. We describe a neurocognitive model that integrates (i) a cognitive account of meaning based on the statistical regularities between semantic features (e.g., ‘has 4 legs’, ‘has a mane’, ‘is black and white’) that can explain a range of semantic effects, with (ii) the neurocomputational properties of the hierarchically organised ventral visual pathway.
Basic-Level Concepts and their Superordinate Categories
Most cognitive models of object meaning address semantics through one of two approaches – focusing on superordinate category organisation (e.g., 9, 15) or basic-level concepts (e.g., [16]). However, a comprehensive account needs to consider both these facets.
Research into the organisation of semantic knowledge in the brain has been largely motivated by the observation of semantic deficits resulting from brain damage and disease – most strikingly those deficits that seemed to be specific to only some superordinate categories. Such category-specific deficits after neurological diseases such as herpes simplex viral encephalitis (HSVE) have shown that tissue loss in anteromedial temporal cortex (AMTC; Figure 1) can disproportionately impair knowledge for living things, with relative preservation of knowledge for nonliving things 17, 18. Complementing these neuropsychological data, functional imaging and electrophysiology studies of healthy individuals show increased activity in the AMTC for living things versus nonliving things 19, 20, 21, 22, 23.
By studying patients showing category-specific deficits following AMTC atrophy, we can gain important insights into the nature of the information that is lost. A striking illustration of this comes from patient drawings, where they are asked to sketch a range of living and non-living objects from memory. In the examples in Figure 2A, all the nonliving objects are well-drawn and easily identifiable, while the drawings of animals mostly reflect their shared properties (e.g., four legs, a tail, eyes, horizontal body), making it impossible to identify them as basic-level concepts. It is clear from these examples that the informational loss underpinning the impairments of such patients involves accessing the distinctive properties of living things, rather than a loss of all information (see [17]). This type of perspective suggests that a more nuanced view of category-specificity in the AMTC is needed, one that takes into account the nature of the deficits at a more specific level than superordinate categories.
Functional brain imaging studies of healthy individuals have provided key evidence that apparent superordinate category effects are not restricted to the AMTC. In the posterior fusiform gyrus (Figure 1), animal images have been shown to produce enhanced effects in the lateral posterior fusiform gyrus, and tool images show effects in the medial posterior fusiform gyrus 15, 24. The nature of this lateral-to-medial gradient in the posterior fusiform is especially intriguing given the range of parameters that produce similar distinctions – such as real world object size [25], animacy [26], expertise [27], and retinotopy [28], suggesting that highly complex representations in this region encompass multiple types of stimulus properties 29, 30.
The effects animals and tools have on the posterior fusiform is one of a range of category-specific effects that have been observed in the temporal and parietal lobes for different categories – animals in the lateral fusiform, superior temporal sulcus (STS), and amygdala 31, 32; tools in medial fusiform, middle temporal gyrus (MTG), inferior parietal lobule (IPL) [33]; places in the lingual, medial fusiform and parahippocampal gyrus [34]; faces in the lateral occipital, lateral fusiform, STS 35, 36; bodies in the lateral fusiform and STS [37]. While understanding the organisation of different categories remains a central issue for cognitive neuroscience, we focus here on one aspect of this, category effects of animals and tools in the posterior fusiform, to illustrate the insights and advances we can make by studying part of this system in detail.
The effects of superordinate category in the AMTC and posterior fusiform must reflect complementary, but different, aspects of semantic computations, but research focusing on superordinate categories has been insufficient to resolve the complementary roles these regions might play.
A largely separate strand of research has focused on basic-level conceptual entities and centres on the anterior temporal lobe (ATL, often defined as the anteroventral and anterolateral aspects of the temporal lobe) which is claimed to represent amodal conceptual information 11, 38. This idea draws upon the notion of convergence zones in the ATL, which acts to bring together information from other brain regions to represent concepts 38, 39, 40. Widespread damage to the ATL is associated with semantic deficits at the level of basic-level concepts for all categories, while superordinate category knowledge itself is unimpaired. Thus, damage to the ATL and to the AMTC seem to have very different effects on conceptual knowledge which have yet to be fully explained.
While these lines of research have fundamentally enhanced our understanding of the neural basis of conceptual knowledge, two significant issues arise. First, theories that focus on the organisation of superordinate category information alone ignore what is perhaps the most salient aspect of semantics – the information which differentiates between basic-level concepts – because it is these concepts that are claimed to be the most necessary in daily usage [2]. Consequently, we believe that concepts, not categories, should be the focus of research. Second, research focusing on basic-level concepts has little to say about superordinate category representations. As a consequence, research into superordinate category representations and basic-level concepts is rarely integrated to provide an account of how meaning is accessed from vision.
Conceptual Structure in the Ventral Visual Pathway
A comprehensive cognitive model of conceptual representations in the brain needs to provide an account of both these sets of issues, and we argue that this can be achieved through the use of semantic feature models of conceptual knowledge. The model that we adopt here, the conceptual structure account 12, 41, claims that concepts can be represented in terms of their semantic features (e.g., ‘has legs’, ‘made of metal’) and statistical measures, termed conceptual structure statistics, based on the regularities of features both across concepts and within a concept. Conceptual structure statistics can be informative about both the superordinate category of a concept (e.g., a camel is an animal and a mammal) and how distinctive a concept is within the category (e.g., a camel is distinctive because of its hump which no other animals have). As Box 1 explains, category membership is strongly indicated by the features a concept shares with many other concepts (e.g., many animals have fur, and have legs etc.), while the relationship between the shared and the distinctive features of a concept reflects the ease with which a concept can be differentiated from similar concepts (or conceptual individuation). Further, statistics derived from property norms can reveal systematic differences between categories, such that living things (e.g., animals) have many shared and few distinctive features (all animals have eyes, but few have a hump), whereas nonliving things (e.g., tools) have fewer shared and relatively more distinctive features. The information captured with conceptual structure statistics shows how feature-based models can provide a single theoretical framework that captures information about conceptual representations at different levels of description.
Recent fMRI data from healthy participants [42] and lesion behaviour mapping in brain-damaged patients [43] show how conceptual structure statistics – capturing either superordinate category information or the ease of conceptual individuation – differentially relate to regions along the ventral visual pathway. In one study [42], we calculated conceptual structure statistics for a large and diverse set of common objects that participants named during fMRI scanning. We then related brain activation across these objects to different conceptual measures to determine how conceptual structure statistics influence object processing (Figure 3A). The results show that the conceptual structure of an object affects processing at two key sites along the ventral visual pathway. First, there is a gradient effect across the lateral-to-medial posterior fusiform that reflects the mean feature sharedness of a concept. Objects with many shared features (typically animals) show greater effects in the lateral posterior fusiform gyrus, and objects with fewer shared features (typically tools) show greater effects in the medial posterior fusiform gyrus. Second, effects in the AMTC, specifically in perirhinal cortex (PRC), are related to the ease of conceptual individuation: more-confusable concepts evoke greater activation. Evidence from lesion–behaviour mapping [43] confirms this relationship between conceptual structure statistics and the PRC. Damage to the PRC results in an increased deficit for naming semantically more-confusable objects, where confusability is defined by conceptual structure statistics (‘correlation × distinctiveness’; Figure 3B). Together, these two studies converge to highlight a specific relationship, between a conceptual structure statistic capturing conceptual individuation and the PRC, that was only indirectly suggested from prior brain lesion-mapping evidence 44, 45, 46, 47.
The statistical measures derived from feature-based accounts shed new light on the nature of category-specific effects in different regions of the ventral visual pathway, and do so with a framework situated at the level of basic-level concepts. Lateral-to-medial effects in the posterior fusiform gyrus, previously associated with category-specific effects for animals and tools, in fact seem to reflect a gradient of feature sharedness, whereas category-specific effects for living things in the AMTC can be explained in terms of the ease of conceptual individuation – two measures derived from a single account to explain category-specific effects in different regions of the ventral visual pathway for different computational reasons.
This research points to a key computational role for the human PRC in the individuation of semantically-confusable concepts. This role is not relevant for all semantic distinctions, but only for those requiring highly differentiated representations, such as distinctions between a lion, leopard, and cheetah. This is clear from studies showing increased AMTC activity only during basic-level conceptual recognition and not during superordinate category distinctions 22, 48, and from studies showing that activity increases in the PRC during the recognition of semantically more-confusable objects 42, 49.
There are close parallels here with research on the resolution of visual ambiguity and confusability in the PRC in both human and non-human primates 50, 51, 52, and on conceptual effects in humans 23, 42, 46, 49, 53, 54, 55, 56, 57, 58, 59, 60. Functionally, it can be argued that the PRC serves to differentiate between objects that have many overlapping features, and are therefore nearby in semantic space, while objects in sparse areas, with few semantic competitors, require less involvement of the PRC. This is directly supported by research showing that activation patterns in the human PRC reflect the semantic similarity of concepts, as defined by semantic features (Figure 3C) 49, 55.
This computational role of the PRC helps to explain two phenomena from neuropsychology. First, patients who present category-specific deficits for living things following AMTC damage show intact superordinate category knowledge. The basic-level nature of the deficits can be explained in terms of the role of the PRC being predominantly limited to differentiating between entities within superordinate categories. However, not all categories are equally effected following AMTC damage, leading to the second phenomenon: that the observed category-specific deficits for living things occur as a result of a differentiation impairment within denser areas of semantic space, more typical for living things, while these patients can easily differentiate within the less-dense areas typically occupied by nonliving things – resulting in the phenomena seen in Figure 1A.
These findings suggest a conceptual hierarchy in the ventral visual pathway, where a network of regions supports recognition of meaningful objects, and that category-specific effects emerge in different regions owing to categorical differences across complementary semantic feature statistics. This also has the implication that our individual knowledge about objects may reshape the distribution of effects in the ventral stream, consistent with research showing that expertise with different categories, and thus an increased ability to individuate between highly-similar objects, also increasingly engages the lateral posterior fusiform and anterior temporal regions 27, 61 – those regions most important for individuating objects with many shared features and few distinctive features.
The Temporal Dynamics of Conceptual Processing
We have shown how a semantic feature-based approach can account for observations of superordinate category-specific effects at different loci in the ventral visual pathway. Any comprehensive account of conceptual processing must also be able to capture the temporal dynamics during the retrieval of semantic knowledge. During object recognition, the system dynamics follow an initial feedforward phase of processing as signals propagate along the ventral temporal lobe, followed by recurrent, long-range reverberating interactions between cortical regions 62, 63, 64, 65, 66. The exact nature of the computations supported by these dynamics remains unclear, though there is clear evidence that information relevant to superordinate category distinctions can be accessed very rapidly (within 150 ms 67, 68, 69) whereas specific conceptual information is only accessible after approximately 200 ms 59, 68, 70, 71, 72.
How the temporal dynamics map onto the processing of conceptual information is an issue we have recently begun to investigate [73]. By measuring neural activity with a high temporal resolution, and using machine-learning methods, we can determine whether feature-based models can predict patterns of brain activity over time. One magnetoencephalography (MEG) study along these lines [68] showed that by combining a computational model of visual processing from V1 to posterior temporal cortex [74] with semantic feature information, the neural activity for single objects could be well explained and this model could be used to predict neural activity for other (new) objects. While the model including both visual and semantic information could successfully account for single-object neural activity from 60 ms, the semantic feature information made unique contributions over and above those that the visual information could explain. Semantic feature information explained a significant amount of single object data in the first 150 ms, and this in turn could predict neural activity that dissociated between objects from different superordinate categories. After around 150 ms, the predictions become more specific, and differentiated between members of the same category (i.e., the basic-level concept could be predicted solely based on semantics; Figure 3D).
In a direct assessment of the influence of conceptual structure statistics on the time-course of object recognition, a second MEG study [75] demonstrated that MEG signals correlated with the visual statistics of an object before rapid effects driven by the feature sharedness of the object in the first 150 ms. Subsequent to this, both shared and distinctive features were correlated with MEG signals after 150 ms. Together, these MEG studies highlight two important time-frames of conceptual processing during object recognition – early information that (rapidly activated by visual properties) dissociates superordinate categories and which is driven by shared feature information, and later conceptual integration of information which individuates basic-level concepts from semantically similar items.
Importance of Anterior–Posterior Interactions in the Ventral Stream
Taken together, data from neuropsychology, fMRI, and MEG reveal that semantic representations are transformed from primarily reflecting superordinate category information to basic-level conceptual information within a few hundred milliseconds, supported by processing along the ventral visual pathway. In particular, the posterior fusiform gyrus and PRC are important to this transition. Electrophysiological recordings in the PRC and posterior ventral temporal cortex of macaques suggest that visual information becomes more differentiated as information flows from posterior to anterior regions [76], a general process along the ventral stream in which object representations are increasingly differentiated [3]. With regards to the mechanism of how basic-level concepts become differentiated within their category, we have shown that connectivity between the ATL and the posterior fusiform increases during tasks requiring access to basic-level concepts compared to those requiring access to superordinate category information [70]. This highlights that the temporal relationship between neural activity in anterior and posterior temporal lobe regions plays an important role in the formation of detailed basic-level conceptual representations.
An important issue is whether interactions involving anterior and posterior regions in the ventral visual pathway are predominantly feedforward or feedback in nature, and how this might change during the course of perception. Combining neuropsychology and functional imaging is particularly illuminating. Patients with semantic deficits following neurological diseases affecting the anterior temporal lobes show reduced functional activity in the posterior aspects of the ventral stream 77, 78, suggesting that anterior damage impacts on the functioning of more-posterior sites. Consistent with this, small lesions to the temporal pole and rhinal cortices (perirhinal and entorhinal) create network dysfunction in the ventral visual pathway, specifically resulting in reduced feedback connectivity from the anterior temporal lobes to posterior fusiform [79]. Overall, these studies strongly suggest that feedback from the anterior temporal lobes, and from PRC, to the posterior ventral stream constitutes a necessary mechanism for accessing specific conceptual representations.
The role that brain connectivity plays in the organisation and orchestration of conceptual knowledge in the brain is yet to be fully appreciated [80]. We have emphasised that connectivity between anterior and posterior temporal lobe sites provides a key underpinning to forming specific basic-level conceptual representations [70], but how this within-temporal-lobe connectivity is coordinated with other networks (e.g., frontotemporal connectivity) remains an important unresolved issue 62, 81. One avenue for progress requires understanding how different brain networks are coordinated, the oscillatory nature of such connectivity and, vitally, how connectivity is modulated by well-characterised and distinct cognitive processes (see Outstanding Questions).
Concluding Remarks
We have argued here for a single explanatory framework, based on a feature-based account, to understand semantic cognition in the ventral visual pathway. This framework can account for several phenomena, previously unconnected, across behaviour, functional neuroimaging (fMRI, MEG), and brain-damaged patients. Progress in understanding conceptual representations in the brain is significantly advanced by shifting focus to the representation of basic-level concepts and to the relationships between them. We can then harness the potential of large feature-norming datasets to provide well-characterised models of semantic space whose regularities can be exploited using multivariate analysis methods applied to multiple imaging modalities.
Acknowledgments
We thank William Marslen-Wilson for his helpful comments on this manuscript. The research leading to these results has received funding to L.K.T. from the European Research Council under the European Commission Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement 249640.
Glossary
Basic-level concept | we can categorise the same object in many different ways ranging from more to less specific. Examples of the basic-level category are ‘dog’, ‘chair’, ‘hammer’, rather than more-specific (subordinate level; e.g., poodle) or less-specific (superordinate level; e.g., animal) names. The basic-level category of an object is typically the name you would give if asked the question – can you name this object? |
Conceptual knowledge | the information we know about things in the world. We use the term conceptual interchangeably with semantic. In contrast to episodic memory, our conceptual knowledge is not tied to any particular place or time; for example, it reflects our knowledge about tigers, rather than our memory of encountering a specific tiger in a specific context. |
Conceptual structure statistics | measures based on the regularities and co-occurrences of semantic feature information across different concepts, where the semantic features are typically obtained from large databases (e.g., large norming studies, corpus data). For example, ‘feature sharedness’, or how common a feature is across different concepts, may be calculated as 1/{the number of concepts a specific feature occurs in}. The mean ‘sharedness’ of a concept is then the mean ‘feature sharedness’ over all features in the concept. These statistics can be used to estimate the statistical structure of individual concepts and the relationship of concepts to each other, and have been shown to influence how conceptual information is accessed. |
Semantic features | Many models of conceptual knowledge assume that meaning is componential in that the meaning of a concept can be characterised by many smaller units of meaning. Semantic features, such as ‘has legs’ or ‘is round’, are one such approximation of those units and can be derived from property norming studies. Although semantic features are not claimed to be the neural units of meaning, the regularities and statistics derived from them are predicted to share some properties with how meaning is instantiated in the brain. |
Superordinate category | refers to groups made up of many concepts, where the grouping is based on semantic properties shared over the group. Superordinate categories can range from more specific categories such as animals, plants, and tools, to less specific categories such as nonliving things (artifacts). |
References
Full text links
Read article at publisher's site: https://doi.org/10.1016/j.tics.2015.08.008
Read article for free, from open access legal sources, via Unpaywall: http://www.cell.com/article/S1364661315001989/pdf
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1016/j.tics.2015.08.008
Article citations
Using deep neural networks to disentangle visual and semantic information in human perception and memory.
Nat Hum Behav, 08 Feb 2024
Cited by: 0 articles | PMID: 38332339
Representation of event and object concepts in ventral anterior temporal lobe and angular gyrus.
Cereb Cortex, 34(2):bhad519, 01 Jan 2024
Cited by: 0 articles | PMID: 38185997 | PMCID: PMC10839851
Recurrent connectivity supports higher-level visual and semantic object representations in the brain.
Commun Biol, 6(1):1207, 27 Nov 2023
Cited by: 0 articles | PMID: 38012301 | PMCID: PMC10682037
Effects of Aging on Successful Object Encoding: Enhanced Semantic Representations Compensate for Impaired Visual Representations.
J Neurosci, 43(44):7337-7350, 06 Sep 2023
Cited by: 3 articles | PMID: 37673674
Targeting the function of the transentorhinal cortex to identify early cognitive markers of Alzheimer's disease.
Cogn Affect Behav Neurosci, 23(4):986-996, 06 Apr 2023
Cited by: 0 articles | PMID: 37024735
Review
Go to all (52) article citations
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Objects and categories: feature statistics and object processing in the ventral stream.
J Cogn Neurosci, 25(10):1723-1735, 10 May 2013
Cited by: 60 articles | PMID: 23662861 | PMCID: PMC3767967
Perceptual and Semantic Representations at Encoding Contribute to True and False Recognition of Objects.
J Neurosci, 41(40):8375-8389, 19 Aug 2021
Cited by: 5 articles | PMID: 34413205 | PMCID: PMC8496201
Integrated deep visual and semantic attractor neural networks predict fMRI pattern-information along the ventral object processing pathway.
Sci Rep, 8(1):10636, 13 Jul 2018
Cited by: 24 articles | PMID: 30006530 | PMCID: PMC6045572
Integration of objects and space in perception and memory.
Nat Neurosci, 20(11):1493-1503, 01 Oct 2017
Cited by: 39 articles | PMID: 29073645 | PMCID: PMC5920781
Review Free full text in Europe PMC
How position dependent is visual object recognition?
Trends Cogn Sci, 12(3):114-122, 11 Feb 2008
Cited by: 69 articles | PMID: 18262829
Review
Funding
Funders who supported this work.
European Research Council (1)
From perception to conception: How the brain processes meaningful concepts (PERCEPCON)
Prof Lorraine K TYLER, University of Cambridge
Grant ID: 249640