Despite its centrality in the philosophy of cognitive science, there has been little prior philosophical work engaging with the notion of representation in contemporary NLP practice. This paper attempts to fill that lacuna: drawing on ideas from cognitive science, I introduce a framework for evaluating the representational claims made about components of neural NLP models, proposing three criteria with which to evaluate whether a component of a model represents a property and operationalising these criteria using probing classifiers, a popular analysis (...) technique in NLP (and deep learning more broadly). The project of operationalising a philosophically-informed notion of representation should be of interest to both philosophers of science and NLP practitioners. It affords philosophers a novel testing-ground for claims about the nature of representation, and helps NLPers organise the large literature on probing experiments, suggesting novel avenues for empirical research. (shrink)
The last 5 years have seen a series of remarkable achievements in deep-neural-network-based artificial intelligence research, and some modellers have argued that their performance compares favourably to human cognition. Critics, however, have argued that processing in deep neural networks is unlike human cognition for four reasons: they are (i) data-hungry, (ii) brittle, and (iii) inscrutable black boxes that merely (iv) reward-hack rather than learn real solutions to problems. This article rebuts these criticisms by exposing comparative bias within them, in the (...) process extracting some more general lessons that may also be useful for future debates. (shrink)
This book surveys and examines the most famous philosophical arguments against building a machine with human-level intelligence. From claims and counter-claims about the ability to implement consciousness, rationality, and meaning, to arguments about cognitive architecture, the book presents a vivid history of the clash between the philosophy and AI. Tellingly, the AI Wars are mostly quiet now. Explaining this crucial fact opens new paths to understanding the current resurgence AI (especially, deep learning AI and robotics), what happens when philosophy meets (...) science, and the role of philosophy in the culture in which it is embedded. -/- Organising the arguments into four core topics - 'Is AI possible', 'Architectures of the Mind', 'Mental Semantics and Mental Symbols' and 'Rationality and Creativity' - this book shows the debate that played out between the philosophers on both sides of the question, and, as well, the debate between philosophers and AI scientists and engineers building AI systems. Up-to-date and forward-looking, the book is packed with fresh insights and supporting material, including: -/- - Accessible introductions to each war, explaining the background behind the main arguments against AI - Each chapter details what happened in the AI wars, the legacy of the attacks, and what new controversies are on the horizon. - Extensive bibliography of key readings. (shrink)
Context sensitivity is one of the distinctive marks of human intelligence. Understanding the ﬂexible way in which humans think and act in a potentially inﬁnite number of circumstances, even though they’re only ﬁnite and limited beings, is a central challenge for the philosophy of mind and cognitive science, particularly in the case of those using representational theories. In this work, the frame problem, that is, the challenge of explaining how human cognition efficiently acknowledges what is relevant from what is not (...) in each context, has been adopted as a guide. By using it, we’ve been able to describe a fundamental tension between context sensitivity and the mental representations used in cognition theories. The ﬁrst chapter discusses the nature of the frame problem,as well as the reasons for its persistence. In the second and third chapters, the problem is used as a measure tool in order to inquiry a few representational approaches and check how well suited they are to deal with context dependencies. The problems found are then correlated with the frame problem. Throughout the discussion, we try to show that 1) none of the evaluated approaches is capable of dealing with context sensitivity in a proper manner, but 2) that’s not a reason to think that the frame problem constitutes an argument against representational approaches in general, and 3) that it constitutes a fundamental conceptual tool in contemporary research -/- A sensibilidade ao contexto é uma das marcas distintivas da inteligência humana. Compreender o modo flexível como o ser humano pensa e age em função de um número potencialmente infinito de circunstâncias, ainda que munido de recursos finitos e limitados, é um desafio central para a filosofia da mente e para a ciência cognitiva, em particular aos que fazem uso de teorias representacionalistas. Nesse trabalho, adotou-se como fio condutor o modo como isso se manifesta no "frame problem": a dificuldade em explicar como a cognição humana reconhece, de maneira eficiente, o que é ou não relevante em cada contexto. A partir dele, buscou-se caracterizar uma tensão fundamental entre a sensibilidade ao contexto e o uso de representações mentais em teorias da cognição. O primeiro capítulo discute a natureza do frame problem, bem como as razões de sua resiliência. No segundo e terceiro capítulos, faz-se uso do problema como métrica para investigar o quão adequado é o tratamento das dependências contextuais no âmbito de várias abordagens representacionais. No decorrer da discussão, realiza-se um esforço argumentativo para mostrar que 1) nenhuma das estratégias abordadas é capaz tratar adequadamente da sensibilidade ao contexto, mas que 2) apesar disso, o frame problem não constitui argumento fatal para teorias representacionalistas em geral, e que 3) ele constitui uma ferramenta conceitual fundamental para pesquisas contemporâneas. (shrink)
This book provides a framework for thinking about foundational philosophical questions surrounding machine learning as an approach to artificial intelligence. Specifically, it links recent breakthroughs in deep learning to classical empiricist philosophy of mind. In recent assessments of deep learning's current capabilities and future potential, prominent scientists have cited historical figures from the perennial philosophical debate between nativism and empiricism, which primarily concerns the origins of abstract knowledge. These empiricists were generally faculty psychologists; that is, they argued that the active (...) engagement of general psychological faculties-such as perception, memory, imagination, attention, and empathy-enables rational agents to extract abstract knowledge from sensory experience. This book explains a number of recent attempts to model roles attributed to these faculties in deep neural network based artificial agents by appeal to the faculty psychology of philosophers such as Aristotle, Ibn Sina (Avicenna), John Locke David Hume, William James, and Sophie de Grouchy. It illustrates the utility of this interdisciplinary connection by showing how it can provide benefits to both philosophy and computer science: computer scientists can continue to mine the history of philosophy for ideas and aspirational targets to hit on the way to more robustly rational artificial agents, and philosophers can see how some of the historical empiricists' most ambitious speculations can be realized in specific computational systems. (shrink)
Methods of machine learning (ML) are gradually complementing and sometimes even replacing methods of classical statistics in science. This raises the question whether ML faces the same methodological problems as classical statistics. This paper sheds light on this question by investigating a long-standing challenge to classical statistics: the reference class problem (RCP). It arises whenever statistical evidence is applied to an individual object, since the individual belongs to several reference classes and evidence might vary across them. Thus, the problem consists (...) in choosing a suitable reference class for the individual. I argue that deep neural networks (DNNs) are able to overcome specific instantiations of the RCP. Whereas the criteria of narrowness, reliability, and homogeneity, that have been proposed to determine a suitable reference class, pose an inextricable tradeoff to classical statistics, DNNs are able to satisfy them in some situations. On the one hand, they can exploit the high dimensionality in big-data settings. I argue that this corresponds to the criteria of narrowness and reliability. On the other hand, ML research indicates that DNNs are generally not susceptible to overfitting. I argue that this property is related to a particular form of homogeneity. Taking both aspects together reveals that there are specific settings in which DNNs can overcome the RCP. (shrink)
Two universal functional principles of Grossberg’s Adaptive Resonance Theory decipher the brain code of all biological learning and adaptive intelligence. Low-level representations of multisensory stimuli in their immediate environmental context are formed on the basis of bottom-up activation and under the control of top-down matching rules that integrate high-level, long-term traces of contextual configuration. These universal coding principles lead to the establishment of lasting brain signatures of perceptual experience in all living species, from aplysiae to primates. They are re-visited in (...) this concept paper on the basis of examples drawn from the original code and from some of the most recent related empirical findings on contextual modulation in the brain, highlighting the potential of Grossberg’s pioneering insights and groundbreaking theoretical work for intelligent solutions in the domain of developmental and cognitive robotics. (shrink)
Two universal functional principles of Grossberg’s Adaptive Resonance Theory  decipher the brain code of all biological learning and adaptive intelligence. Low-level representations of multisensory stimuli in their immediate environmental context are formed on the basis of bottom-up activation and under the control of top-down matching rules that integrate high-level long-term traces of contextual configuration. These universal coding principles lead to the establishment of lasting brain signatures of perceptual experience in all living species, from aplysiae to primates. They are re-visited (...) in this concept paper here on the basis of examples drawn from the original code and from some of the most recent related empirical findings on contextual modulation in the brain, highlighting the potential of Grossberg’s pioneering insights and groundbreaking theoretical work for intelligent solutions in the domain of developmental and cognitive robotics. (shrink)
Machine learning operates at the intersection of statistics and computer science. This raises the question as to its underlying methodology. While much emphasis has been put on the close link between the process of learning from data and induction, the falsificationist component of machine learning has received minor attention. In this paper, we argue that the idea of falsification is central to the methodology of machine learning. It is commonly thought that machine learning algorithms infer general prediction rules from past (...) observations. This is akin to a statistical procedure by which estimates are obtained from a sample of data. But machine learning algorithms can also be described as choosing one prediction rule from an entire class of functions. In particular, the algorithm that determines the weights of an artificial neural network operates by empirical risk minimization and rejects prediction rules that lack empirical adequacy. It also exhibits a behavior of implicit regularization that pushes hypothesis choice toward simpler prediction rules. We argue that taking both aspects together gives rise to a falsificationist account of artificial neural networks. (shrink)
Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam’s razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties (...) of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future. Keywords: big data; non-dimensionality; applied data science; paradigm shift; artificial intelligence; principle of parsimony (Occam’s razor) . (shrink)
Can the machines that play board games or recognize images only in the comfort of the virtual world be intelligent? To become reliable and convenient assistants to humans, machines need to learn how to act and communicate in the physical reality, just like people do. The authors propose two novel ways of designing and building Artificial General Intelligence (AGI). The first one seeks to unify all participants at any instance of the Turing test – the judge, the machine, the human (...) subject as well as the means of observation instead of building a separating wall. The second one aims to design AGI programs in such a way so that they can move in various environments. The authors of the article thoroughly discuss four areas of interaction for robots with AGI and introduce a new idea of techno-umwelt bridging artificial intelligence with biology in a new way. (shrink)
A crucial question for artificial cognition systems is what meaning is and how it arises. In pursuit of that question, this paper extends earlier work in which we show that emergence of simple signaling in biologically inspired models using arrays of locally interactive agents. Communities of "communicators" develop in an environment of wandering food sources and predators using any of a variety of mechanisms: imitation of successful neighbors, localized genetic algorithms and partial neural net training on successful neighbors. Here we (...) focus on environmental variability, comparing results for environments with (a) constant resources, (b) random resources, and (c) cycles of "boom and bust." In both simple and complex models across all three mechanisms of strategy change, the emergence of communication is strongly favored by cycles of "boom and bust." These results are particularly intriguing given the importance of environmental variability in fields as diverse as psychology, ecology and cultural anthropology. (shrink)
The ontology of social objects and facts remains a field of continued controversy. This situation complicates the life of social scientists who seek to make predictive models of social phenomena. For the purposes of modelling a social phenomenon, we would like to avoid having to make any controversial ontological commitments. The overwhelming majority of models in the social sciences, including statistical models, are built upon ontological assumptions that can be questioned. Recently, however, artificial neural networks have made their way into (...) the social sciences, raising the question whether they can avoid controversial ontological assumptions. ANNs are largely distinguished from other statistical and machine learning techniques by being a representation-learning technique. That is, researchers can let the neural networks select which features of the data to use for internal representation instead of imposing their preconceptions. On this basis, I argue that neural networks can avoid ontological assumptions to a greater degree than common statistical models in the social sciences. I then go on, however, to establish that ANNs are not ontologically innocent either. The use of ANNs in the social sciences introduces ontological assumptions typically in at least two ways, via the input and via the architecture. (shrink)
in a nervous system of a given species. This chapter provides a critical perspective on the role of connectomes in neuroscientific practice and asks how the connectomic approach fits into a larger context in which network thinking permeates technology, infrastructure, social life, and the economy. In the first part of this chapter, we argue that, seen from the perspective of ongoing research, the notion of connectomes as “complete descriptions” is misguided. Our argument combines Rachel Ankeny’s analysis of neuroanatomical wiring diagrams (...) as “descriptive models” with Hans-Joerg Rheinberger’s notion of “epistemic objects,” i.e., targets of research that are still partially unknown. Combining these aspects we conclude that connectomes are constitutively epistemic objects: there just is no way to turn them into permanent and complete technical standards because the possibilities to map connection properties under different modeling assumptions are potentially inexhaustible. In the second part of the chapter, we use this understanding of connectomes as constitutively epistemic objects in order to critically assess the historical and political dimensions of current neuroscientific research. We argue that connectomics shows how the notion of the “brain as a network” has become the dominant metaphor of contemporary brain research. We further point out that this metaphor shares (potentially problematic) affinities to the form of contemporary “network societies.” We close by pointing out how the relation between connectomes and networks in society could be used in a more fruitful manner. (shrink)
In recent years, the family of algorithms collected under the term ``deep learning'' has revolutionized artificial intelligence, enabling machines to reach human-like performances in many complex cognitive tasks. Although deep learning models are grounded in the connectionist paradigm, their recent advances were basically developed with engineering goals in mind. Despite of their applied focus, deep learning models eventually seem fruitful for cognitive purposes. This can be thought as a kind of biological exaptation, where a physiological structure becomes applicable for a (...) function different from that for which it was selected. In this paper, it will be argued that it is time for cognitive science to seriously come to terms with deep learning, and we try to spell out the reasons why this is the case. First, the path of the evolution of deep learning from the connectionist project is traced, demonstrating the remarkable continuity, and the differences as well. Then, it will be considered how deep learning models can be useful for many cognitive topics, especially those where it has achieved performance comparable to humans, from perception to language. It will be maintained that deep learning poses questions that cognitive sciences should try to answer. One of such questions is the reasons why deep convolutional models that are disembodied, inactive, unaware of context, and static, are by far the closest to the patterns of activation in the brain visual system. (shrink)
This paper attempts to describe and address a specific puzzle related to compositionality in artificial networks such as Deep Neural Networks and machine learning in general. The puzzle identified here touches on a larger debate in Artificial Intelligence related to epistemic opacity but specifically focuses on computational applications of human level linguistic abilities or properties and a special difficulty with relation to these. Thus, the resulting issue is both general and unique. A partial solution is suggested.
In this paper, I argue that theories of perception that appeal to Helmholtz’s idea of unconscious inference (“Helmholtzian” theories) should be taken literally, i.e. that the inferences appealed to in such theories are inferences in the full sense of the term, as employed elsewhere in philosophy and in ordinary discourse. -/- In the course of the argument, I consider constraints on inference based on the idea that inference is a deliberate acton, and on the idea that inferences depend on the (...) syntactic structure of representations. I argue that inference is a personal-level but sometimes unconscious process that cannot in general be distinguished from association on the basis of the structures of the representations over which it’s defined. I also critique arguments against representationalist interpretations of Helmholtzian theories, and argue against the view that perceptual inference is encapsulated in a module. (shrink)
This chapter focuses on what’s novel in the perspective that the prediction error minimization (PEM) framework affords on the cognitive-scientific project of explaining intelligence by appeal to internal representations. It shows how truth-conditional and resemblance-based approaches to representation in generative models may be integrated. The PEM framework in cognitive science is an approach to cognition and perception centered on a simple idea: organisms represent the world by constantly predicting their own internal states. PEM theories often stress the hierarchical structure of (...) the generative models they posit. The novel explanatory power of the PEM account derives largely from the way in which pairs of generative and recognition models interact. “Predictive coding” refers to an encoding strategy in which predicted portions of an input signal are subtracted from the actual signal received, so that only the difference between the two is passed as output to the next stage of information processing. (shrink)
Deep learning is currently the most prominent and widely successful method in artificial intelligence. Despite having played an active role in earlier artificial intelligence and neural network research, philosophers have been largely silent on this technology so far. This is remarkable, given that deep learning neural networks have blown past predicted upper limits on artificial intelligence performance—recognizing complex objects in natural photographs and defeating world champions in strategy games as complex as Go and chess—yet there remains no universally accepted explanation (...) as to why they work so well. This article provides an introduction to these networks as well as an opinionated guidebook on the philosophical significance of their structure and achievements. It argues that deep learning neural networks differ importantly in their structure and mathematical properties from the shallower neural networks that were the subject of so much philosophical reflection in the 1980s and 1990s. The article then explores several different explanations for their success and ends by proposing three areas of inquiry that would benefit from future engagement by philosophers of mind and science. (shrink)
Artificial intelligence (AI) research enjoyed an initial period of enthusiasm in the 1970s and 80s. But this enthusiasm was tempered by a long interlude of frustration when genuinely useful AI applications failed to be forthcoming. Today, we are experiencing once again a period of enthusiasm, fired above all by the successes of the technology of deep neural networks or deep machine learning. In this paper we draw attention to what we take to be serious problems underlying current views of artificial (...) intelligence encouraged by these successes, especially in the domain of language processing. We then show an alternative approach to language-centric AI, in which we identify a role for philosophy. (shrink)
Endowing artificial systems with explanatory capacities about the reasons guiding their decisions, represents a crucial challenge and research objective in the current fields of Artificial Intelligence (AI) and Computational Cognitive Science [Langley et al., 2017]. Current mainstream AI systems, in fact, despite the enormous progresses reached in specific tasks, mostly fail to provide a transparent account of the reasons determining their behavior (both in cases of a successful or unsuccessful output). This is due to the fact that the classical problem (...) of opacity in artificial neural networks (ANNs) explodes with the adoption of current Deep Learning techniques [LeCun, Bengio, Hinton, 2015]. In this paper we argue that the explanatory deficit of such techniques represents an important problem, that limits their adoption in the cognitive modelling and computational cognitive science arena. In particular we will show how the current attempts of providing explanations of the deep nets behaviour (see e.g. [Ritter et al. 2017] are not satisfactory. As a possibile way out to this problem, we present two different research strategies. The first strategy aims at dealing with the opacity problem by providing a more abstract interpretation of neural mechanisms and representations. This approach is adopted, for example, by the biologically inspired SPAUN architecture [Eliasmith et al., 2012] and by other proposals suggesting, for example, the interpretation of neural networks in terms of the Conceptual Spaces framework [Gärdenfors 2000, Lieto, Chella and Frixione, 2017]. All such proposals presuppose that the neural level of representation can be considered somehow irrelevant for attacking the problem of explanation [Lieto, Lebiere and Oltramari, 2017]. In our opinion, pursuing this research direction can still preserve the use of deep learning techniques in artificial cognitive models provided that novel and additional results in terms of “transparency” are obtained. The second strategy is somehow at odds with respect to the previous one and tries to address the explanatory issue by avoiding to directly solve the “opacity” problem. In this case, the idea is that one of resorting to pre-compiled plausible explanatory models of the word used in combination with deep-nets (see e.g. [Augello et al. 2017]). We argue that this research agenda, even if does not directly fits the explanatory needs of Computational Cognitive Science, can still be useful to provide results in the area of applied AI aiming at shedding light on the models of interaction between low level and high level tasks (e.g. between perceptual categorization and explanantion) in artificial systems. (shrink)
In artificial intelligence, recent research has demonstrated the remarkable potential of Deep Convolutional Neural Networks (DCNNs), which seem to exceed state-of-the-art performance in new domains weekly, especially on the sorts of very difficult perceptual discrimination tasks that skeptics thought would remain beyond the reach of artificial intelligence. However, it has proven difficult to explain why DCNNs perform so well. In philosophy of mind, empiricists have long suggested that complex cognition is based on information derived from sensory experience, often appealing to (...) a faculty of abstraction. Rationalists have frequently complained, however, that empiricists never adequately explained how this faculty of abstraction actually works. In this paper, I tie these two questions together, to the mutual benefit of both disciplines. I argue that the architectural features that distinguish DCNNs from earlier neural networks allow them to implement a form of hierarchical processing that I call “transformational abstraction”. Transformational abstraction iteratively converts sensory-based representations of category exemplars into new formats that are increasingly tolerant to “nuisance variation” in input. Reflecting upon the way that DCNNs leverage a combination of linear and non-linear processing to efficiently accomplish this feat allows us to understand how the brain is capable of bi-directional travel between exemplars and abstractions, addressing longstanding problems in empiricist philosophy of mind. I end by considering the prospects for future research on DCNNs, arguing that rather than simply implementing 80s connectionism with more brute-force computation, transformational abstraction counts as a qualitatively distinct form of processing ripe with philosophical and psychological significance, because it is significantly better suited to depict the generic mechanism responsible for this important kind of psychological processing in the brain. (shrink)
Between the later views of Wittgenstein and those of connectionism on the subject of the mastery of language there is an impressively large number of similarities. The task of establishing this claim is carried out in the second section of this paper.
Within the controversy between the combinatorial and the connectionist approaches to cognition it has been argued that our semantic and syntactic capacities provide evidence for the combinatorial approach. In this paper I offer a counter-weight to this argument by pointing out that the same type of considerations, when applied to the pragmatics of adjectives, provide evidence for connectionism.
In “On Begging the Systematicity Question,” Wayne Davis criticizes the suggestion of Cummins et al. that the alleged systematicity of thought is not as obvious as is sometimes supposed, and hence not reliable evidence for the language of thought hypothesis. We offer a brief reply.