Cognitive Architecture, Holistic Inference and Bayesian Networks

Fuller, Timothy J.

doi:10.1007/s11023-019-09505-7

Cognitive Architecture, Holistic Inference and Bayesian Networks

Published: 17 September 2019

Volume 29, pages 373–395, (2019)
Cite this article

Minds and Machines Aims and scope Submit manuscript

Timothy J. Fuller ORCID: orcid.org/0000-0002-5243-5343¹

509 Accesses
1 Citation
Explore all metrics

Abstract

Two long-standing arguments in cognitive science invoke the assumption that holistic inference is computationally infeasible. The first is Fodor’s skeptical argument toward computational modeling of ordinary inductive reasoning. The second advocates modular computational mechanisms of the kind posited by Cosmides, Tooby and Sperber. Based on advances in machine learning related to Bayes nets, as well as investigations into the structure of scientific and ordinary information, I maintain neither argument establishes its architectural conclusion. Similar considerations also undermine Fodor’s decades-long diagnosis of artificial intelligence research as confounded by an inability to circumscribe the amount of information relevant to inferential processes. This diagnosis is particularly inapposite with respect to Bayes nets, since one of their strengths as machine learning systems has been their capacity to reason probabilistically about large data sets whose size overwhelms the capacities of individual human reasoners. A general moral follows from these criticisms: Insights into artificial and human cognitive systems are likely to be cultivated by focusing greater attention on the structure and density of connections among items of information that are available to them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Combination of Bayesian Models and Representations

Bayesian Networks

Artificial Intelligence and High-Level Cognition

Notes

“Isotropy” is one of two senses in which Fodor maintains non-deductive inferences are holistic; the other is “Quinean” (Fodor 1983, 2008). These senses of holistic inference are characterized in Sects. 5.1 and 5.2, respectively.
A canonical example of combinatorial explosion concerns updating according to the probability calculus (Harman 1986). Such updating requires calculating the probability of every combination of simple propositions in one’s background theory joined by logical connectives, where the set of such complex propositions grows exponentially with the number of simple propositions. As Harman points out, calculating the probabilities of just thirty simple propositions “combinatorially explodes” in the sense of requiring more than a billion calculations.
In this quote I have substituted “non-deductive” for “abductive”. Over several decades, Fodor alternated between presenting holism as a feature of non-deductive inference generally (Fodor 1983, 2008) and as a feature of abduction in particular (Fodor 2000). Here I will follow Fodor 1983, 2008 in treating holism as a feature of non-deductive inference generally.
Fodor’s characterization of the frame problem is intentionally broad as compared to a narrower and more standard characterization within AI that restricts the problem to action-planning (Cf. Pylyshyn 1987). Fodor, however, believes that characterizing the frame problem as solely relevant to action-planning masks that it as an instance of a more general problem (cf. Fodor 1987).
See Cosmides and Tooby 1994, p. 104 and Sperber 1994, p. 49 for claims that non-modular processing is likely computationally infeasible. See Carruthers 2006, chapter 1 for similar arguments.
Notoriously, the notion of a “module” is multiply ambiguous in cognitive science. Some conceptions of a module do not treat such informational restrictions as central features, but instead treat performing a proprietary function as criterial for modularity (e.g., Barrett and Kurzban 2006). However, our focus will be on modules conceived as informationally restricted cognitive systems, since they are typically the subjects of computational feasibility arguments. “Modularity” is also used in a mathematical sense, which is relevant to this article, but does not refer to architectural features of cognitive systems. Rather, it refers to a measure of connection density as opposed to sparsity among communities of nodes in a network. One aim of this article is to argue that a mathematical notion of modularity is at least as important to understanding the computational feasibility of cognitive processing as the more traditional conception of modularity in terms of informationally restricted computational systems. A similar point, albeit from the angle of characterizing neural connectivity rather than the structure of Bayes nets, is also pursued in Colombo 2013.
Bayes nets are “directed” in the sense that the edges or arrows between variables point only in one direction. They are “acyclic” in the sense that tracing the direction of the arrows will not result in a path that begins and ends with the same variable.
“Causal” Bayes nets are networks in which directed edges are given a causal interpretation; these have been the focus of Bayes nets’ applications to early cognitive development.
Gopnik and Glymour 2006, for example, allow that Bayes nets are likely only partial models of childhood cognitive development (p. 42). And Glymour 2004 claims that the Bayes nets formalism should not be regarded as the entire story concerning causal reasoning in science (p. 784). However, see Danks 2014 for an extended argument that a core of the Bayes nets formalism is a significantly more general model of cognition than many have supposed.
Fodor has better-known criticisms of artificial neural networks based on their purported incapacity to account for productivity and systematicity (Fodor and Pylyshyn 1988). But he also faults them for their incapacity to accommodate the holistic character of non-deductive inference (Fodor 2000), which is our focus here.
A variable X is conditionally independent of another variable Y given a third variable Z just in case: P(X | Y, Z) = P(X | Y). Perhaps more intuitively, conditional independence might be also defined in terms of events that variables denote. To take one such definition, the probability of an event A is conditionally independent of an event B given an event C just in case: P(A∩B | C) = P(A | C)P(B | C), where P(C) > 0.
A brief explanation of the steps for constructing and identifying a clique or “fully connected” set of variables will provide a clearer characterization of computational feasibility on a Bayes net. The first step is to “moralize” a Bayesian network by removing the directionality of its arrows and inserting an (undirected) edge between all of its spouses. (In the language of familial relations used to characterize relationships among nodes on a Bayes net, a “parent” node is at the front of a directed edge or arrow and its “child” is at the tail end. “Spouses” are parents of at least one common child.) Next, make the graph “chordal” by adding an (undirected) edge between all nodes in loops greater than three. Finally, identify the sets of nodes in which a single (undirected) edge connects every pair of nodes—i.e., sets in which any two variables are directly linked to each other by a single (undirected) edge. Such “fully connected” sets are referred to as “cliques” (Cf. Koller and Friedman 2009, p. 35; Bertolero and Griffiths 2014, p. 188).
Why does largest clique size rather than network size determine the computational complexity of probabilistic inference on a Bayes net? Essentially this is a theorem whose full appreciation requires understanding its proof (cf. Koller and Friedman 2009, chapters 9 and 10). At an intuitive level, “elimination variable” algorithms have been developed that efficiently and exactly compute the probabilities of all variables on a Bayes net while avoiding repetitious computations such that they would be exponentially costly as the size of the net. However, even these elimination variable algorithms cannot “go around” the subgraph that constitutes a Bayes net’s largest clique, in the sense that calculations that are exponentially costly as the number of the largest clique’s constituent nodes are necessary intermediary steps.
There is a wrinkle here, however, given that one might charitably interpret at least one sense of “centrality” within the framework of Bayes nets as “member of the net’s largest clique”. This interpretation raises the worry that conservative revision processes might be computationally infeasible because identifying a Bayes net’s largest clique is N-P hard. However, such an interpretation of “centrality” would not in fact undermine this article’s central critique, which targets Fodor and several massive modularity theorists for failing to establish that bodies of information relevant to inferential processes are densely rather than sparsely structured. This critique holds in the present context because a prominent method for bypassing the problem of identifying a Bayes net’s largest clique, in the sense of nevertheless identifying the net’s overall community structure, is highly accurate and computationally feasible precisely when a network is sparse rather than dense (Girvan and Newsom 2002). Thanks to an anonymous reviewer for raising this objection.
Cf. http://conceptnet.io.
It should be noted, however, that learning a Bayes net’s structure from data (construed as variables bearing probabilistic dependencies), without any prior structural knowledge, is in general N-P hard. Thus, this article’s critique of Fodorian skepticism does not directly respond to a narrower frame problem that might be formulated in terms of specific (although empiricist) kinds of learning processes, as opposed to non-deductive inference generally. However, even restricting our scope to learning a Bayes net’s structure, Fodor’s frame problem arguably does not apply. This is because some of the most prominent heuristics for learning Bayes nets, including score-based structure learning algorithms, provably converge on accuracy as the size of data grows (Danks 2014, p. 55). Thus, as with probabilistic inference on extant Bayes nets, information size is not a central indicator of computational infeasibility for learning Bayes nets. In this case, more information leads to closer approximations of optimal performance rather than computational infeasibility. Thanks to an anonymous reviewer for raising this objection.

References

Barrett, H. C., & Kurzban, R. (2006). Modularity in cognition: Framing the debate. Psychological Review, 113, 628–647.
Article Google Scholar
Bertolero, M. & Griffiths, T. (2014). “Is holism a problem for inductive inference? A computational analysis.” in Proceedings of the 36th Annual Conference of the Cognitive Science Society (pp. 188–193).
Carruthers, P. (2003). On Fodor’s problem. Mind & Language, 18, 502–523.
Article Google Scholar
Carruthers, P. (2006). The architecture of the mind: Massive modularity and the flexibility of thought. Oxford: Oxford University Press.
Book Google Scholar
Clark, A. (2002). Local associations and global reason: Fodor’s frame problem and second-order search. Cognitive Science Quarterly, 2(2), 115–140.
MathSciNet Google Scholar
Colombo, M. (2013). Moving forward (and beyond) the modularity debate: A network perspective. Philosophy of Science, 80(3), 356–377.
Article Google Scholar
Cosmides, L., & Tooby, J. (1994). Origins of domain-specificity: The evolution of functional organization. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind: Domain-specificity in cognition and culture. New York: Cambridge University Press.
Google Scholar
Danks, D. (2014). Unifying the mind: Cognitive representations as graphical models. Cambridge: MIT Press.
Book Google Scholar
Fodor, J. (1983). The modularity of mind. Cambridge: MIT Press.
Book Google Scholar
Fodor, J. (1985). Précis of The Modularity of Mind (with peer commentaries and author’s response). The Behavioural and Brain Sciences, 8, 1–42.
Article Google Scholar
Fodor, J. (1987). Modules, frames fridgeons, sleeping dogs and the music of the spheres. In Z. Pylyshyn (Ed.), The Robot’s dilemma: The frame problem in artificial intelligence. Norwood: Ablex.
Google Scholar
Fodor, J. (2000). The mind doesn’t work that way: The scope and limits of computational psychology. Cambridge: MIT Press.
Book Google Scholar
Fodor, J. (2008). LOT 2: The language of thought revisited. New York: Oxford University Press.
Book Google Scholar
Fodor, J., & Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3–71.
Article Google Scholar
Fuller, T., & Samuels, R. (2014). Scientific inference and ordinary cognition: Fodor on holism and cognitive architecture. Mind and Language, 29(2), 201–237.
Article Google Scholar
Girvan, M., & Newsom, M. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.
Article MathSciNet MATH Google Scholar
Glymour, C. (1980). Theory and evidence. Princeton: Princeton University Press.
Google Scholar
Glymour, C. (1985). In Précis of The Modularity of Mind (with peer commentaries and 13 author’s response). The Behavioural and Brain Sciences, 8, 1–42.
Article Google Scholar
Glymour, C. (2000). Bayes nets as psychological models. In F. C. Keil & R. A. Wilson (Eds.), Explanation and cognition (pp. 169–198). Cambridge: MIT Press.
Google Scholar
Glymour, C. (2002). The mind’s arrows: Bayes nets and graphical causal models in psychology. Cambridge: MIT Press.
Google Scholar
Glymour, C. (2004). The automation of Discovery. Daedelus, Winter 2004, (pp. 69–77).
Glymour, C. (2010). What is right with ‘Bayes Net Methods’ and what is wrong with ‘Hunting Causes and Using Them’? British Journal for the Philosophy of Science, 61, 161–211.
Article MathSciNet Google Scholar
Gopnik, A. & Glymour, C. (2006). A brand new ball game: Bayes net and neural net learning mechanisms in children. Processes of change in brain and cognitive development: Attention and performance xxi. Attention and Performance (pp. 349–372).
Gopnik, A., Glymour, C., Sobel, D., Schulz, L., Kushnir, T., & Danks, D. (2004). A theory of causal learning in children: Causal maps and Bayes nets. Psychological Review, 111(1), 3–32.
Article Google Scholar
Gopnik, A., & Schulz, L. (Eds.). (2007). Causal learning: Philosophy, psychology and computation. New York: Oxford University Press.
Google Scholar
Gopnik, A., Sobel, D., Schulz, L., & Glymour, C. (2001). Causal learning mechanisms in very young children: Two-, three-, and four-year-olds infer causal relations from patterns of variation and covariation. Developmental Psychology, 37(5), 620–629.
Article Google Scholar
Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95.
Article Google Scholar
Harman, G. (1986). Change in view; principles of reasoning. Cambridge: MIT Press.
Google Scholar
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge: MIT Press.
MATH Google Scholar
Murphy, D. (2006). On Fodor’s analogy: why psychology is like philosophy of science after all. Mind & Language, 2, 553–564.
Article Google Scholar
Murzi, J., & Steinberger, F. (2017). “Inferentialism”, Blackwell Companion to Philosophy of Language (pp. 197–224). Hoboken: Wiley Blackwell.
Book Google Scholar
Pearl, J. (1988). Probabilistic reasoning systems: Networks of plausible inference. San Francisco: Morgan Kaufmann.
MATH Google Scholar
Pearl, J. (2000). Causality: models, reasoning, and inference. New York: Cambridge University Press.
MATH Google Scholar
Pinker, S. (2005). So how does the mind work? Mind & Language, 20, 1–24.
Article Google Scholar
Prinz, J. (2006). Is the mind really modular? In R. Stainton (Ed.), Contemporary debates in cognitive science (pp. 22–36). Oxford: Blackwell.
Google Scholar
Pylyshyn, Z. (Ed.). (1987). The Robot’s Dilemma: The Frame Problem in Artificial Intelligence. Norwood: Ablex.
Google Scholar
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.
Article Google Scholar
Quine, W. (1953). From a Logical Point of View. Cambridge, Mass.: Harvard University Press.
MATH Google Scholar
Quine, W., & Ullian, J. (1970). The Web of Belief. New York: Random House.
Google Scholar
Samuels, R. (1998). Evolutionary psychology and the massive modularity hypothesis. British Journal for the Philosophy of Science, 49, 575–602.
Article MathSciNet Google Scholar
Samuels, R. (2005). The complexity of cognition: tractability arguments for massive modularity. In P. Carruthers, S. Laurence, & S. Stich (Eds.), The Innate Mind: Structure and Contents (pp. 107–121). Oxford: Oxford University Press.
Chapter Google Scholar
Schneider, S. (2007). Yes, it does: A diatribe on Jerry Fodor’s The Mind Doesn’t Work that Way. Psyche, 13(1), 1–15.
MathSciNet Google Scholar
Sloman, S. (2005). Causal models: How we think about the world and its alternatives. New York: Oxford University Press.
Book Google Scholar
Sober, E. (1999). Testability. Proceedings and Addresses of the American Philosophical Association, 73, 47–76.
Article Google Scholar
Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the Mind (pp. 39–67). Cambridge: Cambridge University Press.
Chapter Google Scholar
Spiegelhalter, D. J., Franklin, R., & Bull, K. (1989). “Assessment, Criticism, and Improvement of Imprecise Probabilities for a Medical Expert System.” In Proceedings of the Fifth Conference on Uncertainty in Artificial Intelligence. Mountain View, CA (pp. 285–294).
Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search. Berlin: Springer-Verlag.
Book MATH Google Scholar
Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford: Oxford University Press.
Google Scholar

Download references

Acknowledgements

The author would like to thank Derek Baker, Maxwell Bertolero, Bennett Holman, Benjamin Jantzen, Edouard Machery, Richard Samuels, Kelly Trogdon, Carl Voss and Jiji Zhang for helpful feedback on drafts of this article. Additional thanks are extended to the philosophy departments at Virginia Tech University, Lingnan University and Underwood International College for valuable discussion on this article’s central ideas.

Author information

Authors and Affiliations

San Diego, United States
Timothy J. Fuller

Authors

Timothy J. Fuller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timothy J. Fuller.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fuller, T.J. Cognitive Architecture, Holistic Inference and Bayesian Networks. Minds & Machines 29, 373–395 (2019). https://doi.org/10.1007/s11023-019-09505-7

Download citation

Received: 25 March 2019
Accepted: 09 September 2019
Published: 17 September 2019
Issue Date: September 2019
DOI: https://doi.org/10.1007/s11023-019-09505-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cognitive Architecture, Holistic Inference and Bayesian Networks

Abstract

Access this article

Similar content being viewed by others

Hierarchical Combination of Bayesian Models and Representations

Bayesian Networks

Artificial Intelligence and High-Level Cognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cognitive Architecture, Holistic Inference and Bayesian Networks

Abstract

Access this article

Similar content being viewed by others

Hierarchical Combination of Bayesian Models and Representations

Bayesian Networks

Artificial Intelligence and High-Level Cognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation