Skip to main content

Advertisement

Log in

Cognitive Architecture, Holistic Inference and Bayesian Networks

  • Published:
Minds and Machines Aims and scope Submit manuscript

Abstract

Two long-standing arguments in cognitive science invoke the assumption that holistic inference is computationally infeasible. The first is Fodor’s skeptical argument toward computational modeling of ordinary inductive reasoning. The second advocates modular computational mechanisms of the kind posited by Cosmides, Tooby and Sperber. Based on advances in machine learning related to Bayes nets, as well as investigations into the structure of scientific and ordinary information, I maintain neither argument establishes its architectural conclusion. Similar considerations also undermine Fodor’s decades-long diagnosis of artificial intelligence research as confounded by an inability to circumscribe the amount of information relevant to inferential processes. This diagnosis is particularly inapposite with respect to Bayes nets, since one of their strengths as machine learning systems has been their capacity to reason probabilistically about large data sets whose size overwhelms the capacities of individual human reasoners. A general moral follows from these criticisms: Insights into artificial and human cognitive systems are likely to be cultivated by focusing greater attention on the structure and density of connections among items of information that are available to them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. “Isotropy” is one of two senses in which Fodor maintains non-deductive inferences are holistic; the other is “Quinean” (Fodor 1983, 2008). These senses of holistic inference are characterized in Sects. 5.1 and 5.2, respectively.

  2. A canonical example of combinatorial explosion concerns updating according to the probability calculus (Harman 1986). Such updating requires calculating the probability of every combination of simple propositions in one’s background theory joined by logical connectives, where the set of such complex propositions grows exponentially with the number of simple propositions. As Harman points out, calculating the probabilities of just thirty simple propositions “combinatorially explodes” in the sense of requiring more than a billion calculations.

  3. In this quote I have substituted “non-deductive” for “abductive”. Over several decades, Fodor alternated between presenting holism as a feature of non-deductive inference generally (Fodor 1983, 2008) and as a feature of abduction in particular (Fodor 2000). Here I will follow Fodor 1983, 2008 in treating holism as a feature of non-deductive inference generally.

  4. Fodor’s characterization of the frame problem is intentionally broad as compared to a narrower and more standard characterization within AI that restricts the problem to action-planning (Cf. Pylyshyn 1987). Fodor, however, believes that characterizing the frame problem as solely relevant to action-planning masks that it as an instance of a more general problem (cf. Fodor 1987).

  5. See Cosmides and Tooby 1994, p. 104 and Sperber 1994, p. 49 for claims that non-modular processing is likely computationally infeasible. See Carruthers 2006, chapter 1 for similar arguments.

  6. Notoriously, the notion of a “module” is multiply ambiguous in cognitive science. Some conceptions of a module do not treat such informational restrictions as central features, but instead treat performing a proprietary function as criterial for modularity (e.g., Barrett and Kurzban 2006). However, our focus will be on modules conceived as informationally restricted cognitive systems, since they are typically the subjects of computational feasibility arguments. “Modularity” is also used in a mathematical sense, which is relevant to this article, but does not refer to architectural features of cognitive systems. Rather, it refers to a measure of connection density as opposed to sparsity among communities of nodes in a network. One aim of this article is to argue that a mathematical notion of modularity is at least as important to understanding the computational feasibility of cognitive processing as the more traditional conception of modularity in terms of informationally restricted computational systems. A similar point, albeit from the angle of characterizing neural connectivity rather than the structure of Bayes nets, is also pursued in Colombo 2013.

  7. Bayes nets are “directed” in the sense that the edges or arrows between variables point only in one direction. They are “acyclic” in the sense that tracing the direction of the arrows will not result in a path that begins and ends with the same variable.

  8. “Causal” Bayes nets are networks in which directed edges are given a causal interpretation; these have been the focus of Bayes nets’ applications to early cognitive development.

  9. Gopnik and Glymour 2006, for example, allow that Bayes nets are likely only partial models of childhood cognitive development (p. 42). And Glymour 2004 claims that the Bayes nets formalism should not be regarded as the entire story concerning causal reasoning in science (p. 784). However, see Danks 2014 for an extended argument that a core of the Bayes nets formalism is a significantly more general model of cognition than many have supposed.

  10. Fodor has better-known criticisms of artificial neural networks based on their purported incapacity to account for productivity and systematicity (Fodor and Pylyshyn 1988). But he also faults them for their incapacity to accommodate the holistic character of non-deductive inference (Fodor 2000), which is our focus here.

  11. A variable X is conditionally independent of another variable Y given a third variable Z just in case: P(X | Y, Z) = P(X | Y). Perhaps more intuitively, conditional independence might be also defined in terms of events that variables denote. To take one such definition, the probability of an event A is conditionally independent of an event B given an event C just in case: P(A∩B | C) = P(A | C)P(B | C), where P(C) > 0.

  12. A brief explanation of the steps for constructing and identifying a clique or “fully connected” set of variables will provide a clearer characterization of computational feasibility on a Bayes net. The first step is to “moralize” a Bayesian network by removing the directionality of its arrows and inserting an (undirected) edge between all of its spouses. (In the language of familial relations used to characterize relationships among nodes on a Bayes net, a “parent” node is at the front of a directed edge or arrow and its “child” is at the tail end. “Spouses” are parents of at least one common child.) Next, make the graph “chordal” by adding an (undirected) edge between all nodes in loops greater than three. Finally, identify the sets of nodes in which a single (undirected) edge connects every pair of nodes—i.e., sets in which any two variables are directly linked to each other by a single (undirected) edge. Such “fully connected” sets are referred to as “cliques” (Cf. Koller and Friedman 2009, p. 35; Bertolero and Griffiths 2014, p. 188).

  13. Why does largest clique size rather than network size determine the computational complexity of probabilistic inference on a Bayes net? Essentially this is a theorem whose full appreciation requires understanding its proof (cf. Koller and Friedman 2009, chapters 9 and 10). At an intuitive level, “elimination variable” algorithms have been developed that efficiently and exactly compute the probabilities of all variables on a Bayes net while avoiding repetitious computations such that they would be exponentially costly as the size of the net. However, even these elimination variable algorithms cannot “go around” the subgraph that constitutes a Bayes net’s largest clique, in the sense that calculations that are exponentially costly as the number of the largest clique’s constituent nodes are necessary intermediary steps.

  14. There is a wrinkle here, however, given that one might charitably interpret at least one sense of “centrality” within the framework of Bayes nets as “member of the net’s largest clique”. This interpretation raises the worry that conservative revision processes might be computationally infeasible because identifying a Bayes net’s largest clique is N-P hard. However, such an interpretation of “centrality” would not in fact undermine this article’s central critique, which targets Fodor and several massive modularity theorists for failing to establish that bodies of information relevant to inferential processes are densely rather than sparsely structured. This critique holds in the present context because a prominent method for bypassing the problem of identifying a Bayes net’s largest clique, in the sense of nevertheless identifying the net’s overall community structure, is highly accurate and computationally feasible precisely when a network is sparse rather than dense (Girvan and Newsom 2002). Thanks to an anonymous reviewer for raising this objection.

  15. Cf. http://conceptnet.io.

  16. It should be noted, however, that learning a Bayes net’s structure from data (construed as variables bearing probabilistic dependencies), without any prior structural knowledge, is in general N-P hard. Thus, this article’s critique of Fodorian skepticism does not directly respond to a narrower frame problem that might be formulated in terms of specific (although empiricist) kinds of learning processes, as opposed to non-deductive inference generally. However, even restricting our scope to learning a Bayes net’s structure, Fodor’s frame problem arguably does not apply. This is because some of the most prominent heuristics for learning Bayes nets, including score-based structure learning algorithms, provably converge on accuracy as the size of data grows (Danks 2014, p. 55). Thus, as with probabilistic inference on extant Bayes nets, information size is not a central indicator of computational infeasibility for learning Bayes nets. In this case, more information leads to closer approximations of optimal performance rather than computational infeasibility. Thanks to an anonymous reviewer for raising this objection.

References

  • Barrett, H. C., & Kurzban, R. (2006). Modularity in cognition: Framing the debate. Psychological Review, 113, 628–647.

    Article  Google Scholar 

  • Bertolero, M. & Griffiths, T. (2014). “Is holism a problem for inductive inference? A computational analysis.” in Proceedings of the 36th Annual Conference of the Cognitive Science Society (pp. 188–193).

  • Carruthers, P. (2003). On Fodor’s problem. Mind & Language, 18, 502–523.

    Article  Google Scholar 

  • Carruthers, P. (2006). The architecture of the mind: Massive modularity and the flexibility of thought. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Clark, A. (2002). Local associations and global reason: Fodor’s frame problem and second-order search. Cognitive Science Quarterly, 2(2), 115–140.

    MathSciNet  Google Scholar 

  • Colombo, M. (2013). Moving forward (and beyond) the modularity debate: A network perspective. Philosophy of Science, 80(3), 356–377.

    Article  Google Scholar 

  • Cosmides, L., & Tooby, J. (1994). Origins of domain-specificity: The evolution of functional organization. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind: Domain-specificity in cognition and culture. New York: Cambridge University Press.

    Google Scholar 

  • Danks, D. (2014). Unifying the mind: Cognitive representations as graphical models. Cambridge: MIT Press.

    Book  Google Scholar 

  • Fodor, J. (1983). The modularity of mind. Cambridge: MIT Press.

    Book  Google Scholar 

  • Fodor, J. (1985). Précis of The Modularity of Mind (with peer commentaries and author’s response). The Behavioural and Brain Sciences, 8, 1–42.

    Article  Google Scholar 

  • Fodor, J. (1987). Modules, frames fridgeons, sleeping dogs and the music of the spheres. In Z. Pylyshyn (Ed.), The Robot’s dilemma: The frame problem in artificial intelligence. Norwood: Ablex.

    Google Scholar 

  • Fodor, J. (2000). The mind doesn’t work that way: The scope and limits of computational psychology. Cambridge: MIT Press.

    Book  Google Scholar 

  • Fodor, J. (2008). LOT 2: The language of thought revisited. New York: Oxford University Press.

    Book  Google Scholar 

  • Fodor, J., & Pylyshyn, Z. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3–71.

    Article  Google Scholar 

  • Fuller, T., & Samuels, R. (2014). Scientific inference and ordinary cognition: Fodor on holism and cognitive architecture. Mind and Language, 29(2), 201–237.

    Article  Google Scholar 

  • Girvan, M., & Newsom, M. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.

    Article  MathSciNet  MATH  Google Scholar 

  • Glymour, C. (1980). Theory and evidence. Princeton: Princeton University Press.

    Google Scholar 

  • Glymour, C. (1985). In Précis of The Modularity of Mind (with peer commentaries and 13 author’s response). The Behavioural and Brain Sciences, 8, 1–42.

    Article  Google Scholar 

  • Glymour, C. (2000). Bayes nets as psychological models. In F. C. Keil & R. A. Wilson (Eds.), Explanation and cognition (pp. 169–198). Cambridge: MIT Press.

    Google Scholar 

  • Glymour, C. (2002). The mind’s arrows: Bayes nets and graphical causal models in psychology. Cambridge: MIT Press.

    Google Scholar 

  • Glymour, C. (2004). The automation of Discovery. Daedelus, Winter 2004, (pp. 69–77).

  • Glymour, C. (2010). What is right with ‘Bayes Net Methods’ and what is wrong with ‘Hunting Causes and Using Them’? British Journal for the Philosophy of Science, 61, 161–211.

    Article  MathSciNet  Google Scholar 

  • Gopnik, A. & Glymour, C. (2006). A brand new ball game: Bayes net and neural net learning mechanisms in children. Processes of change in brain and cognitive development: Attention and performance xxi. Attention and Performance (pp. 349–372).

  • Gopnik, A., Glymour, C., Sobel, D., Schulz, L., Kushnir, T., & Danks, D. (2004). A theory of causal learning in children: Causal maps and Bayes nets. Psychological Review, 111(1), 3–32.

    Article  Google Scholar 

  • Gopnik, A., & Schulz, L. (Eds.). (2007). Causal learning: Philosophy, psychology and computation. New York: Oxford University Press.

    Google Scholar 

  • Gopnik, A., Sobel, D., Schulz, L., & Glymour, C. (2001). Causal learning mechanisms in very young children: Two-, three-, and four-year-olds infer causal relations from patterns of variation and covariation. Developmental Psychology, 37(5), 620–629.

    Article  Google Scholar 

  • Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95.

    Article  Google Scholar 

  • Harman, G. (1986). Change in view; principles of reasoning. Cambridge: MIT Press.

    Google Scholar 

  • Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge: MIT Press.

    MATH  Google Scholar 

  • Murphy, D. (2006). On Fodor’s analogy: why psychology is like philosophy of science after all. Mind & Language, 2, 553–564.

    Article  Google Scholar 

  • Murzi, J., & Steinberger, F. (2017). “Inferentialism”, Blackwell Companion to Philosophy of Language (pp. 197–224). Hoboken: Wiley Blackwell.

    Book  Google Scholar 

  • Pearl, J. (1988). Probabilistic reasoning systems: Networks of plausible inference. San Francisco: Morgan Kaufmann.

    MATH  Google Scholar 

  • Pearl, J. (2000). Causality: models, reasoning, and inference. New York: Cambridge University Press.

    MATH  Google Scholar 

  • Pinker, S. (2005). So how does the mind work? Mind & Language, 20, 1–24.

    Article  Google Scholar 

  • Prinz, J. (2006). Is the mind really modular? In R. Stainton (Ed.), Contemporary debates in cognitive science (pp. 22–36). Oxford: Blackwell.

    Google Scholar 

  • Pylyshyn, Z. (Ed.). (1987). The Robot’s Dilemma: The Frame Problem in Artificial Intelligence. Norwood: Ablex.

    Google Scholar 

  • Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.

    Article  Google Scholar 

  • Quine, W. (1953). From a Logical Point of View. Cambridge, Mass.: Harvard University Press.

    MATH  Google Scholar 

  • Quine, W., & Ullian, J. (1970). The Web of Belief. New York: Random House.

    Google Scholar 

  • Samuels, R. (1998). Evolutionary psychology and the massive modularity hypothesis. British Journal for the Philosophy of Science, 49, 575–602.

    Article  MathSciNet  Google Scholar 

  • Samuels, R. (2005). The complexity of cognition: tractability arguments for massive modularity. In P. Carruthers, S. Laurence, & S. Stich (Eds.), The Innate Mind: Structure and Contents (pp. 107–121). Oxford: Oxford University Press.

    Chapter  Google Scholar 

  • Schneider, S. (2007). Yes, it does: A diatribe on Jerry Fodor’s The Mind Doesn’t Work that Way. Psyche, 13(1), 1–15.

    MathSciNet  Google Scholar 

  • Sloman, S. (2005). Causal models: How we think about the world and its alternatives. New York: Oxford University Press.

    Book  Google Scholar 

  • Sober, E. (1999). Testability. Proceedings and Addresses of the American Philosophical Association, 73, 47–76.

    Article  Google Scholar 

  • Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In L. A. Hirschfeld & S. A. Gelman (Eds.), Mapping the Mind (pp. 39–67). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Spiegelhalter, D. J., Franklin, R., & Bull, K. (1989). “Assessment, Criticism, and Improvement of Imprecise Probabilities for a Medical Expert System.” In Proceedings of the Fifth Conference on Uncertainty in Artificial Intelligence. Mountain View, CA (pp. 285–294).

  • Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search. Berlin: Springer-Verlag.

    Book  MATH  Google Scholar 

  • Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford: Oxford University Press.

    Google Scholar 

Download references

Acknowledgements

The author would like to thank Derek Baker, Maxwell Bertolero, Bennett Holman, Benjamin Jantzen, Edouard Machery, Richard Samuels, Kelly Trogdon, Carl Voss and Jiji Zhang for helpful feedback on drafts of this article. Additional thanks are extended to the philosophy departments at Virginia Tech University, Lingnan University and Underwood International College for valuable discussion on this article’s central ideas.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Timothy J. Fuller.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fuller, T.J. Cognitive Architecture, Holistic Inference and Bayesian Networks. Minds & Machines 29, 373–395 (2019). https://doi.org/10.1007/s11023-019-09505-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11023-019-09505-7

Keywords

Navigation