In this note we develop a method for constructing finite totally-ordered m-zeroids and prove that there exists a categorical equivalence between the category of finite, totally-ordered m-zeroids and the category of pseudo Łukasiewicz-like implicators.
Shepard has argued that a universal law should govern generalization across different domains of perception and cognition, as well as across organisms from different species or even different planets. Starting with some basic assumptions about natural kinds, he derived an exponential decay function as the form of the universal generalization gradient, which accords strikingly well with a wide range of empirical data. However, his original formulation applied only to the ideal case of generalization from a single encountered stimulus to a (...) single novel stimulus, and for stimuli that can be represented as points in a continuous metric psychological space. Here we recast Shepard's theory in a more general Bayesian framework and show how this naturally extends his approach to the more realistic situation of generalizing from multiple consequential stimuli with arbitrary representational structure. Our framework also subsumes a version of Tversky's set-theoretic model of similarity, which is conventionally thought of as the primary alternative to Shepard's continuous metric space model of similarity and generalization. This unification allows us not only to draw deep parallels between the set-theoretic and spatial approaches, but also to significantly advance the explanatory power of set-theoretic models. Key Words: additive clustering; Bayesian inference; categorization; concept learning; contrast model; features; generalization; psychological space; similarity. (shrink)
In many learning or inference tasks human behavior approximates that of a Bayesian ideal observer, suggesting that, at some level, cognition can be described as Bayesian inference. However, a number of findings have highlighted an intriguing mismatch between human behavior and standard assumptions about optimality: People often appear to make decisions based on just one or a few samples from the appropriate posterior probability distribution, rather than using the full distribution. Although sampling-based approximations are a common way to implement Bayesian (...) inference, the very limited numbers of samples often used by humans seem insufficient to approximate the required probability distributions very accurately. Here, we consider this discrepancy in the broader framework of statistical decision theory, and ask: If people are making decisions based on samples—but as samples are costly—how many samples should people use to optimize their total expected or worst-case reward over a large number of decisions? We find that under reasonable assumptions about the time costs of sampling, making many quick but locally suboptimal decisions based on very few samples may be the globally optimal strategy over long periods. These results help to reconcile a large body of work showing sampling-based or probability matching behavior with the hypothesis that human cognition can be understood in Bayesian terms, and they suggest promising future directions for studies of resource-constrained cognition. (shrink)
We investigated people's ability to infer others’ mental states from their emotional reactions, manipulating whether agents wanted, expected, and caused an outcome. Participants recovered agents’ desires throughout. When the agent observed, but did not cause the outcome, participants’ ability to recover the agent's beliefs depended on the evidence they got. When the agent caused the event, participants’ judgments also depended on the probability of the action ; when actions were improbable given the mental states, people failed to recover the agent's (...) beliefs even when they saw her react to both the anticipated and actual outcomes. A Bayesian model captured human performance throughout, consistent with the proposal that people rationally integrate information about others’ actions and emotional reactions to infer their unobservable mental states. (shrink)
In order to receive controlled pain medications for chronic non-oncologic pain, patients often must sign a “narcotic contract” or “opioid treatment agreement” in which they promise not to give pills to others, use illegal drugs, or seek controlled medications from health care providers. In addition, they must agree to use the medication as prescribed and to come to the clinic for drug testing and pill counts. Patients acknowledge that if they violate the opioid treatment agreement, they may no longer receive (...) controlled medications. OTAs have been widely implemented since they were recommended by multiple national bodies to decrease misuse and diversion of narcotic medications. But critics argue that OTAs are ethically suspect, if not unethical, and should be used with extreme care if at all. We agree that OTAs pose real dangers and must be implemented carefully. But we also believe that the most serious criticisms stem from a mistaken understanding of OTAs’ purpose and ethical basis. (shrink)
In order to account for how children can generalize words beyond a very limited set of labeled examples, Bloom's proposal of word learning requires two extensions: a better understanding of the “general learning and memory abilities” involved, and a principled framework for integrating multiple conflicting constraints on word meaning. We propose a framework based on Bayesian statistical inference that meets both of those needs.
Hierarchical Bayesian models (HBMs) provide an account of Bayesian inference in a hierarchically structured hypothesis space. Scientific theories are plausibly regarded as organized into hierarchies in many cases, with higher levels sometimes called ‘paradigms’ and lower levels encoding more specific or concrete hypotheses. Therefore, HBMs provide a useful model for scientific theory change, showing how higher‐level theory change may be driven by the impact of evidence on lower levels. HBMs capture features described in the Kuhnian tradition, particularly the idea that (...) higher‐level theories guide learning at lower levels. In addition, they help resolve certain issues for Bayesians, such as scientific preference for simplicity and the problem of new theories. *Received July 2009; revised October 2009. †To contact the authors, please write to: Leah Henderson, Massachusetts Institute of Technology, 77 Massachusetts Avenue, 32D‐808, Cambridge, MA 02139; e‐mail: firstname.lastname@example.org. (shrink)
Learning to understand a single causal system can be an achievement, but humans must learn about multiple causal systems over the course of a lifetime. We present a hierarchical Bayesian framework that helps to explain how learning about several causal systems can accelerate learning about systems that are subsequently encountered. Given experience with a set of objects, our framework learns a causal model for each object and a causal schema that captures commonalities among these causal models. The schema organizes the (...) objects into categories and specifies the causal powers and characteristic features of these categories and the characteristic causal interactions between categories. A schema of this kind allows causal models for subsequent objects to be rapidly learned, and we explore this accelerated learning in four experiments. Our results confirm that humans learn rapidly about the causal powers of novel objects, and we show that our framework accounts better for our data than alternative models of causal learning. (shrink)
People are adept at inferring novel causal relations, even from only a few observations. Prior knowledge about the probability of encountering causal relations of various types and the nature of the mechanisms relating causes and effects plays a crucial role in these inferences. We test a formal account of how this knowledge can be used and acquired, based on analyzing causal induction as Bayesian inference. Five studies explored the predictions of this account with adults and 4-year-olds, using tasks in which (...) participants learned about the causal properties of a set of objects. The studies varied the two factors that our Bayesian approach predicted should be relevant to causal induction: the prior probability with which causal relations exist, and the assumption of a deterministic or a probabilistic relation between cause and effect. Adults’ judgments (Experiments 1, 2, and 4) were in close correspondence with the quantitative predictions of the model, and children’s judgments (Experiments 3 and 5) agreed qualitatively with this account. (shrink)
Both scientists and children make important structural discoveries, yet their computational underpinnings are not well understood. Structure discovery has previously been formalized as probabilistic inference about the right structural form—where form could be a tree, ring, chain, grid, etc.. Although this approach can learn intuitive organizations, including a tree for animals and a ring for the color circle, it assumes a strong inductive bias that considers only these particular forms, and each form is explicitly provided as initial knowledge. Here we (...) introduce a new computational model of how organizing structure can be discovered, utilizing a broad hypothesis space with a preference for sparse connectivity. Given that the inductive bias is more general, the model's initial knowledge shows little qualitative resemblance to some of the discoveries it supports. As a consequence, the model can also learn complex structures for domains that lack intuitive description, as well as predict human property induction judgments without explicit structural forms. By allowing form to emerge from sparsity, our approach clarifies how both the richness and flexibility of human conceptual organization can coexist. (shrink)
Current psychological theories of human causal learning and judgment focus primarily on long-run predictions: two by estimating parameters of a causal Bayes nets, and a third through structural learning. This paper focuses on people’s short-run behavior by examining dynamical versions of these three theories, and comparing their predictions to a real-world dataset.