Bayesian sensitivity principles for evidence based knowledge

Pinillos, Ángel

doi:10.1007/s11098-021-01668-3

Bayesian sensitivity principles for evidence based knowledge

Published: 25 July 2021

Volume 179, pages 495–516, (2022)
Cite this article

Philosophical Studies Aims and scope Submit manuscript

Ángel Pinillos¹

323 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In this paper, I propose and defend a pair of necessary conditions on evidence-based knowledge which bear resemblance to the troubled sensitivity principles defended in the philosophical literature (notably by Fred Dretske). We can think of the traditional principles as simple but inaccurate approximations of the new proposals. Insofar as the old principles are intuitive and used in scientific and philosophical contexts, but are plausibly false, there’s a real need to develop precise and correct formulations. These new renditions turned out to be more cautious, so they won’t be able to do everything the old principled promised they could. For example, they respect closure for knowledge. But these sober formulations, or something like them, might be the best that we can do with respect to sensitivity. And there’s value in understanding the limits to these types of principles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Logic of Evidence-based Knowledge

Evidence and the epistemic betterness

Article 08 October 2023

Imprecise Bayesianism and Inference to the Best Explanation

Article 18 April 2022

Notes

Dhaliwal (2011), Van den Bruel et al. (2012) and Stolper et al (2011).
Anderson et al (2013), Gigerenzer et al. (2007) and Manrai et al (2014).
Dretske’s (1971, page 1) argues that ‘S knows P (based on reason R)’ entails ‘R would not be the case unless P were the case’ which I interpret to entail that ‘if P were false, R would be false’.
Dretske’s motivating example also concerns a medical case with a faulty instrument (1971, page 2). He considers an agent who forms a belief that a child has a normal temperature (98.6) based on a thermometer which gives the correct readings only for normal temperatures but is “stuck” at 98.6 for higher temperatures. In this case, the following subjunctive is true: if the child were to have a fever, the thermometer would still say the temperature is 98.6. And as a consequence, according to Dretske, the agent does not know that the child has a normal temperature.
Though it is featured prominently in contextualist some approaches to knowledge (DeRose, 1995, Ichikawa, 2017).
Roush (2005, page 28).
The conditional in box A is in a subjunctive mood whereas conditional probabilities are thought to correspond to conditionals in the indicative mood. There are important differences between these as we discuss below.
In other words, the Bayes factor is k₁/k₂ < 1 signifying that learning the result (the instrument predicts the patient has the disease) adds some support to the hypothesis that the patient has the disease (assuming your “prior” for the disease was < 1).
In a sense, this gives the new principles an internalist flavor. I discuss this in Sect. 6.
We have left it unclear when ~H > E is supposed to be true. Is it at t₁ or t₂? When we derive a more precise version of D’s advice we will be concerned with knowledge that ~H > E at t₁.
Although we are able to preserve single premise closure, we don’t accept multi-premise closure. In particular, an agent can know p and can know q where both sentences are under discussion in a learning scenario with ignorance zone k (so credence in p and in q will both be above or equal to k). The conjunction p and q will also be under discussion in the same learning scenario (by definition). And the agent’s confidence in the conjunction may very well fall below k (and inside the ignorance zone) since the probability of a conjunction can (and often does) fall below the probability of either conjunct.
We are not interested in determining what one’s confidence in H should be when one learns the conditional (if ~H then E). Plausibly, one’s confidence in H could be affected by learning the conditional. I assume that this updating has been settled by t₁. I assume that at t₁ one has some stable antecedent rational credence in H as well as in (If ~H then E).
Stalnaker (1970), Adams (1975), Gibbard (1981), Edgington (1995) and Bennett (2003).
Over et al (2007), Douven and Verbrugge (2010) and Pfeifer and Kleiter (2010).
David Lewis (1976).
Adams (1975), Kratzer (1986), Lewis (1975, 1986) and Russell and Hawthorne (2016).
Briggs (2017), Leitgeb (2012) and Williams (2012).
Lewis argued that C(Y > X) = C(X//Y), for positive C(Y). Notice how C(X//Y) can be radically different from C(X|Y) as our example reveals.
A non-trivial assumption in the derivation of DAS is that learning E does not destroy knowledge of ~H > E (i.e. does not lead to a violation of the constraints). In the medical case we started with, this is a reasonable assumption. If I know that my instrument would say that I am sick if I weren’t and I further learn that that the instrument says I am sick, it need not affect my confidence in the subjunctive.
See Huemer (2016) for an interpretation of the priors as something closer to ‘a priori’. What is the a priori probability of not being a BIV? I find myself unable to get a handle on this question.
As an anonymous referee correctly points out, even if our confidence that the evidence is the same conditional on being a BIV is 1, it doesn’t follow that we should accept the skeptical conclusion. Suppose, for example, that the knowledge threshold is .8 and that the prior that we are not BIVs is .9. If we also assume that the confidence of our evidence being the same conditional on not being a BIV is 1, then the posterior (the probability we are not BIVs conditional on the evidence) will also be .9 (applying Bayes’ theorem) which is above the knowledge threshold. The same result holds if we assume instead that the confidence in the corresponding subjunctive (if we were a BIV, then our evidence would be the same) is 1. This is because C(X//Y) = 1 entails C(X|Y) = 1 under reasonable assumptions.
Where ‘>’ denotes the subjunctive conditional, this principle is (strictly speaking) entailed by Dretske’s account (as discussed in Sect. 1). They are equivalent under reasonable assumptions. In counter-factual logics these assumptions are ‘anti-symmetry’ and ‘limit’, Sider (2010), which are taken on by Stalnaker (1968).
Another way to explain why our agent does not know H (it is 3:31 pm) is to argue that the proposition ~ H > E is itself a propositional defeater for knowing H (where E is the watch reading). A propositional defeater D for H is a proposition that would lower the justification for H if it were accepted (Bergman, 2006). On this strategy we would need to show how accepting ~H > E can lead to a reduction in the credence for H, under certain further assumptions.

References

Adams, E. (1975). The logic of conditionals. D. Reidel, Synthese Library.
Book Google Scholar
Anderson, B., Williams, S., & Schulkin, J. (2013). Statistical literacy of obstetrics-gynecology residents. Journal of Graduate Medical Education, 5(2), 272–275.
Article Google Scholar
Bennett, J. (2003). A philosophical guide to conditionals. Oxford University Press.
Book Google Scholar
Bergman, M. (2006). Justification without awareness. OUP.
Book Google Scholar
Briggs, R. (2017). Two interpretations of the Ramsey test. In H. Beebee, C. Hitchcock, & H. Price (Eds.), Making a difference: Essays in honour of Peter Menzies. Oxford: Oxford University Press.
Google Scholar
DeRose, K. (1995). Solving the skeptical problem. Philosophical Review, 104(1), 1–52.
Article Google Scholar
Dhaliwal, G. (2011). Going with your gut. Journal of General Internal Medicine, 26, 107.
Article Google Scholar
Douven, I., & Verbrugge, S. (2010). The Adams family. Cognition, 117(3), 302–318.
Article Google Scholar
Dretske, F. (1970). Epistemic operators. The Journal of Philosophy, 67(24), 1007–1023.
Article Google Scholar
Dretske, F. (1971). Conclusive reasons. Australasian Journal of Philosophy, 49, 1–22.
Article Google Scholar
Edgington, D. (1995). On conditionals. Mind, 104, 235–329.
Article Google Scholar
Gettier, E. L. (1963). Is justified true belief knowledge? Analysis, 23(6), 121–123.
Article Google Scholar
Gibbard, A. (1981). Two recent theories of conditionals. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), Ifs (pp. 211–247). Reidel.
Google Scholar
Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L., & Woloshin, S. (2007). Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest., 8(2), 53–96.
Article Google Scholar
Hawthorne, J. (2003). Knowledge and lotteries. Oxford University Press.
Book Google Scholar
Huemer, M. (2016). Serious theories and skeptical theories: Why you are probably not a brain in a vat. Philosophical Studies, 173(4), 1031–1052.
Article Google Scholar
Ichikawa, J. J. (2017). Contextualizing knowledge. Oxford University Press.
Book Google Scholar
Kratzer, A. (1986). Conditionals. Chicago Linguistics Society, 22(2), 1–15.
Google Scholar
Leitgeb, H. (2012). A probabilistic semantics for counterfactuals. Part A. The Review of Symbolic Logic, 5(01), 26–84.
Article Google Scholar
Lewis, D. (1975). Adverbs of quantification. In E. L. Keenan (Ed.), Formal semantics of natural language (pp. 3–15). Cambridge University Press.
Chapter Google Scholar
Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosophical Review, 85(3), 297–315.
Article Google Scholar
Lewis, D. (1986). Probabilities of conditionals and conditional probabilities II. The Philosophical Review, 95, 581–89.
Article Google Scholar
Manrai, A. K., Bhatia, G., Strymish, J., & Kohane, I. S. (2014). Medicine’s uncomfortable relationship with math calculating positive predictive value. JAMA Internal Medicine, 174(6), 991–993.
Article Google Scholar
Mayo-Wilson, C. (2018). Epistemic closure in science. The Philosophical Review., 127(1), 73–114.
Google Scholar
Nozick, R. (1981). Philosophical explanations. Harvard University Press.
Google Scholar
Over, D., Hadjichristidis, C., Evans, J. S. B. T., Handley, S., & Sloman, S. (2007). The probability of causal conditionals. Cognitive Psychology, 54(1), 62–97.
Article Google Scholar
Pfeifer, N., & Kleiter, G. D. (2010). The conditional in mental probability logic. In M. Oaksford & N. Chater (Eds.), Cognition and conditionals (pp. 153–173). Oxford University Press.
Google Scholar
Putnam, H. (1981). Reason, truth and history. Cambridge University Press.
Book Google Scholar
Roush, S. (2005). Tracking truth. Oxford University Press.
Book Google Scholar
Russell, J. S., & Hawthorne, J. (2016). General dynamic triviality theorems. The Philosophical Review, 125(3), 307–339.
Article Google Scholar
Sider, T. (2010). Logic for philosophy. Oxford University Press.
Stalnaker, R. (1970). Probability and conditionals. Philosophy of Science, 37(1), 64–80.
Article Google Scholar
Stalnaker, R. C. (1968). A theory of conditionals. In N. Rescher (Ed.), Studies in logical theory (American philosophical quarterly monographs 2) (pp. 98–112). Oxford: Blackwell.
Stolper, E., Wiel, M., Royen, P., Bokhoven, M., Weijden, T., & Dinant, G. J. (2011). Gut feelings as a third track in general practitioners’ diagnostic reasoning. Journal of General Internal Medicine, 26, 197–203.
Article Google Scholar
Van den Bruel, A., Thompson, M., Buntinx, F., & Mant, D. (2012). Clinicians’ gut feeling about serious infections in children: Observational study. BMJ, 345, e6144.
Article Google Scholar
Williams, J. R. G. (2012). Counterfactual triviality: A Lewis-impossibility argument for counterfactuals. Philosophy and Phenomenal Research, 85(3), 648–670.
Article Google Scholar
Williamson, T. (2000). Knowledge and its limits. Oxford University Press.

Download references

Acknowlegements

I would like to thank Brad Armendt, Shyam Nair, Bryan Lietz, audience members at the Society for Exact Philosophy, and anonymous referees for helpful feedback and suggestions.

Author information

Authors and Affiliations

Arizona State University, Po Box 874302, Tempe, AZ, 85287, USA
Ángel Pinillos

Authors

Ángel Pinillos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ángel Pinillos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Notation convention

We will be making repeated use of Bayes’ theorem. Here it is, applied to a Learning scenario L:

(Bayes) C_t1(H|E) = C_t1(E|H) C_t1(H)/C_t1(E).

where by the law of total probability: C_t1(E) = C_t1(E|H)C_t1(H) + C_t1(E|~H)C_t1(~H).

Let’s agree to some friendlier notation which I summarize here for convenience (Table 4).

Table 4 Notation convention

Full size table

This allows us to rewrite Bayes as:

(BT) C_t1(H|E) = lp/(lp + x(1 − p)).

Finally, we always assume l, x, p can only take values in the interval [0,1] since they are probabilities.

Appendix 2: Deriving f (indicative)

Consider an arbitrary learning situation L with an ignorance zone [0,k) where (a) H is known_t2 (upon learning E) and (b) ~H → E is known_t1. Typically, this situation won’t be possible. But there are cases in which it is possible, revealing the limits of sensitivity. These limits are expressed by function f which we now derive.

From the premise that L has ignorance zone [0,k), (a), plus conditionalization, we get (i) C_t1(H|E) ≥ k. From the premise that L has ignorance zone [0,k), (b) and ICA (see main text), we get (ii) x ≥ k. We will add one more constraint which maximizes the posterior as we discussed in the text. We assume (iii) l = C_t1(E|H) = 1. These assumptions are summarized in Table 5.

Table 5 Constraints for →, for L with ignorance zone = [0,k)

Full size table

These are the constraints where our new sensitivity principle would not work (would not yield a denial of knowledge).

Now from (i), (iii) and BT we derive.

(1) [p/(p + x(1-p))] ≥ k

Algebraic manipulation of (1) and (ii) entails.

(2) [p-kp/k(1-p) ] ≥ x ≥ k

eliminating x:

(3) 0 ≥ k² − k²p + kp − p

What (3) represents are the possible values of k, p when the constraints (i–iii) are met (Fig. 3). Suppose we pick a pair k₁, p₁ which do not satisfy (3), it follows that the constraints are violated. That is, either l < 1, the agent fails to know_t2 H or she fails to know_t1 the indicative ~H → E. Let’s grant that l = 1 (the strongest case for knowing_t2 H)) and that she knows_t1 the indicative. We conclude that she fails to know_t2 H. In other words, l = 1, knowledge of the indicative, plus violation of (3) entails failure to know_t2 H (e.g. failure to know the patient has the disease) upon learning E. We have the makings of a necessary condition for knowledge.

Looking at Fig.

3, the possible values of k,p that satisfy the constraints will just be the 2 dimensional projection of the surface on the top of the cube. This will just be a region of space bounded by a curve we will call f. f is a function which is given implicitly in (4) derived from (3). Graphing f, the area above it (inclusive) represents the pairs k,p which satisfy our constraints (See Fig. 1 in the main paper).

(4) K²-k²p + kp − p =0 (Function p=f(k) given implicitly)

Appendix 3: Deriving g (Subjunctive)

The constraints translate to (i)–(iii) as depicted in Table 6. (i) and (iii) are derived in the same way that they were derived in Table 5. (ii) is gotten from SCA (see main text) plus the fact that L has ignorance zone [0,k).

Table 6 Constraints for >, for L with ignorance zone = [0,k)

Full size table

We begin by dividing the totality of possible worlds W into 4 disjoint sets (some of these may be empty): W_HE, W_H~E, W_~HE and W_~H~E. W_HE is just the set worlds where both H and E are true and so on. Propositions are just sets of worlds, so W_HE is just the proposition H&E and so on.

Before we get to the details of the proof, I want to briefly point out the main driving idea. Once we see this, we will be able to get a feel for how the proof is supposed to work. In particular, we will be able to see how g is set independent of any choice function (the function which selects the closest worlds in a counter-factual logic). This may be surprising.

The key idea is to note that with the help of imaging, we can deduce facts about one’s priors and conditional probabilities from facts about our credences regarding subjunctives. Once we deduce these facts, we can just plug them in to Bayes’ theorem to get constraints on the posterior as we did with the indicative case. To see this, suppose that C(E//~H) ≥ k. By the definition of imaging, this means that if you add up the probabilities of all the worlds (assume for simplicity there are just a finite number of possible worlds) such that the nearest ~H world to them is an E world, then this sum will be greater than or equal to k. Consider the set W* of all such worlds. C(W*) ≥ k. W* will be a subset of W_HvE (because no element of W* can be a world where both ~H and ~ E are true—if ~H is true in a W* world, then the closest ~H world will be itself and so it must be an E world by definition). Since C(W*) ≥ k and W* is a subset of W_HvE, it follows that C(HvE) ≥ k. C(HvE) is itself equivalent C(H) + C(E|~H)C(~H) if defined. Notice that these are elements of Bayes’ theorem and so (via algebraic manipulation) we can get constraints on the posterior C(H|E). And note that we made no appeal to any fact about selection functions.

Let’s begin the proof. The product rule in the probability calculus gives us (5):

(5) C_t1(W_~HE) = C_t1(~H&E) = C_t1(E|~H) C_t1(~H) = x(1-p)

Applying (5) and (iii) to BT followed by setting the inequality in (i) yields:

(6) [p/(p + C_t1(W_~HE))] ≥ k

Algebraic manipulation gets us.

(7) [(p/k) − p] ≥ C_t1(W_~HE).

Next, I will prove (8).

(8) C_t1(W_HE) + C_t1(W_H~E) + C_t1(W_~HE) ≥ k

Here’s the proof for (8). From (ii) and IMA (see main text) we get C_t1(E//~H) = [∑_WC_t1(w)⋯1_E (w_~H)] ≥ k. So [∑_WC_t1(w)⋯1_E (w_~H)] = [∑_HvEC_t1(w)⋯1_E (w_~H)] + [∑_~H&~EC_t1(w)⋯1_E (w_~H)] ≥ k. We know that for all w in W_~H~E, 1_E(w_~H) = 0. This is because if w is in W_~H~E, this means both ~H and ~ E are true in w, so then the closest ~H world is a ~ E world (the closest ~H world is itself) which means that 1_E(w_~H) = 0. It follows that [∑_~H&~EC_t1(w)⋯1_E (w_~H)] = 0 and hence [∑_HvEC_t1(w) ⋯1_~H (w_E)] ≥ k. Now since ∑_HvEC_t1(w) ≥ [∑_HvEC_t1(w)⋯1_~H (w_E)], we deduce that ∑_HvEC_t1(w) ≥ k. But this is just C_t1(W_HE) + C_t1(W_H~E) + C_t1(W_~HE) ≥ k. QED.

Continuing, manipulation of (8) gets us C_t1(W_~HE) ≥ k − [C_t1(W_HE) + C_t1(W_H~E)]. But since [C_t1(W_HE) + C_t1(W_H~E)] = C_t1(H) = p, we deduce C_t1(W_~HE) ≥ k − p from these last equations. This, together with (7) (eliminating C_t1(W_~HE)) yields (p/k) − p ≥ k − p which is just our function g:

(9) p ≥ k²

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pinillos, Á. Bayesian sensitivity principles for evidence based knowledge. Philos Stud 179, 495–516 (2022). https://doi.org/10.1007/s11098-021-01668-3

Download citation

Accepted: 28 May 2021
Published: 25 July 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11098-021-01668-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian sensitivity principles for evidence based knowledge

Abstract

Access this article