Skip to main content
Log in

Bayesian sensitivity principles for evidence based knowledge

  • Published:
Philosophical Studies Aims and scope Submit manuscript

Abstract

In this paper, I propose and defend a pair of necessary conditions on evidence-based knowledge which bear resemblance to the troubled sensitivity principles defended in the philosophical literature (notably by Fred Dretske). We can think of the traditional principles as simple but inaccurate approximations of the new proposals. Insofar as the old principles are intuitive and used in scientific and philosophical contexts, but are plausibly false, there’s a real need to develop precise and correct formulations. These new renditions turned out to be more cautious, so they won’t be able to do everything the old principled promised they could. For example, they respect closure for knowledge. But these sober formulations, or something like them, might be the best that we can do with respect to sensitivity. And there’s value in understanding the limits to these types of principles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Dhaliwal (2011), Van den Bruel et al. (2012) and Stolper et al (2011).

  2. Anderson et al (2013), Gigerenzer et al. (2007) and Manrai et al (2014).

  3. Dretske’s (1971, page 1) argues that ‘S knows P (based on reason R)’ entails ‘R would not be the case unless P were the case’ which I interpret to entail that ‘if P were false, R would be false’.

  4. Dretske’s motivating example also concerns a medical case with a faulty instrument (1971, page 2). He considers an agent who forms a belief that a child has a normal temperature (98.6) based on a thermometer which gives the correct readings only for normal temperatures but is “stuck” at 98.6 for higher temperatures. In this case, the following subjunctive is true: if the child were to have a fever, the thermometer would still say the temperature is 98.6. And as a consequence, according to Dretske, the agent does not know that the child has a normal temperature.

  5. Though it is featured prominently in contextualist some approaches to knowledge (DeRose, 1995, Ichikawa, 2017).

  6. Roush (2005, page 28).

  7. The conditional in box A is in a subjunctive mood whereas conditional probabilities are thought to correspond to conditionals in the indicative mood. There are important differences between these as we discuss below.

  8. In other words, the Bayes factor is k1/k2 < 1 signifying that learning the result (the instrument predicts the patient has the disease) adds some support to the hypothesis that the patient has the disease (assuming your “prior” for the disease was < 1).

  9. In a sense, this gives the new principles an internalist flavor. I discuss this in Sect. 6.

  10. We have left it unclear when ~H > E is supposed to be true. Is it at t1 or t2? When we derive a more precise version of D’s advice we will be concerned with knowledge that ~H > E at t1.

  11. Although we are able to preserve single premise closure, we don’t accept multi-premise closure. In particular, an agent can know p and can know q where both sentences are under discussion in a learning scenario with ignorance zone k (so credence in p and in q will both be above or equal to k). The conjunction p and q will also be under discussion in the same learning scenario (by definition). And the agent’s confidence in the conjunction may very well fall below k (and inside the ignorance zone) since the probability of a conjunction can (and often does) fall below the probability of either conjunct.

  12. We are not interested in determining what one’s confidence in H should be when one learns the conditional (if ~H then E). Plausibly, one’s confidence in H could be affected by learning the conditional. I assume that this updating has been settled by t1. I assume that at t1 one has some stable antecedent rational credence in H as well as in (If ~H then E).

  13. Stalnaker (1970), Adams (1975), Gibbard (1981), Edgington (1995) and Bennett (2003).

  14. Over et al (2007), Douven and Verbrugge (2010) and Pfeifer and Kleiter (2010).

  15. David Lewis (1976).

  16. Adams (1975), Kratzer (1986), Lewis (1975, 1986) and Russell and Hawthorne (2016).

  17. Briggs (2017), Leitgeb (2012) and Williams (2012).

  18. Lewis argued that C(Y > X) = C(X//Y), for positive C(Y). Notice how C(X//Y) can be radically different from C(X|Y) as our example reveals.

  19. A non-trivial assumption in the derivation of DAS is that learning E does not destroy knowledge of ~H > E (i.e. does not lead to a violation of the constraints). In the medical case we started with, this is a reasonable assumption. If I know that my instrument would say that I am sick if I weren’t and I further learn that that the instrument says I am sick, it need not affect my confidence in the subjunctive.

  20. See Huemer (2016) for an interpretation of the priors as something closer to ‘a priori’. What is the a priori probability of not being a BIV? I find myself unable to get a handle on this question.

  21. As an anonymous referee correctly points out, even if our confidence that the evidence is the same conditional on being a BIV is 1, it doesn’t follow that we should accept the skeptical conclusion. Suppose, for example, that the knowledge threshold is .8 and that the prior that we are not BIVs is .9. If we also assume that the confidence of our evidence being the same conditional on not being a BIV is 1, then the posterior (the probability we are not BIVs conditional on the evidence) will also be .9 (applying Bayes’ theorem) which is above the knowledge threshold. The same result holds if we assume instead that the confidence in the corresponding subjunctive (if we were a BIV, then our evidence would be the same) is 1. This is because C(X//Y) = 1 entails C(X|Y) = 1 under reasonable assumptions.

  22. Where ‘>’ denotes the subjunctive conditional, this principle is (strictly speaking) entailed by Dretske’s account (as discussed in Sect. 1). They are equivalent under reasonable assumptions. In counter-factual logics these assumptions are ‘anti-symmetry’ and ‘limit’, Sider (2010), which are taken on by Stalnaker (1968).

  23. Another way to explain why our agent does not know H (it is 3:31 pm) is to argue that the proposition ~ H > E is itself a propositional defeater for knowing H (where E is the watch reading). A propositional defeater D for H is a proposition that would lower the justification for H if it were accepted (Bergman, 2006). On this strategy we would need to show how accepting ~H > E can lead to a reduction in the credence for H, under certain further assumptions.

References

  • Adams, E. (1975). The logic of conditionals. D. Reidel, Synthese Library.

    Book  Google Scholar 

  • Anderson, B., Williams, S., & Schulkin, J. (2013). Statistical literacy of obstetrics-gynecology residents. Journal of Graduate Medical Education, 5(2), 272–275.

    Article  Google Scholar 

  • Bennett, J. (2003). A philosophical guide to conditionals. Oxford University Press.

    Book  Google Scholar 

  • Bergman, M. (2006). Justification without awareness. OUP.

    Book  Google Scholar 

  • Briggs, R. (2017). Two interpretations of the Ramsey test. In H. Beebee, C. Hitchcock, & H. Price (Eds.), Making a difference: Essays in honour of Peter Menzies. Oxford: Oxford University Press.

    Google Scholar 

  • DeRose, K. (1995). Solving the skeptical problem. Philosophical Review, 104(1), 1–52.

    Article  Google Scholar 

  • Dhaliwal, G. (2011). Going with your gut. Journal of General Internal Medicine, 26, 107.

    Article  Google Scholar 

  • Douven, I., & Verbrugge, S. (2010). The Adams family. Cognition, 117(3), 302–318.

    Article  Google Scholar 

  • Dretske, F. (1970). Epistemic operators. The Journal of Philosophy, 67(24), 1007–1023.

    Article  Google Scholar 

  • Dretske, F. (1971). Conclusive reasons. Australasian Journal of Philosophy, 49, 1–22.

    Article  Google Scholar 

  • Edgington, D. (1995). On conditionals. Mind, 104, 235–329.

    Article  Google Scholar 

  • Gettier, E. L. (1963). Is justified true belief knowledge? Analysis, 23(6), 121–123.

    Article  Google Scholar 

  • Gibbard, A. (1981). Two recent theories of conditionals. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), Ifs (pp. 211–247). Reidel.

    Google Scholar 

  • Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L., & Woloshin, S. (2007). Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest., 8(2), 53–96.

    Article  Google Scholar 

  • Hawthorne, J. (2003). Knowledge and lotteries. Oxford University Press.

    Book  Google Scholar 

  • Huemer, M. (2016). Serious theories and skeptical theories: Why you are probably not a brain in a vat. Philosophical Studies, 173(4), 1031–1052.

    Article  Google Scholar 

  • Ichikawa, J. J. (2017). Contextualizing knowledge. Oxford University Press.

    Book  Google Scholar 

  • Kratzer, A. (1986). Conditionals. Chicago Linguistics Society, 22(2), 1–15.

    Google Scholar 

  • Leitgeb, H. (2012). A probabilistic semantics for counterfactuals. Part A. The Review of Symbolic Logic, 5(01), 26–84.

    Article  Google Scholar 

  • Lewis, D. (1975). Adverbs of quantification. In E. L. Keenan (Ed.), Formal semantics of natural language (pp. 3–15). Cambridge University Press.

    Chapter  Google Scholar 

  • Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosophical Review, 85(3), 297–315.

    Article  Google Scholar 

  • Lewis, D. (1986). Probabilities of conditionals and conditional probabilities II. The Philosophical Review, 95, 581–89.

    Article  Google Scholar 

  • Manrai, A. K., Bhatia, G., Strymish, J., & Kohane, I. S. (2014). Medicine’s uncomfortable relationship with math calculating positive predictive value. JAMA Internal Medicine, 174(6), 991–993.

    Article  Google Scholar 

  • Mayo-Wilson, C. (2018). Epistemic closure in science. The Philosophical Review., 127(1), 73–114.

    Google Scholar 

  • Nozick, R. (1981). Philosophical explanations. Harvard University Press.

    Google Scholar 

  • Over, D., Hadjichristidis, C., Evans, J. S. B. T., Handley, S., & Sloman, S. (2007). The probability of causal conditionals. Cognitive Psychology, 54(1), 62–97.

    Article  Google Scholar 

  • Pfeifer, N., & Kleiter, G. D. (2010). The conditional in mental probability logic. In M. Oaksford & N. Chater (Eds.), Cognition and conditionals (pp. 153–173). Oxford University Press.

    Google Scholar 

  • Putnam, H. (1981). Reason, truth and history. Cambridge University Press.

    Book  Google Scholar 

  • Roush, S. (2005). Tracking truth. Oxford University Press.

    Book  Google Scholar 

  • Russell, J. S., & Hawthorne, J. (2016). General dynamic triviality theorems. The Philosophical Review, 125(3), 307–339.

    Article  Google Scholar 

  • Sider, T. (2010). Logic for philosophy. Oxford University Press.

  • Stalnaker, R. (1970). Probability and conditionals. Philosophy of Science, 37(1), 64–80.

    Article  Google Scholar 

  • Stalnaker, R. C. (1968). A theory of conditionals. In N. Rescher (Ed.), Studies in logical theory (American philosophical quarterly monographs 2) (pp. 98–112). Oxford: Blackwell.

  • Stolper, E., Wiel, M., Royen, P., Bokhoven, M., Weijden, T., & Dinant, G. J. (2011). Gut feelings as a third track in general practitioners’ diagnostic reasoning. Journal of General Internal Medicine, 26, 197–203.

    Article  Google Scholar 

  • Van den Bruel, A., Thompson, M., Buntinx, F., & Mant, D. (2012). Clinicians’ gut feeling about serious infections in children: Observational study. BMJ, 345, e6144.

    Article  Google Scholar 

  • Williams, J. R. G. (2012). Counterfactual triviality: A Lewis-impossibility argument for counterfactuals. Philosophy and Phenomenal Research, 85(3), 648–670.

    Article  Google Scholar 

  • Williamson, T. (2000). Knowledge and its limits. Oxford University Press.

Download references

Acknowlegements

I would like to thank Brad Armendt, Shyam Nair, Bryan Lietz, audience members at the Society for Exact Philosophy, and anonymous referees for helpful feedback and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ángel Pinillos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Notation convention

We will be making repeated use of Bayes’ theorem. Here it is, applied to a Learning scenario L:

(Bayes) Ct1(H|E) = Ct1(E|H) Ct1(H)/Ct1(E).

where by the law of total probability: Ct1(E) = Ct1(E|H)Ct1(H) + Ct1(E|~H)Ct1(~H).

Let’s agree to some friendlier notation which I summarize here for convenience (Table 4).

Table 4 Notation convention

This allows us to rewrite Bayes as:

(BT) Ct1(H|E) = lp/(lp + x(1 − p)).

Finally, we always assume l, x, p can only take values in the interval [0,1] since they are probabilities.

Appendix 2: Deriving f (indicative)

Consider an arbitrary learning situation L with an ignorance zone [0,k) where (a) H is knownt2 (upon learning E) and (b) ~H → E is knownt1. Typically, this situation won’t be possible. But there are cases in which it is possible, revealing the limits of sensitivity. These limits are expressed by function f which we now derive.

From the premise that L has ignorance zone [0,k), (a), plus conditionalization, we get (i) Ct1(H|E) ≥ k. From the premise that L has ignorance zone [0,k), (b) and ICA (see main text), we get (ii) x ≥ k. We will add one more constraint which maximizes the posterior as we discussed in the text. We assume (iii) l = Ct1(E|H) = 1. These assumptions are summarized in Table 5.

Table 5 Constraints for →, for L with ignorance zone = [0,k)

These are the constraints where our new sensitivity principle would not work (would not yield a denial of knowledge).

Now from (i), (iii) and BT we derive.

(1) [p/(x(1-p))] ≥ k

Algebraic manipulation of (1) and (ii) entails.

(2) [p-kp/k(1-p) ] ≥ xk

eliminating x:

(3) 0 ≥ k2k2p + kpp

What (3) represents are the possible values of k, p when the constraints (i–iii) are met (Fig. 3). Suppose we pick a pair k1, p1 which do not satisfy (3), it follows that the constraints are violated. That is, either l < 1, the agent fails to knowt2 H or she fails to knowt1 the indicative ~H → E. Let’s grant that l = 1 (the strongest case for knowingt2 H)) and that she knowst1 the indicative. We conclude that she fails to knowt2 H. In other words, l = 1, knowledge of the indicative, plus violation of (3) entails failure to knowt2 H (e.g. failure to know the patient has the disease) upon learning E. We have the makings of a necessary condition for knowledge.

Looking at Fig. 

Fig. 3
figure 3

The surface is a graph of equation (3) and represents the k,p values that satisfy the constraints

3, the possible values of k,p that satisfy the constraints will just be the 2 dimensional projection of the surface on the top of the cube. This will just be a region of space bounded by a curve we will call f. f is a function which is given implicitly in (4) derived from (3). Graphing f, the area above it (inclusive) represents the pairs k,p which satisfy our constraints (See Fig. 1 in the main paper).

(4) K2-k2p + kp − p =0 (Function p=f(k) given implicitly)

Appendix 3: Deriving g (Subjunctive)

The constraints translate to (i)–(iii) as depicted in Table 6. (i) and (iii) are derived in the same way that they were derived in Table 5. (ii) is gotten from SCA (see main text) plus the fact that L has ignorance zone [0,k).

Table 6 Constraints for >, for L with ignorance zone = [0,k)

We begin by dividing the totality of possible worlds W into 4 disjoint sets (some of these may be empty): WHE, WH~E, W~HE and W~H~E. WHE is just the set worlds where both H and E are true and so on. Propositions are just sets of worlds, so WHE is just the proposition H&E and so on.

Before we get to the details of the proof, I want to briefly point out the main driving idea. Once we see this, we will be able to get a feel for how the proof is supposed to work. In particular, we will be able to see how g is set independent of any choice function (the function which selects the closest worlds in a counter-factual logic). This may be surprising.

The key idea is to note that with the help of imaging, we can deduce facts about one’s priors and conditional probabilities from facts about our credences regarding subjunctives. Once we deduce these facts, we can just plug them in to Bayes’ theorem to get constraints on the posterior as we did with the indicative case. To see this, suppose that C(E//~H) ≥ k. By the definition of imaging, this means that if you add up the probabilities of all the worlds (assume for simplicity there are just a finite number of possible worlds) such that the nearest ~H world to them is an E world, then this sum will be greater than or equal to k. Consider the set W* of all such worlds. C(W*) ≥ k. W* will be a subset of WHvE (because no element of W* can be a world where both ~H and ~ E are true—if ~H is true in a W* world, then the closest ~H world will be itself and so it must be an E world by definition). Since C(W*) ≥ k and W* is a subset of WHvE, it follows that C(HvE) ≥ k. C(HvE) is itself equivalent C(H) + C(E|~H)C(~H) if defined. Notice that these are elements of Bayes’ theorem and so (via algebraic manipulation) we can get constraints on the posterior C(H|E). And note that we made no appeal to any fact about selection functions.

Let’s begin the proof. The product rule in the probability calculus gives us (5):

(5) Ct1(W~HE) = Ct1(~H&E) = Ct1(E|~H) Ct1(~H) = x(1-p)

Applying (5) and (iii) to BT followed by setting the inequality in (i) yields:

(6) [p/(p + Ct1(W~HE))] ≥ k

Algebraic manipulation gets us.

(7) [(p/k) − p] ≥ Ct1(W~HE).

Next, I will prove (8).

(8) Ct1(WHE) + Ct1(WH~E) + Ct1(W~HE) ≥ k

Here’s the proof for (8). From (ii) and IMA (see main text) we get Ct1(E//~H) = [∑WCt1(w)⋯1E (w~H)] ≥ k. So [∑WCt1(w)⋯1E (w~H)] = [∑HvECt1(w)⋯1E (w~H)] + [∑~H&~ECt1(w)⋯1E (w~H)] ≥ k. We know that for all w in W~H~E, 1E(w~H) = 0. This is because if w is in W~H~E, this means both ~H and ~ E are true in w, so then the closest ~H world is a ~ E world (the closest ~H world is itself) which means that 1E(w~H) = 0. It follows that [∑~H&~ECt1(w)⋯1E (w~H)] = 0 and hence [∑HvECt1(w) ⋯1~H (wE)] ≥ k. Now since ∑HvECt1(w) ≥ [∑HvECt1(w)⋯1~H (wE)], we deduce that ∑HvECt1(w) ≥ k. But this is just Ct1(WHE) + Ct1(WH~E) + Ct1(W~HE) ≥ k. QED.

Continuing, manipulation of (8) gets us Ct1(W~HE) ≥ k − [Ct1(WHE) + Ct1(WH~E)]. But since [Ct1(WHE) + Ct1(WH~E)] = Ct1(H) = p, we deduce Ct1(W~HE) ≥ kp from these last equations. This, together with (7) (eliminating Ct1(W~HE)) yields (p/k) − p ≥ kp which is just our function g:

(9) pk2

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pinillos, Á. Bayesian sensitivity principles for evidence based knowledge. Philos Stud 179, 495–516 (2022). https://doi.org/10.1007/s11098-021-01668-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11098-021-01668-3

Keywords

Navigation