Abstract
In this paper, I propose and defend a pair of necessary conditions on evidence-based knowledge which bear resemblance to the troubled sensitivity principles defended in the philosophical literature (notably by Fred Dretske). We can think of the traditional principles as simple but inaccurate approximations of the new proposals. Insofar as the old principles are intuitive and used in scientific and philosophical contexts, but are plausibly false, there’s a real need to develop precise and correct formulations. These new renditions turned out to be more cautious, so they won’t be able to do everything the old principled promised they could. For example, they respect closure for knowledge. But these sober formulations, or something like them, might be the best that we can do with respect to sensitivity. And there’s value in understanding the limits to these types of principles.
Similar content being viewed by others
Notes
Dretske’s (1971, page 1) argues that ‘S knows P (based on reason R)’ entails ‘R would not be the case unless P were the case’ which I interpret to entail that ‘if P were false, R would be false’.
Dretske’s motivating example also concerns a medical case with a faulty instrument (1971, page 2). He considers an agent who forms a belief that a child has a normal temperature (98.6) based on a thermometer which gives the correct readings only for normal temperatures but is “stuck” at 98.6 for higher temperatures. In this case, the following subjunctive is true: if the child were to have a fever, the thermometer would still say the temperature is 98.6. And as a consequence, according to Dretske, the agent does not know that the child has a normal temperature.
Roush (2005, page 28).
The conditional in box A is in a subjunctive mood whereas conditional probabilities are thought to correspond to conditionals in the indicative mood. There are important differences between these as we discuss below.
In other words, the Bayes factor is k1/k2 < 1 signifying that learning the result (the instrument predicts the patient has the disease) adds some support to the hypothesis that the patient has the disease (assuming your “prior” for the disease was < 1).
In a sense, this gives the new principles an internalist flavor. I discuss this in Sect. 6.
We have left it unclear when ~H > E is supposed to be true. Is it at t1 or t2? When we derive a more precise version of D’s advice we will be concerned with knowledge that ~H > E at t1.
Although we are able to preserve single premise closure, we don’t accept multi-premise closure. In particular, an agent can know p and can know q where both sentences are under discussion in a learning scenario with ignorance zone k (so credence in p and in q will both be above or equal to k). The conjunction p and q will also be under discussion in the same learning scenario (by definition). And the agent’s confidence in the conjunction may very well fall below k (and inside the ignorance zone) since the probability of a conjunction can (and often does) fall below the probability of either conjunct.
We are not interested in determining what one’s confidence in H should be when one learns the conditional (if ~H then E). Plausibly, one’s confidence in H could be affected by learning the conditional. I assume that this updating has been settled by t1. I assume that at t1 one has some stable antecedent rational credence in H as well as in (If ~H then E).
David Lewis (1976).
Lewis argued that C(Y > X) = C(X//Y), for positive C(Y). Notice how C(X//Y) can be radically different from C(X|Y) as our example reveals.
A non-trivial assumption in the derivation of DAS is that learning E does not destroy knowledge of ~H > E (i.e. does not lead to a violation of the constraints). In the medical case we started with, this is a reasonable assumption. If I know that my instrument would say that I am sick if I weren’t and I further learn that that the instrument says I am sick, it need not affect my confidence in the subjunctive.
See Huemer (2016) for an interpretation of the priors as something closer to ‘a priori’. What is the a priori probability of not being a BIV? I find myself unable to get a handle on this question.
As an anonymous referee correctly points out, even if our confidence that the evidence is the same conditional on being a BIV is 1, it doesn’t follow that we should accept the skeptical conclusion. Suppose, for example, that the knowledge threshold is .8 and that the prior that we are not BIVs is .9. If we also assume that the confidence of our evidence being the same conditional on not being a BIV is 1, then the posterior (the probability we are not BIVs conditional on the evidence) will also be .9 (applying Bayes’ theorem) which is above the knowledge threshold. The same result holds if we assume instead that the confidence in the corresponding subjunctive (if we were a BIV, then our evidence would be the same) is 1. This is because C(X//Y) = 1 entails C(X|Y) = 1 under reasonable assumptions.
Where ‘>’ denotes the subjunctive conditional, this principle is (strictly speaking) entailed by Dretske’s account (as discussed in Sect. 1). They are equivalent under reasonable assumptions. In counter-factual logics these assumptions are ‘anti-symmetry’ and ‘limit’, Sider (2010), which are taken on by Stalnaker (1968).
Another way to explain why our agent does not know H (it is 3:31 pm) is to argue that the proposition ~ H > E is itself a propositional defeater for knowing H (where E is the watch reading). A propositional defeater D for H is a proposition that would lower the justification for H if it were accepted (Bergman, 2006). On this strategy we would need to show how accepting ~H > E can lead to a reduction in the credence for H, under certain further assumptions.
References
Adams, E. (1975). The logic of conditionals. D. Reidel, Synthese Library.
Anderson, B., Williams, S., & Schulkin, J. (2013). Statistical literacy of obstetrics-gynecology residents. Journal of Graduate Medical Education, 5(2), 272–275.
Bennett, J. (2003). A philosophical guide to conditionals. Oxford University Press.
Bergman, M. (2006). Justification without awareness. OUP.
Briggs, R. (2017). Two interpretations of the Ramsey test. In H. Beebee, C. Hitchcock, & H. Price (Eds.), Making a difference: Essays in honour of Peter Menzies. Oxford: Oxford University Press.
DeRose, K. (1995). Solving the skeptical problem. Philosophical Review, 104(1), 1–52.
Dhaliwal, G. (2011). Going with your gut. Journal of General Internal Medicine, 26, 107.
Douven, I., & Verbrugge, S. (2010). The Adams family. Cognition, 117(3), 302–318.
Dretske, F. (1970). Epistemic operators. The Journal of Philosophy, 67(24), 1007–1023.
Dretske, F. (1971). Conclusive reasons. Australasian Journal of Philosophy, 49, 1–22.
Edgington, D. (1995). On conditionals. Mind, 104, 235–329.
Gettier, E. L. (1963). Is justified true belief knowledge? Analysis, 23(6), 121–123.
Gibbard, A. (1981). Two recent theories of conditionals. In W. L. Harper, R. Stalnaker, & G. Pearce (Eds.), Ifs (pp. 211–247). Reidel.
Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L., & Woloshin, S. (2007). Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest., 8(2), 53–96.
Hawthorne, J. (2003). Knowledge and lotteries. Oxford University Press.
Huemer, M. (2016). Serious theories and skeptical theories: Why you are probably not a brain in a vat. Philosophical Studies, 173(4), 1031–1052.
Ichikawa, J. J. (2017). Contextualizing knowledge. Oxford University Press.
Kratzer, A. (1986). Conditionals. Chicago Linguistics Society, 22(2), 1–15.
Leitgeb, H. (2012). A probabilistic semantics for counterfactuals. Part A. The Review of Symbolic Logic, 5(01), 26–84.
Lewis, D. (1975). Adverbs of quantification. In E. L. Keenan (Ed.), Formal semantics of natural language (pp. 3–15). Cambridge University Press.
Lewis, D. (1976). Probabilities of conditionals and conditional probabilities. Philosophical Review, 85(3), 297–315.
Lewis, D. (1986). Probabilities of conditionals and conditional probabilities II. The Philosophical Review, 95, 581–89.
Manrai, A. K., Bhatia, G., Strymish, J., & Kohane, I. S. (2014). Medicine’s uncomfortable relationship with math calculating positive predictive value. JAMA Internal Medicine, 174(6), 991–993.
Mayo-Wilson, C. (2018). Epistemic closure in science. The Philosophical Review., 127(1), 73–114.
Nozick, R. (1981). Philosophical explanations. Harvard University Press.
Over, D., Hadjichristidis, C., Evans, J. S. B. T., Handley, S., & Sloman, S. (2007). The probability of causal conditionals. Cognitive Psychology, 54(1), 62–97.
Pfeifer, N., & Kleiter, G. D. (2010). The conditional in mental probability logic. In M. Oaksford & N. Chater (Eds.), Cognition and conditionals (pp. 153–173). Oxford University Press.
Putnam, H. (1981). Reason, truth and history. Cambridge University Press.
Roush, S. (2005). Tracking truth. Oxford University Press.
Russell, J. S., & Hawthorne, J. (2016). General dynamic triviality theorems. The Philosophical Review, 125(3), 307–339.
Sider, T. (2010). Logic for philosophy. Oxford University Press.
Stalnaker, R. (1970). Probability and conditionals. Philosophy of Science, 37(1), 64–80.
Stalnaker, R. C. (1968). A theory of conditionals. In N. Rescher (Ed.), Studies in logical theory (American philosophical quarterly monographs 2) (pp. 98–112). Oxford: Blackwell.
Stolper, E., Wiel, M., Royen, P., Bokhoven, M., Weijden, T., & Dinant, G. J. (2011). Gut feelings as a third track in general practitioners’ diagnostic reasoning. Journal of General Internal Medicine, 26, 197–203.
Van den Bruel, A., Thompson, M., Buntinx, F., & Mant, D. (2012). Clinicians’ gut feeling about serious infections in children: Observational study. BMJ, 345, e6144.
Williams, J. R. G. (2012). Counterfactual triviality: A Lewis-impossibility argument for counterfactuals. Philosophy and Phenomenal Research, 85(3), 648–670.
Williamson, T. (2000). Knowledge and its limits. Oxford University Press.
Acknowlegements
I would like to thank Brad Armendt, Shyam Nair, Bryan Lietz, audience members at the Society for Exact Philosophy, and anonymous referees for helpful feedback and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Notation convention
We will be making repeated use of Bayes’ theorem. Here it is, applied to a Learning scenario L:
(Bayes) Ct1(H|E) = Ct1(E|H) Ct1(H)/Ct1(E).
where by the law of total probability: Ct1(E) = Ct1(E|H)Ct1(H) + Ct1(E|~H)Ct1(~H).
Let’s agree to some friendlier notation which I summarize here for convenience (Table 4).
This allows us to rewrite Bayes as:
(BT) Ct1(H|E) = lp/(lp + x(1 − p)).
Finally, we always assume l, x, p can only take values in the interval [0,1] since they are probabilities.
Appendix 2: Deriving f (indicative)
Consider an arbitrary learning situation L with an ignorance zone [0,k) where (a) H is knownt2 (upon learning E) and (b) ~H → E is knownt1. Typically, this situation won’t be possible. But there are cases in which it is possible, revealing the limits of sensitivity. These limits are expressed by function f which we now derive.
From the premise that L has ignorance zone [0,k), (a), plus conditionalization, we get (i) Ct1(H|E) ≥ k. From the premise that L has ignorance zone [0,k), (b) and ICA (see main text), we get (ii) x ≥ k. We will add one more constraint which maximizes the posterior as we discussed in the text. We assume (iii) l = Ct1(E|H) = 1. These assumptions are summarized in Table 5.
These are the constraints where our new sensitivity principle would not work (would not yield a denial of knowledge).
Now from (i), (iii) and BT we derive.
(1) [p/(p + x(1-p))] ≥ k
Algebraic manipulation of (1) and (ii) entails.
(2) [p-kp/k(1-p) ] ≥ x ≥ k
eliminating x:
(3) 0 ≥ k2 − k2p + kp − p
What (3) represents are the possible values of k, p when the constraints (i–iii) are met (Fig. 3). Suppose we pick a pair k1, p1 which do not satisfy (3), it follows that the constraints are violated. That is, either l < 1, the agent fails to knowt2 H or she fails to knowt1 the indicative ~H → E. Let’s grant that l = 1 (the strongest case for knowingt2 H)) and that she knowst1 the indicative. We conclude that she fails to knowt2 H. In other words, l = 1, knowledge of the indicative, plus violation of (3) entails failure to knowt2 H (e.g. failure to know the patient has the disease) upon learning E. We have the makings of a necessary condition for knowledge.
Looking at Fig.
3, the possible values of k,p that satisfy the constraints will just be the 2 dimensional projection of the surface on the top of the cube. This will just be a region of space bounded by a curve we will call f. f is a function which is given implicitly in (4) derived from (3). Graphing f, the area above it (inclusive) represents the pairs k,p which satisfy our constraints (See Fig. 1 in the main paper).
(4) K2-k2p + kp − p =0 (Function p=f(k) given implicitly)
Appendix 3: Deriving g (Subjunctive)
The constraints translate to (i)–(iii) as depicted in Table 6. (i) and (iii) are derived in the same way that they were derived in Table 5. (ii) is gotten from SCA (see main text) plus the fact that L has ignorance zone [0,k).
We begin by dividing the totality of possible worlds W into 4 disjoint sets (some of these may be empty): WHE, WH~E, W~HE and W~H~E. WHE is just the set worlds where both H and E are true and so on. Propositions are just sets of worlds, so WHE is just the proposition H&E and so on.
Before we get to the details of the proof, I want to briefly point out the main driving idea. Once we see this, we will be able to get a feel for how the proof is supposed to work. In particular, we will be able to see how g is set independent of any choice function (the function which selects the closest worlds in a counter-factual logic). This may be surprising.
The key idea is to note that with the help of imaging, we can deduce facts about one’s priors and conditional probabilities from facts about our credences regarding subjunctives. Once we deduce these facts, we can just plug them in to Bayes’ theorem to get constraints on the posterior as we did with the indicative case. To see this, suppose that C(E//~H) ≥ k. By the definition of imaging, this means that if you add up the probabilities of all the worlds (assume for simplicity there are just a finite number of possible worlds) such that the nearest ~H world to them is an E world, then this sum will be greater than or equal to k. Consider the set W* of all such worlds. C(W*) ≥ k. W* will be a subset of WHvE (because no element of W* can be a world where both ~H and ~ E are true—if ~H is true in a W* world, then the closest ~H world will be itself and so it must be an E world by definition). Since C(W*) ≥ k and W* is a subset of WHvE, it follows that C(HvE) ≥ k. C(HvE) is itself equivalent C(H) + C(E|~H)C(~H) if defined. Notice that these are elements of Bayes’ theorem and so (via algebraic manipulation) we can get constraints on the posterior C(H|E). And note that we made no appeal to any fact about selection functions.
Let’s begin the proof. The product rule in the probability calculus gives us (5):
(5) Ct1(W~HE) = Ct1(~H&E) = Ct1(E|~H) Ct1(~H) = x(1-p)
Applying (5) and (iii) to BT followed by setting the inequality in (i) yields:
(6) [p/(p + Ct1(W~HE))] ≥ k
Algebraic manipulation gets us.
(7) [(p/k) − p] ≥ Ct1(W~HE).
Next, I will prove (8).
(8) Ct1(WHE) + Ct1(WH~E) + Ct1(W~HE) ≥ k
Here’s the proof for (8). From (ii) and IMA (see main text) we get Ct1(E//~H) = [∑WCt1(w)⋯1E (w~H)] ≥ k. So [∑WCt1(w)⋯1E (w~H)] = [∑HvECt1(w)⋯1E (w~H)] + [∑~H&~ECt1(w)⋯1E (w~H)] ≥ k. We know that for all w in W~H~E, 1E(w~H) = 0. This is because if w is in W~H~E, this means both ~H and ~ E are true in w, so then the closest ~H world is a ~ E world (the closest ~H world is itself) which means that 1E(w~H) = 0. It follows that [∑~H&~ECt1(w)⋯1E (w~H)] = 0 and hence [∑HvECt1(w) ⋯1~H (wE)] ≥ k. Now since ∑HvECt1(w) ≥ [∑HvECt1(w)⋯1~H (wE)], we deduce that ∑HvECt1(w) ≥ k. But this is just Ct1(WHE) + Ct1(WH~E) + Ct1(W~HE) ≥ k. QED.
Continuing, manipulation of (8) gets us Ct1(W~HE) ≥ k − [Ct1(WHE) + Ct1(WH~E)]. But since [Ct1(WHE) + Ct1(WH~E)] = Ct1(H) = p, we deduce Ct1(W~HE) ≥ k − p from these last equations. This, together with (7) (eliminating Ct1(W~HE)) yields (p/k) − p ≥ k − p which is just our function g:
(9) p ≥ k2
Rights and permissions
About this article
Cite this article
Pinillos, Á. Bayesian sensitivity principles for evidence based knowledge. Philos Stud 179, 495–516 (2022). https://doi.org/10.1007/s11098-021-01668-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11098-021-01668-3