1 Introduction

In previous work I have argued in detail that the comparative application of the hypothetico-deductive (HD-)method is not only functional for making empirical progress, but even for achieving truth approximation, in both cases of course in some deductive sense (see Kuipers 2000 for a monograph synthesis). For the last conclusion the so-called (deductive) success theorem is crucial: being closer to the truth entails always being (empirically) at least as successful and it entails almost always becoming more successful in the long run, that is, showing empirical progress.

Three intuitions form the basis of this paper. The first is the idea that there must be some kind of probabilistic version of the HD-method, a ‘Hypothetico-Probabilistic (HP-) method’, in terms of something like probabilistic consequences, instead of deductive consequences. According to the second intuition, the comparative application of this method should also be functional for some probabilistic kind of empirical progress, and according to the third intuition this should be functional for truth approximation. In all three cases, the guiding idea is to explicate these intuitions by explicating the crucial notions as appropriate ‘concretizations’ of their deductive analogs, being ‘idealizations’.

In a previous paper (Kuipers 2007a), the HP-paper for short, the first intuition, already suggested by Ilkka Niiniluoto in (Niiniluoto and Tuomela 1973, p. 204, 209), has already been successfully explicated, and the second as well, at least to a large extent. However, it turned out to be a rather difficult task to explicate the third intuition, for two reasons. First, although ‘probabilistically closer to the truth’ would have to be explicated as a genuine concretization of ‘deductively closer to the truth’, this was very difficult to prove for the most plausible prima facie concretization. Second, whereas a conceptually interesting, but impractical, probabilistic success theorem in terms of ‘expected success’ was plausible to think of, finding a probabilistic version of the success theorem with practical value, turned out to be a difficult challenge.

In the present paper these problems have been solved, with some surprising aspects. It integrates brief versions of the explication of the first intuition, and a relevant adaptation of the second. I like to stress that for this kind of analysis the general (meta-) method of explication (Kuipers 2007b) is indispensable. It goes back to the constructive attitude of Carnap and Hempel to the analysis of ‘scientific common language and common sense’: it requires no genuine artificial language, but may well require some well argued revision of concepts and intuitions. More specifically, this project requires the ‘conceptual version of idealization and concretization (conceptual-I&C)’ (Kuipers 2007b, and more specifically, 2007c), going from the relevant deductive notions to probabilistic ones. In order to be a proper concretization, it should satisfy the C-test, which has a formal and material side. The idealizational point of departure should formally appear as an extreme special case of the concretization. Moreover, the latter should be a case of ‘conceptual progress’ relative to the problems and limitations of the former.

Unfortunately, as we will see, a conceptually satisfactory probabilistic concretization of the deductive story does not yet imply that we have reached the level of real historical cases.

2 The ‘Hypothetico-Probabilistic (HP-) Method’ as a Concretization of the ‘Hypothetico-Deductive (HD-) Method’

2.1 The Basic Concretization

The basic concretization needed is the concretization of the crucial notion of the HD-method, viz. deductive consequences, d-consequences. There have been proposed in the literature quite a number of alternatives. They are briefly presented and compared in the HP-paper (Kuipers 2007a). For our purposes, there are two, related, useful possibilities. Starting from the idealization:

$$ H\, \vDash \, E:\quad \quad \,E \,is\,a\,deductive \left( {d \hbox{-} } \right) consequence\,of\, H $$

and assuming some probability function p(.) the first attractive possibility is:

$$ p\left( {E/H} \right)\, \ge \,p\left( E \right):\quad \quad E\,is\,a\,probabilistic\,\left( {p \hbox{-}} \right)\,consequence\,of\,H $$

This is also known as the (weak) positive relevance condition in the context of explicating the notion of ‘confirmation’. The probability function may be of any kind, in particular, in terms of a classification introduced in (Kuipers 2001, 2006), Popperian (non-inductive), Carnapian (inductive likelihoods), Bayesian (inductive priors), Hintikkian (double inductive). As to the formal C-test, it directly follows from

$$ if\,H\, \vDash E, then\,1 = p\left( {E/H} \right) \ge p\left( E \right) $$

that a d-consequence is an extreme special case of a p-consequence. Here, and later, it is easy to see how we get conditional versions, i.e., assuming conditions such as initial conditions, auxiliary hypotheses, background assumptions, and core principles. Indicating them with C, we get in the present case:

$$ if\,H,\,C\, \vDash \,E,\,then\,1 = p\left( {E/H\& C} \right)\, \ge \,p\left( {E\& C} \right) $$

The alternative possibility would be to take p(E/H) ≥ p(¬E/H), i.e., p(E/H) ≥ ½ as the defining condition of a p-consequence, a condition that is again satisfied in an extreme sense by a d-consequence. Although I have a slight preference for the first possibility, there is not much reason to debate the choice, because the relevance of making a choice disappears as soon as we enter comparisons of the kind p(E/H1) ≥ p(E/H2), which will become crucial for our analysis. For both versions it is interesting for various formal reasons to study the set of p-consequences of a statement: PCn(A) = def {B/p(B/A) ≥ p(B)}, {B/p(B/A) ≥ p(¬B/A)}, respectively. In our HP-paper we have made a start.

2.2 From the HD- to the HP-Method of Testing

Although the context of testing of a hypothesis (H) resulting in evidence (E) was suggested by our abbreviations, so far nothing substantial was particularly assumed about such a context. In the context of the HD-method of testing, the crucial notion is that of (deductive) test implication (E) of a hypothesis (H), which is of course explicated as a (non-trivial) d-consequence of an observational nature. Of course, ‘observational’ will here be understood in the ‘theory-relative’ sense that relative to H the non-formal terms occurring in E are supposed to be observational, however theory-laden they may be with underlying or background theories. In sum, we get as the idealized point of departure:

  • E is a d-test implication of H iff

  • E is an ‘observational’ d-consequence of H such that not \(\vDash E\)

The concretization is now almost as plausible:

  • E is a p-test implication of H iff

  • E is an ‘observational’ p-consequence of H such that \( p\left( {E/H} \right)\, > \,p\left( E \right)\)

It satisfies the C-test, with a marginal exception, for: “if \( H \vDash E \) and not \( \vDash E \) then p(E/H) = 1 > p(E)” holds for almost all kinds of evidence, but not for ‘almost tautological’ evidence, i.e., non-tautological evidence for which p(E) nevertheless equals 1. Although such evidence is conceptually possible, it is not something to keep taking into account, hence we may conclude that a d-test implication is an extreme case of a p-test implication.

Now we are in the position to formulate the basic aim of the HD-method of testing: it aims at deriving d-test implications and finding out, by experiment or mere observation, whether they are true or false, which is called (d-)confirmation and falsification of the hypothesis, respectively. In sum, the HD-method aims at d-confirmation or falsification of d-test implications. Being restricted to d-consequences, it is an idealization. Its suggested concretization, the Hypothetico-Probabilistic (HP-)Method, aims at p-(dis-)confirmation of p-test implications. Here ‘p-confirmation’ amounts to the coming out to be true of a p-test implication, and ‘p-disconfirmation’ its coming out to be false. See (Kuipers 2001, Sect. 7.1.2) for an analysis of probabilistic (dis-)confirmation.

2.3 Further Challenges

Starting from the deductive explications, taking counterexamples and falsified theories into account, presented in (Kuipers 2000), the main challenges were to give a coherent I&C-explication of probabilistic versions of the deductive ones of (1) confirmation and falsification, of (2) testing, and separate and comparative evaluation, and of (3) empirical progress and truth approximation. The first challenge was already met in (Kuipers 2000, Ch. 3) and improved in (Kuipers 2001, Sect. 7.1), the second one in the HP-paper (Kuipers 2007a). This paper builds upon these results and partially redirects them and combines them with meeting the third challenge.

The main claim of the HP-paper, integrated in this one, was that the method of likelihood comparisons (the LC-method) is the concretization of (instrumentalist) comparative HD-evaluation. The present paper adds to this the claims that the LC-method is, like the comparative HD-method, not only functional for empirical progress but also for truth approximation.

More specifically, the HP-paper shows that there is a HP-method as a straightforward concretization of the HD-method. It concerns testing, and separate and comparative evaluationFootnote 1 and is, depending on the choice of probability function, like the explication of ‘probabilistic confirmation’, of a Popperian, Carnapian, Bayesian, or Hintikkian nature. The first is non-inductive, by using only p-functions without ampliative effect on the re-occurrence of a property, also called ‘instantial confirmation’. Carnapian p-functions achieve this inductive feature by a likelihood function having this effect, whereas Bayesian p-functions achieve it by assuming a prior distribution over a family set of hypotheses. Finally, Hintikkian p-functions achieve it by the combination of the two, for which reason these p-functions may be called ‘double-inductive’.

The relevant aspects of HP-testing and separate HP-evaluation for this paper have been presented above. The core of the explication of ‘comparative HP-evaluation’ amounts to likelihood comparisons p(E/X) > p(E/Y) for evidence E and theories X and Y, here called the LC-method. In the section 4 we will present in an integrated form how this method is functional for empirical progress and truth approximation. It will also provide a summary of the way in which it is argued in (Kuipers 2000) that the comparative HD-method is functional for empirical progress and truth approximation. But first we have to define probabilistic truthlikeness.

3 From Deductive to Probabilistic Truthlikeness

3.1 General Assumptions and Restrictions

The present analysis is restricted to laws and theories of a deterministic nature dealing with what is nomically possible versus impossible. Such laws and theories can be characterized by subsets of the set Mp of conceptual possibilities generated by a given vocabulary for the intended domain. In the structuralist approach to theories it has been demonstrated by many realistic examples that it is at least possible to define the type of set theoretic structures, here called conceptual possibilities that are involved. See Ch. 12 of (Kuipers 2001) or Sect. 2 of (Kuipers 2007d) for a survey.

When subsets A and B of Mp can be axiomatized, then \( \hbox{A}\, \subseteq \,\hbox {B} \) amounts to \( A \vDash \,B \). Note that we do not make a notational distinction between statements and their models, such as in the present case between the possible axiomatization of a subset and the subset itself, i.e., the class of models of the axiomatization. It may take some effort to get acquainted with this simplifying convention.

Let T denote the subset of Mp of nomic possibilities. We associate a strong and a weak claim with a theory X. The strong claim is “X = T”, and theory X is true (in this sense) iff this claim is true. We associate a weaker claim primarily with (potential) laws or nomic hypotheses. A nomic hypothesis, H claims “\( \hbox{T}\, \subseteq \,\hbox {H} \)”, which is true iff this claim is true, in which case it is called ‘a (weak nomic) truth’. In line with this, we call the claim “\( \hbox{T}\, \subseteq \,\hbox {X} \)” the weak claim of theory X, or the claim of the theory ‘as a nomic hypothesis’; and if true, the theory is a weak nomic truth, that is, it is ‘true as a nomic hypothesis’. From now on, ‘a truth’ is a weak nomic truth. From the above assumptions and conventions it directly follows that a characterization of T, axiomatic or set theoretic, also indicated by T, represents “the strongest true hypothesis”, called “the (nomic) truth”.

3.2 Deductive Truthlikeness

The basic definition of ‘(comparative) deductive truthlikeness’ can be given in three ways, a model, a consequence and a mixed version. Here we prefer the first one:

  • Basic model (D-)definition of ‘more truthlikeness’:

  • Y is (‘deductively’) at least as close/similar to the (nomic) truth T as X (Y is D-at-least-as-close to T as X) iff:

  • (i) \( {\text{T}} \cap {\text{Y}} \supseteq {\text{T}} \cap {\text{X}}\quad \quad \left[ { \Leftrightarrow {\text{T}} - {\text{Y}}\, \subseteq \,{\text{T}} - {\text{X}}} \right] \)

  • (ii) \( {\text{Y}} - {\text{T}}\, \subseteq \,{\text{X}} - {\text{T}} \)

  • Y is D-closer to T than X iff in addition both subset relations are proper subset relations.

Note first that the chosen definition of ‘D-closer to’ is ‘two-sided’. Weaker versions (‘at least one proper subset relation’, ‘the first (second) subset relation should be proper’) are also possible.

Note also that in terms of symmetric differences (i) and (ii) together amount to \( \hbox{T}\Updelta \,\hbox {Y}\, \subseteq \,\hbox {T}\Updelta \hbox {X} \) and in case the theories are axiomatizable also to: \( \hbox{T} \cap \hbox{X}\, \vDash \,\hbox{Y}\, \vDash \, \hbox{T} \cup \hbox{X} \). This equivalence makes it already plausible that the definition is in a crucial sense of a deductive nature. In terms of the consequence version (Kuipers 2000, 2001) this is still more evident. That version will be touched upon in a paraphrase of the definition of ‘D-more successfulness’ (Sect. 4.2).

Figure 1 depicts “Y is D-closer to T than X” set theoretically. Note that Y is as it were moving from X to T.

Fig. 1
figure 1

“Y is D-closer to T than X”: shaded areas empty, *-areas non-empty

3.3 Probabilistic Truthlikeness

Let p(.) represent any probability measure (function) on the set of (measurable) subsets of Mp. It is important to note that p(X) is no prior in the sense of the probability that theory X is true, i.e., p(T = X). It is the absolute probability (measure) of X as a subset of the total outcome space Mp. Hence, p(.) is such that \( {\text{p}}(\emptyset )\, = \,0\, \le \,{\text{p}}\left( {\text{X}} \right)\, \le \,{\text{p}}\left( {\text{Mp}} \right)\, = \,1. \)

  • Conditional Probabilistic model (CP-)definition of ‘closer to the truth’

  • Y is CP-at least as close/similar to the (nomic) truth T as X (to the degree m, see below)

  • (i-p) \( {\text{p}}\left( {\text{Y/T}} \right)\, \ge \,{\text{p}}\left( {\text{X/T}} \right) \quad \quad \left[ { \Leftrightarrow {\text{p}}\left( {{\text{T}} - {\text{Y}}/{\text{T}}} \right) \le {\text{ p}}\left( {{\text{T}} - {\text{X}}/{\text{T}}} \right)} \right] \)

  • (ii-p) \( {\text{p}}\left( {{\text{T}}/{\text{Y}}} \right) \ge {\text{p}}\left( {{\text{T}}/{\text{X}}} \right)\quad \left[ { \Leftrightarrow {\text{p}}\left( {{\text{Y}} - {\text{T}}/{\text{Y}}} \right) \le {\text{p}}\left( {{\text{X}} - {\text{T}}/{\text{X}}} \right)} \right] \)

  • Y is CP-closer to T than X iff in addition the two inequalities are proper inequalities, to be indicated by (i-p+) respectively (ii-p+).

The strong clauses can be interpreted in terms of random sampling: (i-p+) amounts to the claim that the probability of getting a (member of) Y by random sampling in T is larger than that of getting a X, and (ii-p+) to the claim that the probability to get a nomic possibility by random sampling in Y is larger than by random sampling in X.

The following (elementary) equivalence is crucial for the present paper: (i-p) is equivalent to p(T/Y)/p(T/X) ≥ p(X)/p(Y). From this it follows that (i-p) and (ii-p) together amount to the claim that the L(ikelihood)-ratio p(T/Y)/p(T/X) has a minimum value or degree, that is 1 or p(X)/p(Y), whatever of these numbers is the highest. In sum, ‘being CP-as close to’ amounts to the:

  • CP-condition

  • $$ {\text{p}}\left( {{\text{T}}/{\text{Y}}} \right)/{\text{p}}\left( {{\text{T}}/{\text{X}}} \right)\, \ge \,\max \left( {1,{\text{ p}}\left( {\text{X}} \right)/{\text{p}}\left( {\text{Y}} \right)} \right)\, = \,_{\text{df}}{\text{m}}. $$

Note that m is essentially known as soon as p(.) has been fixed.

3.4 Survey of Successive Concretizations and Generalizations

In order to show that the CP-definition is, at least in the formal sense, a concretization of the D-definition it is illuminating to schematically present some intermediate definitions and their relations. An arrow indicates ‘entails’, and ‘(e)sc’ indicates ‘is (extreme) special case of’. ‘N-closer to’ is short for ‘Numerically closer to’, and is only defined for finite Mp. ‘UP-closer to’ is short for ‘Unconditional Probabilistically closer to’.

$$ \begin{array}{llll}\hbox{D-closer to} &\quad\quad {\rm T}- {\rm Y} \subset {\rm T} -{\rm X} &\quad \& &\quad\quad {\rm Y} - {\rm T} \subset {\rm X} - {\rm T}\\&\quad\quad\qquad\Downarrow &\quad {\rm esc} &\quad\quad \qquad\Downarrow \\\hbox{N-closer to} &\quad\,\left|{\rm T}- {\rm Y}\right| < \left|{\rm T} -{\rm X}\right| &\quad \& &\quad\,\left|{\rm Y} -{\rm T}\right| < \left|{\rm X} - {\rm T}\right|\\&\quad\quad\qquad\Downarrow &\quad {\rm sc} &\quad \quad\qquad\Downarrow \\\hbox{UP-closer to} &\quad{\rm p}({\rm T}- {\rm Y}) < {\rm p}({\rm T} -{\rm X}) &\quad \& &\quad{\rm p}({\rm Y} -{\rm T}) < {\rm p}({\rm X} - {\rm T})\\&\quad\quad\qquad\Downarrow &\quad {\rm esc} &\quad \quad\qquad\Downarrow \hbox{ if p(X)}= \hbox{p(Y)}\\\hbox{CP-closer to} &\,{\rm p}({\rm T}- {\rm Y/T}) < {\rm p}({\rm T} -{\rm X/T}) &\quad \& &\,{\rm p}({\rm Y} -{\rm T/Y}) < {\rm p}({\rm X} - {\rm T/X})\\&\,\,\quad\qquad\Leftrightarrow &\quad &\,\,\quad \qquad\Leftrightarrow \\&\,\quad{\rm p}({\rm Y/T}) > {\rm p}({\rm X/T})& &\,\quad{\rm p}({\rm T/Y}) > {\rm p}({\rm T/X})\\\end{array}$$

It is interesting to see that in all but one case a higher clause entails its immediate lower successor. The right hand clause of ‘UP-closer to’ only entails that of ‘CP-closer to’ in the special case that p(X) = p(Y), in which case it is easy to prove. Hence, although it is easy to show that ‘D-closer to’ is an extreme special case of ‘N-closer to’, assuming finite Mp, and both of ‘UP-closer to’, it is not easy to show that all three of them are extreme special cases of ‘CP-closer to’. All these claims hold already for the weak ‘at least as close’ versions.

3.5 ‘D-as-Close-to’ as (extreme) Special Case’

It appears that the right hand clause of CP-at least as close to’, that is (ii-p), can be derived from the conjunction of the two clauses of the same higher level.

Theorem

‘D-at least as close’ (entails ‘N-at least as close’, if Mp finite, which) entails ‘UP-at least as close’, which entails ‘CP-at least as close’.

As already suggested, the proof ‘from D to N’ and ‘from N to UP’ is rather trivial. The proofs ‘from UP to CP’ and directly ‘from N to CP’ are similar to the following direct proof ‘from D- to CP’:

  • (i) \( {\text{T}} \cap {\text{Y}} \supseteq {\text{T}} \cap {\text{X}} \) trivially entails (i-p) p(Y/T)  ≥ p(X/T)

For deriving (ii-p) from (i) and (ii) we first note some of its equivalents:

  • (ii-p) \( \begin {aligned}& {\text{p}}\left( {{\text{T}}/{\text{Y}}} \right)\, \ge \,{\text{p}}\left( {{\text{T}}/{\text{X}}} \right) \Leftrightarrow\\ &{\text{p}}\left( {{\text{T}} \cap {\text{Y}}} \right)|/\left\{ {{\text{p}}\left( {{\text{T}} \cap {\text{Y}}} \right) + {\text{p}}\left( {{\text{Y}} - {\text{T}}} \right)} \right\} \ge {\text{p}}\left( {{\text{T}} \cap {\text{X}}} \right)/\left\{ {{\text{p}}\left( {{\text{T}} \cap {\text{X}}} \right) + {\text{p}}\left( {{\text{X}} - {\text{T}}} \right)} \right\} \Leftrightarrow \\&{\text{p}}\left( {{\text{Y}} - {\text{T}}} \right)/{\text{p}}\left( {{\text{T}} \cap {\text{Y}}} \right)\, \le \,{\text{p}}\left( {{\text{X}} - {\text{T}}} \right)/{\text{p}}\left( {{\text{T}} \cap {\text{X}}} \right) \end {aligned}\)

It is easy to check that the last version trivially follows from the combination of (i) \( \left( {{\text{T}} \cap {\text{Y}} \supseteq {\text{T}} \cap {\text{X}}} \right) \) and (ii) \( \left( {{\text{Y}} - {\text{T}}\, \subseteq \,{\text{X}} - {\text{T}}} \right). \)

One important problematic aspect of the definition of ‘D-closer to’ is that it holds only very rarely because most theories are incomparable according to that definition. The present theorem, according to which ‘CP-closer to’ is weaker than ‘D-closer to’, naturally suggests the question to what extent the possibilities for more truthlikeness and hence for truth approximation have increased. It is clear that the number of possibilities have increased enormously in any informal sense. In fact, theories X and Y are only incomparable according to the CP-condition when

$$ \begin{gathered} {\text{p}}\left( {{\text{T}}/{\text{X}}} \right)\, < \,{\text{p}}\left( {{\text{T}}/{\text{Y}}} \right)\, < \,\left[ {{\text{p}}\left( {\text{X}} \right)/{\text{p}}\left( {\text{Y}} \right)} \right]{\text{ p}}\left( {{\text{T}}/{\text{X}}} \right)\quad \quad \quad {\text{if p}}\left( {\text{X}} \right)\, \ge \,{\text{p}}\left( {\text{Y}} \right) \hfill \\ {\text{or}}\quad {\text{p}}\left( {{\text{T}}/{\text{Y}}} \right)\, < \,{\text{p}}\left( {{\text{T}}/{\text{X}}} \right)\, < \,\left[ {{\text{p}}\left( {\text{Y}} \right)/{\text{p}}\left( {\text{X}} \right)} \right]{\text{ p}}\left( {{\text{T}}/{\text{Y}}} \right)\quad \;{\text{if p}}\left( {\text{X}} \right)\, < \,{\text{p}}\left( {\text{Y}} \right) \hfill \\ \end{gathered} $$

However, this does not very much influence the extent to which it now covers real cases of ‘closer to’, i.e., cases going beyond the toy level of, for example, theories about simple electric circuits (Kuipers 2000, Ch. 7). The main problem of the ‘basic’ approaches, deductive or probabilistic, is the fact that in real cases theories X and Y will have no overlap with T at all due to remaining idealizations in X and Y. Hence, p(T/Y) and p(T/X) will be 0, in which case the CP-condition is not applicable, but the original definition straightforwardly leads to the verdict ‘equally close to the truth’. To meet this ‘problem of non-overlap’ we have to refine the probabilistic approach by taking structurelikeness into account, as has been done for the deductive case (Kuipers 2000, Ch. 10).

4 The Comparative HP-Method is Functional for Empirical Progress and Truth Approximation

4.1 Empirical Evidence

In order to deal with the way in which the HP-method might be helpful for empirical progress and truth approximation we need a representation of empirical findings. Let Rn represent the set of realized possibilities at time n and Sn the strongest law induced from Rn, i.e., the resulting conjunction of all induced laws. In the former we can of course make descriptive mistakes, in the latter inductive mistakes. However, if we assume, by way of strong idealizations, that we have not made such mistakes, we may conclude that Rn is a subset of T, for we cannot realize nomic impossibilities, and that Sn is a superset of T, for it must then be true as a nomic hypothesis, that is, true in the weak sense. Moreover, while Rn can only grow, not shrink, in the course of time, Sn can only shrink, i.e., become stronger. In sum, we have the following ‘cumulative evidence relations’:

$$ {\text{R}}0\,\; = \,\,\emptyset \, \subseteq \ldots {\text{Rn}}\, \subseteq \,{\text{Rn}} + 1 \ldots \, \subseteq \,{\text{T}}, \quad {\hbox{provided no descriptive mistakes}} $$
$$ {\text{S}}0\; = \,{\text{Mp}}\, \supseteq \ldots {\text{Sn}}\, \supseteq \,{\text{Sn}} + 1 \ldots .\, \supseteq \,{\text{T}}, \quad {\hbox{provided no inductive mistakes}} $$

This representation of evidence is useful for both our deductive and probabilistic purposes.

4.2 Deductive Success Theorem

We now turn to the introduction of the crucial theorem for the claim that the HD-method is functional for empirical progress and truth approximation. First we need an appropriate definition of (more) successfulness:

  • Definition of ‘deductive more successfulness’:

  • Y is D-at least as successful as X (relative to R/S):

  • (j) \( {\text{R}} \cap {\text{Y}}\, \supseteq \,{\text{R}} \cap {\text{X}}\quad \left[ { \Leftrightarrow {\text{R}} - {\text{Y}} \subseteq {\text{ R}} - {\text{X}}} \right] \)

  • (jj) \( {\text{Y}} - {\text{S}}\, \subseteq \,{\text{X}} - {\text{S}} \)

  • Y is D-more successful than X (relative to R/S) iff both subset relations are proper.

This result may well be seen as the ideal result of applying the HD-method, more specifically the method of comparative HD-evaluation to two theories. Let H be a test implication of Y, hence it is a superset of Y. If we get a counterexample of H, that is, if we get a member in R that is excluded by H, hence by Y, condition (j) requires that it is also a counterexample of X. In other words, according to (j), all established counterexamples of Y should also be counterexamples of X. On the other hand, if H* is a test implication of X, hence a superset of X, that turns out not to get counterexamples despite all our attempts, and hence is inductively concluded to be true, that is, to be a superset of S, also called a general success of X, then (jj) requires that H* is also a superset of Y, hence a test implication of Y. In other words, according to (jj), all general successes of X should also be general successes of Y. This paraphrase of (jj) typically reflects the consequence version of one of the crucial notions. Of course, Y is more successful than X if it drops at least one counterexample of X and counts at least one extra general success.

The rest of the comparative HD-method story in terms of HD-results, i.e., general successes and counterexamples, now goes as follows: Suppose at time n, Y happens to be more successful than X, relative to Rn/Sn. This result need not indicate that Y is really better than X, for it may result from a, relative to Y, happy choice of experiments. Hence, before we draw far reaching conclusions we have to consider and test the Comparative Success Hypothesis (CSH), according to which Y remains more successful than X. CSH is a genuine empirical hypothesis, albeit of a comparative nature, which can also be tested by the HD-method. One route is trying to derive test implications from X that do not follow from Y. If they turn out to be a (general) success of X, they form a kind of counterexample to CSH, more specifically, to the claim that all general successes of X are general successes of Y. If they turn out to have counterexamples (on the Mp-level) they confirm it, even in a strong sense as far it concerns examples of Y. Similarly, one may derive test implications from Y that do not follow from X. If they turn out to be (general) successes, they confirm CSH in the strong sense, if they are falsified and the relevant counterexamples happen to be allowed by X, these counterexamples also form a kind of counterexample to CSH, viz. to the claim that all counterexamples of Y should be counterexamples of X.

As in the context of inductive generalization, one cannot continue testing CSH. After ‘sufficiently many and varied’ tests, supporting the claim that Y seems so far persistently more successful, one will inductively jump to the conclusion that CSH happens to be true, and hence that we may eliminate X in favor of Y, for the time being. I have called this the Instrumentalist Rule of Success (IRS), rather than a falsificationist rule of success, for it remains to take falsified theories seriously. IRS typically amounts to the conclusion that we have made ‘empirical progress’ by the transition from theory X to theory Y, hence the comparative HD-method is functional for achieving empirical progress.

The first question to answer in the sequel is how to extend this story about empirical progress to the comparative HP-method, more specifically, to the LC-method. However, first we will introduce the second question.

The next claim is that IRS is also functional for truth approximation. This is based on the deductive success theorem (DST), according to which ‘D-at least as close to’ entails ‘D-at least as successful’, assuming \( \hbox{Rn}\, \subseteq \,\hbox{T} \, \subseteq \, \hbox{Sn}\) (i.e., provided no mistakes have been made). The proof is rather easy, as is immediately clear from confronting Fig. 1 with Fig. 2. Under certain conditions we may even conclude that in the long run: ‘D-closer to’ leads to ‘D-more successfulness’.

Fig. 2
figure 2

‘Y is at least as successful as X relative to R/S’: shaded areas empty

The functionality for truth approximation can now be argued as follows. If, as assumed by IRS, Y seems so far persistently more successful than X, the following holds:

  • first, it is possible that Y is D-closer to the truth than X, a possibility which would, when conceived as hypothesis, according to DST, explain the greater success,

  • second, it is impossible that X is D-closer to the truth than Y, for otherwise, so teaches DST, Y could not be more successful.

  • third, it is also possible neither is the case, in which case, however, another specific explanation has to be given for Y’s so far seeming persistently more successful than X.

Hence, since IRS is based on the HD-method, more specifically, the comparative HD-method, the second question to answer in the sequel is how to extend this story about truth approximation to the comparative HP-method, more specifically, the LC-method.

4.3 Probabilistic Success Theorems

Before we can formulate something like a probabilistic success theorem, we need to specify the probabilistic versions of the cumulative evidence relations and the notion of being ‘probabilistically more successful’. Recall the following relations:

$$ {\text{R}}0\,\; = \,\,\emptyset \, \subseteq \ldots {\text{Rn}}\, \subseteq \,{\text{Rn}} + 1 \ldots \, \subseteq \,{\text{T}}, \quad {\hbox{provided no descriptive mistakes}} $$
$$ {\text{S}}0\; = \,{\text{Mp}}\, \supseteq \ldots {\text{Sn}}\, \supseteq \,{\text{Sn}} + 1 \ldots .\, \supseteq \,{\text{T}}, \quad {\hbox{provided no inductive mistakes}} $$

hence, assuming no mistakes, in probabilistic terms we have, for any X, p(X) > 0:

$$ 0\, = \,{\text{p}}\left( {{\text{R}}0/{\text{X}}} \right)\, \le \, \ldots .\, \le \,{\text{p}}\left( {{\text{Rn}}/{\text{X}}} \right)\, \le \,{\text{p}}\left( {{\text{Rn}} + 1/{\text{X}}} \right)\, \ldots ..\, \le {\text{ p}}\left( {{\text{T}}/{\text{X}}} \right) $$
$$ 1\, = \,{\text{p}}\left( {{\text{S}}0/{\text{X}}} \right)\, \ge \, \ldots ..\, \ge \,{\text{p}}\left( {{\text{Sn}}/{\text{X}}} \right)\, \ge \,{\text{p}}\left( {{\text{Sn}} + 1/{\text{X}}} \right)\, \ldots ..\, \ge \,{\text{p}}\left( {{\text{T}}/{\text{X}}} \right) $$

Assuming in addition that in the end all nomic possibilities can be realized, we may even sharpen these ‘probabilistic cumulative evidence relations’ to:

p(Rn/X) is a monotone non-decreasing function approaching p(T/X)

p(Sn/X) is a monotone non-increasing function approaching p(T/X)

Now we are able to define the probabilistic analog of ‘deductive more successfulness’:

  • Definition of ‘probabilistic more successfulness’:

  • Y is P-more successful than X relative to Rn/Sn and to degree r ≥ 1 iff

  • p(Rn/Y)/p(Rn/X) and p(Sn/Y)/p(Sn/X) both exceed r.Footnote 2

Note that both ratios are likelihood ratios, which shows that the particular kind of comparative HP-method we are supposing is indeed the Likelihood Comparison (LC-) method. The role of the ‘degree r’ will become clear; in particular, why it may be preferable to have a degree larger than 1, which might prima facie seem high enough.

4.3.1 Expected Success Theorem

Before we present the more general and more relevant ‘threshold success theorem’, we like to present a success theorem in terms of (mathematical) expectations, that is restricted to the ‘numerical probability function’ on finite Mp, according to which \( {\text{p}}\left( {\text{A}} \right) = {{\left| {\text{A}} \right|} \mathord{\left/ {\vphantom {{\left| {\text{A}} \right|} {\left| {\text{Mp}} \right|}}} \right. \kern-\nulldelimiterspace} {\left| {\text{Mp}} \right|}} \cdot \) It is illuminating because it expresses very well what we might intuitively expect.

Recall that the expectation value E f(u) of real-valued function f(u) of random variable u is defined as \( \sum {\text{ f}}\left( {\text{u}} \right){\text{p}}\left( {\underline{\text{u}} = {\text{u}}} \right), \) with summation over all the possible values of u. Suppose that the set R, at a certain time n, but now more important, with size r, results from random selection out of the set of all subsets of T of size r, denoted by Tr. Hence, R is a random variable, with probability \( {\text{q}}\left( {\underline{\text{R}} = {\text{R}}} \right)\, = \,1/\left| {\text{Tr}} \right| \cdot \) Similarly, suppose that S, with size s, results from random selection out of the set of all supersets of T of size s, denoted by Ts. Hence, S is a random variable, with probability \( {\text{q}}\left( {\underline{\text{S}} = {\text{S}}} \right)\, = \,1/\left| {\text{Ts}} \right| \cdot \) Recall also that m was defined as max(1, p(X)/p(Y)).

Expected Success Theorem

If Mp is finite and ‘CP-at least as close’ is based on the numerical p, then “Y is CP-closer to T than X to degree m” implies \( {\text{E p}}\left( {{\underline{\text{R}}}/{\text{Y}}} \right)\, > \,{\text{E p}}\left( {{\underline{\text{R}}}/{\text{X}}} \right) \) and \( {\text{E p}}\left( {{\underline{\text{S}}}/{\text{Y}}} \right)\, > \,{\text{E p}}\left( {{\underline{\text{S}}}/{\text{X}}} \right) \) to degree m.

Hence, ‘CP-closer to’ entails that the ratios of the expected likelihoods are also favorable, or, informally, we may expect ‘P-more successfulness’. Note, however, that we do not claim that the expected likelihood ratios are favorable. We did not succeed in proving this stronger and still more appealing claim.

  • Proof of the first (R-) part (that of the S-part is similar).

  • Let Tr indicate the set of subsets of T of size r

  • $$ {\text{E p}}\left( {{\underline{\text{R}}}/{\text{Y}}} \right)\, = \,\sum_{{{\underline{\text{R}}} = {\text{R}} \in {\text{Tr}}}} {\text{p}}\left( {{\text{R}}/{\text{Y}}} \right)\cdot{\text{q}}\left( {{\underline{\text{R}}} = {\text{R}}} \right) $$
  • Given \( {\text{p}}\left( {{\text{R}}/{\text{Y}}} \right)\, = \;\left| {{\text{R}} \cap {\text{Y}}} \right|/\left| {\text{Y}} \right| \) and \( {\text{q}}\left( {{\underline{\text{{R}}}} = {\text{R}}} \right)\, = \,1/\left| {\text{Tr}} \right|, \) and denoting ‘truth value’ by tv (being 1 or 0), we get that E p(R/Y) is equal to:

  • $$ \begin{gathered} \left( {1/|{\text{Y}}|} \right)\cdot\left( {1/|{\text{Tr}}|} \right) \sum_{{{\underline{\text{R}}} = {\text{R}} \in {\text{Tr}}}} \left| {{\text{R}} \cap {\text{Y}}} \right|\, = \,\left( {1/|{\text{Y}}|} \right)\cdot\left( {1/|{\text{Tr}}|} \right) \sum_{{{\underline{\text{R}}} = {\text{R}} \in {\text{Tr}}}} \sum_{{{\text{y}} \in {\text{Y}}}} {\text{ tv }}\left( {\hbox{``}{\text{y}} \in {\text{R}}\hbox{''}} \right) = \hfill \\ \left( {1/|{\text{Y}}|} \right)\cdot\left( {1/|{\text{Tr}}|} \right) \sum_{{{\text{y}} \in {\text{Y}}}} \sum_{{{\underline{\text{R}}} = {\text{R}} \in {\text{Tr}}}} {\text{tv }}\left( {\hbox{``}{\text{y }} \in {\text{R}}\hbox{''}} \right) \cdot \hfill \\ \end{gathered} $$
  • Note now that the double summation is proportional to \( \left| {{\text{T}} \cap {\text{Y}}} \right|, \) for every y in T is equally many times a member of R, hence the total expression is proportional to \( \left| {{\text{T}} \cap {\text{Y}}} \right|/\left| {\text{Y}} \right| = {\text{p}}\left( {{\text{T}}/{\text{Y}}} \right) \cdot \) Combining this with the same proportionality of E p(R/X), the claim of the theorem directly follows from the defining condition of ‘CP-as close to’, i.e., the CP-condition. Q.e.d.

Of course, the expected success theorem has not only a limited and artificial, random sampling, character, but it also has no practical ‘detachment’ value for a probabilistic analog of the (instrumentalist) Rule of Success.

4.3.2 Threshold Success Theorem

As mentioned in the introduction, finding a sensible success theorem with practical value was a difficult task. But it turns out that ‘CP-closer to’ entails that both ratios will in the long run irreversibly pass a certain threshold; however, we never know for certain whether a particular passing is irreversible.

Threshold Success Theorem

If Y is CP-closer to T than X (to degree m), Y will in the long run become irreversibly P-more successful than X to degree m

Proof

Let ‘>l r’ and ‘<l r’ indicate ‘in the long run larger than’ and ‘in the long run smaller than’. Recall the CP-condition, capturing ‘CP-closer to’: p(T/Y) > m. p(T/X), where m = max(1, p(X)/p(Y)). According to the probabilistic cumulative evidence relations, p(Rn/X) will not exceed p(T/X), whereas p(Rn/Y) will in the long run approach p(T/Y) and hence, in view of the CP-condition, it will pass the threshold m. p(T/X), which entails that the ratio p(Rn/Y)/p(Rn/X) will pass m. In sum:

p(Rn/X) ≤ p(T/X) and p(Rn/Y) > l r m. p(T/X), hence p(Rn/Y)/p(Rn/X) > l r m

Similarly, according to the probabilistic cumulative evidence relations, p(Sn/Y) will not come below (T/Y), whereas p(Sn/X) will in the long run approach p(T/X) and hence, in view of the CP-condition, pass the threshold p(T/Y)/m, which entails that the ratio p(Sn/Y)/p(Sn/X) will pass m. In sum:

p(Sn/Y) ≥ p(T/Y) and p(Sn/X) < l r p(T/Y)/m, hence p(Sn/Y)/p(Sn/X) > l r m

Figure 3 illustrates the theorem by way of possible curves. Note that one of the two ratios actually passing m need not be irreversible. Hence we run the risk of concluding too early that the passing of the threshold was irreversible. Note also that if Mp is denumerably infinite, the irreversible passage of m need in practice not be reached, the more so when Mp is non-denumerable. Assuming more specific conditions, it will be possible to make ‘in the long run’ more precise.

Fig. 3
figure 3

m ≥ 1. The one on top shows possible curves of p(Rn/X) and p(Rn/Y), starting unfavorable for Y, with ratio smaller than 1, but later p(Rn/Y) passes the hidden threshold mp(T/X). The bottom one analogously for p(Sn/X) and p(Sn/Y), now with threshold p(T/Y)/m for p(Sn/X)

4.4 The LC-Method is Functional for Empirical Progress and Truth Approximation

Now it is not difficult to extend the functionality story to the probabilistic case. We have defined in general what it means that Y is P-more successful (relative to Rn/Sn) than X to degree r, but in view of PST it is now plausible to start with the assumption that this happens to be the case to degree m. Again this may be a matter of lucky choices of experiments in favor of Y. So the probabilistic version of the Comparative Success Hypothesis (PCSH) hypothesizes in this situation that this is from now on irreversibly so. The adapted version of IRS becomes the equally instrumentalist rule:

Probabilistic Instrumentalist Rule of Success (PIRS): When PCSH is ‘sufficiently confirmed’, i.e., Y’s being P-more successful than X to degree m seems irreversible, eliminate X in favor of Y, at least for the time being.

Again we may claim that this is an ideal form of ‘empirical progress’, now of a probabilistic nature. Since the crucial notion of ‘P-more successfulness’ is couched in likelihood ratios (relative to cumulated evidence), we may conclude that the LC-method is functional for achieving (probabilistic) empirical progress, viz. by applying PIRS whenever this seems justified.

As suggested, the expected success theorem is not useful for a functionality claim. But the Threshold Success Theorem (TST) certainly is. Let Y seem so far irreversibly P-more successful than X to the degree m, i.e., the condition assumed in PIRS. Then there are the following possibilities:

  • Y may be CP-closer to X, which would explain its so far seeming irreversibly P-more successful to the degree m, from some time on, for in that case it may have become in fact irreversibly P-more successful,

  • it can’t be the reverse, i.e., that X is CP-closer to T than Y and that Y has nevertheless become irreversibly P-more successful to the degree m,

  • if neither is CP-closer to the other, a specific explanation has to be given for the apparent success dominance so far from some time on, i.e., for Y’s so far seeming irreversibly P-more successful than X to the degree m.

Note that there is a difference in the corresponding second clause of the deductive case. In the latter case the simple fact that Y is more successful than X at some time prevents X already from being D-closer to the truth than Y. In the present case Y may, despite all efforts to test PCSH, wrongly seem irreversibly P-more successful than X to the degree m. As a matter of fact, the tables may still turn irreversibly in favor of X, in which case X’s being CP-closer to the truth than Y would be the explanation.

However this may be, the condition of PIRS still provides good reasons to conclude, at least for the time being, that Y is CP-closer to the truth than X. Since PIRS was based on the LC-method, we may conclude that the LC-method is functional for probabilistic truth approximation.

The quest for real cases of ‘CP-closer to’ now extends of course to the quest for historical cases of ‘seemingly irreversibly P-more successful’, even of three kinds: cases in which the tables have never turned irreversibly in favor of the other, cases in which this has happened, and cases where the tables have become mixed.

4.5 Inference to the Best Theory

Very briefly I will indicate the probabilistic extension of the claim in (Kuipers 2000), and rephrased in terms of abduction in (Kuipers 2004), that the (deductive) success theorem, enables the explication of a revised version of so-called Inference to the Best Explanation (IBT).

In the deductive case, the good reasons for extending the instrumentalist conclusion of IRS to the conclusion, for the time being, that Y is D-closer to the truth than X, suggests the following revision of IBT to ‘Inference to the Best Theory’ (IBT). Whereas the former prescribes to conclude that the best, apparently non-falsified, explanation is true, the latter prescribes to conclude that the best, i.e., the persistently D-most successful theory, falsified or not, is the D-closest to the truth. Under even more stringent conditions, including not being falsified, this may justify an inductive jump to the conclusion that the best theory represents the truth, i.e., the strongest true theory (Kuipers 2004).

In the probabilistic case we have similar good reasons for extending the instrumentalist conclusion of PIRS to the conclusion, for the time being, that Y is CP-closer to the truth than X, which now suggests the ‘Inference to the P-Best Theory’ (IPBT). It prescribes to conclude that the P-best, that is, the persistently P-most successful theory, to all relevant degrees, falsified or not, is the CP-closest to the truth. Again, under more stringent conditions this may justify an inductive jump to the conclusion that the P-best theory represents the truth. In the present case the conditions are of course that we have good reasons to assume that the P-best theory, say Y, is such that p(Rn/Y) and p(Sn/Y) equal the corresponding values for T: p(Rn/T) and p(Sn/T). Whereas p(Sn/T) has to be 1, and hence p(Sn/Y) as well, it does not seem possible to formulate a specific value for p(Rn/Y), except that it should be greater than 0, for otherwise Y would apparently have been falsified.

5 Concluding Remarks: Possible Extensions/Concretizations

The above probabilistic extension of the deductive story according to which the HD-method is functional for empirical progress and truth approximation is what I wanted to achieve. However, it has a number of abstractions, idealizations, and restrictions asking for specification, concretization and extension.

The main abstraction is that we have left the presupposed probability function p(.) further unspecified. Of course, p(.) is primarily supposed to be used epistemically, despite occasional use of ‘as-if-objective’ talk, e.g., interpreting clauses in terms of ‘random sampling’. Moreover, the likelihoods, in particular, may well be of an objective nature. However this may be, all claims are relative to the chosen p(.). This raises at least two questions. The first question is whether there are good reasons to restrict probabilistic claims of confirmation, empirical progress and truth approximation to cases that they hold for all conceivable p-functions, or for all members of a certain subclass, e.g., as distinguished in (Kuipers 2001, 2006), i.e., for all Popperian (non-inductive) or Carnapian (inductive likelihoods) or Bayesian (inductive priors) or Hintikkian (double inductive) p-functions. The suggested restriction to a subclass presupposes already an answer to the second question, viz. whether there are good reasons to restrict p-specific claims to certain kinds of p-functions.

Turning to idealizations, as in the deductive case (Kuipers 2000, Ch. 9), a stratified version in terms of the (theory-relative) distinction between observation and theoretical terms would be attractive. However, a simple probabilistic version of the ‘projection theorem’ seems unlikely, which is, by the way, also absent in the quantitative similarity approach of Niiniluoto (1987).

As we have already noted, like in the deductive case (Kuipers 2000, Ch. 10), a refined probabilistic version, taking structurelikeness into account, or distances between structures would be most welcome in order to get realistic cases. In terms of distances, it seems rather plausible to define ‘closer to’ then in terms of:

$$ {\text{d}}\left( {{\text{T}}\backslash {\text{X}}} \right)\, + \,{\text{d}}\left( {{\text{X}}\backslash {\text{T}}} \right)\, = \,\sum _{{{\text{z}} \in {\text{T}} - {\text{X}}}} {\text{p}}\left( {{\text{z}}/{\text{T}}} \right){\text{d}}\left( {{\text{z}},{\text{X}}} \right)\, + \,\sum_{{{\text{z}} \in {\text{X}} - {\text{T}}}} {\text{p}}\left( {{\text{z}}/{\text{X}}} \right){\text{d}}\left( {{\text{z}},{\text{T}}} \right) $$

However plausible it may look now, it is not included in the ‘quantitative’ chapter of (Kuipers 2000, Ch. 2), presenting several possible definitions.

Regarding restrictions, the laws and theories assumed above were of a deterministic nature, as was the truth. In the context of the (implicit) assumption that the truth is deterministic, laws and theories that are taken into account frequently have nevertheless a statistical or probabilistic nature. It may be possible to reinterpret such use of statistical laws and probabilistic theories as laws and theories that probabilistically deviate from the (deterministic) truth.

Perhaps the most intriguing question is how to extend the analysis to non-deterministic theories, ‘the truth’ included. One might think of theories and the truth itself as probability measures on (subsets) of Mp, but this is left for future research.

A final concern are the limitations of the LC-method in general. Likelihood comparison has come under attack. Forster (2006) has recently shown by way of detailed counterexamples that there are exceptions to the rule that all the empirical information relevant to the comparison of theories is contained in the likelihoods. Although likelihoods can measure how well a theory accommodates the data, they do not contain the information how well it can predict one part of the data from another. The counterexamples all deal with curve fitting of one kind or another. The basic approaches, deductive and probabilistic, do not yet cover such ‘theories’, but refined versions, if based on real-valued distances between structures, certainly do. Hence, a final question for further research is how problematic such counterexamples are for refined probabilistic approaches to empirical progress and truth approximation in the line of the present article.