1 Introduction

Most scientific realists claim that the success of a scientific theory is indicative of its truth, that we have good reason to believe in things like evolution and the big bang given how successful the corresponding theories are at explaining and/or predicting certain phenomena (such as the fossil record and cosmic microwave background radiation). However, it is well known that many scientific successes are born of theories which are false, and sometimes radically false. How can the realist maintain her realism in the face of this fact?

The common answer amongst philosophers of science today is that significant scientific successes usually are born of truth, even if that truth is often buried within a wider theoretical framework that is largely misguided. So-called ‘selective’ realists attempt to explain various successes in the history of science by reconstructing the relevant derivations so as to show that the success of a given theory depends only on those things it actually got right (be those things properties, structure, or whatever).Footnote 1 However, clearly even these sophisticated selective realists will be in trouble if there are episodes in the history of science where significant theoretical successes were quite obviously dependent on things the theory got wrong. In such cases one would have to accept that success was not born of truth, and if several such cases could be uncovered it would suggest that one is ill-advised to believe our current scientific theories (or even their ‘best’ parts) on the basis of their success. The objective of the current paper is to introduce the old quantum theory (OQT) of Bohr and Sommerfeld as one such relevant historical episode.

At the end of the nineteenth century and the beginning of the twentieth century one phenomenon was proving particularly difficult to explain. Every element emits and absorbs light at only certain frequencies, which we call that element’s characteristic spectrum. These were first observed in the mid-nineteenth century, but proved largely inexplicable until Niels Bohr provided an explanation in 1913.Footnote 2 His theory explained the discrete set of spectral lines associated with any particular element in terms of a discrete set of possible orbits an electron can inhabit within the atoms of that element. He had tremendous success in explaining the spectral lines of hydrogen and ionised helium—so much success, in fact, that Einstein was moved to remark, ‘This is a tremendous result. The theory of Bohr must then be right.’ (cited in Pais 1991, p. 154). As Pais writes,

Up to that time no one had ever produced anything like it in the realm of spectroscopy, agreement between theory and experiment to five significant figures. (1991, p. 149)

The realist would surely like to be able to explain how Bohr could be so successful with a theory we now know to be so fundamentally wrong.Footnote 3

But despite Bohr’s success there were elements of the spectral lines of hydrogen which remained unexplained. In particular it remained unexplained why, when one looks closely at the spectral lines with a high resolution spectroscope, some of them are actually a number of separate lines grouped extremely closely together. This is known as the “fine structure” of the spectral lines. But in 1916 Arnold Sommerfeld—de-idealizing and expanding on Bohr’s theory—derived a formula which predicted the fine structure of the hydrogen spectrum with stunning accuracy. In fact, the formula he derived is the very same formula used today for the fine structure energy levels of hydrogen, although today it stands on completely different theoretical foundations (including the Schrödinger equation and electron spin). Kronig calls this ‘perhaps the most remarkable numerical coincidence in the history of physics’ (cited in Kragh 1985, 84). Kragh puts it as follows:

By some sort of historical magic, Sommerfeld managed in 1916 to get the correct formula from what turned out to be an utterly inadequate model...[This] illustrates the well-known fact that incorrect physical theories may well lead to correct formulae and predictions. (Ibid.)

Brown et al. (1995) write that ‘Sommerfeld’s answer has to be considered a fluke’ (p.92). The flavour of this literature is clearly anti-realist, and most realists would be keen to respond by explaining the success in another way. The success isn’t a matter of luck, they would want to say—it is born of truth buried within the theory. But can this be demonstrated?

The first job of this paper will be to show that these cases really should worry the realist. The modern, sophisticated realist does not profess to be moved by the ‘mere’ explanatory success of theories, but demands ‘novel predictive success’.Footnote 4 This will be examined in Section 2, and the candidacy of Bohr’s success and Sommerfeld’s success for the realism debate will be made clearer in Sections 2.1 and 2.2 respectively. Then in Section 3 I will go into the details of Bohr’s derivation of the Rydberg constant and the spectral lines of ionised helium, and ask whether the reason he was so successful was because of the truth content hidden within his theory (judging by current theory). Drawing on Norton (2000) it will be shown that the selective realist strategy does have some chance of succeeding here. But then in Section 4 I will argue that this strategy almost certainly can’t work for Sommerfeld’s derivation of the fine structure formula for hydrogen. Finally in Section 5, the conclusion, I sum up what this means for realism.

2 Two successes of old quantum theory

The question of this section will be whether Bohr’s success vis-à-vis the spectral lines of hydrogen and ionised helium, and Sommerfeld’s success vis-à-vis the fine structure of hydrogen, were significant enough to make the realist take notice. They are usually referred to as explanatory successes, whereas the modern realist usually demands predictive success in order to make a realist commitment. However, often what became explanations started out as predictions: eg. big-bang cosmology originally predicted the microwave background radiation, and now it explains why we see this radiation. This is how we should understand certain successes of OQT that are now referred to as ‘explanations’. What’s important to the realism debate is that they started out as predictions.

Psillos (1999, p. 105ff)—drawing on Earman—distinguishes two types of prediction: ‘use-novel’ and ‘temporally-novel’. A theory that achieves the latter is most obviously successful: it predicts some phenomenon that we are completely unaware of at the time. In the case of ‘use-novel’ predictions the phenomenon is already known about, but the theory is put together as if it isn’t known about. In this case the prediction has less psychological impact but, as Psillos argues, any theory deserves just as much credit for achieving it. As we will see (and as discussed further in Robotti (1986) and elsewhere) the theories of Bohr and Sommerfeld achieved both types of predictive success: use-novel for lines already known about, and temporally-novel for lines not known about at the time the theories were proposed. Even the more careful brand of realist ought to be moved by these successes.Footnote 5

2.1 Relevant history 1: Bohr

The story of Bohr’s success is often misreported. His original ‘explanation’ of the spectral lines of hydrogen was not very convincing at all, and many saw it as ‘merely an ingenious play with numbers and formulas’ (Heilbron and Kuhn 1969, p. 266). Even if it did persuade some in the community, the modern day realist certainly wouldn’t need to take notice. The problem is that, when Bohr derived the Balmer formula in 1913, he clearly knew what he was trying to derive—the spectral lines of hydrogen—and in the earliest version of his theory he manipulates certain assumptions so as to make his theory fit the phenomena. His derivation ‘contained an arbitrary deviation from the theory of Planck, justified only by its success in giving the right result.’ (Norton 2000, p. 83).Footnote 6 This certainly isn’t what convinced Einstein, and so many others, to commit to the theory.

Things became much more convincing when Bohr re-structured his theory so as to explicitly include a formula encoding the frequencies of the spectral lines as part of his theory (Bohr 1913a). In other words the theory didn’t make any attempt to derive the spectral lines anymore. Instead he merely hypothesised a mechanism for them, and the derivation of interest became that of the Rydberg constant. As Heilbron and Kuhn put it,

He took the Balmer formula, interpreted from the start as a statement about energy levels, as his point of departure, and could deduce only the value of the multiplicative constant, the Rydberg coefficient. (Ibid., p.277, see also p.270)

This was pretty impressive. Scientists had long had an empirical measure of the Rydberg constant from the Balmer formula, but now Bohr had presented a way of deriving it theoretically, in terms of primitive constants (such as the charge of the electron). And in 1913 the agreement between the empirically measured value and Bohr’s theoretically derived value was within one percent (Heilbron and Kuhn, p. 266, fn.140), close enough for Einstein to exclaim, ‘Very remarkable! There must be something behind it. I do not believe that the derivation of the absolute value of the Rydberg constant is purely fortuitous.’ (Jammer 1966, p. 86). Clearly what we have here is a ‘no miracles’ intuition: Einstein’s thought is that it just isn’t reasonable to assume that such quantitatively accurate success could be born of assumptions that are not at least approximately true.

However, even this isn’t what convinced most people to commit seriously to Bohr’s theory. Within a few months Alfred Fowler (1913) had objected that when one applies Bohr’s theory to ionised helium the predictions are outside experimental error.Footnote 7 But Bohr noted that Fowler’s calculation was based on an idealization assumption: namely that the nucleus of the atom is infinitely heavy compared with the mass of the electron. Bohr showed (Bohr 1913b) that when one makes the relevant de-idealization the predictions match experiment to five significant figures (see quotation above). It is this success which motivated Einstein to remark that the ‘theory of Bohr must then be right’. This is so persuasive because what we have here is quite obviously novel predictive success: both use-novel predictive success (for lines already known about) and temporally-novel predictive success (for new lines). Thus this really is the kind of thing that should draw the attention of the realist. One needn’t be a ‘naïve optimist’ as Einstein’s remark suggests: that is, one needn’t assume that Bohr’s theory must be right to be so successful.Footnote 8 But the modern day realist does at least want to say that such success is usually born of truth, so this certainly counts as a case which matters to the realism debate.Footnote 9

2.2 Relevant history 2: Sommerfeld

The fine structure of hydrogen had been noticed as early as 1887 by Michelson and Morley. At least three serious measurements of the line known as Hα were taken—in 1891, 1895 and 1912—before Sommerfeld developed his theory in 1916.Footnote 10 There is no doubt that Sommerfeld knew about the fine structure at the time he was developing Bohr’s theory. If we want to describe Sommerfeld’s success as ‘use-novel’ predictive success, we need to be sure that Sommerfeld did not manipulate his development of the theory to ensure it predicted the fine structure.

On this issue opinions are divided. Kragh writes,

Sommerfeld’s theory of fine structure, although inspired by experimental results on the Balmer series, was not specifically designed to explain the hydrogen spectrum. On the contrary, it was part of a general program of quantum theory that proved to have many applications. (p.71)

But Arabatzis (criticising a similar sentiment courtesy of Dudley Shapere) writes, ‘Sommerfeld expected that this manoeuvre [introducing a second quantum number] would ... resolve the problem of fine structure.’ (2006, p. 149).

Arabatzis’s thought appears to be this: Sommerfeld certainly knew that he needed extra energy levels to predict the fine structure, and he quickly realised that these wouldn’t appear if electrons could orbit the atom in all possible ellipses, at any given eccentricity. Thus Sommerfeld’s qualitative prediction of more spectral lines should not count as ‘use-novel’—this really was written into the theory in its construction. However, what is really impressive about Sommerfeld’s prediction is the quantitative agreement between theory and experiment, and this certainly wasn’t written into the theory in its construction. Here we again have use-novel prediction for the lines already known, and also temporally novel prediction for new lines.Footnote 11

Thus even at the time Sommerfeld’s fine structure formula was seen as making a novel prediction, which was subsequently confirmed by Friedrich Paschen. Paschen wrote to Sommerfeld in 1916,

My measurements are now finished and they agree everywhere most beautifully with your fine structures. (cited in Kragh, p. 75)

The agreement was quantitative and not merely qualitative, as Sommerfeld subsequently proclaimed (Kragh, p.76).

As Kragh continues,

Sommerfeld’s theory was generally considered to be excellently confirmed by experiments... To many physicists the theory was the final proof of the soundness of Bohr’s quantum theory of the atom. For example, in letters of 1916 Einstein called Sommerfeld’s theory “a revelation.” “Your investigation of the spectra belongs among my most beautiful experiences in physics. Only through it do Bohr’s ideas become completely convincing.” Paul Epstein was converted to Bohr’s theory only “after Sommerfeld in his theory of the fine structure of the hydrogen lines achieved such a striking agreement with the experiment.”

During the reign of the old quantum theory, that is, until 1926, Sommerfeld’s explanation of the fine structure was regarded as undisputable by the leading atomic physicists. (p. 80)

In fact, as Kragh goes on to note, Max Planck compared Sommerfeld’s explanation of the fine structure with LeVerrier’s explanation of deviations in the orbit of Saturn in terms of an unseen planet which came to be called Neptune. The suggestion was that Sommerfeld’s success provided very strong evidence that there really are quantized elliptical orbits, just as there really is a planet beyond Saturn which we now call Neptune.Footnote 12

However, Robotti (1986) has argued that the community wasn’t quite as convinced as Kragh’s paper makes out. She mentions several experiments carried out between 1916 and 1925 that seemed to go against the theory to one degree or another. Thus some in the community argued that the Sommerfeld predictions were not confirmed by experiment. In 1924 Lau even went as far as to remark ‘For the hydrogen doublet series there exists so far no theoretical explanation which can be considered good.’ (cited in Robotti, p.46). The realist might try to make something of this, and argue that Sommerfeld’s theory was not very successful, and so they don’t need to make a realist commitment. But my argument can’t work unless the realist is motivated to make realist commitment in this case.

In fact I think the finer details of the history do support my argument here. The key point is that at the time, everything taken into account, the evidence was seen as overwhelming by the vast majority of the relevant individuals. Now, with hindsight, we of course see small discrepancies between theory and experiment (especially regarding spectral line intensities) as indications that something was wrong. And it is tempting to regard isolated objections to Sommerfeld’s theory as prescient. But in fact any discrepancies between theory and experiment were not seen as problematic at the time by the vast majority of the relevant experts, and with good reason. In particular, as Robotti discusses in detail, minor discrepancies could be explained away in perfectly sensible ways. Here are the main points:

  1. (i)

    There were many issues concerning exactly how best to carry out the experiments, and many sources of error inevitably crept into the experimental process (eg. as detailed in Table 6 on p.83 of Robotti’s paper);

  2. (ii)

    Some measure of electrical field disturbance was inevitable, and Kramers (in 1919) was influential in explaining discrepancies between theory and experiment in terms of such electric fields (as Robotti explains in Section 3 of her paper and elsewhere). As Kramers put it in 1919 ‘[T]his does not constitute a difficulty for the theory but it is just what should be expected according to the above considerations of the effect of perturbing fields on the fine structure.’ (cited in Robotti, p. 73);

  3. (iii)

    There were further possible de-idealizations to be made to Sommerfeld’s theory, any of which could explain small discrepancies between theory and experiment (Robotti, p.99, fn.163);

  4. (iv)

    Where Sommerfeld’s theory did fail was in the ‘selection rules’, which Sommerfeld added to his theory ‘ad hoc’ (as Robotti explains on p.59). These were added to explain the spectral line intensities, but here the successes I am concerned with are the spectral line frequencies. Sommerfeld’s theory predicts these exactly (see below);

  5. (v)

    Finally, the biggest problems for Sommerfeld’s theory came several years after 1916, so that there was at the very least a considerable period when belief in the theory was fully justified. Eg. Robotti writes ‘[I]n 1925 the problem of the fine structure ... was no longer considered as definitely and thoroughly solved.’ (p. 67). And Lau’s comment (above) came in 1924. But several years of success is more than enough for present purposes.

Of course, how convinced the community was by the fine-structure predictions is only indirect evidence that the realist ought to take notice of this episode in the history of science. Sometimes scientists have committed to theories which were not particularly successful, and which the modern day realist would not feel obliged to consider. However, in this case the level of success of the theory is clear: Sommerfeld derived the exact same formula used today for the fine structure energy levels of hydrogen. That is, Sommerfeld reached the exact same formula we get from the modern-day fully-relativistic Dirac formulation of QM. There are differences between the old and the new theories regarding spectral lines intensities, but not regarding spectral lines frequencies. It is predictions of the latter that I am concerned with here.

3 Bohr’s derivation of the spectral lines of ionised helium

The method follows Bohr’s third derivation of the Rydberg constant given in a lecture at the end of 1913, and found in Bohr 1922, pp. 1–19. Useful secondary sources are Pais 1991, p. 151ff. and Heilbron and Kuhn 1969, p. 276ff..

Norton (2000) has done much of the work required to explain the success of Bohr’s derivation in realist terms (although Norton’s reconstruction isn’t actually part of an argument for realismFootnote 14). Bohr’s first significant success was a derivation of the Rydberg constant, as explained in Section 2.1, above. For Bohr’s derivation of this constant I will closely follow Norton (2000, pp. 83–86).

Bohr’s assumptions were as follows:

  1. (1)

    The electrostatic model of electron orbits: the electron in the hydrogen atom orbits the nucleus, held in its orbit by a Coulomb attraction.

  2. (2)

    Indexing of electron orbits: the electron can only orbit the nucleus at certain energies E. These energies are indexed by the formula: E n  = f(n), for some as yet unspecified function of (integer) n.

  3. (3)

    Emission of light by quanta: light is emitted when an electron jumps from one allowed orbit to another, less energetic orbit. The frequency v of the emitted light is given by the formula \( {E_{{{n_2}}}} - {E_{{{n_1}}}} = h\nu \) (drawing on Einstein’s formula E = ).

  4. (4)

    The Balmer formula: the frequencies of light emitted by hydrogen are given by the Balmer formula: \( {\nu_{{{n_2},{n_1}}}} = R\left( {\frac{1}{{n_2^ {{{ \,\,\, 2}}} }} - \frac{1}{{n_1^ {{{ \,\,\, 2}}} }}} \right) \), where n 2 and n 1 span over the corresponding energies of the orbiting electrons as per premises (2) and (3). From (2), (3) and (4) we have it that the energies of the possible orbits are given by \( {E_n} = \frac{{hR}}{{{n^2}}} \) for n = 1,2,3,4....

  5. (5)

    Classical electrodynamics governs emissions for large quantum numbers: When n is large, and thus the electron is relatively far from the nucleus, the emitted frequency can be calculated according to classical electrodynamics (the electron behaves classically, more or less). Now according to classical theory a particle with mechanical frequency (frequency of revolution) υ n and energy E n circling a central charge Ze (under the influence of coulomb attraction) emits light with that same frequency υ n such that: \( \nu_n^2 = \frac{{2E_n^ {{{ \,\,\, 2}}} }}{{{\pi^2}m{Z^2}{e^4}}} \). Combining this with premise (4) we find that \( \nu_n^2 = \frac{{2{R^3}{h^3}}}{{{\pi^2}m{Z^2}{e^4}{n^6}}} \) for electrons far enough away from the nucleus.Footnote 15

Now, using (4), for large enough quantum number n we have it that,

$$ {\nu_{{n,n - 1}}} = R\left( {\frac{1}{{{{(n - 1)}^2}}} - \frac{1}{{{n^2}}}} \right) = \frac{{2R}}{{{n^3}}}. $$

Bohr can then compare this frequency with the one derived classically for large quantum numbers as follows:

$$ \frac{{2{R^3}{h^3}}}{{{\pi^2}m{Z^2}{e^4}{n^6}}} = {\left( {\frac{{2R}}{{{n^3}}}} \right)^2}. $$

From here a simple calculation yields \( R = \frac{{2{\pi^2}m{e^4}}}{{{h^3}}} \), since Z = 1 for hydrogen. So we have a theoretical derivation of the Rydberg constant, which can be compared with the one measured empirically using Balmer’s formula. As noted in Section 2.1, the two results match almost perfectly.Footnote 16

From here the challenge is to apply Bohr’s theory to ionised helium. Now, instead of merely predicting the value of the Rydberg constant, Bohr’s theory predicts the spectral lines of ionised helium. It tells us that the setup is very similar to that of the hydrogen atom, with the one significant difference being that the charge on the nucleus is doubled. This means that the Rydberg constant changes, since Z = 2 instead of 1. But it is still the case that the possible energies of the electron orbits are given by the formula \( {E_n} = \frac{{hR}}{{{n^2}}} \), although now ‘R’ stands for the new Rydberg constant, which we can label R2. So the predicted frequencies of the spectral lines of ionised helium are given by,

$$ \nu = {R_2}\left( {\frac{1}{{n_2^ {{{ \,\,\, 2}}} }} - \frac{1}{{n_1^ {{{ \,\,\, 2}}} }}} \right). $$

So in other words the only extra assumption Bohr needs in order to predict the spectral lines of ionised helium is that the positive charge on the nucleus is doubled. And of course, as noted in Section 2.1, the predictions are extremely successful.Footnote 17

The crucial question for the realist is how it is possible for Bohr’s theory to be so successful when we now know that it is so far off the mark (in the light of modern quantum theory). At first it seems that the success can be explained with a relatively simple selective realist move, following Norton (2000, pp. 86–88). The point is that Bohr’s assumptions about electron orbits appear to play no role in the derivation given above; they are just heuristically useful. The derivation still goes through if one merely assumes that the electrons can persist in different possible ‘stationary states’, each with an associated energy \( {E_n} = \frac{{hR}}{{{n^2}}} \). And, crucially, this assumption is already a part of Bohr’s original assumption set. Of course he added the extra assumption that the reason electrons have these energies is because of the kinetic and potential energies of their orbits. But this extra assumption doesn’t play any part in the derivation of either the Rydberg constant or the spectral lines of ionised helium. As Norton writes,

The reduced form [of the derivation] eschews all talk of elliptical orbits other than in the domain of correspondence with classical theory... No assumption is made or needed that these stationary states are elliptical orbits of some definite size and frequency of localised electrons. What is retained is that these states possess a definite energy. (2000, pp. 86–87)

Building on this the realist would claim that all of the assumptions truly necessary to Bohr’s derivation are still true according to the new quantum theory. As Norton puts it,

[A]ll of the assumptions of this reduced demonstrative induction [“derivation”Footnote 18] are compatible with the new quantum mechanics that emerged in the 1920s. The stationary states ..., for example, would simply correspond to the energy eigenstates of a bound electron.Footnote 19

A possible objection to this story would be that Bohr still draws on assumption (5), which makes use of the assumption of elliptical orbits for very large quantum numbers (in the ‘domain of correspondence with classical theory’, as Norton puts it). Some might be tempted to claim that this is crucial to Bohr’s derivation, but not even approximately true according to modern QM. Another worry might be the way in which Bohr answered Fowler’s objections, as noted in Section 2.1 above. Bohr drew on the assumption of real electron orbits to argue that the nucleus does not stay absolutely stationary, but ‘wobbles’ due to its attraction to the electron. Prima facie it would appear that without the assumption of real electron orbits Bohr’s ‘reduced mass’ assumption makes no sense, and without this latter assumption Bohr cannot achieve the really startling success which made Einstein remark ‘the theory of Bohr must then be right’.

Whether these two objections to the realist’s story can be satisfactorily answered is a question I leave for selective realists. Once one has taken away what is radically false from the Bohr theory, is there enough left for the successful derivations to go through? One thing worth emphasising here is that the selective realist cannot answer as follows:

It doesn’t matter if just one or two assumptions used in the derivation of a novel prediction are not-even-approximately-true. As long as the vast majority are, one can maintain that the theory as a whole is approximately true, and that this explains the success.

On the contrary, all of the ‘working posits’ must be at least approximately true if the success of the theory is to be explained in terms of truth. Consider a successful prediction P which can be derived from assumptions A – E. If A – D are at-least-approximately-true, but E is radically false, then the truth in the theory does not explain the success at all. The success in question, prediction P, doesn’t follow from what the theory got (approximately) right, namely A – D; it only follows if we include the radically false assumption E. Thus the realist fails in her basic aim: to explain the success of science in terms of truth.Footnote 20

In the specific case of Bohr’s derivation of the spectral lines of ionised helium, it may yet be possible to show that all of the relevant ‘working posits’ are approximately true. Work remains to be done, but I think the realist is already in a tight spot. However, in a sense it is immaterial whether the realist can answer her critics on these points: things are about to get significantly more difficult.

4 Sommerfeld’s derivation of the fine structure of hydrogen

Sommerfeld took Bohr’s theory as a starting point and developed it significantly. He noted first that for every allowed energy \( {E_n} = \frac{{hR}}{{{n^2}}} \) in Bohr’s model there should (classically speaking) be an infinite number of possible orbits, because there are an infinite number of eccentricities of ellipse which could describe the electron’s trajectory (ranging from very long and thin ellipses to circles). Motivated by various empirical phenomena (including the fine-structure, see above), Sommerfeld postulated that only certain eccentricities of ellipse are possible trajectories for the electron. For the smallest energy E 1 there is just one possible orbit, a circle. For the second smallest energy E 2 there are two possible orbits, a circle and an ellipse with a given eccentricity. For the third smallest energy E 3 there are three possible orbits, and so on. These possible orbits are all indexed by a second quantum number k, in addition to Bohr’s principal quantum number n.

But Sommerfeld does not stop here. His next move is to apply the laws of special relativity to the different possible electron trajectories.Footnote 21 In essence, Sommerfeld makes use of the equation for a precessing elliptical orbit, but introduces relativity by making a change to the equation for angular momentum:

$$ p = m{r^2}\omega = \gamma {m_0}{r^2}\omega \,{\hbox{where}}\,\gamma = {\left( {1 - \frac{{{v^2}}}{{{c^2}}}} \right)^{{ - \frac{1}{2}}}}. $$

Here m 0 is the ‘bare’ mass, m is the relativistic mass, r is the distance between the electron and the proton, and ω is the angular rate of rotation. One then has a relativistic equation for a precessing elliptical orbit, which can be arranged to tell us the energy of an electron in such an orbit.

Introducing the two quantum numbers n and k we find that different orbits which before had the same energy (same n, different k) now have very slightly different energies. It is these very slightly different energies which explain the very closely grouped spectral lines which we call the “fine-structure”. With these assumptions Sommerfeld is led to the following formula for the allowed energies of the hydrogen atom:

$$ E(n,k) = \frac{{ - Rhc{Z^2}}}{{{n^2}}}\left( {1 + \frac{{{\alpha^2}{Z^2}}}{{{n^2}}}\left( {\frac{n}{k} - \frac{3}{4}} \right)} \right) + \ldots $$

Here c is the speed of light, α ≈ 1/137, and the dots stand for negligible terms.Footnote 22

As noted above in Section 2.2, this formula is still accepted as the correct expression of the hydrogen energy levels. But today the explanation of the fine-structure is completely different, in particular because the derivation of the fine-structure formula is completely different.Footnote 23 For example, orbital trajectories play no part whatsoever in the new QM. Instead one starts with the time-independent Schrödinger equation,

$$ H\Psi = E\Psi $$

where H is the Hamiltonian operator:

$$ H = \left[ { - \frac{{{\hbar^2}}}{{2m}}{\nabla^2} + V({\mathbf{x}})} \right]. $$

For the hydrogen atom V(x) is the familiar Coulomb potential, which in polar co-ordinates can be written as,

$$ V(r) = \frac{{{e^2}}}{{4\pi {\varepsilon_0}r}}. $$

Now if we solve the Schrödinger equation in this form, the energy levels for the hydrogen atom turn out to be the same as those predicted by Bohr’s theory (for the full derivation see Eisberg and Resnick p. 235ff.). But now it is possible to make corrections to the Hamiltonian to account for relativity and electron spin.

First, instead of assuming the (classical) relation between (kinetic) energy and momentum E = p 2/2 m, one can make use of the relativistic alternative:

$$ {E^2} = {c^2}{{\mathbf{p}}^2} + {m^2}{c^4}. $$

The first term in the above (non-relativistic) Hamiltonian can be reached by starting with E = p 2/2 m and substituting p for the corresponding operator: \( {\mathbf{p}} \to i\hbar \nabla \). If one does the same substitution but now with the relativistic alternative, one reaches a new, relativistically corrected Hamiltonian.Footnote 24 In its unrefined form we have,

$$ {H_{{REL}}} = \left[ {\pm c{{\left( { - {\hbar^2}{\nabla^2} + {m^2}{c^2}} \right)}^{{\frac{1}{2}}}} + V({\mathbf{x}})} \right]. $$

In addition there is a correction due to the electron ‘spin’, and thus a final Hamiltonian which can be expressed as,

$$ {H_{{FINAL}}} = {H_{{REL}}} + {H_{{SPIN}}} $$

In short, the electron is said to have an ‘intrinsic angular momentum’ S, which causes it to interact with the field of the proton about which it is orbiting. Thus the result is something referred to as the ‘spin-orbit’ interaction. The resultant correction factor to the Hamiltonian is,

$$ {H_{{SPIN}}} = \left( {\frac{1}{{4\pi {\varepsilon_0}}}} \right)\frac{{{e^2}}}{{{m^2}{c^2}{r^3}}}{\mathbf{S}} \cdot {\mathbf{L}} $$

where L is the regular ‘orbital’ angular momentum (mvr).Footnote 25

With these corrections to the Hamiltonian, one has only to solve the Schrödinger equation and Sommerfeld’s fine-structure formula re-emerges as if by magic. At least, almost exactly the same formula emerges (see below for discussion). It turns out that two quantum numbers have to be introduced—just as in Sommerfeld’s theory—to avoid physically unacceptable results. Of course, the quantum numbers are interpreted differently in the new theory (see Series 1988, p. 24), but the restricted range over which these numbers can span is exactly the same as in Sommerfeld’s theory.

As we saw earlier, the feeling in the physics community is certainly not that Sommerfeld’s success is born of truth; instead, they see it as a lucky coincidence:

That these two theories [the old and the new QM] lead to essentially the same results for the hydrogen atom is a coincidence that caused much confusion in the 1920s, when the modern quantum theories were being developed. The coincidence occurs because the errors made by the Sommerfeld model, in ignoring the spin-orbit interaction and in using classical mechanics to evaluate the average energy shift due to the relativistic dependence of mass on velocity, happen to cancel for the case of the hydrogen atom. (Eisberg and Resnick, p.286)

But perhaps we shouldn’t think of it as two errors cancelling each other out, as if we have a positive and a negative which cancel to give zero. If you take away these ‘errors’ from OQT you certainly don’t get the right result: you get nothing at all! Instead, it just so happens that the work done by the Sommerfeld model in introducing a second quantum number to restrict the possible elliptical orbits, and then applying special relativity to those orbits, is exactly equivalent (as far as the end result is concerned) to the work done in modern QM by (i) applying relativistic effects to quantum mechanics, and (ii) taking spin-orbit interaction into account.

How can the selective realist respond? Despite the vast difference between the provenance of the fine-structure formula in the old and new QM, the selective realist might still insist that old QM has enough in common with the new QM to explain Sommerfeld’s success in terms of what he got right. After all, Sommerfeld was right in thinking that hydrogen atoms consist of one proton and one electron, that there is a Coulomb potential caused by the positive charge of the proton nucleus, that the energies of the possible electron states are quantized, that photon emission is caused by transitions between electron states, and so on. This all resides within the new QM just as it resides within the old QM. But of course, this isn’t enough. The issue is whether Sommerfeld’s success is due to what he got right, whether all of the assumptions which play an essential role in Sommerfeld’s derivation are at least approximately true.

One might simply say that, on the face of it, Sommerfeld drew on assumptions which are clearly false according to the new theory, that his derivation is dependant upon assumptions about elliptical trajectories and the effect of relativity on those trajectories. However, we can do better than this.

Consider Fig. 1. On the left hand side modern QM is represented as a body of assumptions, with electron ‘spin’ being one of these assumptions. Spin is a necessary part of the derivation of the fine-structure formula in the new QM, as noted above. In fact it is explicitly stated in relevant literature everywhere that the fine-structure splitting is caused by spin-orbit interaction.Footnote 26 On Fig. 1 this is signalled by the fact that there is an arrow leading from ‘spin’ to the ‘fine-structure formula’ on the diagram. But of course electron spin, even in a classical sense, is not playing a role in OQT.

Fig. 1
figure 1

A schematic representation of how the fine-structure follows from the new QM, and how the selective realist wants it to follow from the old QM. The elliptical ‘egg’ on the right hand side (RHS) represents the subset of assumptions OQT got right, according to the new QM

Herein lies a major difficulty for the selective realist. Such a realist wants to say that the fine-structure formula is ultimately born of assumptions Sommerfeld was right about, represented by the ‘egg’ on the RHS of Fig. 1. Thus, in the (realist) hope that this is true, an arrow on the diagram points from the ‘egg’ of truth on the RHS to the fine-structure formula in the middle. But, since all of those assumptions are true according to the new theory, they are in the new theory, and that same ‘egg’ appears on the LHS. It follows that, if the selective realist is right, one could draw an arrow from the ‘egg’ on the LHS to the fine-structure formula. But that would show that spin isn’t a necessary part of the explanation of the fine-structure formula in the new QM! But, since it is a necessary part, a simple application of modus tollens tells us that the arrow on the RHS cannot extend all the way into the ‘egg’ of truth within Sommerfeld’s theory. The conclusion is that Sommerfeld’s derivation of the fine-structure formula necessarily draws on at least some false assumptions within his theory.Footnote 27

There is still one last hope for the selective realist here. In 1928 Dirac showed that, if one develops quantum mechanics in a fully relativistic way, electron spin ‘drops out’ of the formalism, and doesn’t have to be ‘bolted on’ as a separate assumption, as in the above analysis.Footnote 28 And it is really this version of modern QM that delivers exactly the same fine structure formula as Sommerfeld’s theory.Footnote 29 The realist might then suggest that Fig. 1 misrepresents the situation. ‘Spin’ on the left hand side—they might claim—should go inside the egg on the left hand side, and therefore also inside the egg on the right hand side. Sommerfeld’s old 1916 theory, they would have to claim, does implicitly include ‘spin’ effects (they are somehow ‘buried’ within the theory).

This seems to be a highly implausible response, however. Even if we can say, with Dirac, that spin ‘drops out’ of a fully relativistic development of the new QM, we can’t say that it follows from relativity alone. Thus the fact that Sommerfeld’s theory includes relativity as a part is nowhere near enough for realist purposes. I find it impossible to fathom how spin could be ‘hiding’ in Sommerfeld’s theory. In Dirac’s treatment of the hydrogen atom it turns out that the electron has a certain, special, quantum property, and its having that property brings about the fine-structure splitting. In Sommerfeld’s theory the electron has no similar property whatsoever (it is not even spinning on its axis like a ball).Footnote 30

Simply put, if the selective realist could make this work, it would be a startling new development in our understanding of spin, and what causes the fine-structure splitting. This seems extremely unlikely: modern quantum mechanics has now reached an outstanding level of maturity, and any suggestion that one of its explanations is misguided is not to be tolerated without very good reason. The dictates of selective realism do not represent a good reason at all, relatively speaking. Thus, even if the selective realist strategy can work for Bohr’s prediction of the spectral lines of ionised helium, it seems safe to assume that it fails for Sommerfeld’s prediction of the fine-structure formula. If there is another way, the burden of proof lies squarely with the realist.

5 Ramifications for realism

How should the realist respond to this historical episode? First of all I should acknowledge that I have hardly proved that the selective realist’s favoured strategy cannot work for the Sommerfeld case. There may still exist truths or approximate truths buried within Sommerfeld’s theory which allow the derivation to go through. It seems highly unlikely—in particular because the spin-orbit interaction of modern QM appears to have no counterpart whatsoever in the Sommerfeld model (circa 1916 when the derivation was made)—but many selective realists allow that the truths in question can be highly abstract and far from obvious (cf. Saatsi 2005). It’s not clear what it would take to show beyond all doubt that there is no possible reconstruction of a given derivation in terms of what the theory got at least approximately right.

In fact, a paper by Biedenharn (1983) shows how it might be possible for the selective realist to proceed here. Biedenharn shows that there is a formal correspondence between what he calls the ‘Sommerfeld derivation’ and the ‘Dirac derivation’. Whether this correspondence is one the selective realist can draw upon is an open question, and a task I leave to such realists. Biedenharn makes quite substantial changes to both the old and the new derivations, employing some quite technical and complex mathematics, to achieve the correspondence. Are such extravagant formal manipulations a legitimate way for the realist to show that the old theory does indeed contain enough truth (in the light of the new theory) to explain its successes? It seems clear at the very least that not all selective realists could say so. Only realists who focus on continuity of highly abstract ‘structure’ across theory change could be satisfied with this: structural realists who follow in the footsteps of Worrall (1989). But of course, many selective realists want to be realists about much more than mere ‘structure’. If the realist has to say that it only looks like Sommerfeld’s relativistic elliptical orbits of electrons are essential features of his derivation, but that they aren’t really essential, then there seems to be nothing stopping the anti-realist saying the same thing about apparently ‘essential’ components of current successful science. This has been emphasised by Kyle Stanford (2006, 2009) and others: if one can show that so little is ‘essential’ to a theory’s success, then the selective realist ends up not being able to commit to many features of modern scientific theories that she most definitely wants to commit to (according to the anti-realist). Or at least, the predictive success of theories tells us so very little about what we are entitled to commit to.

Thus, in the end, many selective realists may prefer to let this example stand as a case that does go against their position. This wouldn’t be the end for selective realism—far from it. The realist can suggest that OQT is ‘the exception which proves the rule’, that it truly is historical magic, a ‘little miracle’. Only the ‘naïve optimist’ says that every significant scientific success must be explainable in terms of truth (see Saatsi and Vickers 2010). A more sophisticated, more realistic realist says that success is usually born of truth, so that we have good reason to believe that modern scientific successes are born of the truth in our current scientific theories (although it remains possible that our current theories are radically false, despite their successes). Following this route, the realist might even claim that Norton’s ‘reduced derivation’ of the Rydberg constant goes in favour of the realist’s thesis almost as much as Sommerfeld’s success goes against it, so that one walks away from OQT with the books fairly evenly balanced.

This move is representative of a general realist strategy that I have implicitly appealed to a few times in this paper, and which may look a little suspicious. Basically the move is, when faced with an apparent counterexample to one’s realist beliefs from the history of science, to say ‘Oh, that was just a simplification of my position! Actually I’m more sophisticated than that. What I really believe is...’ Five such ‘aspects of sophistication’ a realist might appeal to are as follows:

  1. (i)

    Success indicates at least approximate truth, not truth simpliciter;

  2. (ii)

    Commitment to only parts of a theory, not all of it (the selective realist move);

  3. (iii)

    Only predictive success is motivating, not (necessarily) explanatory success;

  4. (iv)

    Non-naive realism: success is only usually connected with truth;

  5. (v)

    If sufficiently successful, then realist commitment is warranted, but not only if sufficiently successful.

So, a realist who wanted to put all of these together might say,

When a theory achieves novel predictive success, then, probably (although not always), that means that the parts of the theory responsible for that success are at least approximately true (but such success is not the only reason one could have for making such a commitment).Footnote 31

Of course, the question arises whether the realist is justified in all of these ‘aspects of sophistication’, or whether they are introduced merely to get around difficult examples of historical evidence. For example, Stanford (2009, p. 383) has complained that the realist focus on predictive success is not justified given various current theories—typically from the biological sciences—that any realist would want to believe in but that don’t achieve novel predictive success. In fact Stanford’s worry is unfounded given aspect (v) on the above list: a realist is free to make a commitment to theories that don’t achieve novel predictive successes. But similar worries about how the realist might justify (i)–(v) might be harder to shake off.

It is not my intention to explore these issues properly here, but what I can say is that even if the realist is granted all of these qualifications, OQT still stands as an important and relevant case study in the debate. Even the most ‘sophisticated’ realist position becomes increasingly implausible if further examples mount up of novel predictive success born of assumptions at least some of which are not even approximately true. In fact many prima facie examples—such as those on Laudan’s list (1981)—have now been dismissed by the realist as either not successful enough or as harbouring hidden truths which explain the success. But there may yet be further examples of the Bohr and Sommerfeld type out there to be found. The example recently introduced to the debate in Saatsi and Vickers (2010) is just such a relevant example.

Clearly much work remains to be done to establish just how today’s philosopher of science should respond to successful but false theories in the history of science. According to this author, we still haven’t collected enough evidence from the history of science to make an informed decision on realism. Selective realism seems to make good sense as a ‘working hypothesis’, but as no more than that. Until much more philosophically informed historical work has been done, intuitions and rhetoric will play an overly prominent role in the debate. The aim of the present paper is to indicate the way forward, and take a small step in the right direction.