Between Rigor and Reality: Many-Body Models in Condensed Matter Physics

Gelfert, Axel

The present paper focuses on a particular class of models intended to describe and explain the physical behaviour of systems that consist of a large number of interacting particles. Such many-body models are characterized by a specific Hamiltonian (energy operator) and are frequently employed in condensed matter physics in order to account for such phenomena as magnetism, superconductivity, and other phase transitions. Because of the dual role of many-body models as models of physical systems (with specific physical phenomena as their explananda) as well as mathematical structures, they form an important sub-class of scientific models, from which one can expect to draw general conclusions about the function and functioning of models in science, as well as to gain specific insight into the challenge of modelling complex systems of correlated particles in condensed matter physics. In particular, it is argued that many-body models contribute novel elements to the process of inquiry and open up new avenues of cross-model confirmation and model-based understanding. In contradistinction to phenomenological models, which have received comparatively more philosophical attention, many-body models typically gain their strength not from 'empirical fit' per se, but from their being the result of a constructive application of mature formalisms, which frees them from the grip of both 'fundamental theory' and an overly narrow conception of 'empirical success'.

forthcoming in Why More is Different: Philosophical Issues in Condensed Matter Physics and Complex Systems, ed. B. Falkenburg & M. Morrison, Heidelberg: Springer 2015 Between Rigor and Reality: Many-Body Models in Condensed Matter Physics Axel Gelfert (National University of Singapore) December 29, 2013 Abstract The present paper focuses on a particular class of models intended to describe and explain the physical behaviour of systems that consist of a large number of interacting particles. Such many-body models are characterized by a specific Hamiltonian (energy operator) and are frequently employed in condensed matter physics in order to account for such phenomena as magnetism, superconductivity, and other phase transitions. Because of the dual role of many-body models as models of physical systems (with specific physical phenomena as their explananda) as well as mathematical structures, they form an important sub-class of scientific models, from which one can expect to draw general conclusions about the function and functioning of models in science, as well as to gain specific insight into the challenge of modelling complex systems of correlated particles in condensed matter physics. In particular, it is argued that manybody models contribute novel elements to the process of inquiry and open up new avenues of cross-model confirmation and model-based understanding. In contradistinction to phenomenological models, which have received comparatively more philosophical attention, many-body models typically gain their strength not from ‘empirical fit’ per se, but from their being the result of a constructive application of mature formalisms, which frees them from the grip of both ‘fundamental theory’ and an overly narrow conception of ‘empirical success’. 1 Introduction Scientiﬁc models are increasingly being recognized as central to the success and coherence of scientiﬁc practice. In the present paper, I focus on a particular class of models intended to describe and explain the physical behaviour of systems that consist of a large number of interacting particles. Such many-body models, usually characterized by a speciﬁc Hamiltonian (energy operator), are frequently employed in condensed matter physics in order to account for phenomena such as magnetism, superconductivity, and other phase transitions. Because of the dual role of many-body models as models of physical systems (with speciﬁc physical phenomena as their explananda) as well as mathematical structures, they form 1 an important sub-class of scientiﬁc models, from which one can expect to draw general conclusions about the function and functioning of models in science, as well as to gain speciﬁc insight into the challenge of modelling complex systems of correlated particles in condensed matter physics. Throughout the present paper, equal emphasis is placed on the process of constructing models and on the various considerations that enter into their evaluation. The rest of this paper is organized as follows. In Section 2, I place many-body models in the context of the general philosophical debate about scientiﬁc models (especially the inﬂuential ‘model as mediators’ view), paying special attention to their status as mathematical models. Following this general characterization, Section 3 then discusses a number of historical examples of many-body models and the uses to which they have been put in 20th-century physics, not least in the transition from classical models of interacting particles to a full appreciation of the quantum aspects of condensed-matter phenomena. On the basis of these historical examples, Section 4 distinguishes between diﬀerent strategies of model construction in condensed matter physics. Contrasting many-body models with phenomenological models (which are typically derived from interpolating between speciﬁc empirical phenomena), it is argued that the construction of many-body models may proceed either from theoretical ‘ﬁrst principles’ (sometimes called the ab initio approach) or may be the result of a more constructive application of the formalism of many-body operators. This formalism-based approach, it is argued in Section 5, leads to novel theoretical contributions by the models themselves (one example of which are so-called ‘rigorous results’; Section 5.1), which in turn gives rise to cross-model support between models of diﬀerent origins (Section 5.2) and opens up room for exploratory uses of models in the service of fostering model-based understanding (Section 5.3). The paper concludes with an appraisal of many-body models as a speciﬁc way of investigating condensed matter phenomena that steers a middle path ‘between rigor and reality’. 2 Many-body models as mathematical models Among the various kinds of models used in condensed matter physics, an important subclass are many-body models which represent a system’s overall behaviour as the collective result of the interactions between its constituents. The present section discusses many-body models in general terms, situating them within the general philosophical debate about scientiﬁc models and discussing, more speciﬁcally, their status as mathematical models. Mathematical models can take diﬀerent forms and fulﬁll diﬀerent purposes. They may be limiting cases of a more fundamental, analytically intractable theory, for example in the case of modelling planetary orbits as if planets were independent mass-points revolving around an inﬁnitely massive sun. Sometimes, models connect diﬀerent theoretical domains, as is the case in hydrodynamics, where Prandtl’s boundary layer model interpolates between the frictionless ‘classical’ domain and the Navier-Stokes domain of viscous ﬂows (26). Even where 2 a fundamental theory is lacking, mathematical models may be constructed, for example by ﬁtting certain dynamical equations to empirically observed causal regularities (as in population cycles of predator-prey systems in ecology) or by analyzing statistical correlations (as in models of stock-market behavior). In the economic and social sciences, identifying the relevant parameters and constructing a mathematical model that connects them may often precede theoryconstruction. Frequently, what scientists are interested in are qualitative features, such as the stability or instability of certain systems, and these may be reﬂected better by a mathematical model than by any available partial evaluation of the underlying theory. Given this diversity, it would be hopeless to look for shared properties that all mathematical models have in common. Fortunately, there are other ways one can approach the problem. First, the characteristics one is most interested in need not themselves be mathematical properties, but may encompass ‘soft’ factors such as ease of use, elegance, simplicity and other factors pertaining to the uses to which mathematical models typically are put. Second, it may be possible to identify a subclass of mathematical models – such as the manybody models to be discussed this paper – which is suﬃciently comprehensive to allow for generalizations but whose members are not too disparate. Finally, it will often be possible to glean additional insight from contrasting mathematical models with other, more general, kinds of models. On one inﬂuential general account, which will prove congenial to the present paper, models are to be regarded as ‘mediating instruments’. (See ref. (28)) It is crucial to this view that models are not merely understood as an unavoidable intermediary step in the application of general theories to speciﬁc situations. Rather, as ‘mediators’ between our theories and the world, models inform the interpretation of our theories just as much as they allow for the application of these theories to nature. As Morrison and Morgan are keen to point out, ‘models are not situated in the middle of an hierarchical structure between theory and the world’, but operate outside the hierarchical ‘theory-world axis’. (28, p. 17f.) This can be seen by realizing that models ‘are made up from a mixture of elements, including those from outside the original domain of investigation’ (p. 14); it is this partial independence of original theory and data that is required in order to allow models to play an autonomous role in scientiﬁc enquiry. In this respect, Margaret Morrison and Mary Morgan argue, scientiﬁc models are much like scientiﬁc instruments. Indeed, it is part and parcel of this view that model building involves an element of creativity and skill – it is ‘not only a craft but also an art, and thus not susceptible to rules’ (28, p. 12). A number of case studies have examined speciﬁc examples from the natural and social sciences from within this framework. (A cross-section of these are collected in ref. (27).) The upshot of many of these studies is that ‘model construction involves a complex activity of integration’ (26, p. 44). This integration need not be perfect and, as Daniela Bailer-Jones points out, may involve ‘a whole range of diﬀerent means of expression, such as texts, diagrams or mathematical equations’ (1, p. 60). Quite often, the integration cannot be perfect, as certain elements of the model may be incompatible with one 3 another. Even in cases where successful integration of the various elements is possible, the latter can be of very diﬀerent sorts – they may diﬀer not only in terms of their medium of expression (text, diagram, formula) but also in terms of content: Some may consist in mathematical relations, others may draw on analogies; some may reﬂect actual empirical data, others, perhaps in economics, may embody future target ﬁgures (e.g., for inﬂation). It is in comparison with this diversity of general aspects of scientiﬁc models, I argue, that several characteristic features of mathematical models can be singled out. The ﬁrst of these concerns the medium of expression, which for mathematical models is, naturally, the formal language of mathematics. It would, however, be misguided to simply regard a model as a set of (uninterpreted) mathematical equations, theorems and deﬁnitions, as this would deprive models of their empirical relevance: A set of equations cannot properly be said to ‘model’ anything, neither a speciﬁc phenomenon nor a class of phenomena, unless some of the variables are interpreted so as to relate them to observable phenomena. One need not be committed to the view (as Morrison paraphrases Nancy Cartwright’s position on the matter) that ‘fundamental theory represents nothing, [that] there is simply nothing for it to represent since it doesn’t describe any real world situations’ (25, p. 69), in order to acknowledge that mathematical models cannot merely be uninterpreted mathematical equations if they are to function as mediators of any sort; that is, if they are to model a case that, for whatever reason, cannot be calculated or described in terms of theoretical ﬁrst principles. The fact that mathematical models, like other kinds of models, require background assumptions and rules of interpretation, of course, does not rule out that in each case there may be a core set of mathematical relationships that model users regard as deﬁnitive of the mathematical model in question. Indeed, this assumption should be congenial to the proposed analysis of models as mediators, as the mathematical features of a model – where these are not merely ‘inherited’ from a fundamental theory – may provide it with precisely the autonomy and independence (from theory and data) that the role as mediator requires. This is applies especially to the case of many-body models which, as I shall discuss in Section 4, are typically the output of what has been called ‘mature mathematical formalisms’ (in this case: the formalism of second quantization, as adapted to the case of many-body physics). While it may be true that, as Giere puts it, ‘[m]uch mathematical modeling proceeds in the absence of general principles to be used in constructing models’ (16, p. 52), there are good terminological reasons to speak of a mathematical model of a phenomenon (or a class of phenomena) only if the kind of mathematical techniques and concepts employed are in some way sensitive to the kind of phenomenon in question. For example, while it may be possible, if only retrospectively, to approximate the stochastic trajectory of a Brownian particle by a highly complex deterministic function, for example a Fourier series of perfectly periodic functions, this would hardly count as a good mathematical model: There is something about the phenomenon, namely its stochasticity, that would not be adequately reﬂected by a set of deterministic equations; such a set of equations would quite simply not be a mathematical model of Brownian 4 motion.1 In addition to the requirement that the core mathematical techniques and concepts be sensitive to the kind of phenomenon that is being modelled, there is a further condition regarding what should count as a mathematical model. Loosely speaking, the mathematics of the model should do some work in integrating the elements of the ‘extended’ model, where the term ‘extended’ refers to the additional information needed to apply a bare mathematical structure to individual cases. If, for example, a mathematical model employs the calculus of partial diﬀerential equations, then it should also indicate which (classes of) initial and boundary conditions need to be distinguished; likewise, if a mathematical model depends crucially on certain parameters, it should allow for systematic methods of varying, or ‘tweaking’, those parameters, so their signiﬁcance can be studied systematically.2 This capacity of successful models to integrate diﬀerent cases, or diﬀerent aspects of the same case, has occasionally been called ‘moulding’ (4, p. 90),(1, p. 62): Mathematical moulding is shaping the ingredients in such a mathematical form that integration is possible, and contains two dominant elements. The ﬁrst element is moulding the ingredient of mathematical formalism in such a way that it allows the other elements to be integrated. The second element is calibration, the choice of the parameter values, again for the purpose of integrating all the ingredients. (4, p. 90) Successful mathematical models, on this account, display a capacity to integrate diﬀerent elements – some theoretical, others empirical – by deploying an adaptable, yet principled formalism that is mathematically characterizable, (largely) independently of the speciﬁcs of the theory and data in the case under consideration. For the remainder of the present paper, I shall therefore be relying on an understanding of many-body models that recognizes their dual status as models of physical systems (which, importantly, may include purely hypothetical systems) and as mathematical structures. This is in line with the following helpful characterization presented by Sang Wook Yi: What I mean by a model in this paper is a mathematical structure of three elements: basic entitites (such as ‘spins’), the postulated arrangement of the basic entities (say, ‘spins are located on the lattice point’) and interactions among the basic entities (‘spin-magnetic 1 There may, of course, be independent reasons why one might represent, say, a specific trajectory by a certain set of deterministic equations, or by a (non-mathematical) pictorial representation. However, in such cases, as well as in contexts where the stochasticity of the causal process is irrelevant, one would not be dealing with a model of Brownian motion, in the proposed narrower sense of ‘mathematical model’. 2 Systematic ‘tweaking’, as Martin Krieger observes, ‘has turned out to be a remarkably effective procedure’ (22, p. 428). By varying contributions to the model, e.g. by adding disturbances, one can identify patterns in the response of the model, including regions of stability. 5 ﬁeld interactions’). As a rough criterion, we may take a model to be given when we have the Hamiltonian of the model and its implicit descriptions that can motivate various physical interpretations (interpretative models) of the model. (37, p. 82) If this sounds too schematic, or too general, then perhaps a look at some historical examples will make vivid how many-body models have been put to use in condensed matter physics. 3 A brief history of many-body models In this, and the next section, a class of mathematical models will be discussed that was ﬁrst developed in connection with research on the magnetic properties of solids. The standard way of picturing a solid as a crystal, with the atoms being arranged in a highly ordered lattice so as to display certain spatial symmetries, and the electrons being possibly delocalized, as in a metal, already contains a good deal of assumptions that may or may not be realized in a given physical system. In order to regard this general characterization as a faithful representation of any real physical object, for example of a lump of metal in a given experiment, certain background assumptions have to be in place. For example, it has to be assumed that the piece of metal, which more often than not will display no crystalline structure to the naked eye, really consists of a number of microcrystals, each of which is highly ordered; that the imperfections, which may arise at the boundaries of two adjoining microcrystals or from the admixture of contaminating substances, are negligible; that, for the purpose of the experiment, the description in terms of ions and electrons is exhaustive (for example, that no spontaneous generation of particles occurs, as may happen at high energies). Picturing a solid as a lattice consisting of ions and electrons is, of course, a rather rudimentary model, as it does not yet tell us anything (except perhaps by analogies we may draw with macroscopic mechanical lattices) about the causal and dynamic features of the system. For this, the acceptance of a physical theory is required – or, in the absence of a theoretical account of the full system, the construction of a mathematical many-body model. (Often a physical theory – to the extent that it is accessible by researchers – will include general principles that constrain, but underdetermine, the speciﬁcs of a given system.) The earliest many-body model of the kind to be discussed in this paper was the Ising model, proposed in 1925 by the German physicist Ernst Ising at the suggestion of his then supervisor Wilhelm Lenz. It was published under the modest title ‘A Contribution to the Theory of Ferromagnetism’ and its conclusions were negative throughout. According to the summary published in that year’s volume of Science Abstracts, the model is an attempt to modify Weiss’ theory of ferromagnetism by consideration of the thermal behavior of a linear distribution of elementary magnets which (in opposition to Weiss) have no molecular ﬁeld 6 but only a non-magnetic action between neighboring elements. It is shown that such a model possesses no ferromagnetic properties, a conclusion extending to a three-dimensional ﬁeld.3 Ising’s paper initially did not generate much interest among physicists, as perhaps one would expect of a model that self-confessedly fails to describe the phenomenon for which it was conceived. It was not until the late 1930s that Ising’s paper was recognized as displaying a highly complex mathematical behaviour, which, as one contemporary physicist puts it, ‘continues to provide us with new insights’ (9, p. 47).4 As a model of ferromagnetic systems the Ising model pursues the idea that a magnet can be thought of as a collection of elementary magnets, whose orientation determines the overall magnetization of the system. If all the elementary magnets are aligned along the same axis, then the system will be perfectly ordered and will display a maximum value of the magnetization. In the simplest one-dimensional case, such a state can be visualized as a chain of ‘elementary magnets’, all pointing the same way: · · · ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ · · · The alignment of elementary magnets can be brought about either by a strong enough external magnetic ﬁeld or it can occur spontaneously, as will happen below a critical temperature, when certain substances (such as iron and nickel) undergo a ferromagnetic phase transition. The parameter that characterizes a phase transition, in this case the magnetization M, is also known as the order parameter of the transition. The guiding principle behind the theory of phase transitions is that discontinuities in certain thermodynamic quantities can occur spontaneously as a result of the system’s minimizing other such quantities in order to reach an equilibrium state. Hence, if the interaction between individual elementary magnets, i, j, characterized by a constant Jij , is such that it favours the parallel alignment of elementary magnets, then one can hope to expect a phase transition below a certain temperature. The energy function of the system as a whole will, therefore, play an important role for the dynamics of the model, and indeed, in the language of mathematical physics, this is what constitutes the many-body model. In the language of ‘mathematical moulding’, the energy function will be the core element of the many-body model. In the case of the Ising model, this function can be simply expressed as the sum over all interactions of one elementary magnet with all the others (the variable Si represents the elementary magnet at lattice site i and takes the values +1 or −1, depending on the direction in which the elementary magnet points; the minus sign is merely a matter of convention): X Jij Si Sj E=− i,j 3 Quoted in (20, p. 104). domain of application has broadened further in recent years; the Ising model is now also used to model networks, spin glasses, population distribution etc.; see, for example, refs. (24), (31), (10). 4 The 7 If one restricts the interaction to nearest neighbors only and assumes that Ji,i±1 > 0, then it is obvious that the energy will be minimized when all the elementary magnets point into the same direction, that is when Si Si+1 = +1 for all i. As Ising himself acknowledged, the one-dimensional model fails to predict a spontaneous magnetization, where the latter can simply be deﬁned as the sum over the orientations (Si = ±1) of all elementary magnets, in the absence of an external ﬁeld, divided by their total number: M= 1 X Si . N i The reason for the absence of a spontaneous magnetization in the case of the Ising ‘chain’ essentially lies in the instability, at ﬁnite temperatures (T 6= 0), of a presumed ordered state against ﬂuctuations.5 In the truly one-dimensional case, the chain is inﬁnitely extended (N → ∞), and the contribution of an individual elementary magnet to the total system is of only inﬁnitesimal signiﬁcance. However, one need only introduce one defect – that is, one pair of antiparallel (rather than parallel) elementary magnets – in order to eliminate the assumed magnetization, as the orientations of the elementary magnets on either side of the ‘fault line’ will cancel out (see ﬁgure below). Given that even the least ‘costly’ (in terms of energy) ﬂuctuation will destroy the magnetization, the presumed ordered state cannot obtain. · · · ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↓ ↓ ↓ ↓ ↓ ↓ ↓ · · · Whereas Ising’s proof of the non-occurrence of a phase-transition in one dimension has stood up to scrutiny, Ising’s conjecture that the same holds also for the two- and three-dimensional case has since been proven wrong. In 1935, Rudolf Peierls demonstrated that the two-dimensional Ising model exhibits spontaneous magnetization below a critical temperature Tc > 0. This marked a turning-point in the ‘career’ of the Ising model as an object of serious study. In what has been described as ‘a remarkable feat of discrete mathematics’ (20, p. 106), Lars Onsager was able to produce an exact solution, at all temperatures, of the two-dimensional version of the Ising model.(32) His results concerned not only the existence, or absence, in general of a phase transition, but they also delivered a precise value of the critical temperature (at least for the square lattice) and gave a rigorous account of the behavior of other quantities, such as the speciﬁc heat. (See (5) and (29) for a more detailed study of the history of the Ising model.) In summary, the lessons of this brief history of many-body models are as follows. First, it is worth reminding oneself that as a model of ferromagnetism, the Ising model was initially considered a failure. At the time Ising proposed his model in 1925, he recognized that its failure lay in not predicting a spontaneous 5 The zero-temperature case (T = 0) is of special significance in a variety of many-body models; however, in order to keep the presentation accessible, T 6= 0 will be assumed throughout the following discussion. 8 magnetization in one dimension (and, as Ising wrongly conjectured, also in three dimensions). By the time it was recognized, by Peierls and Onsager, that the model could explain the occurrence of a phase transition in two (and possibly three) dimensions, however, the theory of ferromagnetism had moved on. For one, Werner Heisenberg, in a paper in 1928, had proposed a quantum theoretical model, essentially by replacing the number-valued variable Si in the Ising model by operator-valued vectors Ŝi . At ﬁrst glance, this formal change may seem minor, but it indicates a radical departure from the classical assumptions that Ising’s model was based on. Where Ising had to postulate the existence of ‘elementary magnets’, Heisenberg was able to give a physical interpretation in terms of the newly discovered spin of atoms and electrons. The departure from classical assumptions also manifests itself mathematically in the use of spin operators, together with their commutation relations (which have no equivalent in classical physics), which fundamentally changes the algebraic properties of the mathematical core of the model. The novelty of quantum theory, and of Heisenberg’s model, however, is only one reason why, despite Peierls and Onsager’s seeming vindication, the Ising model did not gain a foothold as a good model of ferromagnetism. For, as Bohr (1911) and van Leeuwen (1919), independently of each other, had rigorously shown in their doctoral dissertations, a purely classical system that respects the (classical) laws of electrodynamics, could never display spontaneous magnetization (though, of course, it may develop a non-zero magnetization in an external ﬁeld). Hence, the explanatory power of the Ising model as a model of spontaneous ferromagnetism was doubly compromised: It could not oﬀer an explanation of why there should be ‘elementary magnets’ in the ﬁrst place, and it purported to model, using the conceptual repertoire of classical physics, a phenomenon that could be shown to be incompatible with classical physics.6 One might question whether at any point in time the Ising model could have been a good model of ferromagnetism. Had Onsager’s solution already been published by Ising, could Heisenberg, in his 1928 paper, still have dismissed Ising’s model as ‘not suﬃcient to explain ferromagnetism’ (19)? Hardly, one might argue. But as things stand, this is not what happened. Models are employed in fairly speciﬁc contexts, and in the case of mathematical models in particular, the uses to which they are put determine their empirical content. As Bailer-Jones argues, it is ‘[t]he model users’ activity of intending, choosing and deciding [that] accounts for the fact that models, as they are formulated, submit to more than sheer data match’ (1, p. 71). Applying this idea to the Ising model with its varied history, one could perhaps argue that even a model that was initially considered a failure may experience a comeback later, when it is used to model other phenomena or is considered as a testing ground for new theoretical techniques or mathematical concepts – only, of course, that this Ising model, now conceived of as an instrument of generating rigorous results and exact solutions for their own sake, would no longer be a model 6 As Martin Niss notes, during the first decade of the study of the Lenz-Ising model ‘[c]omparisons to experimental results were almost absent’ and especially its initial ‘development was not driven by discrepancies between the model and experiments’. (29, p. 311-312.) 9 of ferromagnetism. 4 Constructing quantum Hamiltonians Because in the Heisenberg model the hypothetical ‘elementary magnets’ of the Ising model are replaced by quantum spins and the nature of ‘spin’ as a nonclassical internal degree of freedom is accepted (by fully embracing the algebraic peculiarities of spin operators), this model is a much better candidate for mimicking spontaneous magnetization. Nonetheless, it still represents the spins as rigidly associated with nodes of a lattice in real (geometrical) space. This is plausible for magnetic insulators but not for substances such as iron and nickel where the electrons are mobile and can, for example, sustain an electric current. However, one can deﬁne a concept of pseudo-spins, which retains the idea that spins can interact directly, even when it is clear that in a metal with delocalized electrons all spin-spin interaction must eventually be mediated by the entities that are in fact the spin-carriers – that is, by electrons. ‘Pseudo-spins’ were ﬁrst constructed mathematically via the so-called electron number operators. This mathematical mapping of kinds of operators onto each other, however, does not yet result in a self-contained, let alone intuitive, many-body model for systems with delocalized electrons. This is precisely the situation a model builder ﬁnds herself in when she sets out to construct a model for a speciﬁc phenomenon, or class of phenomena, such as the occurrence of spontaneous magnetization in a number of physical materials. Since ‘fundamental theory’ allows for almost limitless possible scenarios, the challenge lies in constructing a model that is interpretable by the standards of the target phenomenon. Questions of empirical accuracy – which, after all, cannot be known in advance – are secondary during the phase of model construction. If a model is indeed an instrument of inquiry and, as some claim, is ‘inherently intended for speciﬁc phenomena’ (35, p. 75), then at the very least there must be a way of interpreting some (presumably the most salient) elements of the model as representing a feature of the phenomenon or system under consideration. This demand has direct implications for how models are being constructed. If models indeed aim at representing their respective target system, then success in constructing models will be judged by their power to represent. Thus, among proponents of the models-as-mediators view, it is a widely held view that ‘[t]he proof or legitimacy of the representation [by a model] arises as a result of the model’s performance in experimental, engineering and other kinds of interventionist contexts’ (25, p. 81). One would expect, then, that the process of model construction should primarily be driven by a concern for whether or not its product – the models – are empirically successful. By contrast, I want to suggest that the case of many-body models is a paradigmatic example of a process of model construction that neither regards models as mere limiting cases of ‘fundamental theory’ nor appeals to empirical success as a guide to (or, indeed, the goal of) model construction. Instead, it involves the interplay of two rather diﬀerent strategies, which I shall refer 10 to as the ‘first-principles’ (or ‘ab initio’) approach, on the one hand, and the ‘formalism-driven’ approach on the other. Whereas the expression ‘formalismdriven’ is my coinage, the ﬁrst pair of expressions – ‘ﬁrst principles’ and ‘ab initio’ – reﬂects standard usage in theoretical condensed matter physics, where it is used in contradistinction to so-called ‘phenomenological’ approaches which aim to develop models by interpolating between speciﬁc empirical observations: The first principles approach to condensed matter theory is entirely diﬀerent from this. It starts from what we know about all condensed matter systems – that they are made of atoms, which in turn are made of a positively charged nucleus, and a number of negatively charged electrons. The interactions between atoms, such as chemical and molecular bonding, are determined by the interactions of their constituent electrons and nuclei. All of the physics of condensed matter systems arises ultimately from these basic interactions. If we can model these interactions accurately, then all of the complex physical phenomena that arise from them should emerge naturally in our calculations. (15) A clear, but overambitious, example of a ﬁrst-principles approach would be the attempt to calculate the full set of ∼1023 coupled Schrödinger equations, one for each of the ∼1023 nodes in the crystal lattice. For obvious reasons, solving such a complex system of equations is not a feasible undertaking – indeed, it would merely restate the problem in the terms of fundamental theory, the complexity of which prompted the introduction of (reduced) models in the ﬁrst place. But less ambitious, and hence more tractable, ﬁrst-principles approaches exist. Thus, instead of taking the full system – the extended solid-state crystal – as one’s starting point, one may instead begin from the smallest ‘building block’ of the extended crystal, by considering the minimal theory of two atoms that are gradually moved together to form a pair of neighbouring atoms in the crystal. One can think of this way of model construction as involving a thought experiment regarding how a many-body system ‘condenses’ from a collection of isolated particles. Such an approach, although it does not start from the ‘full’ theory of all ∼1023 particles, remains ﬁrmly rooted in ‘ﬁrst principles’, in that the thought experiment involving the two ‘neighbouring’ atoms approaching one another is being calculated using the full theoretical apparatus (in this case, the theoretical framework of non-relativistic quantum mechanics).7 This is the ‘derivation’ of many-body models that is usually given in textbooks of many-body theory (e.g. ref. (30)), often with some degree of pedagogical hindsight. However, while such a derivation makes vivid which kinds of effects – e.g., single-particle kinetic energy, particle-particle Coulomb repulsion, and genuine quantum exchange interactions between correlated particles – may be expected to become relevant, it typically remains incomplete as a model of 7 Needless to say, considerable background assumptions are necessary in order to identify which unit is indeed the smallest one that still captures the basic mechanisms that determine the behaviour of the extended system. 11 the extended many-body system: what is being considered is only the smallest ‘building block’, and a further constructive move is required to generate a many-body model of the full crystal. This is where the second kind of procedure in model construction – what I shall call the formalism-driven approach – needs to be highlighted. This approach, in my view, is far more ubiquitous than is commonly acknowledged, and it sheds light on the interplay between mathematical formalism and rigor on the one hand, and the interpretation of models and the assessment of their validity on the other hand. In particular, it also reinforces the observation that many-body models enjoy a considerable degree of independence from speciﬁc experimental (or other interventionist) contexts, and even from quantitative standards of accuracy. On the account I am proposing, a ‘mature mathematical formalism’ is ‘a system of rules and conventions that deploys (and often adds to) the symbolic language of mathematics; it typically encompasses locally applicable rules for the manipulation of its notation, where these rules are derived from, or otherwise systematically connected to, certain theoretical or methodological commitments’ (13, p. 272). In order to understand how the formalism-driven strategy in model construction works, let us return to the case under consideration, namely ferromagnetic systems with itinerant electrons. How is one to model the itinerant nature of conduction electrons in such metals as cobalt, nickel and iron? The formalism of so-called creation and annihilation operators, â†i,σ and âi,σ , allows one to describe the dynamics of electrons in a crystal. Since electrons cannot simply be annihilated completely or created ex nihilo (at least not by the mechanisms that govern the dynamics in a solid at room temperature), an annihilation operator acting at one lattice site must always be matched by a creation operator acting at another lattice site. But this is precisely what describes itinerant behaviour of electrons in the ﬁrst place. Hence, the formalism of second quantization, in conjunction with the basic assumption of preservation of particle number, already suggests how to model the kinetic behaviour of itinerant electrons, namely through the following contribution to the Hamiltonian: X Ĥkin = Tij â†i,σ âj,σ ijσ When the operator product â†i,σ âj,σ acts on a quantum state, it ﬁrst8 annihilates an electron of spin σ at lattice site j (provided such an electron happens to be associated with that lattice site) and then creates an electron of spin σ at another site i. Because electrons are indistinguishable it appears, from within the formalism, as if an electron of spin σ had simply moved from j to i. The parameters Tij , which determine the probability with which such electron ‘hopping’ from one place to another occurs, are known as hopping integrals. In cobalt, nickel and iron, the electrons are still comparatively tightly bound to 8 Operators need to be read from right to left; hence, if an operator product acts on a quantum state, â†i,σ âj,σ |Ψi , the operator âj,σ directly in front of |Ψi acts first, then â†i,σ . Because operators do not always commute, the order of operation is important. 12 their associated ions; hence, hopping to distant lattice sites will be rare. This is incorporated into the model for the kinetic behaviour of the electrons by including in the model the assumption that hopping only occurs between nearest neighbours. Hopping is not the only phenomenon that a model for itinerant electrons should reﬂect. Also, the Coulomb force between electrons – that is, the fact that two negatively charged entities will experience electrostatic repulsion – needs to be taken into consideration. Again, the formalism of second quantization suggests a straightforward way of accounting for the Coulomb contribution to the Hamiltonian. Since the Coulomb force will be greatest for electrons at the same lattice site (which, due to the Pauli exclusion principle, then must have diﬀerent spins), the dominating term will be ĤCoulomb = XU iσ 2 n̂i,σ n̂i,−σ . The sum of these two terms – the hopping term (roughly, representing movement of electrons throughout the lattice) and the Coulomb term (the potential energy due to electrostatic repulsion) – already constitutes the Hubbard model: ĤHubbard = = Ĥkin + ĤCoulomb XU X n̂i,σ n̂i,−σ . Tij â†i,σ âj,σ + 2 iσ ijσ Note that, unlike the ﬁrst-principles approach, the formalism-based approach to model construction does not begin with a description of the physical situation in terms of fundamental theory, either in the form of the ‘full’ set of ∼ 1023 coupled Schrödinger equations, or via the thought experiment of neighbouring atoms gradually approaching each other so as to form the elementary ‘building block’ of an extended crystal lattice. Instead, it models the presumed microscopic processes (such as hopping and Coulomb interaction) separately, adding up the resulting components and, in doing so, constructing a many-body model ‘from scratch’, as it were, without any implied suggestion that the Hamiltonian so derived is the result of approximating the full situation as described by fundamental theory. Interestingly, Nancy Cartwright argues against what she calls ‘a mistaken reiﬁcation of the separate terms which compose the Hamiltonians we use in modelling real systems’ (8, p. 261). Although Cartwright grants that, on occasion, such terms ‘represent separately what it might be reasonable to think of as distinct physical mechanisms’, she insists that ‘the break into separable pieces is purely conceptual’ (ibid.) and that what is needed are ‘independent ways of identifying the representation as correct’ (8, p. 262). Cartwright’s critique of formalism-based model construction must be understood against the backdrop of her emphasis on phenomenological approaches, which she regards as the only way ‘to link the models to the world’ (ibid.). To be sure, the formalism-driven approach often proceeds in disregard of speciﬁc empirical phenomena and in this respect might be considered as remote from 13 Cartwright’s preferred level of description – the world of physical phenomena – as the more ‘ﬁrst-principles’-based approaches. But it would be hasty to reject the formalism-driven approach for this reason alone, just as it would be hasty to consider it simply an extension of ‘fundamental theory’. It is certainly true that the formalism-driven approach is not theory-free. But much of the fundamental theory is hidden in the formalism – the formalism, I have argued elsewhere, may be said to ‘enshrine’ various theoretical, ontological, and methodological commitments and assumptions. (See (13).) Consider, for example, how the construction of the kinetic part of the model proceeded from purely heuristic considerations of how itinerant motion in a discrete lattice could be intuitively pictured in terms of the annihilation of an electron at one place and its subsequent creation at another place in the lattice. The hopping integrals Tij were even introduced as mere parameters, when, on the ﬁrst-principles approach, they ought to be interpreted as matrix elements, which contain the bulk of what quantum theory can tell us about the probability amplitude of such events. The Coulomb term, ﬁnally, was constructed almost entirely by analogy with the classical case, except for the reference to the Pauli principle. (Then again, the Pauli principle itself is what makes the formalism of second quantization and of creation/annihilation operators work in the ﬁrst place – a fact that the present formalism-driven derivation did not for a moment have to reﬂect upon.9 ) Rather than thinking of the formalism-based approach as drawing a veil over the world of physical phenomena, shrouding them in a cocoon of symbolic systems, one should think of formalisms such as the many-body operators discussed above as playing an enabling role: not only do they allow the model builder to represent selected aspects of complex systems, but in addition one ﬁnds that ‘in many cases, it is because Hamiltonian parts can be interpreted literally, drawing on the resources furnished by fundamental theory as well as by (interpreted) domain-speciﬁc mathematical formalisms, that they generate understanding’ (14, p. 264; see also Section 5.3 of this paper). While the formalism-based approach is not unique in its ability to model selected aspects of complex systems (in particular, diﬀerent co-existing ‘elementary’ processes), it does so with an especially high degree of economy, thereby allowing the wellversed user of a many-body model to develop a ‘feel’ for the model and to probe its properties with little explicit theoretical mediation. 5 Many-body models as mediators and contributors Earlier, I argued that mathematical models should be sensitive to the phenomena they are intended to model. As argued in the previous section (Section 4), the existence of a mature formalism – such as second quantization with its rules for employing creation and annihilation operators – can guarantee certain 9 For a discussion of the formalism of creation and annihilation operators as a ‘mature mathematical formalism’, see (13, p. 281-282.). 14 kinds of sensitivity, for example the conformity of many-body models to certain basic theoretical commitments (such as the Pauli principle). At the same time, however, the formalism frees the model from some empirical constraints: By treating the hopping integrals as parameters that can be chosen largely arbitrarily (except perhaps for certain symmetry requirements), it prevents the relationship of sensitivity between model and empirical data from turning into a relationship of subjugation of the model by the data. On a standard interpretation, applying models to speciﬁc physical systems is a two-step process. First, a ‘reduced’ mathematical model is derived from fundamental theory (a simplistic view that has already been criticized earlier); second, approximative techniques of numerical and analytical evaluation must be employed to calculate physical observables from the model, again at the expense of the mathematical complexity of the (still not exactly solvable) Hamiltonian. This way of speaking of two successive steps of approximation, however, puts undue emphasis on the loss of accuracy involved in the process. For, it is not clear how lamentable this ‘loss’ really is, given the unavailability of an exact solution to the full problem. Crucially, such a view also overlooks that the model itself contributes new elements to the theoretical description of the physical system, or class of systems, under consideration – elements, which are not themselves part of the fundamental theory (or, as it were, cannot be ‘read oﬀ’ from it) but which may take on an interpretative or otherwise explanatorily valuable role. Contributions of this sort, originating from the model rather than from either fundamental theory or empirical data, do, however, considerably inform the way physicists think about a class of systems and frequently suggest new lines of research. Consider the limitation to two sets of parameters U, {Tij } in the case of the Hubbard model. Assuming a cubic lattice and nearest-neighbour interaction, the interaction Tij between diﬀerent lattice sites will either be zero or have the same ﬁxed value t. Hence, the quotient U/t reﬂects the relative strength of the interaction between electrons (as compared with their individual kinetic movement), and within the model it is a unique and exact measure of this important aspect of the dynamics of electron behaviour in a solid. The individual quantities U and t, thus, are seen to be no mere parameters, but are linked, through the model, in a meaningful way, which imposes constraints on which precise values are, or aren’t, plausible. Not only does this restrict the freedom one enjoys in arbitrarily choosing U and t to ﬁt the model to the empirical data, but it also imposes constraints on structural modiﬁcations of the model. For example, an attempt to make the model more accurate by adding new (higher-order) terms to the model (perhaps accounting for higherorder interactions of strengths V, W, X, Y, Z < U, t), may be counterproductive, as it may be more useful, for explanatory purposes, to have one measure of the relative strength of the electron-electron interaction (namely, U/t) rather than a whole set {U/t, V /t, W/t, ...}. To the extent that the model’s purpose is explanatory and not merely predictive, a gain in numerical accuracy may not be desirable if it requires replacing an intuitively meaningful quantity with a set of parameters that lack a straightforward interpretation. Fitting a model to 15 the data does not by itself make the model any more convincing. The ‘active’ contribution of the model, that is, its contributing new elements rather than merely integrating theoretical and experimental (as well as further, external) elements, is not only relevant to interpretative issues, but also has direct consequences for assessing the techniques used to evaluate the model and to calculate, either numerically or analytically, observable quantities from it. 5.1 Rigorous results and relations One particularly salient class of novel contributions that many-body models make to the process of inquiry in condensed matter physics are known as rigorous results. The expression ‘rigorous results’, which is not without its problems, has become a standing expression in theoretical physics, especially among practitioners of statistical and many-body physics. (See for example (3).) It therefore calls for some clariﬁcation. What makes a result ‘rigorous’ is not the qualitative or numerical accuracy of a particular prediction of the theory or model. In fact, the kind of ‘result’ in question will often have no immediate connection with the empirical phenomenon (or class of phenomena) a model or theory is supposed to explain. Rather, it concerns an exact mathematical relationship between certain mathematical variables, or certain structural components, of the mathematical model, which may or may not reﬂect an empirical feature of the system that is being modelled. One, perhaps crude, way of thinking about rigorous results would be to regard them as mathematical theorems that are provable from within the model or theory under consideration.10 Much like Pythagoras’ theorem, a2 + b2 = c2 , is not merely true of a particular set of parameters, e.g. {a, b, c} = {3, 4, 5}, but holds for all rectangular triangles, so a rigorous result in the context of a mathematical model holds for a whole class of cases rather than for particular parameter values. Yet, importantly, rigorous results are true only of a model (or a class of models) as deﬁned by a speciﬁc Hamiltonian; unlike, say, certain symmetry or conservation principles, they do not follow directly from fundamental theory. An important use of rigorous results and relations is as ‘benchmarks’ for the numerical and analytical techniques of calculating observable quantities from the model.11 After all, an evaluative technique that claims to be true to the model should preserve its main features, and rigorous results often take the form either of exact relations holding between two or more quantities, or of lower and upper bounds to certain observables. If, for example, the order parameter in question is the magnetization, then rigorous results – within a given model – may obtain, dictating the maximum (or minimum) value of the magnetization or the magnetic susceptibility. These may then be compared with results derived 10 The notions of ‘theorem’ and ‘rigorous result’ are frequently used interchangeably in scientific texts, especially in theoretical works such as (18). 11 This is noted in passing, though not elaborated on, by R.I.G. Hughes in his case study of one of the first computer simulations of the Ising model: ‘In this way the verisimilitude of the simulation could be checked by comparing the performance of the machine against the exactly known behaviour of the Ising model.’ (20, p. 123) 16 numerically or by other approximative methods of evaluation. 5.2 Cross-model support Rigorous results may also connect diﬀerent models in unexpected ways, thereby allowing for cross-checks between methods that were originally intended for different domains. Such connections can neither be readily deduced from fundamental theory, since the rigorous results do not hold generally but only between diﬀerent (groups of) models; nor can they justiﬁably be inferred from empirical data, since the physical systems corresponding to the two groups of mathematical many-body models may be radically diﬀerent. As an example consider again the Hubbard model. It can been shown rigorously (see, for example, (11)) that, at half ﬁlling (that is, when half of the quantum states in the conduction band are occupied) and in the strong-coupling interaction limit, U/t → ∞, the Hubbard model can be mapped on to the spin-1/2 antiferromagnetic Heisenberg model (essentially in the form described earlier, with Jij = 4t2 /U ). Under the speciﬁed conditions, the two models are isomorphic and display the same mathematical behavior. Of course, the Hubbard model with infinitely strong electron-electron interaction (U/t → ∞) cannot claim to describe an actual physical system, where the interaction is necessarily ﬁnite, but to the extent that various mathematical und numerical techniques can nonetheless be applied in the strong-coupling limit, comparison with the numerically and analytically more accessible antiferromagnetic Heisenberg model provides a test also for the adequacy of the Hubbard model. Rigorous relations between diﬀerent many-body models not only provide fertile ground for testing of mathematical and numerical techniques, and for the ‘exploration’ (in the sense discussed in the next subsection) of models more generally. They can also give rise to a transfer of empirical warrant across models that were intended to describe very diﬀerent physical systems. The mapping, in the strong-coupling limit (U/t → ∞), of the Hubbard model onto the spin-1/2 antiferromagnetic Heisenberg model is one such example. For, the latter – the antiferromagnetic Heisenberg model – has long been known as an empirically successful “‘standard model” for the description of magnetic insulators’ (Gebhard 1997: 75), yet the Hubbard model at low coupling (U/t = 0, indicating zero electron-electron interaction) reduces to an ideal Fermi electron gas – a perfect conductor. It has therefore been suggested that, for some ﬁnite value between U/t = 0 and U/t → ∞, the Hubbard model must describe a system that undergoes a transition from conductor to insulator. Such transitions, for varying strengths of electron-electron interaction, have indeed been observed in physical systems and are known as Mott insulators. Thanks to the existence of a rigorous relation between the two models, initial empirical support for the Heisenberg model as a model of a magnetic insulator thus translates into support for a new – and originally unintended – representational use of the Hubbard model, namely as a model of Mott insulators. In other words, ‘empirical warrant ﬁrst ﬂows from one model to another, in virtue of their standing in an appropriate mathematically rigorous relation’ (12, p. 516), from which one may 17 then gain new insights regarding the empirical adequacy of the model.12 As this example illustrates, rigorous results neither borrow their authority from fundamental theory nor do they need to prove their mettle in experimental contexts; instead, they are genuine contributions of the models themselves, and it is through them that models – at least those of the kind discussed in this paper – have ‘a life of their own’. 5.3 Model-based understanding The existence of rigorous results and relations, and of individual cases of crossmodel support between many-body models of quite diﬀerent origins, may perhaps seem too singular. Can any general lessons be inferred from them regarding the character of many-body models more broadly? I wish to suggest that both classes of cases sit well with general aspects of many-body models and their construction, especially when viewed from the angle of the formalism-based approach. By reconceptualizing many-body models as outputs of a mature mathematical formalism – rather than conceiving of them either as approximations of the ‘full’ (but intractable) theoretical description or as interpolating between speciﬁc empirical phenomena – the formalism-based approach allows for a considerable degree of ﬂexibility and exploration, which in turn generates understanding. For example, one may construct a many-body model (which may even be formulated in arbitrary spatial dimensions) by imagining a crystal lattice of a certain geometry, with well-formed (by the lights of the many-body formalism) mathematical expressions associated with each lattice point, and adding the latter up to give the desired ‘Hamiltonian’: ‘Whether or not this “Hamiltonian” is indeed the Hamiltonian of a real physical system, or an approximation of it, is not a consideration that enters at this stage of model construction.’ (14, p. 262) The phenomenological approach advocated by Cartwright might lament this as creating an undue degree of detachment from the world of empirical phenomena, but what is gained in the process is the potential for exploratory uses of models. As Yi puts it: One of the major purposes of this ‘exploration’ is to identify what the true features of the model are; in other words, what the model can do with and without additional assumptions that are not a part of the original structure of the model. (37, p. 87) Such exploration of the intrinsic features of a model ‘helps us shape our physical intuitions about the model’, even before these intuitions become, as Yi puts it, ‘canonical’ through ‘successful application of the model in explaining a phenomenon’ (ibid.). Exploratory uses of models feed directly into model-based understanding, yet they do so in a way that is orthogonal to the phenomenological approach and its emphasis on interpolation between observed physical phenomena. As I have 12 This case of cross-model support between many-body models that were originally motivated by very different concerns, is discussed in detail in (12). 18 argued elsewhere, microscopic many-body models ‘are often deployed in order to account for poorly understood phenomena (such as speciﬁc phase transitions); a premature focus on empirical success (e.g., the exact value of the transition temperature) might lead one to add unnecessary detail to a model before one has developed a suﬃcient understanding of which microscopic processes inﬂuence the macroscopically observable variable’ (14, p. 264). A similar observation is made by those who argue for the signiﬁcance of minimal models. Thus Robert Batterman argues (quoting a condensed matter theorist, Nigel Goldenfeld): On this view, what one would like is a good minimal model—a model ‘which most economically caricatures the essential physics’ (Goldenfeld 1992, p. 33). The adding of details with the goal of ‘improving’ the minimal model is self-defeating – such improvement is illusory.(2, p. 22) The formalism-based approach thus diﬀers from the phenomenological approach in two important ways. First, it conceives of model construction as a constructive and exploratory process, rather than as one that is driven by tailoring a model to speciﬁc empirical phenomena. This is aptly reﬂected by Yi in his account of model-based understanding, which posits two stages: (1) understanding of the model under consideration, and this involves, among other things, exploring its potential explanatory power using various mathematical techniques, ﬁguring out various plausible physical mechanisms for it and cultivating our physical intuition about the model; (2) matching the phenomenon with a wellmotivated interpretative model of the model.(37, p. 89-90) Second, the two approaches diﬀer in the relative weight they accord to empirical adequacy and model-based understanding as measures of the performance of a model. In the formalism-based approach, empirical adequacy is thought of as a ‘bonus’ – in the sense that ‘model-based understanding does not necessarily presuppose empirical adequacy’ (37, p. 85). Such model-based understanding need not be restricted to purely internal considerations, such as structural features of the model, but may also extend to general questions about the world, especially where these take the form of ‘how-possibly’ questions. For example, in the many-body models under discussion, an important driver of model construction has been the question of how there could possibly arise any magnetic phase transition (given the Bohr-van Leeuwen prohibition on spontaneous magnetization in classical systems; see Section 3) – regardless of any actual, empirically observed magnetic systems. By contrast, the phenomenological approach is willing to trade in understanding of the inner workings of a model for speciﬁc empirical success. As Cartwright puts it, ‘[a] Hamiltonian can be admissible under a model – and indeed under a model that gives good predictions – without being explanatory if the model itself does not purport to pick out basic explanatory mechanisms’ (8, p. 271). As an illustration of how the formalism-based approach and the phenomenological approach pull in diﬀerent directions, consider which contributions to a 19 many-body model (that is, additive terms in a Hamiltonian) each approach deems admissible. According to Cartwright, only those terms are admissible that are based on ‘basic interpretative models’ that have been studied independently and are well-understood, both on theoretical grounds and in other empirical contexts; these are the textbook examples of the central potential, scattering, the Coulomb interaction, the harmonic oscillator, and kinetic energy (8, p. 264). What licenses their use – and, in turn, excludes other (more ‘arbitrary’ or ‘formal’) contributions to the Hamiltonian – is the existence of ‘bridge principles’ which ‘attach physics concepts to the world’ (8, p. 255). Indeed, Cartwright goes so far as to assert that quantum theory ‘applies exactly as far as its interpretative models can stretch’: Only those situations that are captured adequately by the half-dozen or so textbook examples of interpretative models ‘fall within the scope of the theory’ (8, p. 265). By contrast, the formalismbased approach tells a very diﬀerent story. As long as one ‘plays by the rules’ of the formalism – which now enshrines theoretical constraints, without the need to make them explicit even to the experienced user – any newly constructed Hamiltonian terms are admissible in principle. And, indeed, in Section 4 we already encountered a contribution to the Hamiltonian – the hopping term – which was not inspired by the limited number of stock examples allowed on the phenomenological approach, but instead resulted from a creative application of the formalism-based rules for the ‘creation’ and ‘annihilation’ of particles at distinct lattice sites. By freeing model construction from the overemphasis on empirical adequacy, the formalism-based approach not only allows for a more ﬂexible way of modelling speciﬁc processes that are thought to contribute to the overall behaviour of a complex system, but gives modellers the theoretical tools to sharpen their understanding of the diverse interactions that together make up the behaviour of many-body systems. 6 Between rigor and reality: Appraising manybody models Traditionally, models have been construed as being located at a deﬁnite point on the ‘theory-world axis’ (28, p. 18). Unless their role was seen as merely heuristic, models were to be judged by how well they ﬁt with the fundamental theory and the data, or, more speciﬁcally, how well they explain the data by the standards of the fundamental theory. Ideally, a model should display a tight ﬁt with both the theory and the empirical data or phenomena. As Tarja Knuuttila has pointed out, large parts of contemporary philosophy of science continue to focus on ‘the model-target dyad as a basic unit of analysis concerning models and their epistemic values’ (23, p. 142). The proposed alternative view of models as mediators presents a powerful challenge to the traditional picture. It takes due account of the fact that, certainly from an epistemic point of view, theories can only ever be partial descriptions of what the world is like. What is called for is an account of models that imbues them with the kind of autonomy that does not 20 require close ﬁt with fundamental theory, but nevertheless enables us to explain and understand physical phenomena where no governing fundamental theory has been identiﬁed. On this view, any account of real processes and phenomena also depends on factors that are extraneous to the fundamental theory, and those who deny this, are ‘interested in a world that is not our world, not the world of appearances but rather a purer, more orderly word, a world which is thought to be represented “directly” by the theory’s equations’ (7, p. 189). The mediator view of models acknowledges from the start that ‘it is because [models] are made up from a mixture of elements, including those from outside the original domain of investigation, that they maintain [their] partially independent status’ (28, p. 14). This is what makes them mediators in the ﬁrst place: Because models typically include other elements, and model building proceeds in part independently of theory and data, we construe models as being outside the theory-world axis. It is this feature which enables them to mediate eﬀectively between the two. (28, p. 17f.) Note that this is essentially a claim about the construction of models, their motivation and etiology. Once a model has been arrived at, however, it is its empirical success in speciﬁc interventionist contexts which is the sole arbiter of its validity. This follows naturally from a central tenet of the mediator view, namely that models are closer to instruments than to theories and, hence, warranted by their instrumental success in speciﬁc empirical contexts.13 That models are to be assessed by their speciﬁcity to empirically observed phenomena, rather than by, say, theoretical considerations or mathematical properties intrinsic to the models themselves, appears to be a widely held view among proponents of the models-as-mediators view. As Mauricio Suárez argues, models ‘are inherently intended for speciﬁc phenomena’ (35, p. 75), and Margaret Morrison writes: ‘The proof or legitimacy of the representation arises as a result of the model’s performance in experimental, engineering and other kinds of interventionist contexts – nothing more can be said!’ (25, p. 81) It appears then that, whilst the mediator view of models has ‘liberated’ models from the grip of theory, by stressing their capacity to integrate disparate elements, it has retained, or even strengthened, the close link between models and empirical phenomena. Yet, on a descriptive level, it is by no means clear that, for example in the case of the Hubbard model, the main activity of researchers is to assess the model’s performance in experimental or other kinds of interventionist contexts. 13 As Cartwright argues, it is for this reason that warrant to believe in predictions must be established case by case on the basis of models. She criticizes the ‘vending-machine view’, in which ‘[t]he question of transfer of warrant from the evidence to the predictions is a short one since it collapses to the question of transfer of warrant from the evidence to the theory’. This, Cartwright writes, ‘is not true to the kind of effort hat we know it takes in physics to get from theories to models that predict what reliably happens’; hence, ‘[w]e are in need of a much more textured, and I am afraid much more laborious view’ regarding the claims and predictions of science. (7, p. 185) 21 A large amount of work, for example, goes into calibrating and balancing different methods of numerical evaluation and mathematical analysis. That is, the calibration takes place not between model and empirical data, but between diﬀerent methods of approximation, irrespective of their empirical accuracy. Even in cases, where ‘quasi-exact’ numerical results are obtainable for physical observables (for example via Quantum Monte Carlo calculations), these will often be compared not to experimental data but instead to other predictions derived at by other approximative methods. It is not uncommon to come across whole papers on, say, the problem of ‘magnetism in the Hubbard model’, that do not contain a single reference to empirical data. (As an example, see (36).) Rather than adjust the parameters of the model to see whether the empirical behaviour of a speciﬁc physical system can be modelled accurately, the parameters will be held ﬁxed to allow for better comparison of the diﬀerent approximative techniques with one another, often singling out one set of results (e.g., those calculated by Monte Carlo simulations) as authoritative. One might object that a good deal of preliminary testing and cross-checking of one’s methods of evaluation has to happen before the model predictions can be compared with empirical data, but that nonetheless the latter is the ultimate goal. While there may be some truth to this interpretation, it should be noted that in many cases this activity of cross-checking and ‘bench-marking’ is what drives research and makes up the better part of it. It appears that at the very least this calls for an acknowledgment that some of the most heavily researched models typically are not being assessed by their performance in experimental, engineering and other kinds of interventionist contexts. In part, this is due to many models’ not being intended for specific phenomena, but for a range of physical systems. This is true of the Hubbard model, which is studied in connection with an array of quite diverse physical phenomena, including spontaneous magnetism, electronic properties, high-temperature superconductivity, metal-insulator transitions and others, and it is particularly obvious in the case of the Ising model, which, even though it has been discredited as an accurate model of magnetism, continues to be applied to problems ranging from soft condensed-matter physics to theoretical biology. In some areas of research, models are not even intended, in the long-term, to reﬂect, or be ‘customizable’ to, the details of a speciﬁc physical system. For example, as R.I.G. Hughes argues, when it comes to critical phenomena ‘a good model acts as an exemplar of a universality class, rather than as a faithful representation of any one of its members’ (20, p. 115). The reasons why many-body models can take on roles beyond those deﬁned by performance in empirical and interventionist contexts, are identical to those that explain their capacity to ‘survive’ empirical refutation in a specific context (as was the case with the Ising model); as I have argued in this paper, they are two-fold: First, models often actively contribute new elements, which introduces cohesion and ﬂexibility. One conspicuous class of such contributions, as discussed in Section 5.1, are the rigorous results and relations that hold for a variety of many-body models, without being entailed either by the fundamental theory or the empirical data. It is such rigorous results, I submit, which guide 22 much of the research by providing important ‘benchmarks’ for the application of numerical and analytical methods. Rigorous results need not have an obvious empirical interpretation in order to guide the search for better techniques of evaluation of analysis. This is frequently overlooked in discussions of the role of many-body models by philosophers of science. Cartwright, for example, writes: When the Hamiltonians do not piggy-back on the speciﬁc concrete features of the model – that is, when there is no bridge principle that licenses their application to the situation described in the model – then their introduction is ad hoc and the power of the derived prediction to conﬁrm the theory is much reduced. (7, p. 195) It is certainly true that many-body Hamiltonians that do ‘piggy-back’ on concrete features of the model frequently fare better than more abstract representations – if only because physicists may ﬁnd the former more ‘intuitive’ and easier to handle than the latter. But it is questionable whether the absence of ‘speciﬁc concrete features’, which would pick out a speciﬁc empirical situation, is enough to render such Hamiltonians ad hoc. For there typically exist additional constraints, in the form of rigorous results and relations, that do constrain the choice of the Hamiltonian, and these may hold for a quite general class of models, irrespective of the speciﬁc concrete features of a given empirical case. In particular, the process of ‘bench-marking’ across models on the basis of such rigorous results and relations is not merely another form of ‘moulding’ a mathematical model to concrete empirical situations; rather, it fulﬁlls a normative function by generating cross-model cohesion. The second main reason why the role of many-body models in condensed matter physics is not exhausted by their empirical success lies in their ability to confer insight and understanding into the likely microscopic processes underlying macroscopic phenomena, even in the absence of a fully developed theory. As discussed in Section 5.3, this is directly related to the exploratory use of manybody models – which in turn is made possible by the formalism-based mode of model-building, which allows for the ‘piece-meal’ construction of many-body Hamiltonians. Especially in the case of physical systems that are marked by complexity and strong correlations among their constituents, what is aimed for is a model which, in Goldenfeld’s apt formulation, ‘most economically caricatures the essential physics’ (17, p. 33). Given my earlier endorsement of the view that models need to be liberated from the grip of self-proclaimed ‘fundamental theories’, one might worry that further liberating them of the burden of empirical success leads to an evaporation of whatever warrant models previously had. This is indeed a legitimate worry, and it is one that is shared by many scientists working on just those models. If all there is to a model is a set of mathematical relations together with a set of background assumptions, how can we expect the model to tell us anything about the world? There are several points in reply to this challenge. First, while it is true that many of the rigorous relations do not easily lend themselves to an empirical interpretation, there are, of course, still many quantities (such as the order parameter, temperature etc.) that have a straightforward empirical 23 meaning. Where the model does make predictions about certain empirically signiﬁcant observables, these predictions will often be an important (though not the only) measure of the model’s signiﬁcance.14 Second, models can mutually support each other. As the example of the mapping of the strong-coupling Hubbard model at half-ﬁlling onto the Heisenberg model showed, rigorous results and relations can connect diﬀerent models in unexpected ways. This allows for some degree of transfer of warrant from one model to the other. Note that this transfer of warrant does not involve any appeal to fundamental theory, but takes place ‘horizontally’ at the level of models.15 Third, in many cases a model can be constructed in several diﬀerent ways, which may bring out the connection with both theory and phenomenon in various ways. The ﬁrst-principles derivation of the Hubbard model is one such example. It provides a meaningful interpretation of the otherwise merely parameter-like quantities Tij (namely, as matrix elements that describe the probability of the associated hopping processes). While this interpretation requires some appeal to theory, it does not require an appeal to the full theoretical problem – that is, the full problem of 1023 particles each described by its ‘fundamental’ Schrödinger equation. A similar point can even be made for the formalism-driven approach. There, too, model construction does not operate in a conceptual vacuum, but makes use of general procedures, which range from the highly abstract (e.g., the formalism of second quantization) to the largely intuitive considerations that go into the selection of elementary processes judged to be relevant. By recognizing that models can be liberated both from the hegemony of fundamental theory and from the burden of empirical performance in every concrete speciﬁc case, I believe one can appreciate the role of models in science in a new light. For one, models are as much contributors as they are mediators in the process of representing the physical world around us. But more importantly, they neither merely execute fundamental theory nor accommodate empirical phenomena. Rather, as the example of many-body models in condensed-matter physics demonstrates, they are highly structured entities, which are woven into, and give stability to scientiﬁc practice. References [1] Daniela M. Bailer-Jones: “When scientiﬁc models represent”, International Studies in the Philosophy of Science, 17, 2003, pp. 59-74. [2] Robert Batterman: “Asymptotics and the role of minimal models”, British 14 A model whose predictions of the order parameter are systematically wrong (e.g., consistently too low) but which gets the qualitative behaviour right (e.g., the structure of the phase diagram), may be preferable to a model that is more accurate for most situations, but is vastly (qualitatively) mistaken for a small number of cases. Likewise, a model that displays certain symmetry requirements or obeys certain other rigorous relations may be preferable to a more accurate model (with respect to the physical observables in questions) that lacks these properties. 15 See also Section 5.2 above; for a full case study see of cross-model transfer of warrant, see (12). 24 Journal for the Philosophy of Science, 53, 2002, pp. 21-38. [3] Rodney J. Baxter: Exactly Solved Models in Statistical Mechanics, New York: Academic Press 1982. [4] Marcel Boumans: “Built-in justiﬁcation”, in Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 68-96. [5] Stephen Brush: “History of the Lenz-Ising Model”, Reviews of Modern Physics, 39, 1967, pp. 883-893. [6] Nancy Cartwright, Towﬁc Shomar and Mauricio Suárez: “The tool box of science”, in: William E. Herfel, Wladyslaw Krajewski, Ilkka Niiniluoto and Ryszard Wójckicki: Theories and Models in Scientific Processes (Poznán Studies in the Philosophy of the Science and the Humanities, Vol. 44), Amsterdam: Rodopi 1995, pp. 137-149. [7] Nancy Cartwright: The Dappled World. A Study of the Boundaries of Science, Cambridge: Cambridge UP 1999. [8] Nancy Cartwright: “Models and the Limits of Theory: Quantum Hamiltonians and the BCS Model of Superconductivity”, in Mary S. Morgan and Margaret Morrison (eds.): Models as Mediators: Perspectives on Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 241281. [9] Michael E. Fischer: “Scaling, Universality, and Renormalization Group Theory”, in F.J.W. Hahne (ed.): Critical Phenomena. (Lecture Notes in Physics, Vol. 186), Berlin: Springer 1983, pp. 1-139. [10] Serge Galam: “Rational group decision-making: A random-ﬁeld Ising model at T=0”, Physica A, 238, 1997, pp. 66-80. [11] Florian Gebhard: The Mott Metal-Insulator Transition: Models and Methods. (Springer Tracts in Modern Physics, Vol. 137), Berlin: Springer 1997. [12] Axel Gelfert: “Rigorous results, cross-model justiﬁcation, and the transfer of empirical warrant: the case of many-body models in physics”, Synthese, 169, 2009, pp. 497-519. [13] Axel Gelfert: “Mathematical formalisms in scientiﬁc practice: From denotation to model-based representation”, Studies in History and Philosophy of Science, 42, 2011, pp. 272-286. [14] Axel Gelfert: “Strategies of model-building in condensed matter physics: trade-oﬀs as a demarcation criterion between physics and biology?”, Synthese, 190, 2013, pp. 253-272. [15] Michael C. Gibson: Implementation and Application of Advanced Density Functional s. (PhD Dissertation, University of Durham, 2006.) [16] Ronald N. Giere: “Using Models to Represent Reality”, in Lorenzo Magnani, Nancy J. Nersessian and Paul Thagard (eds.): Model-Based Rea25 soning in Scientific Discovery, New York: Plenum Publishers 1999, pp. 41-57. [17] Nigel Goldenfeld: Lectures on Phase Transitions and the Renormalization Group (Frontiers in Phyiscs, Vol. 85.), Reading, Mass.: Addison Wesley 1992. [18] Robert B. Griﬃths: “Rigorous results and theorems”, in Cyril Domb and Melville S. Green (eds.): Phase Transitions and Critical Phenomena, New York: Academic Press, 1972, pp. 8-109. [19] Werner Heisenberg: “Theorie des Ferromagnetismus”, Zeitschrift für Physik, 49, 1928, pp. 619-636. [20] R. I. G. Hughes: “The Ising model, computer simulation, and universal physics”, in Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 97-145. [21] Ernst Ising: “Beitrag zur Theorie des Ferromagnetismus”, Zeitschrift für Physik, 31, 1925, pp. 253-258. [22] Martin H. Krieger: “Phenomenological and Many-Body Models in Natural Science and Social Research”, Fundamenta Scientiae, 2, 1981, pp. 425431. [23] Tarja Knuuttila: “Some Consequences of the Pragmatist Approach to Representation: Decoupling the Model-Target Dyad and Indirect Reasoning”, in Mauricio Suárez , Mauro Dorato and Miklos Rédei (eds.): EPSA Epistemology and Methodology of Science, Dordrecht: Springer 2010, pp. 139-148. [24] H. Matsuda: “The Ising Model for Population Biology”, Progress of Theoretical Physics, 66, 1981, pp. 1078-1080. [25] Margaret C. Morrison: “Modelling Nature: Between Physics and the Physical World”, Philosophia Naturalis, 35, 1998, pp. 65-85. [26] Margaret Morrison: “Models as autonomous agents”, in Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 38-65. [27] Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on Natural and Social Science, Cambridge: Cambridge UP 1999. [28] Margaret Morrison and Mary S. Morgan: “Models as mediating instruments”, in Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 10-37. [29] Martin Niss: “History of the Lenz-Ising model 1920-1950: from ferromagnetic to cooperative phenomena”, Archive for History of Exact Sciences, 59, 2005, pp. 267-318. 26 [30] Philippe Nozières: Theory of Interacting Fermi Systems. New York: Benjamin 1963. [31] Andrew T. Ogielski and Ingo Morgenstern: “Critical behavior of 3dimensional Ising model of spin glass”, Journal of Applied Physics, 57, 1985, pp. 3382-3385. [32] Lars Onsager: “Crystal statistics. I. A two-dimensional model with an order-disorder transition”, Physical Review, 65, 1944, p. 117. [33] Sam Schweber and Matthias Wächter: “Complex Systems, Modelling and Simulation”, Studies in History and Philosophy of Modern Physics, 31 No.4, 2000, pp. 583-609. [34] Mark Steiner: “The Application of Mathematics to Natural Science”, The Journal of Philosophy, 86 No. 9, 1989, pp. 449-480. [35] Mauricio Suárez: “Theories, Models, and Representations”, in Lorenzo Magnani, Nancy J. Nersessian and Paul Thagard (eds.): Model-Based Reasoning in Scientific Discovery, New York: Plenum Publishers 1999, pp. 75-83. [36] Michael A. Tusch, Yolande H. Szczech, and David E. Logan: “Magnetism in the Hubbard model: An eﬀective spin Hamiltonian approach”, Physical Review B, 53 No. 9, 1996, 5505-5517. [37] Sang Wook Yi: “The Nature of Model-Based Understanding in Condensed Matter Physics”, Mind & Society, 5, 2002, pp. 81-91. 27

RELATED PAPERS

RELATED TOPICS

Log In

Between Rigor and Reality: Many-Body Models in Condensed Matter Physics

Between Rigor and Reality: Many-Body Models in Condensed Matter Physics

Related Papers

RELATED PAPERS

RELATED TOPICS