forthcoming in Why More is Different: Philosophical Issues in Condensed
Matter Physics and Complex Systems, ed. B. Falkenburg & M. Morrison,
Heidelberg: Springer 2015
Between Rigor and Reality: Many-Body Models
in Condensed Matter Physics
Axel Gelfert (National University of Singapore)
December 29, 2013
Abstract
The present paper focuses on a particular class of models intended to
describe and explain the physical behaviour of systems that consist of a
large number of interacting particles. Such many-body models are characterized by a specific Hamiltonian (energy operator) and are frequently
employed in condensed matter physics in order to account for such phenomena as magnetism, superconductivity, and other phase transitions.
Because of the dual role of many-body models as models of physical systems (with specific physical phenomena as their explananda) as well as
mathematical structures, they form an important sub-class of scientific
models, from which one can expect to draw general conclusions about the
function and functioning of models in science, as well as to gain specific
insight into the challenge of modelling complex systems of correlated particles in condensed matter physics. In particular, it is argued that manybody models contribute novel elements to the process of inquiry and open
up new avenues of cross-model confirmation and model-based understanding. In contradistinction to phenomenological models, which have received
comparatively more philosophical attention, many-body models typically
gain their strength not from ‘empirical fit’ per se, but from their being
the result of a constructive application of mature formalisms, which frees
them from the grip of both ‘fundamental theory’ and an overly narrow
conception of ‘empirical success’.
1
Introduction
Scientific models are increasingly being recognized as central to the success and
coherence of scientific practice. In the present paper, I focus on a particular class
of models intended to describe and explain the physical behaviour of systems
that consist of a large number of interacting particles. Such many-body models,
usually characterized by a specific Hamiltonian (energy operator), are frequently
employed in condensed matter physics in order to account for phenomena such as
magnetism, superconductivity, and other phase transitions. Because of the dual
role of many-body models as models of physical systems (with specific physical
phenomena as their explananda) as well as mathematical structures, they form
1
an important sub-class of scientific models, from which one can expect to draw
general conclusions about the function and functioning of models in science, as
well as to gain specific insight into the challenge of modelling complex systems
of correlated particles in condensed matter physics. Throughout the present
paper, equal emphasis is placed on the process of constructing models and on
the various considerations that enter into their evaluation.
The rest of this paper is organized as follows. In Section 2, I place many-body
models in the context of the general philosophical debate about scientific models
(especially the influential ‘model as mediators’ view), paying special attention
to their status as mathematical models. Following this general characterization,
Section 3 then discusses a number of historical examples of many-body models
and the uses to which they have been put in 20th-century physics, not least in
the transition from classical models of interacting particles to a full appreciation of the quantum aspects of condensed-matter phenomena. On the basis of
these historical examples, Section 4 distinguishes between different strategies of
model construction in condensed matter physics. Contrasting many-body models with phenomenological models (which are typically derived from interpolating between specific empirical phenomena), it is argued that the construction of
many-body models may proceed either from theoretical ‘first principles’ (sometimes called the ab initio approach) or may be the result of a more constructive
application of the formalism of many-body operators. This formalism-based
approach, it is argued in Section 5, leads to novel theoretical contributions by
the models themselves (one example of which are so-called ‘rigorous results’;
Section 5.1), which in turn gives rise to cross-model support between models of
different origins (Section 5.2) and opens up room for exploratory uses of models
in the service of fostering model-based understanding (Section 5.3). The paper
concludes with an appraisal of many-body models as a specific way of investigating condensed matter phenomena that steers a middle path ‘between rigor
and reality’.
2
Many-body models as mathematical models
Among the various kinds of models used in condensed matter physics, an important subclass are many-body models which represent a system’s overall behaviour
as the collective result of the interactions between its constituents. The present
section discusses many-body models in general terms, situating them within
the general philosophical debate about scientific models and discussing, more
specifically, their status as mathematical models.
Mathematical models can take different forms and fulfill different purposes.
They may be limiting cases of a more fundamental, analytically intractable theory, for example in the case of modelling planetary orbits as if planets were independent mass-points revolving around an infinitely massive sun. Sometimes,
models connect different theoretical domains, as is the case in hydrodynamics,
where Prandtl’s boundary layer model interpolates between the frictionless ‘classical’ domain and the Navier-Stokes domain of viscous flows (26). Even where
2
a fundamental theory is lacking, mathematical models may be constructed, for
example by fitting certain dynamical equations to empirically observed causal
regularities (as in population cycles of predator-prey systems in ecology) or by
analyzing statistical correlations (as in models of stock-market behavior). In
the economic and social sciences, identifying the relevant parameters and constructing a mathematical model that connects them may often precede theoryconstruction. Frequently, what scientists are interested in are qualitative features, such as the stability or instability of certain systems, and these may be
reflected better by a mathematical model than by any available partial evaluation of the underlying theory.
Given this diversity, it would be hopeless to look for shared properties that
all mathematical models have in common. Fortunately, there are other ways
one can approach the problem. First, the characteristics one is most interested
in need not themselves be mathematical properties, but may encompass ‘soft’
factors such as ease of use, elegance, simplicity and other factors pertaining
to the uses to which mathematical models typically are put. Second, it may
be possible to identify a subclass of mathematical models – such as the manybody models to be discussed this paper – which is sufficiently comprehensive to
allow for generalizations but whose members are not too disparate. Finally, it
will often be possible to glean additional insight from contrasting mathematical
models with other, more general, kinds of models.
On one influential general account, which will prove congenial to the present
paper, models are to be regarded as ‘mediating instruments’. (See ref. (28)) It
is crucial to this view that models are not merely understood as an unavoidable
intermediary step in the application of general theories to specific situations.
Rather, as ‘mediators’ between our theories and the world, models inform the
interpretation of our theories just as much as they allow for the application of
these theories to nature. As Morrison and Morgan are keen to point out, ‘models
are not situated in the middle of an hierarchical structure between theory and
the world’, but operate outside the hierarchical ‘theory-world axis’. (28, p.
17f.) This can be seen by realizing that models ‘are made up from a mixture of
elements, including those from outside the original domain of investigation’ (p.
14); it is this partial independence of original theory and data that is required
in order to allow models to play an autonomous role in scientific enquiry. In
this respect, Margaret Morrison and Mary Morgan argue, scientific models are
much like scientific instruments. Indeed, it is part and parcel of this view that
model building involves an element of creativity and skill – it is ‘not only a craft
but also an art, and thus not susceptible to rules’ (28, p. 12).
A number of case studies have examined specific examples from the natural
and social sciences from within this framework. (A cross-section of these are
collected in ref. (27).) The upshot of many of these studies is that ‘model
construction involves a complex activity of integration’ (26, p. 44). This
integration need not be perfect and, as Daniela Bailer-Jones points out, may
involve ‘a whole range of different means of expression, such as texts, diagrams
or mathematical equations’ (1, p. 60). Quite often, the integration cannot
be perfect, as certain elements of the model may be incompatible with one
3
another. Even in cases where successful integration of the various elements is
possible, the latter can be of very different sorts – they may differ not only in
terms of their medium of expression (text, diagram, formula) but also in terms
of content: Some may consist in mathematical relations, others may draw on
analogies; some may reflect actual empirical data, others, perhaps in economics,
may embody future target figures (e.g., for inflation).
It is in comparison with this diversity of general aspects of scientific models, I
argue, that several characteristic features of mathematical models can be singled
out. The first of these concerns the medium of expression, which for mathematical models is, naturally, the formal language of mathematics. It would, however,
be misguided to simply regard a model as a set of (uninterpreted) mathematical
equations, theorems and definitions, as this would deprive models of their empirical relevance: A set of equations cannot properly be said to ‘model’ anything,
neither a specific phenomenon nor a class of phenomena, unless some of the variables are interpreted so as to relate them to observable phenomena. One need
not be committed to the view (as Morrison paraphrases Nancy Cartwright’s position on the matter) that ‘fundamental theory represents nothing, [that] there
is simply nothing for it to represent since it doesn’t describe any real world
situations’ (25, p. 69), in order to acknowledge that mathematical models cannot merely be uninterpreted mathematical equations if they are to function as
mediators of any sort; that is, if they are to model a case that, for whatever
reason, cannot be calculated or described in terms of theoretical first principles.
The fact that mathematical models, like other kinds of models, require background assumptions and rules of interpretation, of course, does not rule out that
in each case there may be a core set of mathematical relationships that model
users regard as definitive of the mathematical model in question. Indeed, this
assumption should be congenial to the proposed analysis of models as mediators,
as the mathematical features of a model – where these are not merely ‘inherited’
from a fundamental theory – may provide it with precisely the autonomy and
independence (from theory and data) that the role as mediator requires. This is
applies especially to the case of many-body models which, as I shall discuss in
Section 4, are typically the output of what has been called ‘mature mathematical formalisms’ (in this case: the formalism of second quantization, as adapted
to the case of many-body physics).
While it may be true that, as Giere puts it, ‘[m]uch mathematical modeling
proceeds in the absence of general principles to be used in constructing models’
(16, p. 52), there are good terminological reasons to speak of a mathematical
model of a phenomenon (or a class of phenomena) only if the kind of mathematical techniques and concepts employed are in some way sensitive to the
kind of phenomenon in question. For example, while it may be possible, if only
retrospectively, to approximate the stochastic trajectory of a Brownian particle by a highly complex deterministic function, for example a Fourier series of
perfectly periodic functions, this would hardly count as a good mathematical
model: There is something about the phenomenon, namely its stochasticity,
that would not be adequately reflected by a set of deterministic equations; such
a set of equations would quite simply not be a mathematical model of Brownian
4
motion.1
In addition to the requirement that the core mathematical techniques and
concepts be sensitive to the kind of phenomenon that is being modelled, there
is a further condition regarding what should count as a mathematical model.
Loosely speaking, the mathematics of the model should do some work in integrating the elements of the ‘extended’ model, where the term ‘extended’ refers
to the additional information needed to apply a bare mathematical structure to
individual cases. If, for example, a mathematical model employs the calculus
of partial differential equations, then it should also indicate which (classes of)
initial and boundary conditions need to be distinguished; likewise, if a mathematical model depends crucially on certain parameters, it should allow for
systematic methods of varying, or ‘tweaking’, those parameters, so their significance can be studied systematically.2 This capacity of successful models to
integrate different cases, or different aspects of the same case, has occasionally
been called ‘moulding’ (4, p. 90),(1, p. 62):
Mathematical moulding is shaping the ingredients in such a mathematical form that integration is possible, and contains two dominant
elements. The first element is moulding the ingredient of mathematical formalism in such a way that it allows the other elements
to be integrated. The second element is calibration, the choice of
the parameter values, again for the purpose of integrating all the
ingredients. (4, p. 90)
Successful mathematical models, on this account, display a capacity to integrate
different elements – some theoretical, others empirical – by deploying an adaptable, yet principled formalism that is mathematically characterizable, (largely)
independently of the specifics of the theory and data in the case under consideration.
For the remainder of the present paper, I shall therefore be relying on an
understanding of many-body models that recognizes their dual status as models
of physical systems (which, importantly, may include purely hypothetical systems) and as mathematical structures. This is in line with the following helpful
characterization presented by Sang Wook Yi:
What I mean by a model in this paper is a mathematical structure
of three elements: basic entitites (such as ‘spins’), the postulated
arrangement of the basic entities (say, ‘spins are located on the lattice point’) and interactions among the basic entities (‘spin-magnetic
1 There may, of course, be independent reasons why one might represent, say, a specific
trajectory by a certain set of deterministic equations, or by a (non-mathematical) pictorial
representation. However, in such cases, as well as in contexts where the stochasticity of the
causal process is irrelevant, one would not be dealing with a model of Brownian motion, in
the proposed narrower sense of ‘mathematical model’.
2 Systematic ‘tweaking’, as Martin Krieger observes, ‘has turned out to be a remarkably
effective procedure’ (22, p. 428). By varying contributions to the model, e.g. by adding
disturbances, one can identify patterns in the response of the model, including regions of
stability.
5
field interactions’). As a rough criterion, we may take a model to be
given when we have the Hamiltonian of the model and its implicit
descriptions that can motivate various physical interpretations (interpretative models) of the model. (37, p. 82)
If this sounds too schematic, or too general, then perhaps a look at some historical examples will make vivid how many-body models have been put to use
in condensed matter physics.
3
A brief history of many-body models
In this, and the next section, a class of mathematical models will be discussed
that was first developed in connection with research on the magnetic properties
of solids. The standard way of picturing a solid as a crystal, with the atoms
being arranged in a highly ordered lattice so as to display certain spatial symmetries, and the electrons being possibly delocalized, as in a metal, already
contains a good deal of assumptions that may or may not be realized in a given
physical system. In order to regard this general characterization as a faithful
representation of any real physical object, for example of a lump of metal in
a given experiment, certain background assumptions have to be in place. For
example, it has to be assumed that the piece of metal, which more often than
not will display no crystalline structure to the naked eye, really consists of a
number of microcrystals, each of which is highly ordered; that the imperfections,
which may arise at the boundaries of two adjoining microcrystals or from the
admixture of contaminating substances, are negligible; that, for the purpose of
the experiment, the description in terms of ions and electrons is exhaustive (for
example, that no spontaneous generation of particles occurs, as may happen at
high energies).
Picturing a solid as a lattice consisting of ions and electrons is, of course, a
rather rudimentary model, as it does not yet tell us anything (except perhaps by
analogies we may draw with macroscopic mechanical lattices) about the causal
and dynamic features of the system. For this, the acceptance of a physical theory
is required – or, in the absence of a theoretical account of the full system, the
construction of a mathematical many-body model. (Often a physical theory – to
the extent that it is accessible by researchers – will include general principles
that constrain, but underdetermine, the specifics of a given system.) The earliest
many-body model of the kind to be discussed in this paper was the Ising model,
proposed in 1925 by the German physicist Ernst Ising at the suggestion of
his then supervisor Wilhelm Lenz. It was published under the modest title
‘A Contribution to the Theory of Ferromagnetism’ and its conclusions were
negative throughout. According to the summary published in that year’s volume
of Science Abstracts, the model is
an attempt to modify Weiss’ theory of ferromagnetism by consideration of the thermal behavior of a linear distribution of elementary magnets which (in opposition to Weiss) have no molecular field
6
but only a non-magnetic action between neighboring elements. It is
shown that such a model possesses no ferromagnetic properties, a
conclusion extending to a three-dimensional field.3
Ising’s paper initially did not generate much interest among physicists, as perhaps one would expect of a model that self-confessedly fails to describe the
phenomenon for which it was conceived. It was not until the late 1930s that
Ising’s paper was recognized as displaying a highly complex mathematical behaviour, which, as one contemporary physicist puts it, ‘continues to provide us
with new insights’ (9, p. 47).4
As a model of ferromagnetic systems the Ising model pursues the idea that
a magnet can be thought of as a collection of elementary magnets, whose orientation determines the overall magnetization of the system. If all the elementary
magnets are aligned along the same axis, then the system will be perfectly ordered and will display a maximum value of the magnetization. In the simplest
one-dimensional case, such a state can be visualized as a chain of ‘elementary
magnets’, all pointing the same way:
·
·
·
↑
↑
↑
↑
↑
↑
↑
↑
·
·
·
The alignment of elementary magnets can be brought about either by a strong
enough external magnetic field or it can occur spontaneously, as will happen
below a critical temperature, when certain substances (such as iron and nickel)
undergo a ferromagnetic phase transition. The parameter that characterizes a
phase transition, in this case the magnetization M, is also known as the order
parameter of the transition. The guiding principle behind the theory of phase
transitions is that discontinuities in certain thermodynamic quantities can occur
spontaneously as a result of the system’s minimizing other such quantities in
order to reach an equilibrium state. Hence, if the interaction between individual
elementary magnets, i, j, characterized by a constant Jij , is such that it favours
the parallel alignment of elementary magnets, then one can hope to expect
a phase transition below a certain temperature. The energy function of the
system as a whole will, therefore, play an important role for the dynamics of
the model, and indeed, in the language of mathematical physics, this is what
constitutes the many-body model. In the language of ‘mathematical moulding’,
the energy function will be the core element of the many-body model. In the
case of the Ising model, this function can be simply expressed as the sum over
all interactions of one elementary magnet with all the others (the variable Si
represents the elementary magnet at lattice site i and takes the values +1 or −1,
depending on the direction in which the elementary magnet points; the minus
sign is merely a matter of convention):
X
Jij Si Sj
E=−
i,j
3 Quoted
in (20, p. 104).
domain of application has broadened further in recent years; the Ising model is now
also used to model networks, spin glasses, population distribution etc.; see, for example, refs.
(24), (31), (10).
4 The
7
If one restricts the interaction to nearest neighbors only and assumes that
Ji,i±1 > 0, then it is obvious that the energy will be minimized when all the
elementary magnets point into the same direction, that is when Si Si+1 = +1
for all i.
As Ising himself acknowledged, the one-dimensional model fails to predict a
spontaneous magnetization, where the latter can simply be defined as the sum
over the orientations (Si = ±1) of all elementary magnets, in the absence of an
external field, divided by their total number:
M=
1 X
Si .
N i
The reason for the absence of a spontaneous magnetization in the case of the
Ising ‘chain’ essentially lies in the instability, at finite temperatures (T 6= 0),
of a presumed ordered state against fluctuations.5 In the truly one-dimensional
case, the chain is infinitely extended (N → ∞), and the contribution of an
individual elementary magnet to the total system is of only infinitesimal significance. However, one need only introduce one defect – that is, one pair of
antiparallel (rather than parallel) elementary magnets – in order to eliminate
the assumed magnetization, as the orientations of the elementary magnets on
either side of the ‘fault line’ will cancel out (see figure below). Given that even
the least ‘costly’ (in terms of energy) fluctuation will destroy the magnetization,
the presumed ordered state cannot obtain.
·
·
·
↑
↑
↑
↑
↑
↑
↑
↓
↓
↓
↓
↓
↓
↓
·
·
·
Whereas Ising’s proof of the non-occurrence of a phase-transition in one
dimension has stood up to scrutiny, Ising’s conjecture that the same holds
also for the two- and three-dimensional case has since been proven wrong. In
1935, Rudolf Peierls demonstrated that the two-dimensional Ising model exhibits spontaneous magnetization below a critical temperature Tc > 0. This
marked a turning-point in the ‘career’ of the Ising model as an object of serious
study. In what has been described as ‘a remarkable feat of discrete mathematics’ (20, p. 106), Lars Onsager was able to produce an exact solution, at all
temperatures, of the two-dimensional version of the Ising model.(32) His results
concerned not only the existence, or absence, in general of a phase transition,
but they also delivered a precise value of the critical temperature (at least for
the square lattice) and gave a rigorous account of the behavior of other quantities, such as the specific heat. (See (5) and (29) for a more detailed study of
the history of the Ising model.)
In summary, the lessons of this brief history of many-body models are as
follows. First, it is worth reminding oneself that as a model of ferromagnetism,
the Ising model was initially considered a failure. At the time Ising proposed his
model in 1925, he recognized that its failure lay in not predicting a spontaneous
5 The zero-temperature case (T = 0) is of special significance in a variety of many-body
models; however, in order to keep the presentation accessible, T 6= 0 will be assumed throughout the following discussion.
8
magnetization in one dimension (and, as Ising wrongly conjectured, also in three
dimensions). By the time it was recognized, by Peierls and Onsager, that the
model could explain the occurrence of a phase transition in two (and possibly
three) dimensions, however, the theory of ferromagnetism had moved on. For
one, Werner Heisenberg, in a paper in 1928, had proposed a quantum theoretical model, essentially by replacing the number-valued variable Si in the Ising
model by operator-valued vectors Ŝi . At first glance, this formal change may
seem minor, but it indicates a radical departure from the classical assumptions
that Ising’s model was based on. Where Ising had to postulate the existence of
‘elementary magnets’, Heisenberg was able to give a physical interpretation in
terms of the newly discovered spin of atoms and electrons. The departure from
classical assumptions also manifests itself mathematically in the use of spin operators, together with their commutation relations (which have no equivalent in
classical physics), which fundamentally changes the algebraic properties of the
mathematical core of the model. The novelty of quantum theory, and of Heisenberg’s model, however, is only one reason why, despite Peierls and Onsager’s
seeming vindication, the Ising model did not gain a foothold as a good model of
ferromagnetism. For, as Bohr (1911) and van Leeuwen (1919), independently of
each other, had rigorously shown in their doctoral dissertations, a purely classical system that respects the (classical) laws of electrodynamics, could never
display spontaneous magnetization (though, of course, it may develop a non-zero
magnetization in an external field). Hence, the explanatory power of the Ising
model as a model of spontaneous ferromagnetism was doubly compromised: It
could not offer an explanation of why there should be ‘elementary magnets’ in
the first place, and it purported to model, using the conceptual repertoire of
classical physics, a phenomenon that could be shown to be incompatible with
classical physics.6
One might question whether at any point in time the Ising model could have
been a good model of ferromagnetism. Had Onsager’s solution already been
published by Ising, could Heisenberg, in his 1928 paper, still have dismissed
Ising’s model as ‘not sufficient to explain ferromagnetism’ (19)? Hardly, one
might argue. But as things stand, this is not what happened. Models are
employed in fairly specific contexts, and in the case of mathematical models in
particular, the uses to which they are put determine their empirical content.
As Bailer-Jones argues, it is ‘[t]he model users’ activity of intending, choosing
and deciding [that] accounts for the fact that models, as they are formulated,
submit to more than sheer data match’ (1, p. 71). Applying this idea to the
Ising model with its varied history, one could perhaps argue that even a model
that was initially considered a failure may experience a comeback later, when
it is used to model other phenomena or is considered as a testing ground for
new theoretical techniques or mathematical concepts – only, of course, that
this Ising model, now conceived of as an instrument of generating rigorous
results and exact solutions for their own sake, would no longer be a model
6 As Martin Niss notes, during the first decade of the study of the Lenz-Ising model
‘[c]omparisons to experimental results were almost absent’ and especially its initial ‘development was not driven by discrepancies between the model and experiments’. (29, p. 311-312.)
9
of ferromagnetism.
4
Constructing quantum Hamiltonians
Because in the Heisenberg model the hypothetical ‘elementary magnets’ of the
Ising model are replaced by quantum spins and the nature of ‘spin’ as a nonclassical internal degree of freedom is accepted (by fully embracing the algebraic
peculiarities of spin operators), this model is a much better candidate for mimicking spontaneous magnetization. Nonetheless, it still represents the spins as
rigidly associated with nodes of a lattice in real (geometrical) space. This is
plausible for magnetic insulators but not for substances such as iron and nickel
where the electrons are mobile and can, for example, sustain an electric current.
However, one can define a concept of pseudo-spins, which retains the idea that
spins can interact directly, even when it is clear that in a metal with delocalized
electrons all spin-spin interaction must eventually be mediated by the entities
that are in fact the spin-carriers – that is, by electrons. ‘Pseudo-spins’ were first
constructed mathematically via the so-called electron number operators. This
mathematical mapping of kinds of operators onto each other, however, does not
yet result in a self-contained, let alone intuitive, many-body model for systems
with delocalized electrons. This is precisely the situation a model builder finds
herself in when she sets out to construct a model for a specific phenomenon,
or class of phenomena, such as the occurrence of spontaneous magnetization in
a number of physical materials. Since ‘fundamental theory’ allows for almost
limitless possible scenarios, the challenge lies in constructing a model that is
interpretable by the standards of the target phenomenon. Questions of empirical accuracy – which, after all, cannot be known in advance – are secondary
during the phase of model construction. If a model is indeed an instrument
of inquiry and, as some claim, is ‘inherently intended for specific phenomena’
(35, p. 75), then at the very least there must be a way of interpreting some
(presumably the most salient) elements of the model as representing a feature
of the phenomenon or system under consideration. This demand has direct
implications for how models are being constructed. If models indeed aim at
representing their respective target system, then success in constructing models will be judged by their power to represent. Thus, among proponents of the
models-as-mediators view, it is a widely held view that ‘[t]he proof or legitimacy
of the representation [by a model] arises as a result of the model’s performance
in experimental, engineering and other kinds of interventionist contexts’ (25,
p. 81). One would expect, then, that the process of model construction should
primarily be driven by a concern for whether or not its product – the models –
are empirically successful.
By contrast, I want to suggest that the case of many-body models is a
paradigmatic example of a process of model construction that neither regards
models as mere limiting cases of ‘fundamental theory’ nor appeals to empirical
success as a guide to (or, indeed, the goal of) model construction. Instead,
it involves the interplay of two rather different strategies, which I shall refer
10
to as the ‘first-principles’ (or ‘ab initio’) approach, on the one hand, and the
‘formalism-driven’ approach on the other. Whereas the expression ‘formalismdriven’ is my coinage, the first pair of expressions – ‘first principles’ and ‘ab
initio’ – reflects standard usage in theoretical condensed matter physics, where
it is used in contradistinction to so-called ‘phenomenological’ approaches which
aim to develop models by interpolating between specific empirical observations:
The first principles approach to condensed matter theory is entirely
different from this. It starts from what we know about all condensed
matter systems – that they are made of atoms, which in turn are
made of a positively charged nucleus, and a number of negatively
charged electrons. The interactions between atoms, such as chemical
and molecular bonding, are determined by the interactions of their
constituent electrons and nuclei. All of the physics of condensed
matter systems arises ultimately from these basic interactions. If
we can model these interactions accurately, then all of the complex
physical phenomena that arise from them should emerge naturally
in our calculations. (15)
A clear, but overambitious, example of a first-principles approach would be the
attempt to calculate the full set of ∼1023 coupled Schrödinger equations, one
for each of the ∼1023 nodes in the crystal lattice. For obvious reasons, solving such a complex system of equations is not a feasible undertaking – indeed,
it would merely restate the problem in the terms of fundamental theory, the
complexity of which prompted the introduction of (reduced) models in the first
place. But less ambitious, and hence more tractable, first-principles approaches
exist. Thus, instead of taking the full system – the extended solid-state crystal
– as one’s starting point, one may instead begin from the smallest ‘building
block’ of the extended crystal, by considering the minimal theory of two atoms
that are gradually moved together to form a pair of neighbouring atoms in
the crystal. One can think of this way of model construction as involving a
thought experiment regarding how a many-body system ‘condenses’ from a collection of isolated particles. Such an approach, although it does not start from
the ‘full’ theory of all ∼1023 particles, remains firmly rooted in ‘first principles’, in that the thought experiment involving the two ‘neighbouring’ atoms
approaching one another is being calculated using the full theoretical apparatus
(in this case, the theoretical framework of non-relativistic quantum mechanics).7
This is the ‘derivation’ of many-body models that is usually given in textbooks
of many-body theory (e.g. ref. (30)), often with some degree of pedagogical
hindsight. However, while such a derivation makes vivid which kinds of effects – e.g., single-particle kinetic energy, particle-particle Coulomb repulsion,
and genuine quantum exchange interactions between correlated particles – may
be expected to become relevant, it typically remains incomplete as a model of
7 Needless to say, considerable background assumptions are necessary in order to identify
which unit is indeed the smallest one that still captures the basic mechanisms that determine
the behaviour of the extended system.
11
the extended many-body system: what is being considered is only the smallest ‘building block’, and a further constructive move is required to generate a
many-body model of the full crystal.
This is where the second kind of procedure in model construction – what I
shall call the formalism-driven approach – needs to be highlighted. This approach, in my view, is far more ubiquitous than is commonly acknowledged,
and it sheds light on the interplay between mathematical formalism and rigor
on the one hand, and the interpretation of models and the assessment of their
validity on the other hand. In particular, it also reinforces the observation that
many-body models enjoy a considerable degree of independence from specific
experimental (or other interventionist) contexts, and even from quantitative
standards of accuracy. On the account I am proposing, a ‘mature mathematical formalism’ is ‘a system of rules and conventions that deploys (and often
adds to) the symbolic language of mathematics; it typically encompasses locally applicable rules for the manipulation of its notation, where these rules are
derived from, or otherwise systematically connected to, certain theoretical or
methodological commitments’ (13, p. 272). In order to understand how the
formalism-driven strategy in model construction works, let us return to the case
under consideration, namely ferromagnetic systems with itinerant electrons.
How is one to model the itinerant nature of conduction electrons in such
metals as cobalt, nickel and iron? The formalism of so-called creation and
annihilation operators, â†i,σ and âi,σ , allows one to describe the dynamics of
electrons in a crystal. Since electrons cannot simply be annihilated completely
or created ex nihilo (at least not by the mechanisms that govern the dynamics
in a solid at room temperature), an annihilation operator acting at one lattice
site must always be matched by a creation operator acting at another lattice
site. But this is precisely what describes itinerant behaviour of electrons in the
first place. Hence, the formalism of second quantization, in conjunction with the
basic assumption of preservation of particle number, already suggests how to
model the kinetic behaviour of itinerant electrons, namely through the following
contribution to the Hamiltonian:
X
Ĥkin =
Tij â†i,σ âj,σ
ijσ
When the operator product â†i,σ âj,σ acts on a quantum state, it first8 annihilates an electron of spin σ at lattice site j (provided such an electron happens
to be associated with that lattice site) and then creates an electron of spin σ at
another site i. Because electrons are indistinguishable it appears, from within
the formalism, as if an electron of spin σ had simply moved from j to i. The
parameters Tij , which determine the probability with which such electron ‘hopping’ from one place to another occurs, are known as hopping integrals. In
cobalt, nickel and iron, the electrons are still comparatively tightly bound to
8 Operators need to be read from right to left; hence, if an operator product acts on a
quantum state, â†i,σ âj,σ |Ψi , the operator âj,σ directly in front of |Ψi acts first, then â†i,σ .
Because operators do not always commute, the order of operation is important.
12
their associated ions; hence, hopping to distant lattice sites will be rare. This
is incorporated into the model for the kinetic behaviour of the electrons by including in the model the assumption that hopping only occurs between nearest
neighbours.
Hopping is not the only phenomenon that a model for itinerant electrons
should reflect. Also, the Coulomb force between electrons – that is, the fact that
two negatively charged entities will experience electrostatic repulsion – needs
to be taken into consideration. Again, the formalism of second quantization
suggests a straightforward way of accounting for the Coulomb contribution to
the Hamiltonian. Since the Coulomb force will be greatest for electrons at the
same lattice site (which, due to the Pauli exclusion principle, then must have
different spins), the dominating term will be
ĤCoulomb =
XU
iσ
2
n̂i,σ n̂i,−σ .
The sum of these two terms – the hopping term (roughly, representing movement of electrons throughout the lattice) and the Coulomb term (the potential
energy due to electrostatic repulsion) – already constitutes the Hubbard model:
ĤHubbard
=
=
Ĥkin + ĤCoulomb
XU
X
n̂i,σ n̂i,−σ .
Tij â†i,σ âj,σ +
2
iσ
ijσ
Note that, unlike the first-principles approach, the formalism-based approach
to model construction does not begin with a description of the physical situation in terms of fundamental theory, either in the form of the ‘full’ set of ∼ 1023
coupled Schrödinger equations, or via the thought experiment of neighbouring
atoms gradually approaching each other so as to form the elementary ‘building
block’ of an extended crystal lattice. Instead, it models the presumed microscopic processes (such as hopping and Coulomb interaction) separately, adding
up the resulting components and, in doing so, constructing a many-body model
‘from scratch’, as it were, without any implied suggestion that the Hamiltonian
so derived is the result of approximating the full situation as described by fundamental theory. Interestingly, Nancy Cartwright argues against what she calls
‘a mistaken reification of the separate terms which compose the Hamiltonians
we use in modelling real systems’ (8, p. 261). Although Cartwright grants
that, on occasion, such terms ‘represent separately what it might be reasonable to think of as distinct physical mechanisms’, she insists that ‘the break
into separable pieces is purely conceptual’ (ibid.) and that what is needed
are ‘independent ways of identifying the representation as correct’ (8, p. 262).
Cartwright’s critique of formalism-based model construction must be understood against the backdrop of her emphasis on phenomenological approaches,
which she regards as the only way ‘to link the models to the world’ (ibid.). To
be sure, the formalism-driven approach often proceeds in disregard of specific
empirical phenomena and in this respect might be considered as remote from
13
Cartwright’s preferred level of description – the world of physical phenomena
– as the more ‘first-principles’-based approaches. But it would be hasty to reject the formalism-driven approach for this reason alone, just as it would be
hasty to consider it simply an extension of ‘fundamental theory’. It is certainly true that the formalism-driven approach is not theory-free. But much
of the fundamental theory is hidden in the formalism – the formalism, I have
argued elsewhere, may be said to ‘enshrine’ various theoretical, ontological, and
methodological commitments and assumptions. (See (13).) Consider, for example, how the construction of the kinetic part of the model proceeded from
purely heuristic considerations of how itinerant motion in a discrete lattice could
be intuitively pictured in terms of the annihilation of an electron at one place
and its subsequent creation at another place in the lattice. The hopping integrals Tij were even introduced as mere parameters, when, on the first-principles
approach, they ought to be interpreted as matrix elements, which contain the
bulk of what quantum theory can tell us about the probability amplitude of
such events. The Coulomb term, finally, was constructed almost entirely by
analogy with the classical case, except for the reference to the Pauli principle.
(Then again, the Pauli principle itself is what makes the formalism of second
quantization and of creation/annihilation operators work in the first place – a
fact that the present formalism-driven derivation did not for a moment have to
reflect upon.9 ) Rather than thinking of the formalism-based approach as drawing a veil over the world of physical phenomena, shrouding them in a cocoon of
symbolic systems, one should think of formalisms such as the many-body operators discussed above as playing an enabling role: not only do they allow the
model builder to represent selected aspects of complex systems, but in addition
one finds that ‘in many cases, it is because Hamiltonian parts can be interpreted
literally, drawing on the resources furnished by fundamental theory as well as
by (interpreted) domain-specific mathematical formalisms, that they generate
understanding’ (14, p. 264; see also Section 5.3 of this paper). While the
formalism-based approach is not unique in its ability to model selected aspects
of complex systems (in particular, different co-existing ‘elementary’ processes),
it does so with an especially high degree of economy, thereby allowing the wellversed user of a many-body model to develop a ‘feel’ for the model and to probe
its properties with little explicit theoretical mediation.
5
Many-body models as mediators and contributors
Earlier, I argued that mathematical models should be sensitive to the phenomena they are intended to model. As argued in the previous section (Section
4), the existence of a mature formalism – such as second quantization with its
rules for employing creation and annihilation operators – can guarantee certain
9 For a discussion of the formalism of creation and annihilation operators as a ‘mature
mathematical formalism’, see (13, p. 281-282.).
14
kinds of sensitivity, for example the conformity of many-body models to certain basic theoretical commitments (such as the Pauli principle). At the same
time, however, the formalism frees the model from some empirical constraints:
By treating the hopping integrals as parameters that can be chosen largely arbitrarily (except perhaps for certain symmetry requirements), it prevents the
relationship of sensitivity between model and empirical data from turning into
a relationship of subjugation of the model by the data.
On a standard interpretation, applying models to specific physical systems
is a two-step process. First, a ‘reduced’ mathematical model is derived from
fundamental theory (a simplistic view that has already been criticized earlier);
second, approximative techniques of numerical and analytical evaluation must
be employed to calculate physical observables from the model, again at the expense of the mathematical complexity of the (still not exactly solvable) Hamiltonian. This way of speaking of two successive steps of approximation, however,
puts undue emphasis on the loss of accuracy involved in the process. For, it
is not clear how lamentable this ‘loss’ really is, given the unavailability of an
exact solution to the full problem. Crucially, such a view also overlooks that
the model itself contributes new elements to the theoretical description of the
physical system, or class of systems, under consideration – elements, which are
not themselves part of the fundamental theory (or, as it were, cannot be ‘read
off’ from it) but which may take on an interpretative or otherwise explanatorily
valuable role.
Contributions of this sort, originating from the model rather than from either
fundamental theory or empirical data, do, however, considerably inform the
way physicists think about a class of systems and frequently suggest new lines
of research. Consider the limitation to two sets of parameters U, {Tij } in the
case of the Hubbard model. Assuming a cubic lattice and nearest-neighbour
interaction, the interaction Tij between different lattice sites will either be zero
or have the same fixed value t. Hence, the quotient U/t reflects the relative
strength of the interaction between electrons (as compared with their individual
kinetic movement), and within the model it is a unique and exact measure of
this important aspect of the dynamics of electron behaviour in a solid. The
individual quantities U and t, thus, are seen to be no mere parameters, but
are linked, through the model, in a meaningful way, which imposes constraints
on which precise values are, or aren’t, plausible. Not only does this restrict
the freedom one enjoys in arbitrarily choosing U and t to fit the model to the
empirical data, but it also imposes constraints on structural modifications of
the model. For example, an attempt to make the model more accurate by
adding new (higher-order) terms to the model (perhaps accounting for higherorder interactions of strengths V, W, X, Y, Z < U, t), may be counterproductive,
as it may be more useful, for explanatory purposes, to have one measure of
the relative strength of the electron-electron interaction (namely, U/t) rather
than a whole set {U/t, V /t, W/t, ...}. To the extent that the model’s purpose is
explanatory and not merely predictive, a gain in numerical accuracy may not
be desirable if it requires replacing an intuitively meaningful quantity with a
set of parameters that lack a straightforward interpretation. Fitting a model to
15
the data does not by itself make the model any more convincing.
The ‘active’ contribution of the model, that is, its contributing new elements
rather than merely integrating theoretical and experimental (as well as further,
external) elements, is not only relevant to interpretative issues, but also has
direct consequences for assessing the techniques used to evaluate the model and
to calculate, either numerically or analytically, observable quantities from it.
5.1
Rigorous results and relations
One particularly salient class of novel contributions that many-body models
make to the process of inquiry in condensed matter physics are known as rigorous results. The expression ‘rigorous results’, which is not without its problems,
has become a standing expression in theoretical physics, especially among practitioners of statistical and many-body physics. (See for example (3).) It therefore
calls for some clarification. What makes a result ‘rigorous’ is not the qualitative or numerical accuracy of a particular prediction of the theory or model. In
fact, the kind of ‘result’ in question will often have no immediate connection
with the empirical phenomenon (or class of phenomena) a model or theory is
supposed to explain. Rather, it concerns an exact mathematical relationship
between certain mathematical variables, or certain structural components, of
the mathematical model, which may or may not reflect an empirical feature of
the system that is being modelled. One, perhaps crude, way of thinking about
rigorous results would be to regard them as mathematical theorems that are
provable from within the model or theory under consideration.10 Much like
Pythagoras’ theorem, a2 + b2 = c2 , is not merely true of a particular set of
parameters, e.g. {a, b, c} = {3, 4, 5}, but holds for all rectangular triangles, so
a rigorous result in the context of a mathematical model holds for a whole class
of cases rather than for particular parameter values. Yet, importantly, rigorous
results are true only of a model (or a class of models) as defined by a specific
Hamiltonian; unlike, say, certain symmetry or conservation principles, they do
not follow directly from fundamental theory.
An important use of rigorous results and relations is as ‘benchmarks’ for the
numerical and analytical techniques of calculating observable quantities from
the model.11 After all, an evaluative technique that claims to be true to the
model should preserve its main features, and rigorous results often take the form
either of exact relations holding between two or more quantities, or of lower and
upper bounds to certain observables. If, for example, the order parameter in
question is the magnetization, then rigorous results – within a given model –
may obtain, dictating the maximum (or minimum) value of the magnetization or
the magnetic susceptibility. These may then be compared with results derived
10 The notions of ‘theorem’ and ‘rigorous result’ are frequently used interchangeably in
scientific texts, especially in theoretical works such as (18).
11 This is noted in passing, though not elaborated on, by R.I.G. Hughes in his case study
of one of the first computer simulations of the Ising model: ‘In this way the verisimilitude of
the simulation could be checked by comparing the performance of the machine against the
exactly known behaviour of the Ising model.’ (20, p. 123)
16
numerically or by other approximative methods of evaluation.
5.2
Cross-model support
Rigorous results may also connect different models in unexpected ways, thereby
allowing for cross-checks between methods that were originally intended for different domains. Such connections can neither be readily deduced from fundamental theory, since the rigorous results do not hold generally but only between
different (groups of) models; nor can they justifiably be inferred from empirical
data, since the physical systems corresponding to the two groups of mathematical many-body models may be radically different. As an example consider
again the Hubbard model. It can been shown rigorously (see, for example, (11))
that, at half filling (that is, when half of the quantum states in the conduction
band are occupied) and in the strong-coupling interaction limit, U/t → ∞, the
Hubbard model can be mapped on to the spin-1/2 antiferromagnetic Heisenberg model (essentially in the form described earlier, with Jij = 4t2 /U ). Under
the specified conditions, the two models are isomorphic and display the same
mathematical behavior. Of course, the Hubbard model with infinitely strong
electron-electron interaction (U/t → ∞) cannot claim to describe an actual
physical system, where the interaction is necessarily finite, but to the extent
that various mathematical und numerical techniques can nonetheless be applied
in the strong-coupling limit, comparison with the numerically and analytically
more accessible antiferromagnetic Heisenberg model provides a test also for the
adequacy of the Hubbard model.
Rigorous relations between different many-body models not only provide fertile ground for testing of mathematical and numerical techniques, and for the
‘exploration’ (in the sense discussed in the next subsection) of models more generally. They can also give rise to a transfer of empirical warrant across models
that were intended to describe very different physical systems. The mapping, in
the strong-coupling limit (U/t → ∞), of the Hubbard model onto the spin-1/2
antiferromagnetic Heisenberg model is one such example. For, the latter – the
antiferromagnetic Heisenberg model – has long been known as an empirically
successful “‘standard model” for the description of magnetic insulators’ (Gebhard 1997: 75), yet the Hubbard model at low coupling (U/t = 0, indicating
zero electron-electron interaction) reduces to an ideal Fermi electron gas – a
perfect conductor. It has therefore been suggested that, for some finite value
between U/t = 0 and U/t → ∞, the Hubbard model must describe a system
that undergoes a transition from conductor to insulator. Such transitions, for
varying strengths of electron-electron interaction, have indeed been observed in
physical systems and are known as Mott insulators. Thanks to the existence
of a rigorous relation between the two models, initial empirical support for the
Heisenberg model as a model of a magnetic insulator thus translates into support for a new – and originally unintended – representational use of the Hubbard
model, namely as a model of Mott insulators. In other words, ‘empirical warrant first flows from one model to another, in virtue of their standing in an
appropriate mathematically rigorous relation’ (12, p. 516), from which one may
17
then gain new insights regarding the empirical adequacy of the model.12 As
this example illustrates, rigorous results neither borrow their authority from
fundamental theory nor do they need to prove their mettle in experimental contexts; instead, they are genuine contributions of the models themselves, and it
is through them that models – at least those of the kind discussed in this paper
– have ‘a life of their own’.
5.3
Model-based understanding
The existence of rigorous results and relations, and of individual cases of crossmodel support between many-body models of quite different origins, may perhaps seem too singular. Can any general lessons be inferred from them regarding
the character of many-body models more broadly? I wish to suggest that both
classes of cases sit well with general aspects of many-body models and their
construction, especially when viewed from the angle of the formalism-based approach. By reconceptualizing many-body models as outputs of a mature mathematical formalism – rather than conceiving of them either as approximations
of the ‘full’ (but intractable) theoretical description or as interpolating between
specific empirical phenomena – the formalism-based approach allows for a considerable degree of flexibility and exploration, which in turn generates understanding. For example, one may construct a many-body model (which may even
be formulated in arbitrary spatial dimensions) by imagining a crystal lattice of a
certain geometry, with well-formed (by the lights of the many-body formalism)
mathematical expressions associated with each lattice point, and adding the
latter up to give the desired ‘Hamiltonian’: ‘Whether or not this “Hamiltonian”
is indeed the Hamiltonian of a real physical system, or an approximation of it,
is not a consideration that enters at this stage of model construction.’ (14, p.
262) The phenomenological approach advocated by Cartwright might lament
this as creating an undue degree of detachment from the world of empirical
phenomena, but what is gained in the process is the potential for exploratory
uses of models. As Yi puts it:
One of the major purposes of this ‘exploration’ is to identify what
the true features of the model are; in other words, what the model
can do with and without additional assumptions that are not a part
of the original structure of the model. (37, p. 87)
Such exploration of the intrinsic features of a model ‘helps us shape our physical intuitions about the model’, even before these intuitions become, as Yi
puts it, ‘canonical’ through ‘successful application of the model in explaining a
phenomenon’ (ibid.).
Exploratory uses of models feed directly into model-based understanding, yet
they do so in a way that is orthogonal to the phenomenological approach and
its emphasis on interpolation between observed physical phenomena. As I have
12 This case of cross-model support between many-body models that were originally motivated by very different concerns, is discussed in detail in (12).
18
argued elsewhere, microscopic many-body models ‘are often deployed in order to
account for poorly understood phenomena (such as specific phase transitions);
a premature focus on empirical success (e.g., the exact value of the transition
temperature) might lead one to add unnecessary detail to a model before one has
developed a sufficient understanding of which microscopic processes influence
the macroscopically observable variable’ (14, p. 264). A similar observation is
made by those who argue for the significance of minimal models. Thus Robert
Batterman argues (quoting a condensed matter theorist, Nigel Goldenfeld):
On this view, what one would like is a good minimal model—a model
‘which most economically caricatures the essential physics’ (Goldenfeld 1992, p. 33). The adding of details with the goal of ‘improving’
the minimal model is self-defeating – such improvement is illusory.(2,
p. 22)
The formalism-based approach thus differs from the phenomenological approach
in two important ways. First, it conceives of model construction as a constructive and exploratory process, rather than as one that is driven by tailoring a
model to specific empirical phenomena. This is aptly reflected by Yi in his
account of model-based understanding, which posits two stages:
(1) understanding of the model under consideration, and this involves, among other things, exploring its potential explanatory power
using various mathematical techniques, figuring out various plausible physical mechanisms for it and cultivating our physical intuition about the model; (2) matching the phenomenon with a wellmotivated interpretative model of the model.(37, p. 89-90)
Second, the two approaches differ in the relative weight they accord to empirical
adequacy and model-based understanding as measures of the performance of a
model. In the formalism-based approach, empirical adequacy is thought of as
a ‘bonus’ – in the sense that ‘model-based understanding does not necessarily
presuppose empirical adequacy’ (37, p. 85). Such model-based understanding
need not be restricted to purely internal considerations, such as structural features of the model, but may also extend to general questions about the world,
especially where these take the form of ‘how-possibly’ questions. For example, in the many-body models under discussion, an important driver of model
construction has been the question of how there could possibly arise any magnetic phase transition (given the Bohr-van Leeuwen prohibition on spontaneous
magnetization in classical systems; see Section 3) – regardless of any actual,
empirically observed magnetic systems. By contrast, the phenomenological approach is willing to trade in understanding of the inner workings of a model for
specific empirical success. As Cartwright puts it, ‘[a] Hamiltonian can be admissible under a model – and indeed under a model that gives good predictions
– without being explanatory if the model itself does not purport to pick out
basic explanatory mechanisms’ (8, p. 271).
As an illustration of how the formalism-based approach and the phenomenological approach pull in different directions, consider which contributions to a
19
many-body model (that is, additive terms in a Hamiltonian) each approach
deems admissible. According to Cartwright, only those terms are admissible
that are based on ‘basic interpretative models’ that have been studied independently and are well-understood, both on theoretical grounds and in other
empirical contexts; these are the textbook examples of the central potential,
scattering, the Coulomb interaction, the harmonic oscillator, and kinetic energy
(8, p. 264). What licenses their use – and, in turn, excludes other (more ‘arbitrary’ or ‘formal’) contributions to the Hamiltonian – is the existence of ‘bridge
principles’ which ‘attach physics concepts to the world’ (8, p. 255). Indeed,
Cartwright goes so far as to assert that quantum theory ‘applies exactly as far
as its interpretative models can stretch’: Only those situations that are captured
adequately by the half-dozen or so textbook examples of interpretative models
‘fall within the scope of the theory’ (8, p. 265). By contrast, the formalismbased approach tells a very different story. As long as one ‘plays by the rules’ of
the formalism – which now enshrines theoretical constraints, without the need
to make them explicit even to the experienced user – any newly constructed
Hamiltonian terms are admissible in principle. And, indeed, in Section 4 we
already encountered a contribution to the Hamiltonian – the hopping term –
which was not inspired by the limited number of stock examples allowed on the
phenomenological approach, but instead resulted from a creative application of
the formalism-based rules for the ‘creation’ and ‘annihilation’ of particles at
distinct lattice sites. By freeing model construction from the overemphasis on
empirical adequacy, the formalism-based approach not only allows for a more
flexible way of modelling specific processes that are thought to contribute to the
overall behaviour of a complex system, but gives modellers the theoretical tools
to sharpen their understanding of the diverse interactions that together make
up the behaviour of many-body systems.
6
Between rigor and reality: Appraising manybody models
Traditionally, models have been construed as being located at a definite point
on the ‘theory-world axis’ (28, p. 18). Unless their role was seen as merely
heuristic, models were to be judged by how well they fit with the fundamental
theory and the data, or, more specifically, how well they explain the data by the
standards of the fundamental theory. Ideally, a model should display a tight fit
with both the theory and the empirical data or phenomena. As Tarja Knuuttila
has pointed out, large parts of contemporary philosophy of science continue to
focus on ‘the model-target dyad as a basic unit of analysis concerning models and
their epistemic values’ (23, p. 142). The proposed alternative view of models as
mediators presents a powerful challenge to the traditional picture. It takes due
account of the fact that, certainly from an epistemic point of view, theories can
only ever be partial descriptions of what the world is like. What is called for is
an account of models that imbues them with the kind of autonomy that does not
20
require close fit with fundamental theory, but nevertheless enables us to explain
and understand physical phenomena where no governing fundamental theory
has been identified. On this view, any account of real processes and phenomena
also depends on factors that are extraneous to the fundamental theory, and
those who deny this, are ‘interested in a world that is not our world, not the
world of appearances but rather a purer, more orderly word, a world which is
thought to be represented “directly” by the theory’s equations’ (7, p. 189).
The mediator view of models acknowledges from the start that ‘it is because
[models] are made up from a mixture of elements, including those from outside
the original domain of investigation, that they maintain [their] partially independent status’ (28, p. 14). This is what makes them mediators in the first
place:
Because models typically include other elements, and model building proceeds in part independently of theory and data, we construe
models as being outside the theory-world axis. It is this feature
which enables them to mediate effectively between the two. (28, p.
17f.)
Note that this is essentially a claim about the construction of models, their
motivation and etiology. Once a model has been arrived at, however, it is
its empirical success in specific interventionist contexts which is the sole arbiter of its validity. This follows naturally from a central tenet of the mediator
view, namely that models are closer to instruments than to theories and, hence,
warranted by their instrumental success in specific empirical contexts.13 That
models are to be assessed by their specificity to empirically observed phenomena, rather than by, say, theoretical considerations or mathematical properties
intrinsic to the models themselves, appears to be a widely held view among
proponents of the models-as-mediators view. As Mauricio Suárez argues, models ‘are inherently intended for specific phenomena’ (35, p. 75), and Margaret
Morrison writes: ‘The proof or legitimacy of the representation arises as a result of the model’s performance in experimental, engineering and other kinds
of interventionist contexts – nothing more can be said!’ (25, p. 81) It appears
then that, whilst the mediator view of models has ‘liberated’ models from the
grip of theory, by stressing their capacity to integrate disparate elements, it
has retained, or even strengthened, the close link between models and empirical
phenomena.
Yet, on a descriptive level, it is by no means clear that, for example in the
case of the Hubbard model, the main activity of researchers is to assess the
model’s performance in experimental or other kinds of interventionist contexts.
13 As Cartwright argues, it is for this reason that warrant to believe in predictions must be
established case by case on the basis of models. She criticizes the ‘vending-machine view’,
in which ‘[t]he question of transfer of warrant from the evidence to the predictions is a short
one since it collapses to the question of transfer of warrant from the evidence to the theory’.
This, Cartwright writes, ‘is not true to the kind of effort hat we know it takes in physics to
get from theories to models that predict what reliably happens’; hence, ‘[w]e are in need of
a much more textured, and I am afraid much more laborious view’ regarding the claims and
predictions of science. (7, p. 185)
21
A large amount of work, for example, goes into calibrating and balancing different methods of numerical evaluation and mathematical analysis. That is,
the calibration takes place not between model and empirical data, but between
different methods of approximation, irrespective of their empirical accuracy.
Even in cases, where ‘quasi-exact’ numerical results are obtainable for physical observables (for example via Quantum Monte Carlo calculations), these will
often be compared not to experimental data but instead to other predictions
derived at by other approximative methods. It is not uncommon to come across
whole papers on, say, the problem of ‘magnetism in the Hubbard model’, that
do not contain a single reference to empirical data. (As an example, see (36).)
Rather than adjust the parameters of the model to see whether the empirical behaviour of a specific physical system can be modelled accurately, the parameters
will be held fixed to allow for better comparison of the different approximative
techniques with one another, often singling out one set of results (e.g., those
calculated by Monte Carlo simulations) as authoritative.
One might object that a good deal of preliminary testing and cross-checking
of one’s methods of evaluation has to happen before the model predictions can
be compared with empirical data, but that nonetheless the latter is the ultimate goal. While there may be some truth to this interpretation, it should be
noted that in many cases this activity of cross-checking and ‘bench-marking’
is what drives research and makes up the better part of it. It appears that at
the very least this calls for an acknowledgment that some of the most heavily researched models typically are not being assessed by their performance in
experimental, engineering and other kinds of interventionist contexts. In part,
this is due to many models’ not being intended for specific phenomena, but for
a range of physical systems. This is true of the Hubbard model, which is studied in connection with an array of quite diverse physical phenomena, including
spontaneous magnetism, electronic properties, high-temperature superconductivity, metal-insulator transitions and others, and it is particularly obvious in
the case of the Ising model, which, even though it has been discredited as an
accurate model of magnetism, continues to be applied to problems ranging from
soft condensed-matter physics to theoretical biology. In some areas of research,
models are not even intended, in the long-term, to reflect, or be ‘customizable’
to, the details of a specific physical system. For example, as R.I.G. Hughes
argues, when it comes to critical phenomena ‘a good model acts as an exemplar
of a universality class, rather than as a faithful representation of any one of its
members’ (20, p. 115).
The reasons why many-body models can take on roles beyond those defined
by performance in empirical and interventionist contexts, are identical to those
that explain their capacity to ‘survive’ empirical refutation in a specific context
(as was the case with the Ising model); as I have argued in this paper, they
are two-fold: First, models often actively contribute new elements, which introduces cohesion and flexibility. One conspicuous class of such contributions, as
discussed in Section 5.1, are the rigorous results and relations that hold for a
variety of many-body models, without being entailed either by the fundamental
theory or the empirical data. It is such rigorous results, I submit, which guide
22
much of the research by providing important ‘benchmarks’ for the application of
numerical and analytical methods. Rigorous results need not have an obvious
empirical interpretation in order to guide the search for better techniques of
evaluation of analysis. This is frequently overlooked in discussions of the role of
many-body models by philosophers of science. Cartwright, for example, writes:
When the Hamiltonians do not piggy-back on the specific concrete
features of the model – that is, when there is no bridge principle that
licenses their application to the situation described in the model
– then their introduction is ad hoc and the power of the derived
prediction to confirm the theory is much reduced. (7, p. 195)
It is certainly true that many-body Hamiltonians that do ‘piggy-back’ on concrete features of the model frequently fare better than more abstract representations – if only because physicists may find the former more ‘intuitive’ and
easier to handle than the latter. But it is questionable whether the absence of
‘specific concrete features’, which would pick out a specific empirical situation,
is enough to render such Hamiltonians ad hoc. For there typically exist additional constraints, in the form of rigorous results and relations, that do constrain
the choice of the Hamiltonian, and these may hold for a quite general class of
models, irrespective of the specific concrete features of a given empirical case.
In particular, the process of ‘bench-marking’ across models on the basis of such
rigorous results and relations is not merely another form of ‘moulding’ a mathematical model to concrete empirical situations; rather, it fulfills a normative
function by generating cross-model cohesion.
The second main reason why the role of many-body models in condensed
matter physics is not exhausted by their empirical success lies in their ability to
confer insight and understanding into the likely microscopic processes underlying
macroscopic phenomena, even in the absence of a fully developed theory. As
discussed in Section 5.3, this is directly related to the exploratory use of manybody models – which in turn is made possible by the formalism-based mode
of model-building, which allows for the ‘piece-meal’ construction of many-body
Hamiltonians. Especially in the case of physical systems that are marked by
complexity and strong correlations among their constituents, what is aimed for is
a model which, in Goldenfeld’s apt formulation, ‘most economically caricatures
the essential physics’ (17, p. 33).
Given my earlier endorsement of the view that models need to be liberated
from the grip of self-proclaimed ‘fundamental theories’, one might worry that
further liberating them of the burden of empirical success leads to an evaporation of whatever warrant models previously had. This is indeed a legitimate
worry, and it is one that is shared by many scientists working on just those models. If all there is to a model is a set of mathematical relations together with a
set of background assumptions, how can we expect the model to tell us anything
about the world? There are several points in reply to this challenge. First, while
it is true that many of the rigorous relations do not easily lend themselves to
an empirical interpretation, there are, of course, still many quantities (such as
the order parameter, temperature etc.) that have a straightforward empirical
23
meaning. Where the model does make predictions about certain empirically significant observables, these predictions will often be an important (though not
the only) measure of the model’s significance.14 Second, models can mutually
support each other. As the example of the mapping of the strong-coupling Hubbard model at half-filling onto the Heisenberg model showed, rigorous results
and relations can connect different models in unexpected ways. This allows for
some degree of transfer of warrant from one model to the other. Note that
this transfer of warrant does not involve any appeal to fundamental theory, but
takes place ‘horizontally’ at the level of models.15 Third, in many cases a model
can be constructed in several different ways, which may bring out the connection with both theory and phenomenon in various ways. The first-principles
derivation of the Hubbard model is one such example. It provides a meaningful
interpretation of the otherwise merely parameter-like quantities Tij (namely, as
matrix elements that describe the probability of the associated hopping processes). While this interpretation requires some appeal to theory, it does not
require an appeal to the full theoretical problem – that is, the full problem
of 1023 particles each described by its ‘fundamental’ Schrödinger equation. A
similar point can even be made for the formalism-driven approach. There, too,
model construction does not operate in a conceptual vacuum, but makes use of
general procedures, which range from the highly abstract (e.g., the formalism
of second quantization) to the largely intuitive considerations that go into the
selection of elementary processes judged to be relevant.
By recognizing that models can be liberated both from the hegemony of
fundamental theory and from the burden of empirical performance in every
concrete specific case, I believe one can appreciate the role of models in science
in a new light. For one, models are as much contributors as they are mediators in
the process of representing the physical world around us. But more importantly,
they neither merely execute fundamental theory nor accommodate empirical
phenomena. Rather, as the example of many-body models in condensed-matter
physics demonstrates, they are highly structured entities, which are woven into,
and give stability to scientific practice.
References
[1] Daniela M. Bailer-Jones: “When scientific models represent”, International
Studies in the Philosophy of Science, 17, 2003, pp. 59-74.
[2] Robert Batterman: “Asymptotics and the role of minimal models”, British
14 A model whose predictions of the order parameter are systematically wrong (e.g., consistently too low) but which gets the qualitative behaviour right (e.g., the structure of the
phase diagram), may be preferable to a model that is more accurate for most situations, but
is vastly (qualitatively) mistaken for a small number of cases. Likewise, a model that displays
certain symmetry requirements or obeys certain other rigorous relations may be preferable to
a more accurate model (with respect to the physical observables in questions) that lacks these
properties.
15 See also Section 5.2 above; for a full case study see of cross-model transfer of warrant, see
(12).
24
Journal for the Philosophy of Science, 53, 2002, pp. 21-38.
[3] Rodney J. Baxter: Exactly Solved Models in Statistical Mechanics, New
York: Academic Press 1982.
[4] Marcel Boumans: “Built-in justification”, in Margaret Morrison and Mary
S. Morgan (eds.): Models as Mediators. Perspectives on Natural and
Social Science, Cambridge: Cambridge UP 1999, pp. 68-96.
[5] Stephen Brush: “History of the Lenz-Ising Model”, Reviews of Modern
Physics, 39, 1967, pp. 883-893.
[6] Nancy Cartwright, Towfic Shomar and Mauricio Suárez: “The tool box of
science”, in: William E. Herfel, Wladyslaw Krajewski, Ilkka Niiniluoto
and Ryszard Wójckicki: Theories and Models in Scientific Processes
(Poznán Studies in the Philosophy of the Science and the Humanities,
Vol. 44), Amsterdam: Rodopi 1995, pp. 137-149.
[7] Nancy Cartwright: The Dappled World. A Study of the Boundaries of Science, Cambridge: Cambridge UP 1999.
[8] Nancy Cartwright: “Models and the Limits of Theory: Quantum Hamiltonians and the BCS Model of Superconductivity”, in Mary S. Morgan
and Margaret Morrison (eds.): Models as Mediators: Perspectives on
Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 241281.
[9] Michael E. Fischer: “Scaling, Universality, and Renormalization Group
Theory”, in F.J.W. Hahne (ed.): Critical Phenomena. (Lecture Notes
in Physics, Vol. 186), Berlin: Springer 1983, pp. 1-139.
[10] Serge Galam: “Rational group decision-making: A random-field Ising
model at T=0”, Physica A, 238, 1997, pp. 66-80.
[11] Florian Gebhard: The Mott Metal-Insulator Transition: Models and Methods. (Springer Tracts in Modern Physics, Vol. 137), Berlin: Springer
1997.
[12] Axel Gelfert: “Rigorous results, cross-model justification, and the transfer of empirical warrant: the case of many-body models in physics”,
Synthese, 169, 2009, pp. 497-519.
[13] Axel Gelfert: “Mathematical formalisms in scientific practice: From denotation to model-based representation”, Studies in History and Philosophy of Science, 42, 2011, pp. 272-286.
[14] Axel Gelfert: “Strategies of model-building in condensed matter physics:
trade-offs as a demarcation criterion between physics and biology?”,
Synthese, 190, 2013, pp. 253-272.
[15] Michael C. Gibson: Implementation and Application of Advanced Density
Functional s. (PhD Dissertation, University of Durham, 2006.)
[16] Ronald N. Giere: “Using Models to Represent Reality”, in Lorenzo Magnani, Nancy J. Nersessian and Paul Thagard (eds.): Model-Based Rea25
soning in Scientific Discovery, New York: Plenum Publishers 1999, pp.
41-57.
[17] Nigel Goldenfeld: Lectures on Phase Transitions and the Renormalization
Group (Frontiers in Phyiscs, Vol. 85.), Reading, Mass.: Addison Wesley
1992.
[18] Robert B. Griffiths: “Rigorous results and theorems”, in Cyril Domb and
Melville S. Green (eds.): Phase Transitions and Critical Phenomena,
New York: Academic Press, 1972, pp. 8-109.
[19] Werner Heisenberg: “Theorie des Ferromagnetismus”, Zeitschrift für
Physik, 49, 1928, pp. 619-636.
[20] R. I. G. Hughes: “The Ising model, computer simulation, and universal
physics”, in Margaret Morrison and Mary S. Morgan (eds.): Models
as Mediators. Perspectives on Natural and Social Science, Cambridge:
Cambridge UP 1999, pp. 97-145.
[21] Ernst Ising: “Beitrag zur Theorie des Ferromagnetismus”, Zeitschrift für
Physik, 31, 1925, pp. 253-258.
[22] Martin H. Krieger: “Phenomenological and Many-Body Models in Natural
Science and Social Research”, Fundamenta Scientiae, 2, 1981, pp. 425431.
[23] Tarja Knuuttila: “Some Consequences of the Pragmatist Approach to Representation: Decoupling the Model-Target Dyad and Indirect Reasoning”, in Mauricio Suárez , Mauro Dorato and Miklos Rédei (eds.): EPSA
Epistemology and Methodology of Science, Dordrecht: Springer 2010,
pp. 139-148.
[24] H. Matsuda: “The Ising Model for Population Biology”, Progress of Theoretical Physics, 66, 1981, pp. 1078-1080.
[25] Margaret C. Morrison: “Modelling Nature: Between Physics and the Physical World”, Philosophia Naturalis, 35, 1998, pp. 65-85.
[26] Margaret Morrison: “Models as autonomous agents”, in Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on
Natural and Social Science, Cambridge: Cambridge UP 1999, pp. 38-65.
[27] Margaret Morrison and Mary S. Morgan (eds.): Models as Mediators. Perspectives on Natural and Social Science, Cambridge: Cambridge UP
1999.
[28] Margaret Morrison and Mary S. Morgan: “Models as mediating instruments”, in Margaret Morrison and Mary S. Morgan (eds.): Models
as Mediators. Perspectives on Natural and Social Science, Cambridge:
Cambridge UP 1999, pp. 10-37.
[29] Martin Niss: “History of the Lenz-Ising model 1920-1950: from ferromagnetic to cooperative phenomena”, Archive for History of Exact Sciences,
59, 2005, pp. 267-318.
26
[30] Philippe Nozières: Theory of Interacting Fermi Systems. New York: Benjamin 1963.
[31] Andrew T. Ogielski and Ingo Morgenstern: “Critical behavior of 3dimensional Ising model of spin glass”, Journal of Applied Physics, 57,
1985, pp. 3382-3385.
[32] Lars Onsager: “Crystal statistics. I. A two-dimensional model with an
order-disorder transition”, Physical Review, 65, 1944, p. 117.
[33] Sam Schweber and Matthias Wächter: “Complex Systems, Modelling and
Simulation”, Studies in History and Philosophy of Modern Physics, 31
No.4, 2000, pp. 583-609.
[34] Mark Steiner: “The Application of Mathematics to Natural Science”, The
Journal of Philosophy, 86 No. 9, 1989, pp. 449-480.
[35] Mauricio Suárez: “Theories, Models, and Representations”, in Lorenzo
Magnani, Nancy J. Nersessian and Paul Thagard (eds.): Model-Based
Reasoning in Scientific Discovery, New York: Plenum Publishers 1999,
pp. 75-83.
[36] Michael A. Tusch, Yolande H. Szczech, and David E. Logan: “Magnetism in
the Hubbard model: An effective spin Hamiltonian approach”, Physical
Review B, 53 No. 9, 1996, 5505-5517.
[37] Sang Wook Yi: “The Nature of Model-Based Understanding in Condensed
Matter Physics”, Mind & Society, 5, 2002, pp. 81-91.
27