Imagination Rather Than Observation in Econometrics: Ragnar Frisch's Hypothetical Experiments as Thought Experiments* Catherine Herfeld† Forthcoming in HOPOS: The Journal of the International Society for the History of Philosophy of Science, 9 (1), 2019. Abstract In economics, thought experiments are frequently justified by the difficulty of conducting controlled experiments. They serve several functions, such as establishing causal facts, isolating tendencies, and allowing inferences from models to reality. In this paper, I argue that thought experiments served a further function in economics: facilitating the quantitative definition and measurement of the theoretical concept of utility, thereby bridging the gap between theory and statistical data. I support my argument by a case study, the "hypothetical experiments" of the Norwegian economist Ragnar Frisch (1895-1973). Frisch aimed to eliminate introspection and a subjective concept of utility from economic reasoning. At the same time, he sought behavioral foundations for economic theory that enabled quantitative reasoning. By using thought experiments to justify his set of choice axioms and facilitating the operationalization of utility, Frisch circumvented the problem of observing utility via actual experiments without eliminating the concept of utility from economic theory altogether. As such, these experiments helped Frisch to empirically support the theory's most important results, such as the laws of demand and supply, without the input of new empirical findings. I suggest that Frisch's experiments fulfill the main characteristics of thought experiments. 1. Introduction Thought experiments are conducted for multiple reasons in the natural sciences, the social sciences, and philosophy. As such, their multifaceted nature and various roles in the knowledge-generating process have been discussed extensively in the literature (Brown et al. 2017). One well-established view is that thought experiments are primarily used in disciplines * I want to thank Olav Bjerkholt, Marcel Boumans, Mary Morgan, as well as the audiences of the session on Thought Experiments in Economics, Past and Present organized by the International Network of Economic Methodology at the ASSA Annual Meeting 2016 in San Francisco and the International Conference on Integrated History and Philosophy of Science at the University of Edinburgh in 2016 for their constructive feedback on earlier versions of this paper. I also want to thank four anonymous referees for their useful comments. This project was financially supported by the Humboldt-Foundation and a fellowship of the Center for the History of Political Economy at Duke University. † Institute of Philosophy and Institute of Sociology, University of Zurich; address: Zurichbergstrasse 43, 8044 Zurich, Switzerland, contact: catherine.herfeld@uzh.ch 2 where observation and controlled experimentation are difficult (Sørensen 1992). One such discipline is economics, where the methodological value of thought experiments is frequently justified by referring to the challenges of conducting actual experiments (e.g., Reiss 2012, 177). It has been argued that thought experiments serve several functions in economics, such as establishing causal facts, isolating tendencies, justifying social institutions and their persistence, establishing the credibility of mathematical models to draw inferences about economic reality, and questioning established scientific commitments (Reiss 2012, Maas 2014, Thoma 2016). In this paper, I suggest that thought experiments have had an additional function, one not yet discussed in the philosophical literature on thought experiments in economics. Looking closely at a particular case in early econometrics, I argue that thought experiments were employed to bridge the gap between established economic theories and statistical data. Specifically, thought experiments played two central roles. First, they helped justify the core behavioral principles underlying economic theory to establish a quantitative definition of unobservable entities, such as utility. Second, they allowed for operationalizing such economic concepts. Some econometricians considered this to be highly beneficial. To establish theoretical regularities and test them, the variables used in economic theories required measurement. However, as mental concepts such as utility did not have an observable counterpart, it was impossible to specify them on the basis of direct observation. Thought experiments allowed for specifying such concepts without conducting actual experiments to investigate the psychological mechanisms behind human behavior. They enabled the measurement of relevant theoretical relationships and variables by connecting them to statistical data.1 In support of my argument, I offer a historical case study: the "imaginary experiments" (Frisch 1932b, 102) of the Norwegian economist and Nobel laureate, Ragnar Frisch (18951973). Frisch is well known today as one of the founders of modern econometrics.2 In his 3 paper Sur un Problème d'Economie Pure (1926) and a series of lectures in the early 1930s, Frisch introduced his hypothetical experiments to analyze demand behavior. According to Frisch, such experiments freed economic reasoning from the unwanted method of introspection and from metaphysical commitments that could enter economic analysis through mental concepts such as utility. He considered both to be major obstacles on the road to developing economics into a proper science comparable to physics. By using thought experiments, Frisch circumvented the problem of directly observing utility via actual experiments without purging the concept of utility from economic theory altogether. As such, his thought experiments helped Frisch to empirically support the theory's most important implications, such as the laws of demand and supply, without the input of new empirical findings.3 By establishing that Frisch's imaginary experiments were in fact thought experiments and analyzing their merits in econometrics, this case study contributes to the literature on the nature and role of thought experiments in economics. Because Frisch's imaginary experiments have not yet been extensively researched, this paper also provides the first detailed account of those experiments. Additionally, the paper offers a fresh perspective on the more general debate about the justification of behavioural principles in economics. Economists and philosophers have long debated the methodological status and role of the behavioral principles economic theory is grounded upon. Many economists have long abstained from including concepts and findings from psychology in economic theory and from empirically testing their behavioral principles. Yet their neglect has also provoked much criticism. In showing that Frisch's hypothetical experiments bridged the gap between theory and statistical data, this case study helps us clarify some of the positions economists have taken in this debate and better understand the justifications they have offered for their positions.4 4 The paper is structured as follows. To introduce sufficient historical and philosophical background to set Frisch's hypothetical experiments in context, I first present a sketch of William Stanley Jevons's conceptual ideas concerning utility that functioned for Frisch as a departure point (Section 2). I then outline the debate about utility measurement in economics as it stood around the turn of the 20th century (Section 3). Because there are very few models for Frisch's hypothetical experiments, considering some early instances of precursors proposed by Vilfredo Pareto and Irving Fisher reveals their character (Section 4). Looking at their hypothetical experiments also shows how closely aligned Frisch's methodology was with his predecessors. I then turn to Frisch's views on the nature of quantification and the role of axiomatization in quantifying and measuring marginal utility (Section 5). I introduce Frisch's axiomatic choice theory (Section 6) and finally show how his hypothetical experiments bridged the gap between economic theory and statistical data, which in turn allowed for the empirical testing of economic theory (Section 7). Against this background, I discuss the status of Frisch's hypothetical experiments and argue that they should be understood as thought experiments (Section 8). 2. Jevons' Dream: Measuring Utility before Ragnar Frisch Frisch was strongly influenced by William Stanley Jevons's views on utility measurement and would fuse them with a set of solutions Irving Fisher proposed at the end of the 19th century. It would become Frisch's self-proclaimed goal to "realize the dream of Jevons" (Frisch 1926/1971, 386), namely to "quantify at least some of the laws and regularities of economics" (Frisch 1981, 3; italics in original). By enabling utility to be measured, Frisch's hypothetical experiments – so I argue bridged the gap between economic theory describing economic laws and statistical data. Those experiments are therefore best understood in light of Jevons's work and the debates about quantification and utility measurement that followed him at the beginning of the 20th century.5 5 Roughly speaking, measurement is the process of mapping a property of an object found in the empirical world onto a set of numbers (Boumans 2005, 859). One major question in debates about economic measurement is how to arrive at meaningful numbers that inform us about the phenomena in question. This is a special concern because, first, measurement in a controlled lab is often not possible and, second, measurement of social and mental phenomena potentially invites a host of problems, such as inaccuracy, subjectivity, and challenges to quantification more generally (e.g., Boumans 2015). An important role in measurement is played by models, which can function as instruments that allow the quantification of a phenomenon by making it observable in terms of numerical facts (Woodward 1989). Such measurement instruments are sometimes not easy to produce and quantification processes are generally not trivial; yet, obtaining proper instruments and quantitatively measuring specific phenomena are especially challenging in field sciences such as economics. Before Frisch became interested in utility measurement in econometrics, economists had long debated how sensations could be quantified and measured. At the end of the 19th century, skeptics argued that sensations do not possess properties like homogeneity that were required to undertake measurement operations such as the addition and difference of magnitude in psychology. Furthermore, the comparison of magnitudes of sensations across time, contexts, or people was considered impossible, largely because of the subjective dimension of sensations (Moscati 2013, 382). At the same time, economic theories made at least implicit reference to mental states, particularly to sensations like pleasure and pain. This is why the problem of quantifying and measuring utility had long been a major concern in economics. Economists began to address the problem of measuring utility at the end of the 19th century by the latest when the concept of utility was introduced as part of the conceptual core of the Marginalist theory of exchange value. In the 1870s with the Marginalist revolution, 6 Jevons, the Austrian economist Carl Menger, and the French economist Léon Walras introduced utility as a key concept in political economy. Jevons aimed to measure the exchange value (or price) of commodities, and he treated utility as that measurable unit indicating this value. He assessed the utility ratios of different commodities, that is, whether and how much more the utility of one commodity was greater than another, where commodities could be money as well as other goods (Moscati 2013). The key notion in their theories was not the total utility a commodity would offer to an individual but rather the marginal utility, i.e., the additional utility that consuming an extra unit of a commodity offered. The problem Jevons faced was that the unit for measuring exchange value was not fixed as a unit of physical magnitudes. He had established economics as concerned with the calculus of pleasure and pain. Jevons thought of pleasure and pain as quantities capable of "being more or less in magnitude" (Jevons quoted in Maas 2005a, 172; italics in original). For him, economics was the study of those quantities and the relations between them. But Jevons acknowledged that pleasure and pain could not easily be defined and measured as quantities – their numerical expression was not possible. Furthermore, there was no instrument available for their quantification (Moscati 2013).6 To understand Frisch's hypothetical experiments, it is also important to note that it was Jevons's idea that, like gravity, pleasure and pain were only indirectly measurable through their observable market effects. More specifically, the market price of a commodity was the only test of the (marginal) utility that a commodity provided to a consumer. This was because of the assumed relationship between observable demand behavior, the marginal utility of the commodity, and the relative market price of that commodity. Jevons believed that "if we could tell exactly how much people reduce their consumption of each important article when the price rises, we could determine, at least approximately, the variation of the final degree of utility [i.e. marginal utility] – the all-important element in Economics" (Jevons quoted in 7 Chipman 1998, 59). Marginal utility and its changes could thus be indirectly observed and determined through its manifestations in observable consumer behavior and the changes of that behavior in reaction to price changes.7 In light of these arguments, Jevons specified utility further in his theory of exchange value. While he took utility to be "a quality of things," he made clear that utility for the economist is not an inherent property of a good (Jevons 1871/1888, III.13). Rather, Jevons defined it as a relational property, "better described as a circumstance of things arising out of their relation to man's requirements" (ibid.). To characterize utility in terms of its properties, Jevons postulated various principles. One crucial principle underlying his law of the variation of utility depicted in Figure 1 was that the marginal utility obtained from consuming an additional unit of a particular commodity (y) would decrease with the quantity of that commodity consumed (x). If an individual had consumed the quantity oa, the marginal utility of that quantity corresponded to the length of the line ab in Figure 1 (Jevons 1871/1888, III 22). Frisch's hypothetical experiments would later help to measure the marginal utility of commodities. Figure 1: Jevons' law of the variation of utility (Jevons 1871/1888, III.21). 8 Marginal utility determined the laws of exchange between different commodities in Jevons's theory and was thus crucial to the theory of market exchange. Marginal utility was indirectly measurable by prices or by money, and it would potentially become directly measurable in the future.8 Jevons's understanding of the indirect measurability of marginal utility via prices was circular. Exchange value (or price), the explanandum of his theory, depended upon marginal utility, while marginal utility was meant to be measurable via exchange value. It was ultimately Jevons's dream to find a way to measure utility. Frisch's hypothetical experiments would play a crucial role in pursuing that dream while avoiding Jevons's circularity.9 3. Frisch's Behavioristic Approach to Utility Measurement Frisch had been frustrated with the work done in utility analysis since Jevons. From the start of his career in the early 1920s, he criticized loose reasoning and the use of vague concepts like utility in demand theory. He was convinced that they could be avoided by using better concepts and mathematics. Frisch also rejected introspection as an alternative method to access mental states, which had been a method accepted by some economists to justify the main propositions of utility theory. Appealing to "a certain mental association process in the reader or the listener" could only lead to a subjective and blurred understanding of utility (Frisch quoted in Bjerkholt and Qin 2010, 82). Frisch was convinced that for economics to develop into a science comparable to physics, a quantitative concept of marginal utility was needed. It required an objective definition, which would allow for its measurement (Frisch 1926/1971, 386). Only then could economic theory describing abstract economic laws, such as the law of demand and supply, be subjected to the same numerical testing and empirical verification as found in the natural sciences.10 Frisch's lectures at Yale University in 1930 and his Poincaré lectures in Paris in 1933 reveal that he did not consider the task of utility measurement in economics as similar to that of a psychologist directly investigating mental states (see Bjerkholt and Dupont 2009, 9 Bjerkholt and Qin 2010). While Frisch recognized that utility could be defined as a quantity in the "psychological" sense of Edgeworth, he committed himself to a "behavioristic theory of choice" (Frisch 1930/2010, 83). Frisch clarified his use of terms such as 'behavioristic' in a "special economic-technical sense" to explicitly distinguish them from the psychological and behavioristic schools in psychology (Frisch 1930/2010, 83). His thinking here was much influenced by the American economist Irving Fisher, who had also been reluctant to accept psychological foundations for utility theory (Lenfant 2012, 118). Frisch's theory of choice and his use of hypothetical experiments were strongly influenced by an ongoing debate between Fisher and the Irish economist Francis Edgeworth. The debate was over whether utility was directly or indirectly measurable in Jevons's sense and thus over whether economics required a psychological or a behavioristic approach. Edgeworth and Fisher agreed on the importance of a workable measure of utility but disagreed on whether economics should focus on the human psyche or on the observable behavior of economic agents to measure it. Edgeworth believed that utility was quantifiable and could in fact be measured directly in a similar way to temperature. He was also convinced that its sensory measurement could be undertaken, if economists only had the appropriate instruments (e.g., Colander 2007, Moscati 2013, 381).11 He argued that economics would ultimately get solid physiological foundations.12 Fisher argued that instead of searching for physiological underpinnings of utility, economists should rely upon backward induction from observed behavior to measure utility. The underlying idea was that of Jevons, namely that people's marginal utility was manifested in their observable choices (Colander 2007). According to Fisher, economic agents reveal the marginal utility they receive from consuming specific commodities through their consumption decisions. The assumption was that a consumption decision is made where the marginal utility of an extra unit of the commodity is equal to the price an individual has to pay for this extra 10 unit. If the price is higher than the utility gained, an individual will not consume any more of the commodity. This idea could allow the economist to directly connect utility with demand behavior and thus with objective quantities of consumed commodities and prices (Fisher 1892/2007). More specifically, statistical data of demand behavior would enable economists to indirectly measure the utility that specific consumer groups gained from specific commodities (see, e.g., Colander 2007). This way, any mental variables could be excluded from economic theory. The actual desires of individual agents were irrelevant for the economist, as only the observable "economic act of choice" was needed for measuring marginal utility (ibid., 11). The choice act itself did not need psychological justification, as it could trivially be explained by the simple and acceptable postulate "[e]ach individual acts as he desires" (ibid.; italics in original).13 Frisch was deeply impressed by Fisher's advances in connecting economic theory and statistical analysis. Crediting Fisher's dissertation as the first work to develop a promising theory of choice, he fully adopted Fisher's anti-psychologist attitude (see Frisch 1930/2010, 83-84, Frisch 1932a, 1 f., Frisch 1932b, 102). Frisch was equally convinced that a behavioral theory of choice was sufficient to quantitatively measure utility and make predictions about demand behavior while circumventing the problem that mental states were not observable. He argued that with a set of specific quantitative relations, which could in turn be deduced from a set of imaginary experiments, utility could be defined as a quantity (Frisch 1932b, 103). That would require a strictly axiomatic theory of choice of the kind would ultimately construct in the 1920s. Such an objective definition of utility suggested by Fisher then held out the possibility of developing the much-needed quantitative theory of demand and exchange more generally (Frisch 1932b, 103). The advantage then was that in such a deduced theory, observable phenomena and imagined choices could be related. "There appears not only the imaginary choice experiments with which one started but also a number of actually 11 observable phenomena, such as market prices, quantities sold, composition of consumption budget and so on. In other words there appears a number of those things in regard to which modern statistics of prices and consumption provide us information. Thereby the connection is joined between theory and observation" (Frisch 1932b, 103). As did Fisher, Frisch doubted that observed regularities could be used to reveal anything about people's mental states or "psychological motivation mechanisms" (Frisch 1930/2010, 84). He wanted to overcome this difficulty "by confining the analysis to the observable choice regularities" only (Frisch 1930/2010, 85).14 These regularities could be observed on the market level, in data about relationships between observable price and income levels, as well as consumption at different points in time. The "psychological motivation" behind the regularities was irrelevant: In a behavioristic theory of choice "[i]t is on the choice acts themselves that the utility definition is based" (Frisch 1930/2010, 84; italics in original). Taking up Jevons's idea of indirect measurement outlined above, the data needed for a behavioristic approach were limited to statistical data about behavioral patterns, such as demand, commodity, prices, and income. According to economic theory, marginal utility was directly linked to such observable patterns. Because those patterns were observable and quantifiable, marginal utility could be objectively measured on the basis of this data. However, the goal was not to measure mental states but only to measure changes in economic regularities, described for example by the laws of demand and supply. Jevons had argued for this possibility because although individuals would deviate from the behavioral assumptions of utility theory, those individual deviations would cancel out on the macro-level. Fisher had argued that while the marginal utility for an individual was not measurable in this way, the relevant economic laws were. This was because the measurement of "the whole [was] simpler than its parts" (Fisher quoted in Colander 2007, 222). As such, a behavioral theory of choice fully dispensed with psychological concepts and could even accommodate the idea that 12 individuals might have irrational motives, which Edgeworth had taken as preventing a quantitative definition of marginal utility. 4. Pareto's and Fisher's Experiments to Justify Indifference Curves As we will see in this section, Frisch's hypothetical experiments were foreshadowed not only by Fisher, who used similar experiments to justify his behavioral theory of choice. Frisch's views were also closely connected to Vilfredo Pareto's work on ordinal utility theory. In defending his ordinal utility theory, Pareto had shown that the major results of utility theory were independent of utility measurement in the classical sense conceived by Jevons. Ordinal utility theory was grounded on the idea that it was only the difference between two quantities of utility and not the size of the quantities themselves that mattered in deriving the main results of demand theory. This was an important step towards a non-psychological interpretation of utility. For Pareto, the psychology of a human agent could be ignored in the economic theory of price and exchange; observable behavior was sufficient to construct the indifference curves underlying demand analysis. Graphically representing different quantities of two commodities between which a consumer was taken to be indifferent and their distinct utility levels respectively, indifference curves could simply be represented by a utility index function that would in turn allow the economist to derive the laws of the market and the market equilibrium (Lenfant 2012, 117-18). However, this anti-psychological picture would raise the question of the origin of the shape of the indifference curves. Indifference curves were on the one hand thought to give a positivist foundation to demand analysis, while, on the other hand, conceived as theoretical constructs to arrive at an index utility function (Lenfant 2012, 119-20). This tension would also become visible in Frisch's work. Pareto himself had justified indifference curves and the utility index in various ways. For their construction and to arrive at the respective utility function, he had referred to introspection, to everyday experience, and to observed behavior. But importantly, Pareto was 13 one of the first economists also arguing for the option of experiments on people's preferences. It was important that such experiments, if not practically feasible, were theoretically possible. Such theoretical possibility of empirically constructing indifference curves was sufficient for an economic theory of choice according to Pareto (Lenfant 2012, 119 ff.). One way in which Pareto specified the idea of 'theoretically possible' was by introducing 'hypothetical' experiments in his Manual to hypothetically inquire into people's tastes. Those experiments foreshadowed later experiments of Fisher and Frisch. To identify the indifference curves and a utility-index function that Pareto considered necessary to obtain precise data for statistical analysis, it was sufficient for these experiments to be possible only in principle (Pareto 1909/1971, 415). How did those experiments look? To find out about consumer tastes, Pareto asked his reader to imagine – with him – some hypothetical scenario in which the reader would place herself in the shoes of the experimenter questioning some individual or household about hypothetical choices they would make in some hypothetical choice scenario. The reader thereby not only had to put herself in the shoes of the experimenter but also in those of the fictitious consumer. One of Pareto's experiments reads as follows: Instead of conducting experiments to determine the indifference lines, or varieties, let us make some experiments to find out what quantities [x, y, z, ...] of goods [X, Y, ...] the individual will buy at certain given prices [py, pz, ...]. We set y0 = 0, z0 = 0, ... and give x0 a certain value; the experiment will reveal the quantities y, z, u, ... which the individual purchases by disposing of a part of his x0 amount of X. Let us repeat these experiments varying x0; we will get the values of y, z, u ... as a function of x0, py, pz, .... According to Pareto, after a set of calculations, the experiments would give the same result as the analytical derivation when considering indifference varieties. "We could deduce 14 the theory of economic equilibrium directly from the experiments which have just been indicated. Indeed, these experiments give us py = ay, pz = bz, ... where ay and bz are known functions. ... The point of equilibrium is determined. But in this method, as long as the experiments are not actually performed, we do not have the few notions about the quantities ay, bz, ... which at least are furnished by the consideration of choices" (Pareto 1909/1971, 415-16). Pareto knew that the information obtained from such experiments was not the complete differential equation of an indifference curve but only the ratio of marginal utilities at a specific point (Lenfant 2012, 118). The obtained preference ordering over commodity bundles reflected the tastes of the individual for multiple combinations of commodity quantities. However, it was not intended to give a detailed description of a person's actual mental states in a particular decision situation (Pareto 1927/1972, 127). Rather, as the following quote by Pareto indicated, such experiments relied on Jevons's idea that it is the logical relationship between observable behavior and theoretical utility considerations that the economist is interested in. "[W]e are concerned only with certain relations between objective facts and subjective facts, principally the tastes of men. Moreover, we will simplify the problem still more by assuming that the subjective fact conforms perfectly to the objective fact. This can be done because we will consider only repeated actions to be a basis for claiming that there is a logical connection uniting such actions" (Pareto 1909/1971, 103). Pareto's experiments allowed him to justify this logical relationship in concrete yet hypothetical choice scenarios. Discussing the work of Jevons and Pareto, Fisher justified his behavioral theory of choice by what he called 'metaphoric experiments'. In his doctoral dissertation, Fisher suggested using such experiments to find out about variations in people's utility levels. Fisher described one experiment as follows: 15 Confine attention first to two commodities (a) and (b) consumed by one individual. Let this individual first arrange this whole consumption combination to suit himself. Then in order to partially analyze this equilibrium of choice let us metaphorically experiment on him as follows. He is directed to alter this consumption combination by arranging his quantities A and B of the two selected commodities (a) and (b) in all possible ways, but without changing the quantities C, D, etc. of other commodities. The marginal utility of each will vary not only in relation to its own quantity but also the quantity of the other commodity. Thus, dU/dA1=F(A1,B1) dU/dB1=F(B1,A1) These may be regarded as derivatives with respect to A and B of U1=j(A1,B1) where U1 is the total utility to I of the consumption combination A1 and B1 (Fisher 1925/2012, 68). Fisher's experiment already comes close to the thought experiment that Frisch would eventually propose. Like Pareto, Fisher asks us, the reader, directly to imagine herself as a choice maker in a particular situation in which we have chosen bundles of two commodities in such a way that it maximizes our utility; we are in an equilibrium situation in which the marginal rates of substitution of both goods are identical. Fisher's metaphorical experiment has the character of an experiment in that we are asked to imagine a direct manipulation of this equilibrium situation, namely the rearrangement of the commodity bundle in multiple ways, considering and imagining all the possible combinations of the two commodities, holding the quantities of all other commodities in the bundle constant, and then reflecting upon the implications of these rearrangements for our utility considerations. Fisher appeals to 16 our intuitions about how we would value commodities on the market and in relation to each other. Nevertheless, the experiment concerns a hypothetical scenario. It would be cognitively difficult or even impossible for any reader to imagine all possible combinations of the two commodities and clearly determine for each possible combination how much the change in marginal utility of one good depends upon the changed quantity of both goods in the bundle. Fisher used this thought experiment to justify his quantitative definition of marginal utility. By imagining the various potential combinations of the quantities of two commodities an individual could choose, given a fixed income, the experiments allowed him to derive the properties of marginal utility by way of the indifference map (Figure 2) without actually making the effort to reconstruct the indifference curves in a first step from actual data or observation.15 Figure 2: Indifference curve map (Fisher 1892/2007, 68). What Fisher demonstrated with his experiments becomes clear when we look at his graphical demonstration of the indifference curve map. Instead of interrogating a person directly or observing her behavior, he imagined the hypothetical judgments of a 'typical' individual regarding her ordering of two commodity bundles, A and B. This allowed Fisher to arrive at a generalized view of the ordering of all possible combinations of two commodities 17 and their marginal utilities for a typical individual and to determine her utility function. He derived the whole map of indifference curves of a typical individual consumer for each possible situation. The utility function could then be applied to statistical data for estimating the marginal utility for a particular good. Fisher considered this quantification of utility as objective because the indifference curves that he identified appealed to general intuitions about the consumption of goods, such as, for example, the law of diminishing marginal utility. His experiments thus mainly demonstrated a theoretical result – a definition of marginal utility and the map of indifference curves. Unlike Jevons, Fisher did not simply postulate the existence of a utility function without further justification. But as he still had to postulate the indifference curves and utility index to determine the utility function, he used metaphorical experiments to make this postulation plausible. 5. Axiomatic Foundations for Quantifying Utility in Demand Analysis Frisch directly expanded on Fisher's work. But to develop Fisher's choice theory one step further, he formulated an axiomatic foundation to utility measurement that theoretically grounded indifference curve analysis and gave an axiomatic definition of utility. This was highly innovative, as the axiomatic method was a new mathematical approach for economists in the 1920s. While axiomatic choice theories are well-known in economics since John von Neumann and Oskar Morgenstern's axiomatic formulation of the expected utility principle in their Theory of Games and Economic Behavior, Frisch's axiomatization of consumer demand theory was presumably the first of its kind in economics (Arrow 1960, 176). Echoing Fisher, he did not aim at providing an explanatory choice theory and at formulating testable hypotheses about individual behavior. In Frisch's picture, "man's choice is similar to that of a bird which selects between two distant alternatives", not between any two alternatives in an actual choice situation (Georgescu-Roegen 1954, 122). His axiomatic system provided a formal "representation" of the outcome of choice, namely demand behavior of the typical 18 individual (Frisch 1926/1971, 387). Furthermore, the axiomatic method enabled him to generate estimates about demand behavior from incomplete statistical data about consumption, prices, and income. Frisch's understanding of the quantification of utility was directly related to his research program in econometrics. For the econometrician of his day, economics had two dimensions, one theoretical and one empirical.16 As Frisch recapitulates from a memorandum about a meeting with Joseph Schumpeter and Gottfried Haberler in 1928 when founding the Econometric Society, "[t]he terms econometric and econometrics are interpreted as including both pure economics and the statistical verification of the laws of pure economics, in essential distinction to the purely empirical manipulation of statistical data on economic phenomena" (Frisch 1970, 21). By formulating solid theoretical foundations for empirical analysis, Frisch wanted to ensure rigorous scientific analysis in the tradition of Jevons, Pareto and Fisher to distinguish scientific from non-scientific research. Furthermore, Frisch stressed the importance of verification of economic laws; the econometrician had "to subject abstract laws of theoretical political economy or 'pure' economics to experimental and numerical verification, and thus to turn pure economics, as far as is possible, into a science in the strict sense of the word" (Frisch 1926/1971, 386). To test and, if possible, verify economic theory against experimental or statistical data, an objective definition of utility as a quantity was essential. Quantification had two aspects for Frisch that guaranteed the presence of a theoretical and empirical dimension also in utility measurement. It had an axiomatic aspect, referring to the abstract quantification by means of mathematics that was necessary for establishing a basis for theoretical analysis. To arrive at a quantitative theory of economic relations, Frisch wanted to establish logical and quantitative definitions of utility. And it had a statistical aspect, referring to its concrete quantification by means of numerical data to fill abstractly formulated quantitative relationships of economic theory, test them, and show "how the 19 theoretical laws manifest themselves at present in this or that industry or for this or that consumption category, etc. The true unification of these quantitative elements is the foundation for econometrics" (Frisch 1933/2009, 5). For Frisch, the abstract dimension of quantification required the axiomatic method, while the concrete dimension of quantification needed data mainly from statistics but could in principle also come from experiments. Against this background, two steps were required for quantifying and measuring utility. First, it required the formulation of choice axioms that gave a rigorous quantitative definition of utility. Second, it required a method of measuring utility statistically and the application of this method to actual data. Frisch showed that the postulation and measurement of a marginal utility function could be justified by an axiomatic structure that represented an agent's preferences in terms of a binary relation à la Pareto such that the existence of a utility function could be formally proven. Frisch's axioms would ground the behavioral choice theory that Frisch took as the theoretical basis for a theory of the market. Frisch's hypothetical experiments were meant to bridge the gap between economic theory and empirical analysis by justifying those axioms. Giocoli (2003) distinguishes between two approaches to axiomatization, both of which can be found in the economic theory of the first half of the 20th century and help us to better understand Frisch's endeavor. Frisch represented the first approach while later attempts by Gerard Debreu to axiomatize the theory of value are representative of the second (Debreu 1959, Weintraub 2002). The approaches are distinct mainly with respect to the reasoning procedures involved, their relationship to empirical evidence, and the purposes for which they were undertaken. The first approach to axiomatization sought to validate results that had already been established empirically. The idea was to start from some empirically confirmed theory and then see whether the results of the theory could also be obtained through deduction from a limited set of axioms that were necessary and sufficient to theoretically establish the result. The interest was primarily to use the axiomatic method in order to ensure that the 20 theory would be subjected to tests, thereby ultimately allowing for explanation and predictions and at the same time fulfilling the prerequisites necessary to conduct the empirical analysis. This double role could only be fulfilled when the axioms could be justified empirically. In contrast, the second approach to axiomatization was a highly formal procedure and as such open for application to a much broader set of problems. One started from primitive concepts, such as a binary relation – formulated axioms that give structure to the relation and at the same time satisfy certain consistency conditions – to deduce theorems from those axioms according to specific logical rules of inference. This approach, later represented by Debreu (1959) but to a certain extent also by John von Neumann and Oskar Morgenstern (1944), was largely empirically neutral in the sense that the axioms and theorems, consisting only of a set of symbols, required interpretation only in a second step in order to be tested empirically – their empirical interpretation and application was not automatically given or suggested by the axiomatic formulation of a theory. This also increased the generality of the framework, as its formal-mathematical formulation allowed for application to structurally similar yet discipline-independent problems, thereby making it highly flexible with respect to its use. Frisch's attempts to axiomatize the theory of consumer behavior were of the first kind. Frisch sought to theoretically support results that were empirically already established, such as the law of demand and supply (Giocoli 2003).17 That the use of the axiomatic method allowed for exactness in defining theoretical terms and concepts was a natural thought for Frisch. Yet he did not consider axiomatization as useful for its own sake or for the sake of the formalism involved, which certainly had an attractive appeal to later economists like Debreu. Axiomatic economics "is abstract, but neither in the sense of a logic game nor in the sense of metaphysical verbiage, of which we have had some in economics, at times. Axiomatic 21 economics will construct its quantitative notions in the same way as theoretical physics has constructed its quantitative notions" (Frisch quoted in Bjerkholt and Dupont 2007, 5).18 Axiomatization, while implying a certain degree of abstraction and generalization, was a step towards developing a scientific basis of economics. While concerned with human behavior, Frisch wanted to adopt the rigor of the natural sciences to demand analysis. He introduced axiomatics "to find a basis from which [the sort of mathematics – arithmetic, algebra or geometry – that is useful for the economist] can be deduced by a rigorous logical structure" (Frisch 1930/2010, 106). At the same time, his motivation was ultimately an empirical one, making use of mathematical economics as complementary to statistical data analysis.19 His challenge thus became establishing a link between deductive theory and empirical analysis in line with his epistemology in econometrics. As in the natural sciences, economic theory would "draw its fundamental conceptions from the actual observation technique" and thereby eventually develop into something close to an "experimental science" (Frisch 1932: 99-100; italics in original). To meet this challenge, Frisch's hypothetical experiments would play a crucial role. 6. Two Types of Axioms for Measuring Consumer Utility Frisch introduced his quantitative definition of utility in his paper Sur un Problème d'Economie Pure, published in 1926. The paper focused primarily on consumer behavior and the measurement of the marginal utility of money. Its goal was to measure utility in the tradition of the physical sciences and thereby realize Jevons's dream. Achieving this was only three concrete steps away: "(1) To point out the choice axioms that are implied when we think of utility as a quantity, and to define utility in a rigorous way by starting from a certain set of such axioms; (2) To develop a method of measuring utility statistically; (3) To apply the method to actual data" (Frisch 1932, 2-3). Measurability for Frisch meant in this context the 22 numerical representation of the marginal utility of a particular commodity for an individual consumer on the basis of an objective definition of marginal utility. Departing from Pareto's idea of measuring utility through understanding choices in terms of binary comparisons (Chipman 1971, 326), Frisch introduced two types of axioms for binary choices. These axioms were meant to capture the properties of marginal utility as the object under measurement. They would allow for the representation of particular patterns of choice and the formal establishment of the utility function, constrained or unconstrained, which in turn could be tested empirically. The first set of axioms related to binary comparisons of commodity bundles p and q (or choice objects), understood as infinitesimal 'displacements' from a vector of the given initial bundle x that Frisch took as the starting point of the analysis. The second set of axioms related to binary comparisons of pairs of commodity bundles p and q of distinct displacement vectors. In each pair, the bundles were treated as infinitesimal displacements from the original bundles x0 and x1; in this case, the binary comparison concerned the bundles (x0 + p) and (x1 + q).20 For each case, Frisch suggested three axioms respectively to give structure to the binary relation (see also Chipman 1971, 326): a) The axiom of choice (completeness), which stated that the choice of an individual between two options was always well-defined. This axiom guaranteed that there existed a preference relation such that the agent would be either indifferent between the two options or preferred one to the other. Hence, Frisch assumed that the economic agent would be able to pair-wise compare combinations of choice objects and would reach a definite choice in each case; b) The axiom of coordination (transitivity), which stated that choices are non-circular, thereby ensuring the absence of inconsistencies in choice; c) The axiom of addition, which allowed for the approximation of indifference varieties in small neighborhoods by their supporting hyperplanes.21 23 Together, the first set of axioms, i.e., the axioms relating the displacement vector to a given initial commodity bundle, ensured the existence of an ordering of the displacements and, as such, served as an ordinal measure of the marginal utility of those displacements. In combination with the second set of axioms, i.e., the axioms relating the displacement vector to different initial positions, Frisch could describe individual choices through the marginal utility the individual would gain for any displacement around an (arbitrary) initial position x and could thus characterize the whole "choice field"22 of that individual in the choice space.23 As such, his set of axioms allowed for a complete numerical representation of an individual's preferences, thereby measuring marginal utility in a systematic way.24 Taking the step to determine the whole choice field of an individual consumer from a set of axioms allowed for more than the numerical estimation of marginal utility at each point of the choice field. As a purely technical concept, utility logically defined in this way also stripped the concept of any (metaphysical) psychological connotation and so sidestepped "sterile discussions about the 'cause' of value and analogous questions" (Frisch 1926/1971, 395).25 The definition characterized marginal utility as nothing but a "coefficient of contingent choice" that led to the components of the vector u(x)26 to be proportional to the prices in the case of market equilibrium, where x denoted the quantity of a commodity and u the marginal utility that quantity provided the agent (Frisch 1926/1971, 394-95). Yet the length of vector u(x) was not defined by the axioms and left room for the distinctive nature of individuals. As such, the definition of marginal utility was "not universal" in that it permitted "the measurement of marginal utilities relating to different individuals. Each individual's choice field is affected by a proportionality coefficient that we have not defined, and which would probably be impossible to define in any objective manner" (Frisch 1926/1971, 394). This was the only difference between individuals that Frisch left room for. Using a set of choice axioms allowed Frisch to circumvent the ad hoc postulation of an individual's utility function, refocus on choice, and thereby turn away from the idea of utility 24 as an actual quantity in people's heads. His solution was theoretical, systematic, and methodologically pragmatic. Yet, it brought him closer than any of his predecessors to measuring the average marginal utility of commodities. He considered it to be an obvious fact that people increased their level of satisfaction from consuming a particular commodity. That was a process through which "everybody has many times lived" (Frisch 1932, 102). However, he did not argue that every single individual would in fact maximize utility in consumption. His axiomatic theory of demand behavior was a theory and thus could accommodate deviant behavior in individual cases. Frisch left psychology largely outside his conceptual world. His axiomatic approach was not an attempt to reveal or formalize psychological effects. In his view, individuals may be driven by psychological factors in their behavior, but he viewed economics as dealing with the regular behavioral patterns that emerge from whatever provokes an individual to act. While the consequences of people's behavior were the object under investigation for the econometrician insofar as they studied those consequences as reflected in data, neither the causes behind individual action nor the mechanisms that lead to observable behavior were of any interest for the econometrician, nor should they be investigated through observation. Thus, "Frisch's image of science, and of his contemporaries, can be identified as the scientific worldview of (logical) positivism" (Boumans and Dupont-Kieffer 2011, 20). Frisch was concerned with verification of observable economic regularities. It was not necessary for the axioms to capture adequately the causes or mechanisms behind behavior; what mattered was that they could lead to approximately accurate propositions about the consequences of those behaviors on the market level. Nevertheless, ambiguity was perceived concerning the origins of the axioms. If they were not true of the actual psychology behind people's observable choices, where did they come from? Frisch opted for the same strategy as Pareto and Fisher and introduced hypothetical 25 experiments to justify his axioms as well as the concept of a preference ordering more generally. 7. Frisch's Hypothetical Experiments Frisch introduced his hypothetical experiments in passing in his 1926 paper, in some methodological writings, and in a set of lectures. These experiments provided his rather abstract axioms an empirical flavor. They entailed hypothetically consulting individual agents in order to find out how they would choose. One of their major ingredients was 'choice questions'. Those were questions about potential choices posed by the scientist to a hypothetical economic agent.27 Frisch imagined an interview with what he calls homo oeconomicus, a fictitious rational individual who is asked to imagine making choices in specific situations that the interviewer poses to her and tells the interviewer her choices. Specifically, Frisch "supposed" that a series of questions "have been posed to a given individual or family" (Frisch 1933/2009, 10).28 As he explained elsewhere, "A system of fictitious interrogation experiments [are] performed on an individual. We invent, so to speak, a series of situations, and imagine that we ask the individual questions as to what he would do in these various situations" (Frisch 1930/2010, 93). The experiments provided the base for "the axiomatic definition of utility as a quantity" (ibid.). They could help the scientist discover how a particular individual would rank her options, as "it is in principle possible, by means of such 'experiments by interrogation' carried out on homo oeconomicus, to determine objectively which of the three cases [i.e., >, <, »] occurs" (Frisch 1926/1971, 388). Likewise, Frisch believed it in principle possible through "experiments by trial and error" to ascertain when an individual is indifferent between two displacements (ibid.). As such, the researcher could from "the description of the situations involved in the choice-questions and from the answers given" try to formulate the rigorous definition of utility (Frisch quoted in Bjerkholt and Dupont 2007, 10). Frisch cautioned that in order to do so "[t]he choice26 questions must, of course, be such that both the situations and the answers can be formulated in objective terms. Sometimes it may even be necessary to require that they can be formulated in quantitative terms (ibid.). For such experiments to generate some information about economic choices, the researcher had to specify two main elements in those choice questions: She had to give (1) a description of the alternative transactions between which an individual could choose, the choice-objects (Frisch 1930/ 2010, 95), and (2) "to each of the alternatives specified under (1) there must be associated a description of the complete economic situation in which it is assumed that the individual finds himself just before the transaction in question is to be effectuated" – Frisch called this the "delivery-situation" (ibid.). The idea was to systematically reveal choice data about consumer behavior as resulting from the imagined choices of a fictitious economic agent in any possible market scenario that agent would find herself in. It was thereby assumed that her answer to a choice-question would crucially depend on, and possibly vary significantly with, the specific situation she found herself in, even if the choice objects stayed the same. The questions posed in the experiments and the answers Frisch imagined receiving from this individual were dictated by his axioms. As such, the experiments gave choice data of a homo oeconomicus in any possible choice situation to determine the whole choice field of that individual and, therefore, would expand the data set of demand behavior beyond actually observable demand data to behavior in all theoretically possible situations (which was always limited to actual price/income situations). What was the concrete setup of the hypothetical experiments? Frisch gave the following description, which is worth quoting in full: Suppose we have a typical working man's family with a certain household budget for the year: certain specified items of food, of clothing, of entertainment, etc. Let us imagine that all these items are planned in detail for the year and let us assume that the expenses are provided for by the salary of the head of the family (the 27 'breadwinner'), and that we have reason to believe that everything in the economic life of the family in this year is going to happen just as it is planned. We are here only concerned with this single year and do not take into account what is going to happen in the future. This being so, let us assume that we offer to the family as a present, in addition to the specified budget, either one of the following two things: 1. one pound of ham per month for the coming year; 2. 36 talkie [i.e., a movie with sound] admissions in a year The family can choose either one of these two things, but when the choice is made it must actually consume the chosen thing. It is not allowed to sell it and instead buy something else. Nor is it allowed, after the choice is made, to make any modifications in the previously specified budget. In this example, the specified budget for this year is the delivery situation and the ham and the talkie admissions represent two choice-objects. The choice question appears in its clearest form when the choice-objects are put up as alternative presents, but in point of principle this is not necessary. Suppose for instance, that ham is included in the specific budget, but no talkie admissions, because up to now there has been no talkie in town. Then to the family's surprise a talkie is opened. We may now ask the question: in case the family had an offer to give up 12 pounds of ham in exchange for 36 admissions, would it accept it or not? In a concrete situation there may perhaps be other alternatives, for instance, to give up just a few pounds of ham and to receive three admissions for each of the pounds given up. The family may even consider other and more far-reaching rearrangements of the budget. But all this we now assume to be excluded. The question is only: if an exchange of 12 pounds of ham for 36 tickets, without any other change in the budget, is the only offer made, will the family accept it or not? There is only one choice-object now, namely, the transaction consisting in 28 exchanging 12 pounds for 36 tickets, and this object may be accepted or declined. The delivery-situation is the same as before, namely, the one defined by the specific budget (Frisch 1930/2010, 96). The setup is similar to those experiments described by Pareto and Fisher. Yet it is far more detailed and contains many more elements of actual experiments. The epistemic status of these hypothetical experiments, however, was dubious. First, as the label Frisch gave them suggests, such experiments did not actually have to be conducted in order to facilitate the measurement of utility. Like Pareto, Frisch argued that their execution only had to be theoretically possible. "[I]f we only have the definition of utility in mind, it is not necessary that the interrogation experiments shall be actually possible in a technical and practical sense. It does not matter if the cost of, or the practical difficulties involved in an actual statistical survey would be prohibitive. It is sufficient that the experiments are possible in principle" (Frisch quoted in Bjerkholt and Dupont 2007, 10; italics in original). But Frisch went beyond Pareto's view of in-principle performability of his experiments. Rather than relating them to actual experiments, these imaginary experiments could best be understood as analogies to what Frisch called "axiomatic experiments" in physics, such as the "light signals of relativity axiomatics," which Frisch considered to be not actually conductible in a technical sense and as rather functioning as a particular 'way of thinking' about the speed of light, or even as 'theoretical tools' (Frisch 1932b, 102 ff.). "The logical process" of such experiments in physics is "precisely the same as in those economic studies, which are built up on the axiomatic choice theory." Take, for instance, "the connection between axiomatic theory and actual observation" in economic theory and in, say, the theory of relativity. The connection in both cases is "wholly the same." As Frisch explains: "When the physicist is to define the conception of time in relativistic philosophy he is in the first place awake to the fact that he can make no progress by refering [sic] to the psychological view of the passage of time. He 29 must"-just as Frisch was attempting to do--"build up his definition on experiments." But just as Frisch would say of his own experiments, the experiments conducted by the physicist do not necessarily require to be concrete experiments, actually capable of accomplishment. It is sufficient if they are possible in principle. They are indeed not intended to serve the purpose of acquiring numerical information, but only to give clarity to the thoughts. For this reason, the relativity theorist, in the setting up of the definition of the concept of time, can operate with experiments which are quite valueless in practice, for instance the experiment that one individual sends out a signal flash and another individual sends a signal flash back as soon as he has understood the first signal, and so on. But, subsequently, when a complicated theory has been erected on this basis, the relativity theorist arrives at relations which are verifiable also in a practical sense. (Frisch 1932b, 105-6; italics in original). The "logical process" that the physicist went through was, in Frisch's view, identical to the process that an economic theorist goes through (ibid.). Frisch located these thought experiments in physics and the tradition of the Hilbert school that aimed at axiomatizing the foundations of geometry. Thought experiments in physics served as theoretical tools and were as such only "a way of thinking" that gave "a precise and concrete significance to our ideas" (Frisch 1930/2010, 93). As those experiments were "not actually 'possible' in a technical sense", labeling them 'experiments' was because their "similarity with actual experiments furnishes the preciseness and clearness of thought that are necessary in the logical construction of the science. A similar role is played by the interrogation experiments of utility theory" in economics (Frisch 1930/2010, 10). Frisch's imaginary experiments were also different from actual experiments in other respects, mainly in technique. "Sometimes in economics it is possible to construct an actual interrogation scheme that is very nearly a true copy of an axiomatic interrogation scheme. ... 30 as a rule, the actual observations have a technique of their own and involve a series of practical experiments quite different from those involved in an axiomatic procedure" (Frisch 1930/2010, 93).29 The imaginary experiments that Frisch advocated also differed from actual experiments in that they were merely hypothetical, i.e., not actually conducted with "some living individual" (ibid.), which is why they presupposed a set of choice axioms that determined the reaction of a fictitious individual. Frisch thought that such choices could be observable in principle, under ideal conditions, but not in practice and certainly not for every human being (Frisch 1932b, 105). Most importantly, the agents interrogated in his experiments were fictitious, which was one of their distinctive elements compared to actual experiments. "There is also this difference between an axiomatic and an actual interrogation experiment, that the latter must always be made on some living individual (or on a concrete group of individuals that are guided by some sort of joint action), while the former can be conceived of as made on some average or typical individual. In this latter case we have to adopt some general assumptions or choice-axioms regarding the way in which our typical individual will react when subjected to our 'experiments.' Otherwise our 'experiments' would be a complete chaos without any meaning" (Frisch 1930/2010, 93-94). Hypothetical experiments thus differed from actual experiments mainly in that they could not in fact be conducted. As Frisch admitted, no human being actually would behave as a homo oeconomicus in such choice situations (Frisch 1926/1971, Frisch 1930/2010). So, the subject of the experiment did not in fact exist. That the subject of the experiments was not an existing agent but rather an imagined homo oeconomicus was only a problem insofar as the investigator would have to think of some properties of this agent's choices that could plausibly be accepted as characteristics of a typical individual. Yet typical meant rational for Frisch. The nature of the questions and the imagined answers the scientist expected in her experiments were "defined by a series of choice axioms" that embodied plausible characteristics of a typical individual (Frisch 31 1933/2009, 10). It was up to the scientist and her experience to decide which axioms would be plausible so as to ensure a close relationship between economic theory and reality. "From a formal point of view we are at liberty to choose any set of choice-axioms that we may favour, but in reality we are under a very severe restriction, that namely of adopting a set of axioms that will lead to a really fruitful theory fitting the facts, that is, a theory which will be able to "explain" the results of actual observations or actual experiments" (Frisch 1930/2010, 93-94). This reasoning revealed the mutual reliance of axioms, hypothetical interrogation, and measurement. The results of the experiment were meant to inspire the formulation of the axioms and the axioms had to be chosen in such a way that the theory would be able to hold up to actual relations between economic variables in the data. It was ultimately up to the scientist, her expertise and common sense as well as her capacity as a human being to introspect, to make those experiments work. This, of course, brought Frisch back to what he had wanted to exclude from demand analysis in the first place, namely a commitment to introspection and the circularity that demand analysis confronted. What exactly was the function of Frisch's experiments? In light of the previous sections, I suggest that his hypothetical experiments served two major functions: justification of the choice axioms and the measurement of marginal utility. The first justificatory function was obvious from Frisch's anti-psychological methodology and anti-metaphysical epistemology. Frisch refused to decide about the utility function and his axioms on purely a priori grounds. He also thought that psychological results about actual choices were not needed, as he took econometricians to be interested in "general phenomena, and not in the isolated cases" and to be concerned with the typical, not the actual, individual (Frisch 1930/2010, 12). At the same time, however, he argued that his imaginary experiments described typical individuals and situations that were not too distant from real life, giving his axioms and thus the data resulting from his experiments plausibility. Furthermore, he held that the plausibility of axioms was 32 also to be judged by testing the propositions deduced from his axioms with reality by drawing upon statistical analysis. [T]he theory gets its concepts from the observation technique. ... For the logical definition it is enough that ... [observations] ... exist as a thought experiment. ... [T]his form of conceptualization has opened a possibility for realizing the connection between the abstract concepts of theoretical economics and economic life as it is reflected in the numerical data of economic statistics. – Although the observations that can corroborate the abstract quantitative definitions are not possible in practice, they are even so the first step towards efficient observations. They pose a target where there used to be none. They show the point that the statistical technique of approximation shall try to hit (Frisch 1926/1971, 302). The second major function of Frisch's experiments was to estimate the marginal utility of specific commodities. Frisch thought that the econometrician had to cope with incomplete data because it was impractical and infeasible to study every individual's choice field. Through hypothetical experiments the scientist would know the position the ideal individual would choose, namely the one that is the equilibrium where price and marginal utility are proportional to each other. This information was given by the generated choice field. This proportionality would then reveal properties of the concrete choice fields existing in the economic world if equilibrium points in the data could be observed. Average data of the quantities actually demanded under different prices would be observable. But we could also use data from hypothetical experiments in combination with statistical demand data to estimate marginal utility and to test economic relationships statistically. It would suffice to estimate the marginal utility of a particular good, say sugar, by studying the choices of a homo oeconomicus to arrive at knowledge about average properties characterizing preferences of sets of typical individuals. This way, one got at a choice field of the homo oeconomicus that could then be combined with average market data and household surveys. Thus, after 33 conducting these experiments, the economist would end up with the whole 'consumption surface' (as Fisher had named it), i.e., all consumption decisions of a typical individual for different price and income situations on the basis of data partly imagined and partly observed. That way, Frisch's experiments helped bridge the gap between theory and data. They allowed for quantitatively defining and measuring utility as a theoretical concept, which would in turn allow for quantifying economic relationships. 8. Frisch's Hypothetical Experiments as Thought Experiments Do Frisch's hypothetical experiments qualify as thought experiments, and if so what did Frisch learn from them? There is no agreement among philosophers of science about the necessary and sufficient conditions for something to qualify as a thought experiment. Thought experiments can involve a highly diverse set of mental activities (Brown 2011, Stuart et al. 2018). However, some characteristics have been suggested as common features to thought experiments.30 A first feature of most thought experiments is that they are, in fact, experiments (e.g., Schabas 2018). Experiments are characterized by the controlled intervention of a scientist in nature within an artificial environment, usually the laboratory. This environment "shields the objects and events of the experimental materials from the effects of other factors and disturbances by strict protocols of both intervention and control" (Morgan 2013, 343-44). Most of the time, the goal "is to reproduce the conditions required by a theory and then to manipulate the relevant variables in order to take measurements of a particular scientific parameter or to test the theory" (Morgan 1990, 9). In the case of thought experiments, both are done hypothetically. In that sense, thought experiments are not actually executed but are hypothetical, which is a second feature of them. They are hypothetical in that we reason with them about an imaginary scenario (e.g., Gendler 2004). Thought experiments do not simply describe an actual experiment that is or could be conducted but so far has not been carried out. Rather, we 34 can "get a grip on nature just by thinking" (Brown and Fehige 2017). Some thought experiments could potentially be conducted one day but are not conducted at present, mainly because it is not yet physically feasible, ethically acceptable, or technologically possible – for instance, due to lack of instruments (ibid.). If we think it likely that they will one day be conducted, we could consider them as so-called "experiments-in-waiting," i.e. experiments that will eventually be conducted, if the conditions and instruments allow for their actual execution (Schabas 2018). Other thought experiments are experiments that can never be conducted. A third, and related, feature of thought experiments is that their execution engages our senses and requires us to appeal to our imagination and not to observation (Brown 1991, Gendler 2004, 1155). We are asked to imagine a specific setup. We are then asked to imagine, given the conditions of a theory, an intervention in or mental manipulation of that setup, which leads to a change in the setup we are imagining (Brown 1991, 2011, Schabas 2018). Intervention is required to secure the status of a proper experiment. Consequently, imagined intervention constitutes a fourth feature of thought experiments. Such imagined interventions can involve visualization or the use of other senses, a putting ourselves into the imaginary experiment (e.g., Brown and Fehige 2017). As such, the phenomenon and intervention imagined in the thought experiment have to be conceivable for us. Fifth, Schabas (2018) and Sørensen (1992) argue that a thought experiment can enable "expeditions to possible worlds" (Sørensen 1992, 135; italics mine). Some important thought experiments in economics, such as Milton Friedman's famous helicopter drop example, have often been launched by a sometimes "bizarre counterfactual," but then "[restore] some mental equanimity by introducing familiar objects to assist the mind of the experimenter as she reaches her destination" (Schabas 2018, 173). For Schabas's demonstrative purposes, the scientist introduces a counterfactual that asks the experimenter to imagine a set of circumstances fundamentally different from the actual world (ibid.). Note, however, that a 35 thought experiment does not necessarily have to contain this feature. Such counterfactuals are mostly combined with other objects describing the hypothetical situation that are usually objects familiar to the experimenter and which therefore assist her imagination. A sixth and final essential feature of a thought experiment is that it demonstrates something. It can do so because it rests upon some stable truths, maybe a set of established law-like regularities according to which specific elements in the experiment behave. In economics, economists must find the experiment plausible in that they find the reasoning involved typical. But this does not mean that the agents we are imagining (possibly including ourselves) have to act in the specific way in which the hypothetical scenario in the thought experiment dictates. This is what distinguishes a thought experiment in economics from a thought experiment in the natural sciences, in physics in particular. Our general idea is that the law of gravity must operate in a specific way, by necessity. Human behavior could conform to some stable regularity. But human agents do not necessarily behave in the particular way that the thought experiment dictates (see also Thoma 2016). While we find plausible considerations about human behavior that rest upon some conventional assumption – about the rationality of human agents, for example – we would have to make a case that thought experiments in economics rest upon stable generalizations of human behavior that operate with something close to necessity, or that the rationalityprinciple is actually operating in the real world with necessity. In that sense, in offering a demonstration of something, the experiment normally comes with a particular result, which we could thus take to be not a necessary but rather a result contingent upon our assumptions about the agents that we imagine populate our thought experiment. This result can be empirical, i.e., potentially saying something about the world that we take the thought experiment to be about. But the result can also be theoretical, such as, for example, in the case of deriving a specific theoretical solution to a theoretical problem or of deducing a theorem that, in the first instance, does not say anything about the actual world. 36 Given these characteristics of thought experiments, can we understand Frisch's experiments as thought experiments? Prima facie, one might interpret Frisch's hypothetical experiments as experiments-in-waiting. Actual experiments were for a long time considered unfeasible in economics and thought experiments could be thought of as having been introduced in order to fill this gap until the day arrived when they could be carried out; examples of such experiments-in-waiting are Friedman's helicopter experiment just mentioned (1969) and Daniel Ellsberg's experiments showing his paradox in expected utility theory (Ellsberg 1961).31 Although he was skeptical about the possibility of conducting experiments in economics, one could argue that Frisch might have believed that the experiments would be carried out eventually. While he later acknowledged certain drawbacks of what he had called the 'interview method', he continued to believe that 'interview questions' were a fruitful method for investigating the preferences of individuals. In later years he argued that determining a preference function was useful in an econometric model on the national (not on the individual) level, "provided the questions are wisely formulated in a conversational manner, and not simply carried out by some youngster in the opinion poll trade" (Frisch 1970, 24; italics in original). Frisch himself would repeatedly try out his interview method in the context of political decision making to discover social preference functions used in macroeconomic models by questioning econometric experts and politicians. Experts were meant to give their judgment about choices for different policies independently of their personal preferences (e.g., Boumans 2015). This might support an argument in favor of Frisch's hypothetical experiments being experiments-in-waiting. One could further argue that Frisch's hypothetical experiments used a rational agent as a measurement instrument and may, as such, be seen as foreshadowing a set of experimental approaches of the 1950s and 1960s by mathematical psychologists and behavioral decision researchers. One dominant idea of these later experiments was to understand measurement 37 theory in psychology as behavior theory. Central questions included how human beings inferred decisions based on their perceptions and whether or not these were rational decisions. Paralleled by an increasing interest in rational choice, psychologists started to see the human being as the ultimate measurement instrument (Heukelom 2010) and as reliably measuring unobservable mental states (e.g., Mitchell, 1999, Heidelberger 2004). Schabas (2018) argues that thought experiments are rare in economics and when we think we have found one, it usually isn't. So, have we found in Frisch's hypothetical experiments thought experiments that have eventually become actual experiments? Notwithstanding the questionable empirical status of his experiments, the set of axioms formulated by Frisch were clearly intended to have a relation to reality, to be inspired by actual observation.32 They were intended to capture the essential features characterizing the phenomenon in question, in this case preference orderings and respective choices, which could then be used as a basis for determining a utility function and thus deriving a quantitative theory of the phenomenon that produced the results of the experiment, namely consumption behavior. They could, furthermore, be used for making several other observations, such as the relation between market prices and quantities (Giocoli 2003, 118, Hands 2006, 163). But their function was not to show something that Frisch could eventually have shown in real experiments. Frisch's hypothetical experiments motivated the choice axioms, which in turn allowed him to do two things: First, to introduce a behavioral theory of choice into the analysis of consumer behavior in order to justify the core behavioral regularities described by economic theory; second, to operationalize a theoretical variable, i.e. utility used in economic theory. Although utility was a concept referring to some unobservable entity and had as such no observable counterparts, it had to be measured. An objective definition of utility would enable Frisch to quantitatively measure utility. This operationalization was not meant to be eventually undertaken by designing an actual experiment. Rather, the goal was to connect economic theory to statistics. 38 I argue that we should understand Frisch's hypothetical experiments not as experimentsin-waiting but as proper thought experiments. They have the aforementioned six characteristics. There is no doubt that Frisch's hypothetical experiments have the character of experiments. The controlled intervention by the scientist targets a rational family's choice situation. Frisch details the 'rules' for the experiment, specifying the one-year period, imposing strict rules on how nature is supposed to develop (everything "is going to happen as planned") and on how the family is supposed to choose and handle the chosen option, etc. Frisch asks us to imagine not only the rational family, their income and their consumption situation, but also the intervention, namely the offer of a choice between two options as a present. He then asks us to imagine how the family would choose under those controlled and highly idealized conditions. Frisch's experiments are also hypothetical in at least two other ways. First, Frisch could not have aimed at eventually performing these experiments with real people because the agents populating his experiments were fictitious and he intended them to be fictitious. A homo oeconomicus only existed theoretically and in hypothetical scenarios. As such, their realization was impossible in principle, and the experiments could never have been implemented. There could never be an actual situation in which because we have the instruments this fictitious agent would populate our actual world. At the same time, the actual individual agent and his particularities were also not Frisch's primary concern. For Frisch, neither the axioms nor the 'choice coefficient' had to be descriptively accurate of the preference ordering of every single individual or of any individual for that matter. Frisch was very clear that the axiomatic representation of preferences was not meant to capture a causal mechanism behind behavior to explain individual consumption. While his axiomatic choice theory was based upon the concept of utility, by introducing his axioms his aim was not to measure utility as a mental state but to measure changes in the general relationships 39 characterizing the economy according to economic theory; that is why the axioms were not directly tested on the basis of the empirical data of individual behavior. It is not called for to argue that one could, by reference to actual life situations, find situations more or less bizarre where this or that axiom is not satisfied. It is always necessary to remind ourselves that in economics we are interested in the general phenomena and not in the isolated cases. The individual is for us only the typical individual. The scientific attitude behind the axiomatic structure is to investigate which consequences we can deduce adopting this or that axiom and then see whether the consequences agree with the observations. It is by the subsequent agreement of the consequences of the axioms with reality that we can judge the plausibility of them (Frisch 1933/2009, 12; italics mine). To Frisch, the general term 'hypothetical' did not only mean that these experiments were far from being, and need not be, actually conducted. They were hypothetical in a second sense, namely in that they allowed the economist to isolate robust regularities – or a causal structure – that could be described by economic theory and would, as such, be useful in thinking about the real world. For Frisch, economic theory describes a 'causal structure' characterizing economic phenomena. But he did not believe that this structure is often shown in reality. This was because, first, the causal complexity of economic phenomena prevents law-like regularities from being visible. What we passively observe by casual observation and in statistical data is the result of multiple causal factors operating at the same time and not a stable regularity that operates in isolation to bring the effect about.33 Controlled experimentation would have been the instrument to isolate causal laws. As they were not available to the econometrician, hypothetical experiments could do the work. How could hypothetical experiments do the work of isolating actual regularities if they are hypothetical in the first sense described above? For Frisch, the laws were hypothetical not only because we cannot passively observe them. For Frisch believed that economic 40 regularities are not at all actual facts about the world. For him, the external world is "essentially chaotic" and neither are laws that we observed "objective" nor are statistical laws invariant (Frisch 1933/2009, 131). That let him to think about truth in a particular way. For instance, in a paper published in 1933 together with the economist Frederick V. Waugh, Frisch explains what it meant for a relation to be true: "An empirically determined relation is 'true' if it approximates fairly well a certain well-defined theoretical relationship, assumed to represent the nature of the phenomenon studied. There does not seem to be any other way of giving a meaning to the expression 'a true relationship.' For clearness of statement we must therefore first define the nature of the a priori relationship that is taken as the ideal" (Frisch and Waugh quoted in Morgan 1990, 150; italics in original). Empirical relationships are considered 'true' when approximating theoretical relations (Morgan 1990, 150). But theoretical relations postulated in economic theories, which Frisch and Waugh also call 'structural relationships', describe only ideal relationships (Morgan 1990). Being ideal theoretical relationships, they are themselves hypothetical according to Frisch. They are not true of the world but result from the scientist's attempt to divide and structure the world. Frisch was explicit about the world ultimately not being so ordered and causality only being the product of the scientist, originating in her mind (see also Bjerkholt and Dupont 2009, sect. 7). As such, the causal structure that economic theory proposed is hypothetical not only because it is not passively observable but because it does not exist in the first place. [A]ny law, any regularity that we have observed is just an effect of the special manner in which we have chosen our coordinate system. But what is then the object of science? The incessant preoccupation of science is to find theoretical schemas, new coordinate systems that fit better and better to the so-called facts. If science finds a discrepancy, it modifies its theoretical scheme, it introduces other variables .... Having done that, it declares triumphantly that now it has succeeded in finding a scheme fitting even better with experience. ... You probably find such 41 a view of science disgusting, you will like better to regard scientific activity as disinterested research for objective truths which are perpetually outside us. And you will probably say that even if we have philosophized long about the chaotic nature of our observations, you will not be convinced by that, because you feel more or less intuitively that man must after all possess a faculty of distinguishing between what you have brought yourself and what nature has provided (Frisch 1933/2009, 142 f.) But Frisch rejects our intuition. He played with an evolutionary explanation of science as a way to secure our survival as a species by helping us structure the world, pointing out a close interdependence between science and the evolutionary development of our species.34 "During this evolution science will certainly from time to time register new fundamental discoveries. But the world which science in that way will discover will be very, very distant from being an objective world. Why then do science? Because we can perhaps by that hope to soften at least a little the pain, that is the only universal and eternal principle which we never will have to question the existence of" (Frisch 1933/2009, 143). On this view, although the world is not ordered in such a way as scientific theories describe, they help mitigate the pain we feel as a species. Hypothetical experiments were thus hypothetical also in that they established a set of theoretical relationships describing behavioral regularities that the scientist could well have formulated differently. To my knowledge, Frisch did not further elaborate on what exactly he meant by economic theory and by the 'structure' this theory referred to. Yet his student Trygve Haavelmo, who was heavily influenced by Frisch (Bjerkholt 2005), elaborated on his epistemology: "whatever 'explanation' we prefer, ... they are all our own artificial inventions in a search for an understanding of real life; they are not hidden truths to be 'discovered'" (Haavelmo 1944, 3).35 In this light, we might conclude that Frisch's hypothetical experiments allow for expeditions to possible worlds while allowing for an infinite number of counterfactual worlds 42 that we could imagine to establish economic theories. Some economic theories will be more useful than others because they are better confirmed by the data than others. But none of those theories is true. Albeit hypothetical, Frisch's experiment nevertheless demonstrated something. Although law-like regularities are a product of the scientist's mind and the theoretical relationships describing them are not true of the world, Frisch considered the kind of reasoning in his choice experiment plausible. Reasoning with a fictional agent had been firmly established in economics since the days of John Stuart Mill. Combined with ceteris paribus reasoning common in early econometrics (e.g., Morgan 1990), Frisch's experiments demonstrated how a consumer would behave – if in a controlled setup imagined by the econometrician and with the preferences of a homo oeconomicus. As such, his experiments helped establish behavioral regularities – albeit non-existent and as such contingent upon the scientist's judgement – for economic theories – albeit untrue – to describe and operationalize utility as a theoretical variable used in those theories so that, in form of structural equations, they could be used for measurement. 9. Conclusion In this paper, I have examined the nature and role of thought experiments in early econometrics, in particular Ragnar Frisch's 'hypothetical experiments'. I have shown that besides the functions of thought experiments already discussed in the literature, they once had yet another function in economics: they helped bridging the gap between theory and statistical data. More specifically, they played two central roles. First, thought experiments helped establishing Frisch's axiomatic theory of consumer behavior by justifying the core principles of human behavior used 43 in economic theory. Second, they allowed for quantitatively defining and measuring utility as a theoretical concept referring to an unobservable entity. This was highly beneficial for econometrics. As mental concepts such as utility did not have an observable counterpart, direct observation was impossible for specifying them. However, to establish theoretical regularities and test economic theories, the variables used in economic theories required measurement. As such, Frisch's hypothetical experiments enabled measurement of relevant economic relationships without requiring actual experimentation into human psychology. But in light of Frisch's epistemology, they clearly had the status of thought experiments in this capacity. 10. Conflicts of interest I hereby confirm that I do not have any conflict of interests that could arise. This project was not financially supported by any funding agency or third party. I am fully responsible for the content of this paper. 11. References Arrow, Kenneth J. 1960. "The Work of Ragnar Frisch, Econometrician." Econometrica 28 (2): 175–92. Bjerkholt, Olav. 2005. "Frisch's Econometric Laboratory and the Rise of Trygve Haavelmo's Probability Approach." Econometric Theory 21 (3): 491–533. ---. 2012. "Ragnar Frisch's Axiomatic Approach to Econometrics." Memorandum [Online], No 21/2012. Oslo: University of Oslo, Department of Economics. 44 https://www.sv.uio.no/econ/english/research/unpublished-works/working-papers/pdffiles/2012/memo-21-2012.pdf (accessed March 25, 2018). Bjerkholt, Olav, and Ariane Dupont. 2007. "Ragnar Frisch's Axiomatic Approach in Econometrics." European Conference on the History of Economics (Siena), 4-6 October, 2007. http://www.frisch.uio.no/Frisch_Website/Axiomatics.pdf (accessed September 17, 2012). ---. 2009. "Ragnar Frisch's Conception of Econometrics." History of Political Economy, 42 (1), 21-73. Boumans, Marcel. 2005. "Measurement Outside the Laboratory." Philosophy of Science 72 (5): 850–63. ---. 2015. Science Outside the Laboratory: Measurement in Field Science and Economics. Oxford: Oxford University Press. Boumans, Marcel, and Ariane Dupont-Kieffer. 2011. "A History of the Histories of Econometrics." In Histories on Econometrics: Annual Supplement to Volume 43, ed. Marcel Boumans, Ariane Dupont-Kieffer, and Duo Qin, 43:5–31. Durham and London: Duke University Press. Brown, James R. 1991. The Laboratory of the Mind: Thought Experiments in the Natural Sciences. London: Routledge. ---. 2011. The Laboratory of the Mind: Thought Experiments in the Natural Sciences. 2nd ed. London: Routledge. 45 Brown, James R., and Yiftach Fehige. 2017. "Thought Experiments." The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/entries/thought-experiment (accessed August 15, 2017). Chipman, John S. 1971. "Introduction to Part II." In Preferences, Utility, and Demand: A Minnesota Symposium, ed. John S Chipman, Leonid Hurwicz, Marcel K Richter, and Hugo F Sonnenschein, 321–31. New York, Chicago et. al.: Harcourt Brace Jovanovich. ---. 1998. "The Contributions of Ragnar Frisch to Economics and Econometrics." In Econometrics and Economic Theory in the 20th Century: The Ragnar Frisch Centennial Symposium, ed. Steinar Strom, 58–110. Cambridge: Cambridge University Press. Colander, David. 2007. "Retrospectives: Edgeworth's Hedonimeter and the Quest to Measure Utility." Journal of Economic Perspectives 21 (2): 215–25. Debreu, Gérard. 1959. Theory of Value: An Axiomatic Analysis of Economic Equilibrium. New Haven/London: Yale University Press. Edgeworth, Francis Y. 1881. Mathematical Psychics: An Essay on the Application of Mathematics to the Moral Sciences. London: C. Kegan Paul & Co. Ellsberg, Daniel. 1961. "Risk, Ambiguity and the Savage Axioms." Quarterly Journal of Economics 75: 643–69. Fisher, Irving. 1892/2007. Mathematical Investigations in the Theory of Value and Prices. New York: Cosimo, Inc. Friedman, Milton. 1969. The Optimum Quantity of Money and Other Essays. London: Macmillan. 46 Frisch, Ragnar. 1926/1971. "On a Problem in Pure Economics." In Preferences, Utility, and Demand: A Minnesota Symposium, ed. John S. Chipman, Leonid Hurwicz, Marcel K. Richter, and Hugo F. Sonnenschein, 386–423. New York, Chicago et. al.: Harcourt Brace Jovanovich. ---. 1930/2010. A Dynamic Approach to Economic Theory: The Yale Lectures of Ragnar Frisch. Ed. Olav Bjerkholt and Duo Qin. Routledge Studies in the History of Economics. London and New York: Routledge. ---. 1932a. New Methods of Measuring Marginal Utility. Tübingen: J. C. B. Mohr (Paul Siebeck). ---. 1932b. "New Orientation of Economic Theory. Economics as an Experimental Science." Nordic Statistical Journal 4: 97–111. ---. 1933/2009. "Problems and Methods of Econometrics: The Poincaré Lectures of Ragnar Frisch". Ed. Olav Bjerkholt and Ariane Dupont-Kieffer. Routledge Studies in the History of Economics. London and New York: Routledge. ---. 1970. "From Utopian Theory to Practical Applications: The Case of Econometrics." in Nobel Prize in Economics Documents [Online], No 1969-1, 9-39. Available from: <http://nobelprize.org/nobel_prizes/economics/laureates/1969/frisch-lecture.pdf>. ---. 1981. "From Utopian Theory to Practical Applications: The Case of Econometrics." The American Economic Review 71 (6): 1–16. Georgescu-Roegen, Nicholas. 1954. "Choice and Revealed Preference." Southern Economic Journal 21 (1): 119–30. 47 Giocoli, Nicola. 2003. Modeling Rational Agents: From Interwar Economics to Early Modern Game Theory. Cheltenham: Edward Elgar. Hands, D Wade. 2006. "Integrability, Rationalizability, and Path-Dependency in the History of Demand Theory." History of Political Economy 38: 153–85. Heidelberger, Michael. 2004. Nature from Within: Gustav Theodor Fechner and His Psychophysical Worldview. Pittsburgh, PA: University of Pittsburgh Press. Herfeld, Catherine (2018): "The Diversity of Rational Choice Theory: A Review Note." TOPOI: An International Review of Philosophy, in New Trends in Rational Choice Theory, special issue ed. Cédric Paternotte, forthcoming. Heukelom, Floris. 2010. "Measurement and Decision Making at the University of Michigan in the 1950s and 1960s." Journal of the History of Behavioral Sciences 46 (2): 189–207. Hoover, Kevin, and Katarina Juselius. 2015. "Trygve Haavelmo's Experimental Methodology and Scenario Analysis in a Cointegrated Vector Autoregression." Econometric Theory 31 (2): 249–74. Jevons, William Stanley. 1871/1888. The Theory of Political Economy. 3rd ed. London: Macmillan and Co. Lenfant, Jean-Sébastien. 2012. "Indifference Curves and the Ordinal Revolution." History of Political Economy 44 (1): 113–55. Lewin, Shira B. 1996. "Economics and Psychology: Lessons for Our Own Day From the Early Twentieth Century." Journal of Economic Literature 34 (3): 1293–1323. 48 Maas, Harro. 2005a. "Jevons, Mill and the Private Laboratory of the Mind." Manchester School 73 (5): 620–49. ---. 2005b. William Stanley Jevons and the Making of Modern Economics. Cambridge: Cambridge University Press. ---. 2014. Economic Methodology: A Historical Introduction. London and New York: Routledge. Michell, Joel. 1999. Measurement in Psychology: A Critical History of a Methodological Concept. Cambridge: Cambridge University Press. Morgan, Mary S. 1990. The History of Econometric Ideas. Cambridge: Cambridge University Press. ---. 2013. "Nature's Experiments and Natural Experiments in the Social Sciences." Philosophy of the Social Sciences 43 (3): 341–57. Moscati, Ivan. 2013. "Were Jevons, Menger and Walras Really Cardinalists? On the Notion of Measurement in Utility Theory, Psychology, Mathematics and Other Disciplines, 1870-1910." History of Political Economy 45 (3): 373–414. ---. 2016a. "Measurement Theory and Utility Analysis in Suppes' Early Work, 19511958." Journal of Economic Methodology 23: 252–67. ---. 2016b. "Measuring the Economizing Mind in the 1940s and 1950s: The MostellerNogee and Davidson-Suppes-Siegel Experiments to Measure the Utility of Money." History of Political Economy 48 (annual supplement): 239–69. 49 Pareto, Vilfredo. 1909/1971. Manual of Political Economy. Trans. Ann S. Schwier. New York: Augustus M. Kelley. ---. 1927/1972. Manual of Political Economy. Trans. Ann S. Schwier, Ed. Ann S. Schwier and Alfred D. Page. London, Basingstoke: Macmillan. Reiss, Julian. 2012. "Genealogical Thought Experiments in Economics." In Thought Experiments in Science, Philosophy, and the Art, ed. Mélanie Frappier, Letitia Meynell, and James R. Brown, 177–90. New York: Routledge. Schabas, Margaret. 2018. "Thought Experiments in Economics." In The Routledge Companion to Thought Experiments, ed. Michael T Stuart, Yiftach Fehige, and James R. Brown. London: Routledge. Schultz, Henry. 1933. "Frisch on the Measurement of Utility." Journal of Political Economy 41 (1): 95–116. Sørensen, Roy A. 1992. Thought Experiments. Oxford: Oxford University Press. Stuart, Michael T, Yiftach Fehige, and James R. Brown. 2018. "Thought Experiments: State of the Art." In The Routledge Companion to Thought Experiments, ed. Michael T Stuart, Yiftach Fehige, and James R. Brown. London: Routledge. Stuart, Michael T. 2016. "Taming Theory with Thought Experiments: Understanding and Scientific Progress." Studies in the History and Philosophy of Science Part A 58: 24-33. Thoma, Johanna. 2016. "On the Hidden Thought Experiments of Economic Theory." Philosophy of the Social Sciences 46 (2): 129–46. 50 Von Neumann, John, and Oskar Morgenstern. 1944. Theory of Games and Economic Behavior. Princeton, NJ: Princeton University Press. Van Fraassen, Bas C. 2008. Scientific Representation: Paradoxes of Perspective. New York: Oxford University Press. Weintraub, E Roy. 2002. How Economics Became a Mathematical Science. Durham and London: Duke University Press. Woodward, James. 1989. "Data and Phenomena." Synthese 79: 393–472. 1 The literature on the nature and role of thought experiments in science is large and cannot be reviewed in this paper; besides, it would be of limited use, as their role and nature differ across disciplines (for an excellent overview of the literature, see Stuart et al., 2018). That thought experiments sometimes bridge the gap between theory and reality more generally has been shown by other philosophers. See, e.g., Stuart (2016) discussing this role for some thought experiments in physics and biology. 2 Econometrics is a branch of economics that aims at quantifying and measuring economic relationships with statistical methods while heavily relying on economic theory. 3 This paper focuses on potential roles that thought experiments can play in economics. Note that I do not make any claims about the nature of thought experiments, i.e. whether they are models, arguments, or something else entirely. 4 The account of Frisch's hypothetical experiments and his early work in demand analysis is part of a much more complex history of econometrics in general and of demand analysis in particular. See Morgan (1990) for a highly instructive account of the history of econometrics, in particular Part II for an introduction to demand analysis in early econometrics. 5 See also Herfeld (2018). 6 Jevons got around the problem by arguing it away. He speculated that the difficulties of measuring utility would eventually be resolved. He also suggested that the problematic unit of pleasure might not be important 51 because individuals compared quantities of different pleasures already in their mind, which made a unit unnecessary. However, Jevons himself did not work through this idea on a conceptual level (Moscati 2013). 7 A related argument along Bentham's lines was that the marginal utility was measurable by the price a consumer would be willing to pay for an additional unit, given that the utility of his income would not be affected by price changes of the commodity (Moscati 2013, 392). 8 For an exposition of Jevons's utility theory, see especially the section "Numerical Determination of the Laws of Utility" in his chapter "Theory of Exchange" in his The Theory of Political Economy (1871/1888, IV.105-107). While Jevons was certainly concerned with the question of how an individual's utility could be measured, he did not think that the measurement of aggregate utility would confront similar challenges. According to Jevons, astronomy and meteorology faced measurement problems similar to those in economics. Nevertheless, theories in those fields could be tested via statistical data despite lacking a definite measuring rod. For him, utility on the aggregate level could be measured in a similar way. 9 See Maas (2005b) for an excellent discussion of Jevons's views on the relationship between the social and the natural sciences, to experimentation in economics, and to the measurement of utility. 10 Economics was influenced by philosophical ideas of the logical positivists and their notion of verificationism around the turn of the 20th century (e.g., Lewin 1996, 1298). 11 Edgeworth was strongly influenced by the German experimental psychologists and psycho-physiologists, such as Gustav Fechner and Ernst Weber, and designed a so-called hedonimeter, a measurement instrument that could relate external stimuli to internal feelings and perceptions (Edgeworth 1881). 12 For a more detailed discussion of the relationship between Edgeworth's, Fisher's, and Jevons's theoretical and methodological approaches to economic behavior as well as their use of psychophysiology in them, see Chaigneau (2002). 13 Frisch remarks: "One could say that Edgeworth represents the psychological, Fisher the antipsychological point of view" (Frisch 1930/2010, 83). One primary reason why Fisher rejected a psychological interpretation of utility was the problem of interpersonal utility comparisons (see Frisch 1933/2009, xxvi). 14 Frisch is explicitly referring to von Böhm-Bawerk's study of the regularities of observable choices to "use them as a tool in his motivation theory" (Frisch 1930/2010, 84; italics in original). 15 Ultimately, Fisher did not make any attempt to actually conduct his metaphorical experiments, nor did he ultimately ground his analysis on them. Rather, for the case of perfect substitutes and perfect complements, he derived the shape of indifference curves from properties of demand behavior that he took the cases of complements and substitutes to have (Lenfant 2012, 119). 52 16 Econometrics was supposed to unify economic theory, statistics, and mathematics. 17 Note that Frisch's problem is a version of the so-called "problem of coordination" in the philosophy of measurement, which has been extensively discussed by Mach, Reichenbach, Bas van Fraassen and others. The problem of coordination originates in the challenge of properly coordinating theoretical concepts with empirical measurement procedures (e.g., van Fraassen 2008, 115 ff.). The problem is that there is a circularity emerging from the attempt to determine the empirical adequacy of a theory by testing it, which presupposes in turn a reliable method of the concepts used in the theory. Arriving at such a reliable method, however, presupposes in turn theoretical knowledge about those concepts. 18 This is an unpublished statement that Frisch originally prepared in 1928 for a publication following up on his participation in a panel discussion of the joint meetings of the American Economic Association (see Bjerkholt and Dupont 2007, 5). 19 As will become apparent below, it does not appear that Frisch committed himself to a priorism, i.e. the view that the fundamental axioms of economics are derived prior to, and independent of, any empirical observation – a view that had been prevalent in economics at the end of the 19th century (Lewin 1996, 1298). 20 Hands (2006, 163) interprets this formulation of Frisch as position-dependent preferences. However, that interpretation seems only to be true for the first type of axioms. 21 The reader can find a modification and extension of these axioms in Frisch's later work, such as in his Poincaré lectures, which Frisch added to solve problems (such as the integrability problem, i.e., defining total utility as the integral of the marginal utility function, as determined by the axioms, along the consumption path) that remained within his first system of axioms; see Frisch (1933/2009, 13). For Frisch's discussion of this problem, see the appendix of his essay added to the translated version and to Frisch (1933/2009, 18 ff.). 22 Frisch attempted to generalize the "choice field" approach at a conference at the Cowles Commission in 1937 (Bjerkholt 2012, 3). The use of the term 'choice field' was clearly inspired by the term 'field theory' in physics. 23 In more technical terms, the "inner product u(x)δx will be the utility of the displacement (x,δx). The vector u(x) will be the marginal utility of the resources x, and the components u1,u2,...,uM of u(x) will be the marginal utilities of the goods 1, 2, ... , M. The vector field so defined will be called the choice field of the individual considered" (Frisch 1926/1971, 394; italics in original). Frisch provided a proof sketch of the measurability of utility in this paper but developed the ideas of the 1926 paper further in subsequent work (Frisch 1926/1971, 388, Chipman 1971, 326 f., 1998, 60), such as in Frisch (1932). 53 24 See also Bjerkholt/Dupont (2009) for an overview of Frisch's axiomatic approach in the context of his conception of econometrics. 25 Frisch intentionally used Pareto's technical term 'ophemility' interchangeably with utility to highlight its purely technical nature in econometrics and to avoid risk of any psychological connotations (see, e.g., Frisch 1933/2009). 26 By the vector u, Frisch referred to the marginal utility of money in his paper with respect to a bundle of goods. 27 As choice questions were a central part of hypothetical experiments, Frisch also referred to these experiments as a method of hypothetical interrogation (e.g., Bjerkholt and Qin 2010). 28 See also Giocoli (2003, 118 f.). 29 Frisch further elaborates on those techniques, for example, in his Nobel Lecture (Frisch 1981, 7). He refers to interview techniques for what he calls "conversational interviews" with decision makers that should be conducted by the econometric expert (ibid.). 30 Note that the following characteristics are neither exclusive nor exhaustive. Thought experiments vary within and across disciplines (see, e.g., Brown 2011, ch. 1, for an overview of various types of thought experiments). 31 See Schabas (2018, 175) on Friedman's helicopter experiment as a thought experiment-in-waiting. 32 Hands makes the claim that Frisch even took those preferences to be actually observable through his experiments (see Hands 2006, 163). Again, I contend that it is important not to overlook Frisch's formulation that those experiments were 'in principle' possible and, as such, need not be actually conduced and/or might confront difficulties when actually conducted (Frisch 1926/1971, 388). 33 This problem is called the 'problem of passive observation' and has been a concern for econometricians ever since (see Boumans 2015, chapter 4). 34 Frisch repeated this evolutionary view of science later in his Nobel Lecture (Frisch 1980, 4). 35 Haavelmo himself introduces the role of experiments in establishing theoretical relationships: "What makes a piece of mathematical economics not only mathematics but also economics is, I believe, this: When we set up a system of theoretical relationships and use economic names for the otherwise purely theoretical variables involved, we have in mind some actual experiment, or some design of an experiment, which we could at least imagine arranging, in order to measure those quantities in real economic life that we think might obey the laws imposed on their theoretical namesakes" (Haavelmo 1944, 5). While an elaboration on the differences and similarities between Haavelmo's and Frisch's views on experiments would be instructive for better 54 understanding the role of thought experiments in econometrics more generally, this can only be a separate project. For a discussion of Haavelmo's methodology and epistemology, see Hoover and Juselius (2015) and Boumans (2015, chapter 4). See also Heckman and Pinto (2015) for an interesting reading of Frisch and Haavelmo and the role of thought experiments in modern econometrics.