1 Introduction

It is a recurring topic in philosophical studies on computer simulations to advance comparisons with mathematical models and laboratory experimentation.Footnote 1 Fritz Rohrlich famously located computer simulations somewhere between traditional theoretical physical science and empirical methods of experimentation and observation, emphasizing that their primary feature is theoretical modeling (Rohrlich, 1990, p. 514). Paul Humphreys presents a working definition where computer simulations implement and find solutions to mathematical models where analytic methods are unavailable, and provide numerical experiments in situations where natural experimentation is inappropriate or unattainable (Humphreys, 1990, pp. 501–502). Eric Winsberg advances a hierarchy of models. At the top of the hierarchy is a theory (general physical and modeling assumptions). After a series of specifications, alterations, and inferences at each level of modeling, it then terminates with a model of the phenomena. This represents the outcome of the computational model that is immediately prior in the hierarchy (Winsberg, 1999, p. 280).

Recently, philosophers have weakened the relation between mathematics and simulations. Johannes Lenhard argues that computer simulations are a new type of mathematical modeling. Simulation models must be “counted into the established classical and modern class of mathematical modeling” (Lenhard, 2019, p. 7). However, one must also take stock on how they “contribute to a novel explorative and iterative mode of modeling characterized by the ways in which simulation models are constructed and fitted” (Lenhard, 2019, p. 7). Lenhard also states as follows:

[o]ne direction seems self-evident: the (further) development of computers is based primarily on mathematical models. However, the other direction is at least just as important: the computer as an instrument channels mathematical modeling (Lenhard, 2019, p. 8).

In previous work, I have put some distance from these viewpoints when reflecting on the plurality of mathematical models involved in the architecture of computer simulations. On my account, mathematical models are recast into a “super-class” of simulation models. This includes: (1) kernel simulations, understood as the implementation of each individual model in the formalism of a programming language, and (2) integration modules”, which play two fundamental roles, namely, they integrate external databases, protocols, libraries and the like with [each kernel simulation], and ensure the synchronization and compatibility among [the kernel simulations] (Durán, 2020, p. 307).

The common assumption here is that we take that computer simulations cannot characterize phenomena without the background conceptual framework that mathematical models provide. This, I submit, configures a form of theory-driven computer simulations. But scientific practice using simulations sometimes tells a different story. Increasingly, simulationists use computer simulations to characterize phenomena (and previously unknown regularities specific to those phenomena) without implementing a clearly defined mathematical model. In such cases, one must ask after the scientific merits of running simulations without a conceptual framework driving the relevant inquiry. What sort of regularities can simulationists uncover? How do they justify the scientific value of these regularities? And, how should this kind of scientific practice be understood?

To answer these questions, I make explicit the uncharacterized exploratory strategies related to computer simulations (Sect. 3). To do this, I must, though, first address core issues in the debate around (1) exploratory strategies and (2) the conditions for theory-driven experimentation (Sect. 2). Based on the conclusions I reach, I then present my first thesis: the non-theory-drivenness of (some) computer simulations. I also discuss some provisos related to the methodology of computer simulations (I discuss the main example in Sect. 3.2). Once the non-theory-drivenness of computer simulations is established, I discuss three exploratory strategies related to computer simulations.Footnote 2 These exploratory strategies are designed for computer simulations. I use the work of Axel Gelfert (2016, 2018) on exploratory mathematical models as a contrastive baseline. I conclude by suggesting further lines of inquiry for exploratory strategies related to computer simulations (Sect. 5).

2 The Theory-Exploration Divide

In this section, I brief reconstruction Friedrich Steinle’s work on exploratory experimentation. He argues that standard views of theory-drivenness (e.g., (Hanson, 1958)) fall short of capturing the complexity and diversity of scientific experiments (Steinle, 1997). Steinle maintains that the standard view of experimentation takes cases where theory underpins experiment to be the only genuine kind of experimentation. Experimental activity is understood in terms of “a theory that led to expecting a certain effect; the expectation led to designing and conducting an experiment; and the success of the experiment counted as support for the theory” (Steinle, 2002, p. 418). However, there are other types of experimentation in scientific research. One such type “typically takes place in those periods of scientific development in which – for whatever reasons – no well-formed theory or even no conceptual framework is available or regarded as reliable” (Steinle, 1997, p. S70). For Steinle, a more adequate approach to scientific experimentation discriminates between two non-exclusive types: (1) theory-driven experimentation and (2) exploratory experimentation.

Theory driven experiments are set up and carried out with “a well-formed theory in mind, from the very-first idea, via the specific design and the execution, to the evaluation” (Steinle, 1997, p. 69). Thus, the theory anticipates the results of the experiment. In fact, Steinle states explicitly that “[t]heory-driven experiments are typically done with quite specific expectations of the various possible outcomes in mind” (Steinle, 1997, p. 70). In this respect, to say that an experiment is theory-driven suggests, at least, three interpretations:

  1. 1.

    The researcher’s expectations about the results of the experiment fall within the framework provided by the theory grounding the experiment,

  2. 2.

    The design and success of the experiment depends on a given theory, or

  3. 3.

    The instruments used for the experiment are theory-dependent.

Some combination of these interpretations is also possible. A canonical example of a theory driven experiment that combines at least the first two interpretations is the crucial experiment of parity non-conservation of weak interactions (see (Franklin & Smokler, 1981; Wu & Ambler, 1957)).

By contrast, exploratory experimentation is characterized by Steinle as generating findings about phenomena that do not appeal to (1) the framework provided by the theory, (2) the theory used to design the experiment, or (3) the theory used to build the relevant instruments. In other words, the experiment and its results render information about phenomena independently, on their own account.Footnote 3

Exploratory experimentation is, by definition, not theory-driven. Nonetheless, it should not be understood as the counterpart of theory-driven experimentation (Steinle, 1997, p. 71). As Steinle puts it, “exploratory experimentation is not one specific and well-defined procedure, but includes a whole bundle of different experimental strategies” (Steinle, 1997, p. 73). A shortlist of such procedures can be found in (Steinle, 1997, p. 70).

With these ideas in mind, I characterize exploratory experimentation in terms of the following: (1) its relative independence from strong theoretical restrictions, and (2) its capacity to generate significant findings that cannot be framed (or cannot be easily framed) within current theoretical frameworks.

A paradigmatic example is the experiments on static electricity performed by Charles Dufay, André-Marie Ampère, and Michael Faraday. As Koray Karaca points out, these experiments were carried out in a new research field. At the time, this research field did not have a well-defined or well-established theoretical framework (Karaca, 2013, p. 97). The results that Dufay, Ampère, and Faraday recorded have helped to advance the study of electromagnetism into the discipline we know today.

Thus understood, exploratory experimentation is meant to fulfill very specific epistemic functions. It is particularly important in cases where a scientific field is open to revision (perhaps, due to its empirical inadequacy). In such cases, exploratory experimentation plays a fundamental role in the fortune of the relevant theories. This is because the experimental findings are not framed within the theory at hand. Exploratory experimentation is also important when it provides useful information about the world that is not implied by the theory itself. More generally, the findings obtained by exploratory experimentation are significant with respect to a variety of goals. These range from practical matters (e.g., learning how to manipulate phenomena) to theoretical goals (e.g., developing alternative conceptual frameworks) (Waters, 2007). Steinle also suggests that findings from exploratory experimentation might have significant implications for our understanding of existing theoretical concepts. This is a primary epistemic function of exploratory experimentation. An example is when researchers attempt to formulate empirical regularities found in exploratory experiments. Researchers are required to (1) revisit existing theoretical concepts and categories or (2) formulate new theoretical concepts and categories to ensure a stable and general formulation of experimental results (Steinle, 2002, p. 419). On the face of it, exploratory experimentation looms large in scientific research. It has a complicated coexistence with theory. Among other factors, this coexistence is based on degrees of independence, capacity to produce findings, and robustness of results.

On the above interpretation, exploratory strategies will have the general character of research activities. These research activities have the goal of generating significant findings about phenomena without fully appealing to, nor entirely relying on a theory of such phenomena. There are two things to note at this point. Firstly, a key issue here is the degree of dependence of experiment on theory. It is rather straightforward that an experiment is theory-driven if results can be anticipated. As we will see in the next section, this issue is pervasive in discussions about computer simulations. Secondly, we must pay attention to the epistemic functions that exploratory experimentation fulfills. Such functions can involve bringing about observable changes in the world illuminated by the experiments. They can, though, also involve more subtle modifications by serving as testing grounds for novel and yet to be stabilized concepts in a new theory.

I now draw on recent discussions of exploratory models (Gelfert, 2016, 2018; Shech & Gelfert, 2019) to address the above-discussed issues in the context of computer simulations.

3 The Exploratory Character of Computer Simulations

A primary concern related to computer simulations as exploratory strategies is that we need an answer to the question of whether computer simulations are theory-driven. To authors like those mentioned in the introduction, the goal of computer simulations is to find a set of solutions to mathematical models. A computer simulation is, then, dependent on theory or (sets of) mathematical models exogenous to the simulation. If one understands things this way, then one is naturally inclined to agree that simulations are theory-driven. This because either (1) the modeler’s expectations regarding simulation outputs fall within the framework provided by the theory or mathematical model forming the basis of the simulation, or (2) the design of the simulation and its outputs depend on a given theory (I elaborate on this claim later in this section).

In what follows, I argue that there many simulations are not theory-driven in Steinle’s sense. This is especially the case for simulations involving complex target systems. Many simulations proceed without a fully developed theory or mathematical model. Moreover, exploratory research employing such simulations is neither driven by pre-existing theoretical concerns nor specifically aimed at testing theoretical constructs. I illustrate these findings with a case of simulating airborne anthrax infection outbreaks in Sect. 3.2. Another issue of interest is that the core epistemic functions of computer simulations can be illuminated in light of exploratory strategies. This is particularly the case for simulations that do not fulfill theory-driven criteria. I discuss three such strategies in Sect. 4.

Drawing from Steinle, theory-driven computer simulations suggest, at least, three different interpretations:

  1. 1.

    Simulationists’ expectations regarding a simulation output fall within the framework provided by the theory or mathematical model that is implemented by the simulation,

  2. 2.

    The design and success of the simulation depends on a given theory, or

  3. 3.

    The instrument used for the simulation (the physical computer) is theory-dependent

In the context of theory-driven computer simulations, I submit that interpretation 1 and 2 are equivalent. Specifically, the simulation (and therefore its design) is, by definition, an implemented mathematical model. Thus, the success of the simulation depends on how successfully that mathematical model is computed. It follows that any expectation about simulation outputs can be ascribed to the implemented mathematical model. If so, then 1 and 2 are equivalent.

Later in this section, I argue for the non-theory-drivenness of computer simulations. For this, I return to interpretations 1 and 2. However, this time around I will not discuss interpretation 2. This is because rejecting 1 is sufficient for entrenching the non-theory-drivenness of computer simulations. Indeed, if it can be shown that simulationist expectations regarding the outputs of their simulations do not fall within an implemented and exogenous mathematical model, then it does not really matter whether those simulations depend on a given theory. Even if they did, the opacity of simulations (epistemic, methodological, procedural, etc.) would hamper any attempt to show that a simulation depends on any given theory (Durán & Formanek, 2018; Beisbart, 2021; Humphreys, xxxx)).

Admittedly, the validity of my claim depends on how we interpret the ‘dependence’ relation in 2. Unfortunately, Steinle says very little about this. Nonetheless, we can gain some insight if we consider exploratory experiments that are “driven by the elementary desire to obtain empirical regularities and to find out proper concepts and classifications by means of which those regularities can be formulated” (Steinle, 1997, p. 70). In this respect, experiments are dependent on theory when the empirical regularities uncovered can be formulated in terms of previous theoretical concepts and classifications. Similarly, simulations are dependent on theory when their outputs can be formulated in terms of previous and exogenous mathematical models. However, as before, opacity prevents any researchers from doing so.

Regarding interpretation 3, it seems obvious that simulation outputs depend on a computer. A computer, in turn, depends on transistors, silicon-based chips, and other physical components. These, naturally, depend on theory in some or other way. Thus, simulations are theory-driven in, at least, one trivial sense. This interpretation is, though, largely irrelevant. Unless the simulation is of the physical workings of the computer, the output will never be related to theories about the computer itself.Footnote 4

3.1 The Strong and Weak Sense of Expecting

Interpretations 1 and 2 appear conceptually equivalent, while interpretation 3 does not seem to apply to computer simulations at all. Interpretation 1 suggests that computer simulations are theory-driven when their outputs can be expected from implemented mathematical models. This is a strong condition that Steinle imposes. Yet, it works as a preliminary benchmark for distinguishing simulations that are “designed for the one well-informed theoretical question - and only for that” from other kinds of simulations (potentially designed for exploratory purposes) (Steinle, 1997, p. 70).

Upon closer inspection, it is possible to identify, at least, two different senses of expectation. A strong sense, where simulationists can anticipate (i.e., know a priori or predict) simulation outputs before running a simulation; and a weak sense, where simulationists can trace back simulation outputs to implemented mathematical models (e.g., by explaining the simulation outputs with the mathematical model).

The strong sense takes inspiration from laboratory experiments where experimenters anticipate that an effect or phenomenon will occur, and they then set up their experiment to that end. For example, theory indicates that a stone thrown at 45\(^{\circ }\) will travel the longest possible distance compared to those thrown at other angles. Experimenters corroborate this by designing experiments under varying conditions.

For computer simulations, anticipation in the strong sense is more difficult to obtain. It is only possible if simulationists analytically calculate the mathematical model. As it turns out, not every model can be analytically solved (Humphreys, 1990). And, not every simulationist can predict the outputs of their simulations. As mentioned, different forms of opacity prevent simulationists from knowing a priori and predicting simulation outputs. I take it, then, that we need not discuss the strong sense any further.Footnote 5

We are left out with the weak sense (which seems to influence most of the literature on computer simulations). At its heart, is the intuition that a conceptual link can be established between simulation outputs and the implemented mathematical model.Footnote 6 Mary Morgan suggested that simulation outputs can be traced back to the implemented mathematical models by explaining the former using the latter.Footnote 7 Because of this possibility of tracing back outputs, simulations have the capacity to surprise simulationists, but not confound them ((Morgan, 2003, p. 219) and (Morgan, 2005, p. 321)). Surprise involves simulation outputs ‘shaking’ the simulationists’ conceptual framework by revealing unforeseen new patterns (mostly, for the reasons given against the strong sense). However, simulation outputs do not confound simulationists. This is because simulationists“[know] the resources that went into the model [...] so that however unexpected the model outcomes, they can be traced back to, and re-explained in terms of, the model” (Morgan, 2005, pp. 324–325),Footnote 8 Contrast this thought with the idea that only laboratory experiments can confound. Laboratory experiments discover new entities in the world, they offers unprecedented observations and measurements, and they can be use to confirm or refute hypotheses. The critical difference lies in the fact that experiments track causal relations in the world while simulations only represent them (Morgan, 2005, p. 324).

I take it to be uncontroversial that simulation outputs can sometimes be traced back to an exogenous mathematical model. Measurements with simulations can be informed by mathematical models (Morrison, 2009; Tal, 2011). For example, computer simulations are employed to correct for the effects of interfering factors when measuring temperature. Outputs can be traced back to “the earlier temperature [...], thermodynamic theory, and our knowledge of the initial temperature of the thermometer” (Parker, 2017, p. 285). Similarly, in climate modeling, simulations are thought of as being “path dependent” in the sense that the choices that modelers make about how to solve problems at a certain time will affect which options are available for solving problems at a later time (Lenhard & Winsberg, 2010, p. 257).

That said, tracing back simulation outputs is increasingly becoming either impossible or undesirable. In (2017), I argue that simulations carry artefacts coded in their algorithm. These artefacts cannot be explained by an exogenous mathematical model (or, if they can, the mathematical models will misrepresent the target phenomenon). The example is an orbiting satellite under tidal stress. Upon computation, the output shows an orbital eccentricity trending steadily downwards. As I state, “[i]f the entire set of results is taken to represent real-world phenomena, [as an advocate of explaining with an exogenous mathematical model would], then we are wrongly ascribing a trend towards a circular orbit to the behaviour of the real-world satellite, when in fact it is an artefact in the computation of the simulation model” (Durán, 2017, p. 32).

Another example is simulations that upscale or downscale time, space, and other units of representation. An interesting case is the simulation of an imaginary piston to accelerate reactions in molecular dynamics. The piston is a computational trick to periodically force molecules into high-density configurations, thereby increasing the number of collisions and kinetic barrier crossings. This triggers specific chemical reactions, which can be mapped out and analyzed using high-level quantum calculations to determine the minimum energy pathways of specific reactions. The simulation assists in identifying new materials and reactive conditions worth investigating (Goldman, 2014). Simulationists can only trace back simulation outputs to a mathematical model using very laborious techniques (if it can be done at all). Neither empirical measurements nor observations of the piston are available. Derivations from theory are a non-starter. And, full explanations are only possible under specific conditions related to the model, the explanatory relation, and the level of description. Yet, these simulations are very valuable for science and engineering.

It is important to note the widely accepted practice of omitting any formal representation of a target system in favor of a ready-made algorithmic structure. This means that simulationists increasingly opt to code their simulation directly rather than write equations and then combine them into a mathematical model that is later implemented in the computer. Writers like Steven Peck (2012), & Donald et al., 2014 have shown how agent-based simulations might be nothing more than an algorithm framing agents’ behavior. In this sense, the model’s representation takes place at the level of algorithmic structures without mediating a mathematical model. The representation is built up from suspected relational structures, structures that are abstracted from the target system and directly coded into the simulation model. This happens because it is a faster way to code simulations. However, as I will argue in the next section, it is also because programming languages are designed to represent the dynamics of target systems in ways that mathematical formalisms struggle to handle.Footnote 9 In such cases, simulations do not conform to any sense of theory-drivenness. However, they can uncover regularities about real-world phenomena and the simulationists’ hypothesis.Footnote 10

3.2 Simulating Airborne Anthrax Infection Outbreaks

To illustrate the non-conformity of simulations to theory-driven claims, consider simulating the dynamics of airborne anthrax infections in mid-dense populations. Such a simulation has two primary components: (1) a Bayesian network containing millions of nodes. The network is represented using pure mathematical formalism; and (2) a large partition of different types of nodes representing individual persons strategically distributed within the network. At the highest level, the network consists of a set of global nodes, G, a set of interface nodes, I, and a set of person sub-networks \(P = \{P_1, P_2, \dots ,P_n\}\) (Cooper et al.., 2004, p. 95).

A fair simplification of the workings of the simulation takes the following form. It first computes the actual probability of a person being exposed to anthrax. This is done by inferring the posterior probabilities of outbreak diseases in the Bayesian network. Next, it computes the spatial distribution of a person exposed to anthrax. This is determined by the person node, which contains a network of interconnected sub-nodes (e.g., age decile, gender, and anthrax infection) (see Figure 3, (Cooper et al.., 2004, p. 98)); and the anthrax infection node, which takes up to four different states: no infection, 24 h infection, 48 h infection, and 72 h infection (for details, see (Cooper et al.., 2004, pp. 98–99)). For timely detection, inferences must be performed in real time as data comes in. Once the probability of an outbreak exceeds a particular threshold, the system generates an alert.

To illustrate how computer simulations represent the dynamics of anthrax infection in ways that exceed mathematical representation, consider the case of nested conditionals (see pseudo-algorithm 1).Footnote 11 Nested conditionals are suitable for representing different paths in the proliferation and spread of the infection, and possible states representing how the infection is transmitted across different networks. However, simulationists do not write down nested conditionals in mathematical formalism. Instead, they are directly coded into the simulation.

figure a

Moreover, mathematical formalism does not account for the selection of nodes, their spatial distribution, or the fact that a given node can take up to four different states. The reason seems to be that aggregating these details in a mathematical syntax would unnecessarily complicate the development of the simulation, with no obvious added value to the simulationists’s overall understanding of the simulation. Programming languages can effortlessly and adequately aggregate these details. In the process, they foster epistemic virtues like simplicity, conceptual clarity, precision, and scope of application.

Finally, consider what would happen if some branches in the conditionals were removed or connected in a different ways. In the former case, the simulation would simply be a handful of Bayesian networks disconnected at the high-level (i.e., at the level of relations between G, I, and the person nodes \(P_i\)). If a connection breaks during execution, the simulation will not provide any reliable information. In the latter case, different probabilities of contagion will be measured, and a different dynamics of the outbreak represented. But, none of this can be captured by mathematical formalism (despite mathematical syntax being key to representing the dynamics of anthrax infections). Again, the probabilities are mathematical, and they can therefore be formulated. But, the overall structure and behavior of the simulation are not.

4 Functions and Uses of Exploratory Simulations

If the above considerations are correct, then simulations generate significant findings about phenomena. And, they do so without having to appeal to or rely on exogenous mathematical models of those phenomena.

In what follows, I cash out these results by discussing three exploratory functions for computer simulations. While these are designed for computer simulations, it will be informative to compare them to Gelfert’s exploratory functions of mathematical models. This allows us to understand how simulations depart from models when it comes to exploratory functions and uses.

4.1 Exploratory Simulations as Starting Points and Continuation of Scientific Inquiry

Gelfert’s first use of exploratory modeling finds inspiration in William C. Wimsatt’s work on false models as a means to truer theories (Wimsatt, 2007, p. 94). Wimsatt’s core idea is that models are epistemically biased in one way or another. Idealizations, abstractions, and approximations are proof of this. Given that models are presumably false, Wimsatt wants to know how they ever contribute to truth (or at the best approximation to truth). This is, in a nutshell, Wimsatt’s project of counterbalancing anti-realistic claims on idealizations and abstractions in modeling in favor of local realism. In any event, the truth or falsity of models is not Gelfert’s primary concern. To his mind, “in the early stages of inquiry, [it] may be impossible to judge, given the lack of a good theoretical measure” (Gelfert, 2016, p. 84). Gelfert does, however, recognize that the issue invokes a legitimate concern. Models are highly idealized and abstract units of analysis. Oftentimes, very little epistemic input can be ‘squeezed’ out of them. Gelfert’s response is that models have great potential as future avenues of research. This is where their exploratory character ultimately resides (Gelfert, 2016, p. 84). It is in this general sense that Gelfert presents exploratory models as starting points for future inquiry.

Computer simulations, I submit, are excellent as starting points for future inquiry. They can increase the level of realism in representing the target system to impressive levels. Simulations can also be highly predicatively accurate, making them a crucial tool for opening new lines of scientific research. Consider two side-by-side simulations used to investigate the conditions of an epidemic in a population-dense country like Italy (Ajelli & Gonçalves, 2010).

The first simulation is a stochastic agent-based simulation that represents individuals through highly detailed data on (1) the relevant socio-demographic structure, (2) the probability of commuting from municipality to municipality, and (3) the integration of susceptible, latent, asymptotic, and symptomatic forms of infection. Marco Ajelli and colleagues define this agent-based model as “a stochastic, spatially-explicit, discrete-time, simulation model where the agents represent human individuals [...] One of the key features of the model is the characterization of the network of contacts among individuals based on a realistic model of the socio-demographic structure of the Italian population” (Ajelli & Gonçalves, 2010, p. 4)

The second simulation is a multiscale mobility network known as GLobal Epidemic and Mobility (GLEaM). It is based on high-resolution population data, where the resolution is given by cells with 15 × 15 min of arc. A typical GLEaM consists of three data layers. A first layer where population size and mobility allows one to partition the world into geographical regions. This partition defines a second layer, the sub-population network where the inter-connections among nodes stand for the fluxes of individuals via general mobility patterns and transportation infrastructures (train stations, taxi pick up spots, residential parking, etc.). Finally, and superimposed onto the second layer, is the epidemic layer. It defines the disease dynamic inside each sub-population group (Balcan & Colizza, 2009). In Ajelli et al.’s study, the GLEaM also represents a grid-like partition where each cell represents the closest airport. The sub-population network uses geographic census data. The mobility layers obtain data from different databases. These include the International Air Transport Association database, which consists of a list of the world’s airports connected by direct flights.

Both simulations are dissimilar in what they represent. GLEaM considers spatial structure and age structure, while the agent-based model is highly structured and considers households, schools, etc. The two simulations are expected to present different attack rates at different times. However, the difference in peak amplitudes decreases for increasing values of the reproductive number \(R_0\). Ajelli et al. explain this as follows “[a]t the end of the epidemic outbreak, the average size predicted by GLEaM ranges from 36% for \(R_0=1.5\) to 56% for \(R_0=2.3\),as compared to the one observed in the agent-based model which ranges from 26% for \(R_0 = 1.5\) to 49% for \(R_0 = 2.3\), with an absolute difference of about 10% for \(R_0 = 1.5\) and 7% for \(R_0 = 2.3\) (Ajelli & Gonçalves, 2010, p. 8).

The good match between the two simulations’ ability to predict the geotemporal spreading pattern of an epidemic supports obvious lines of inquiry that pertain to both the target system and the simulation. A short list would include, among others, an identification of the most vulnerable nodes in the network, the best way to calibrate initial conditions, and mechanisms to assess medical reporting and notification systems’ reliability.

Computer simulations do not only serve as starting points for future inquiry. They also enable continued exploration by other means, and they can pave the way for new kinds of research (vis-à-vis what, where, and how to explore). This is because they accompany the entire process of research. Thus understood, simulations offer exploratory strategies for the continuation of scientific inquiry on two accounts.

Firstly, adding new modules to the simulation facilitates the inclusion of new target phenomena. This allows the simulation to represent more phenomena (quantitatively and qualitatively), add numerical and visual realism to the target system, and enjoy higher levels of predictive accuracy. An example is a simulation that quantifies the economic and demographic impact of transportation infrastructure investments (Diaz & Behr, 2016). Initially, the research depends on a simple simulation. But, over time and with module accumulation, the research finds new avenues of inquiry.Footnote 12

The most recent version of this simulation employs a dynamic model, one that mimics the behavior of complex and cyclical transportation relations over time. These include transportation infrastructure, levels of productivity, congestion, net migration patterns, and travel behavior and demand. These relations do not come from a single model, nor do they come from one set of model-building practices. Rather, they come from an accumulative process. This process is triggered by adding new simulation modules to the initial simulation, modules that potentially expand the line of research. As such, the simulation provides insight into the duration of critical cyclical patterns given prospective infrastructure investments. It also “seeks to be utilized as guidance to support decision-making processes that lead to the execution of more exhaustive transportation studies that organize the execution of such investments” (Diaz & Behr, 2016, Abstract). As the authors indicate, building simulations with multiple simulation modules representing the transportation infrastructure enables more exhaustive studies on transportation.

Secondly, simulations offer exploratory strategies for the continuation of scientific inquiry. They do so by advancing new research in neighboring disciplines. Take, for instance, simulating the resistance of human bones, which is necessary for understanding their internal structure. In real material experiments, force is mechanically exerted, it is measured, and data is collected. Unfortunately, a material experiment does not allow experimentalists to distinguish the strength of the material from the strength of its structure. The mechanical process involved also destroys the bone. This makes it difficult to observe and analyze how the detailed internal structure responds to increasing force. Running a computer simulation is the best way to obtain reliable information about the resistance of human bones. Two types of simulations were utilized at the Orthopaedic Biomechanics Laboratory at the University of California, Berkeley (Keaveny & Wachtel, 1994; Niebur & Feldstein, 2000).

Type 1 involved converting a real cow hipbone into a computerized image. The team cut very thin slices of the bone sample. They then prepared them in a way that allowed the complicated bone structure to stand apart from spaces where there was no bone. Each slice, afterwards, was turned into a digital image (Beck & Canfield, 1997). These digitalized images were later reassembled in a computer, creating a high-quality 3-D image of the real cow hipbone. The benefit of this simulation is that it retains a high degree of verisimilitude in structure and appearance for each bone sample. Little is added, removed, filtered, or replaced in the process of preparing the bone and in the process of turning it into a computer model.

Type 2 involved computerizing a stylized bone as a 3-D grid image. Each individual square within the grid was given assorted widths. These were based on average measurements of internal strut widths from real cow bones and angled in relation to each other by a random assignment process (see (Morgan, 2003)). The advantage of the stylized bone comes from familiarity with the process of modeling. Simulationists begin by hypothesizing a simple grid structure. They then add details and features as needed. In this way, an idealized and simplified abstract structure of the bone is created from the beginning.

Both type 1 and type 2 simulations are theory-driven in design and aim. They investigate how the bone structure behaves under conditions of stress and pressure. Both simulations also allow researchers to learn how bone architecture responds in real accidents and how bones are best repaired. However, the simulations’ exploratory character stems from generating significant findings about phenomena in neighboring disciplines without fully appealing to the implemented mathematical model. For instance, these simulations advance medical inquiry into musculoskeletal forces in the body, the dynamic behavior of collagen in tissue, and the effect of ageing and drug treatment (Christen & Webster, 2010, p. 2660). They also enable the development of mathematical formulae related to micro-finite element analyses (Christen & Webster, 2010, p. 2657) and the physical basis of energy absorption prior to bone fractures (Christen & Webster, 2010, p. 2661). None of these advancements and findings fully depend on an exogenous mathematical model, even for cases of theory-driven simulations.

4.2 Exploratory Simulations as Varying Parameters

In his treatment of exploratory experimentation, Steinle argues that a key exploratory property is the possibility of instantiating different experimental parameters. This has the purpose of allowing experimenters to learn which conditions are indispensable for phenomena of interest (Steinle, 1997, p. S69). Gelfert is less enthusiastic about varying parameters. He thinks that they “may come too cheaply” (Gelfert, 2016, p. 79). Varying parameters are, at best, a generic kind of exploratory strategy. It is not the most favorable way of “generating understanding or granting more solid epistemic access to a target phenomenon” (Gelfert, 2016, p. 79). Gelfert’s caution stems from his view that experimentation and modeling afford different accesses to a target system. Experimenters causally intervene in nature to explore its dynamics, while modelers vary parameters for curve-fitting purposes (Gelfert, 2016, p. 79). I contend that computer simulations tell a different story.

Scanning the space of model parameters in computer simulations does, indeed, come ‘cheaply.’ This is for good reason. Simulationists can considerably increase the number of parameters under study (degree of freedom, number of independent and dependent variables, etc.). Simulationists can also manipulate the range of values that each parameter can take—amounting to a wider rage interval than than experimentalists and modelers can deal with. Yet, this is surely not a reason to consider simulations as the least favorable way to generate understanding of a target phenomenon. Recall the simulation of human bone resistance. Simulationists can test a larger number of parameters. Simulationists can also test these parameters within the total physical range (from no pressure to the point of breakage) (Keaveny & Wachtel, 1994, p. 1313).

But, there is more to it. Varying parameters in computer simulations assists multiple scientific goals. These include bringing about outputs that might or might not represent real-world phenomena, laying out the conditions for generating knowledge and understanding of these outputs, enabling the exploration of factual and counterfactual scenarios, and offering ways to manipulate time, space, and other scales to permit the collection of data-based evidence for settling scientific disputes.

Consider the problem of the resistance of materials to heat and pressure. Simulationists use computer simulations to explore an array of materials’ atomic properties. They also explore structures and their reactions to a given range of heat and pressure. Without the simulationists’ intervention, the simulation can form combinations for the required materials by selecting from known atom types and properties. Nir Goldman (2014) demonstrates how one can effortlessly achieve this with computer simulations. Experimentalists can later test the newly found structures under the required conditions of heat and pressure. By scanning the space of parameters, simulations work as exploratory experiments. And thus they are recognized: “[t]hese types of results [i.e., the simulation outputs] can make experiments more tractable by aiding in their interpretation, and helping to narrow the number of different materials and reactive conditions to investigate” (Goldman, 2014, p. 1033). This specific simulation is also able to freely predict structures and reactions based on quantum interactions and classical equations of motion in the simulation model (along with databases, data structures, design decisions, etc.). The simulation model that initially lacked predefined parameters can now ‘find’ those parameters by computing it. Goldman does not think that this is a disadvantage (we cannot even begin to understand what an experiment or a model would be without predefined parameters). Instead, he believes that it is of crucial scientific value for the simulation. He writes: “[t]his lack of predefined parameters or elementary steps makes the technique an appealing tool to help determine the seemingly innumerable synthesis mechanisms that sometimes occur in a mixture of reactive compounds” (Goldman, 2014, p. 1034).

Varying the number of parameters also assists in theoretical inquiries (e.g., finding optimal solutions to multidimensional problems). Consider simulated annealing as an optimization technique that aims to find the best solution to a multidimensional problem. Multidimensionality here means that the simulation must deal with several variables involved in the solution to the problem. This makes it impossible to simply “draw a plot in two or three dimensions and to inspect it visually” (Bailer-Jones, 2009, p. 63). The optimization technique of simulated annealing enables the exploration of the domain of optimal solutions. These solutions can later assist in the physical process of annealing when a material is heated to a very high temperature and then slowly cooled.

The same applies to computer simulations that provide theoretical feedback and predictions of values for parameters by systematically exploring their range of possible values. For instance, LOTUSES is a computer simulation used for predicting the theoretical performance of parameters in explosive decomposition products after explosion as well as their power index (e.g., TNT, PETN, RDX, and HMX). These parameters include performance parameters (density, detonation factor, velocity of detonation, etc.) and thermodynamic properties (heat of detonation and explosion, volume of explosion gaseous products, etc.) (Muthurajana & Sivabalanb, 2004).

I submit that simulationists can grant genuine epistemic access to target phenomena by varying the parameters of computer simulations. This constitutes a legitimate exploratory strategy.

4.3 Exploratory Simulations as Scientific Prototyping

A final exploratory strategy associated with computer simulations is prototyping. Prototyping is the capacity to produce preliminary versions of phenomena from unrealistic scenarios. A case in point is the anthrax infection outbreak discussed earlier. Copper and colleagues used the simulation as an ‘experiment’. They mapped simulated cases of patients with anthrax (generated from a separate model) onto background data of real patients who visited emergency departments during a period when no known outbreaks were occurring (Cooper et al., 2004, p. 95). In this case, the patients with anthrax were not real, nor were the simulated periods actual outbreak periods. Nonetheless, the simulated experiment allowed simulationists to measure some important variables. These included the predictive accuracy of the simulation (key for judging its reliability in real cases), the reaction time of the simulation in detecting outbreaks (key for knowing the minimum time required by first responders), and the breadth of applicability of the simulation (key for calibrating to realistic scenarios).

What separates prototyping from previous exploratory strategies is that the target phenomena are highly speculative, contingently non-existent, necessarily impossible, or counterfactual. Thus understood, prototyping is a genuine form of exploratory strategy, one that simulationists use to manufacture multiple kinds of unrealistic scenarios. This, on two accounts. Prototyping is possible under conditions of non-theory-driven simulations (as the anthrax example demonstrates). But, it is also possible under conditions of well-known mathematical models. Consider simulating a satellite orbiting around a planet whose gravitational constant is set to \(G = 2.0 \times 10^{-11} {\text{m}}^3 {\text{kg}}^{-1} {\text{s}}^{-2}\) (Woolfson & Pert, 1999). Here, G patently violates the gravitational constant. As such, any simulation output will be physically impossible (to the known physical world). Yet, it would not be too difficult to trace back and explain why a satellite spires away from a planet by using the equations of classic mechanics. Prototyping can, then, be considered a genuine form of exploratory strategy, one that does not depend on whether the simulation is theory-driven or not. Like other forms of exploration, prototyping aims at furthering scientific research. This research ultimately aggregates knowledge to diverse scientific and engineering disciplines.

5 Final Remarks

Exploratory experiments and explorative models do not exhaust the domain of exploratory strategies in scientific research. Computer simulations can contribute in meaningful ways. In this article, I have attempted to build and expand on these issues. In doing so, I hope to have contributed to advancing the literature on exploratory strategies.

I have discussed three forms of exploratory simulations: computer simulations as starting points and the continuation of scientific inquiry, as varying parameters, and as scientific prototyping. Computer simulations are increasingly employed across scientific disciplines. Advances in computer languages, data sources, and programming practices mean that we can anticipate a rise in novel and unprecedented exploratory strategies. As such, the exploratory functions and uses I have presented here are unlikely to be exhaustive. This underscores why the ongoing debate around this important subject should continue.

I wish to mention two reasons as to why philosophers should pay more attention to computer simulations as exploratory strategies. Firstly, computer simulations often encompass exploratory functions even when they are driven by specific theories. This is demonstrated by examples like simulating human bone breakage and a satellite orbiting a planet. This suggests that the realm of exploratory strategies for computer simulations extends beyond the non-theory-driven requirement.

Secondly, computer simulations can incorporate multiple exploratory strategies without one strategy dominating another. To illustrate, we can compare a stochastic agent-based model and a structured meta-population stochastic model (GLEaM) used to study the dynamics of an epidemic outbreak. The agent-based model represents the population with detailed socio-demographic data. In contrast, the GLEaM simulation focuses on population data that represents individual flows through transportation infrastructure and general mobility patterns. I previously discussed these simulations as a starting point and continuation of scientific inquiry. But a similar argument can be made for using them in prototyping. The simulations depict how an epidemic outbreak might unfold in a country like Italy (considering its population and transportation characteristics). However, they lack the empirical data required to validate the simulation. The simulations rely heavily on speculative assumptions and counterfactual representations.Footnote 13