Skip to main content

Calibration, Validation, and Confirmation

  • Chapter
  • First Online:

Part of the book series: Simulation Foundations, Methods and Applications ((SFMA))

Abstract

This chapter examines the role of parameter calibration in the confirmation and validation of complex computer simulation models. I examine the question to what extent calibration data can confirm or validate the calibrated model, focusing in particular on Bayesian approaches to confirmation. I distinguish several different Bayesian approaches to confirmation and argue that complex simulation models exhibit a predictivist effect: Complex computer simulation models constitute a case in which predictive success, as opposed to the mere accommodation of evidence, provides a more stringent test of the model. Data used in tuning do not validate or confirm a model to the same extent as data successfully predicted by the model do.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    See also (Box 1979, p. 202): “All models are wrong but some are useful.”

  2. 2.

    This can be seen as follows. By Bayes’s Theorem p(h|e.b) = p(e|h.b)x p(h|b)/p(e|b). If p(e|b) = 1-ε, for some small number ε, then p(h|e.b)/p(h|b) ≈ p(e|h.b) x (1 + ε). That is, the posterior probability p’(h|b) = p(h|e.b) cannot be appreciably larger than p(h|b). A version of the problem also arises if we replace the Principle of Conditionalization with Jeffrey Conditionalization, which presupposes that observations result in non-inferential changes in the probability of an evidential statement e. As in the traditional formulation, the problem is that the probability of evidential statements does not change for old evidence.

  3. 3.

    Steele and Werndl (2016) are among the very few dissenters from this consensus and suggest that the Bayesian formalism can be applied to the case of climate-model calibration directly and without any modification. Yet curiously they argue that a direct application of the Bayesian formalism implies that successful calibration against known data can confirm a model. The argument they give, however, is mistaken. Their discussion focuses on the case of comparative confirmation. Whether one hypothesis h1 is confirmed with respect to another hypothesis h2 is given by the following ratio (where conditionalization on background beliefs is left implicit): p(h1|e)/p(h2|e) = p(e|h1)/p(e|h2) x p(h1)/p(h2). Steele and Werndl maintain that this ratio can change as a consequence of calibrating our models against known evidence and hence that one model can be incrementally confirmed or disconfirmed with respect to another model: “For the Bayesian, calibration is not really distinct from confirmation.” Yet they also (as is standard) assign known evidence probability one: “When new data are learnt, the relevant evidence proposition is effectively assigned a probability of one.” (Ibid.) But then in the case of calibration against data e that have been previously known the likelihoods p(e|h1) and p(e|h2) are both equal to one and hence p(h1|e)/p(h2|e) = p(h1)/p(h2). Thus, a direct application of Bayesian reasoning yields exactly the opposite conclusion from the one Steele and Werndl want us to reach.

  4. 4.

    The discussion below follows closely my presentation in (Frisch 2015).

References

  • Barnes, E. C. (2008). The paradox of Predictivism. Cambridge, New York: Cambridge University Press.

    Google Scholar 

  • Barnes, E. C. (1999). The quantitative problem of old evidence. The British Journal for the Philosophy of Science, 50(2), 249–264.

    Article  MathSciNet  Google Scholar 

  • Bellprat, O., Kotlarski, S., Lüthi, D., & Schär, C. (2012). Objective calibration of regional climate models. Journal of Geophysical Research: Atmospheres, 117(D23). https://doi.org/10.1029/2012JD018262.

    Article  Google Scholar 

  • Box, G. E. P. (1979). Robustness in the strategy of scientific model building. In R. L. Launer & G. N. Wilkinson (Eds.), Robustness in Statistics (pp. 201–36). Academic Press. https://www.sciencedirect.com/science/article/pii/B9780124381506500182.

  • Brush, S. G. (1994). Dynamics of theory change: The role of predictions. In PSA Proceedings of the Biennial Meeting of the Philosophy of Science Association 1994 (January) (pp. 133–45).

    Google Scholar 

  • Christoph, B., Reto, K., & Gertrude, H. H. (2017). Building confidence in climate model projections: An analysis of inferences from fit. Wiley Interdisciplinary Reviews: Climate Change, 8(3), e454. https://doi.org/10.1002/wcc.454.

    Google Scholar 

  • Cartwright, N. (1983). How the laws of physics lie. Oxford University Press.

    Google Scholar 

  • Douglas, H., & Magnus, P. D. (2013). State of the field: Why novel prediction matters. Studies in History and Philosophy of Science Part A, 44(4), 580–589.

    Article  Google Scholar 

  • Ellery, E., & Fitelson, B. (2000). Comments and criticism: Measuring confirmation and evidence. Journal of Philosophy, 97(12), 663–72.

    Google Scholar 

  • Ellery, E., & Fitelson, B. (2002). Symmetries and asymmetries in evidential support. Philosophical Studies, 107(2), 129–42.

    Google Scholar 

  • Frisch, M. (2015). Predictivism and old evidence: A critical look at climate model tuning. European Journal for Philosophy of Science, 5(2), 171–190. https://doi.org/10.1007/s13194-015-0110-4.

    Article  MathSciNet  Google Scholar 

  • Garber, D. (1983). Old evidence and logical omniscience in Bayesian confirmation theory. http://conservancy.umn.edu/handle/11299/185350.

  • Gleckler, P. J., Taylor, K. E., & Doutriaux, C. (2008). Performance metrics for climate models. Journal of Geophysical Research: Atmospheres, 113(D6), D06104. https://doi.org/10.1029/2007JD008972.

    Article  Google Scholar 

  • Glymour, C. (2010). Why I Am Not a Bayesian. In Philosophy of Probability: Contemporary Readings. Routledge.

    Google Scholar 

  • Glymour, C. N. (1980). Theory and evidence. Princeton, N.J.: Princeton University Press.

    Google Scholar 

  • Golaz, J.-C., Horowitz, L. W., & Levy, H. (2013). Cloud tuning in a coupled climate model: impact on 20th century warming. Geophysical Research Letters, 40(10), 2246–2251. https://doi.org/10.1002/grl.50232.

    Article  Google Scholar 

  • Golaz, J.-C., Salzmann, M., Donner, L. J., Horowitz, L. W., Ming, Y., & Zhao, M. (2010). Sensitivity of the aerosol indirect effect to subgrid variability in the cloud parameterization of the GFDL atmosphere general circulation model AM3. Journal of Climate, 24(13), 3145–3160. https://doi.org/10.1175/2010JCLI3945.1.

    Article  Google Scholar 

  • Held, I. M. (2005). The gap between simulation and understanding in climate modeling. Bulletin of the American Meteorological Society, 86(11), 1609–1614. https://doi.org/10.1175/BAMS-86-11-1609.

    Article  Google Scholar 

  • Hourdin, F., Mauritsen, T., Gettelman, A., Golaz, J.-C., Balaji, V., Duan, Q., et al. (2016). The art and science of climate model tuning. Bulletin of the American Meteorological Society, 98(3), 589–602. https://doi.org/10.1175/BAMS-D-15-00135.1.

    Article  Google Scholar 

  • Howson, C. (1991). The ‘Old Evidence’ problem. The British Journal for the Philosophy of Science, 42(4), 547–555. https://doi.org/10.1093/bjps/42.4.547.

    Article  MathSciNet  Google Scholar 

  • Howson, C., & Franklin, A. (1991). Maher, Mendeleev and Bayesianism. Philosophy of Science, 58(4), 574–585.

    Article  MathSciNet  Google Scholar 

  • Intergovernmental Panel on Climate Change, ed. (2014). Evaluation of climate models. In Climate Change 2013—The Physical Science Basis (pp. 741–866). Cambridge: Cambridge University Press. http://ebooks.cambridge.org/ref/id/CBO9781107415324A028.

  • Intergovernmental Panel on Climate Change. (2015). Climate Change 2014: Mitigation of Climate Change: Working Group III Contribution to the IPCC Fifth Assessment Report. Cambridge University Press.

    Google Scholar 

  • Kennedy, M. C., & O’Hagan, A. (2001). Bayesian calibration of computer models. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 63 (3), 425–64.

    Google Scholar 

  • Knutti, R., Allen, M. R., Friedlingstein, P., Gregory, J. M., Hegerl, G. C., Meehl, G. A., et al. (2008). A Review of Uncertainties in global temperature projections over the twenty-first century. Journal of Climate, 21(11), 2651–2663. https://doi.org/10.1175/2007JCLI2119.1.

    Article  Google Scholar 

  • Maher, P. (1988). Prediction, accommodation, and the logic of discovery. In PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1988 (January): (pp. 273–85).

    Google Scholar 

  • Masson, D., & Knutti, R. (2012). Predictor screening, calibration, and observational constraints in climate model ensembles: An illustration using climate sensitivity. Journal of Climate, 26(3), 887–898. https://doi.org/10.1175/JCLI-D-11-00540.1.

    Article  Google Scholar 

  • Mauritsen, T., Stevens, B., Roeckner, E., Crueger, T., Esch, M., Giorgetta, M., Haak, H. et al. (2012). Tuning the climate of a global model. Journal of Advances in Modeling Earth Systems, 4(3), M00A01. https://doi.org/10.1029/2012MS000154.

    Article  Google Scholar 

  • Oberkampf, W. L., Trucano, T. G., & Hirsch, C. (2004). Verification, validation, and predictive capability in computational engineering and physics. Applied Mechanics Reviews, 57(5), 345–384. https://doi.org/10.1115/1.1767847.

    Article  Google Scholar 

  • Oberkampf, W. L., & Barone, M. F. (2006). Measures of agreement between computation and experiment: Validation metrics. Journal of Computational Physics, Uncertainty Quantification in Simulation Science, 217(1), 5–36. https://doi.org/10.1016/j.jcp.2006.03.037.

    Article  MATH  Google Scholar 

  • Parker, W. S. (2009). Confirmation and adequacy? for? purpose in climate modelling. Aristotelian Society Supplementary Volume, 83(1), 233–249.

    Article  Google Scholar 

  • Parker, W. S. 2010. Predicting weather and climate: Uncertainty, ensembles and probability. Studies in History and Philosophy of Science Part B 41 (3): 263–272.

    Article  Google Scholar 

  • Parker, W. S. (2013). Computer simulation. In S. Psillos & M. Curd (Eds.), The Routledge Companion to Philosophy of Science, 2nd Edition. Routledge.

    Google Scholar 

  • Sprenger, Jan. (2015). A novel solution to the problem of old evidence. Philosophy of Science, 82(3), 383–401. https://doi.org/10.1086/681767.

    Article  MathSciNet  Google Scholar 

  • Steele, K., & Charlotte, W. (2016). Model-selection theory: The need for a more nuanced picture of use-novelty and double-counting. The British Journal for the Philosophy of Science. https://doi.org/10.1093/bjps/axw024.

  • Worrall, J. (1980). 001: The methodology of scientific research programmes: Philosophical papers (Vol. 1). Cambridge: Cambridge University Press.

    Google Scholar 

  • Worrall, J. (2014). Prediction and accommodation revisited. Studies in History and Philosophy of Science Part A, 45(March), 54–61. https://doi.org/10.1016/j.shpsa.2013.10.001.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathias Frisch .

Editor information

Editors and Affiliations

Appendix

Appendix

We want to show that \( p\left( {f|e.t} \right)\, < \,p\left( {f|e.\neg t} \right).\quad \quad \quad \left( {\text{C}} \right) \)

Proof: \( \begin{aligned} p\left( {f|e.t} \right) \, = & p\left( {e|f.t} \right)p\left( {f|t} \right)/p\left( {e|t} \right) \, & {\text{Bayes}}'{\text{s Theorem}} \\ = & p\left( {f|t} \right) \, & {\text{premise }}\left( 1\right) \\ = & p\left( f \right) = { 1} - p\left( {\neg f} \right) \, & {\text{premise }}\left( 2\right) \\ \end{aligned} \)

On the other hand:

$$ \begin{aligned} p\left( {f|e.\neg t} \right) = & 1- p\left( {\neg f|e.\neg t} \right) = 1- p\left( {e|\neg f.\neg t} \right)p\left( {\neg f|\neg t} \right)/p\left( {e|\neg t} \right) \\ = & 1- p\left( {e|\neg f.\neg t} \right)p\left( {\neg f} \right)/p\left( {e|\neg t} \right) \\ \end{aligned} $$

Thus, (C) is equivalent to:

$$ 1- p\left( {\neg f} \right) < { 1} - p\left( {e|\neg f.\neg t} \right)p\left( {\neg f} \right)/p\left( {e|\neg t} \right) $$

or:

$$ p\left( {e|\neg t} \right) > p\left( {e|\neg f.\neg t} \right)\,\;\;\;\left( {\text{C'}} \right) $$

But (C’) can be shown to follow from premise (3) as follows:

$$ \begin{aligned} p\left( {e|\neg t} \right) = & p\left( f \right)p\left( {e|f.\neg t} \right) + p\left( {\neg f} \right)p\left( {e|\neg f.\neg t} \right) \\ > & p\left( f \right)p\left( {e|\neg f.\neg t} \right) + p\left( {\neg f} \right)p\left( {e|\neg f.\neg t} \right){\text{ premise }}\left( 3\right) \\ = & p\left( {e|\neg f.\neg t} \right). \\ \end{aligned} $$

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Frisch, M. (2019). Calibration, Validation, and Confirmation. In: Beisbart, C., Saam, N. (eds) Computer Simulation Validation. Simulation Foundations, Methods and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-70766-2_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70766-2_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70765-5

  • Online ISBN: 978-3-319-70766-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics