Extinction Risks from AI: Invisible to Science?


In an effort to inform the discussion surrounding existential risks from AI, we formulate Extinction-level Goodhart’s Law as “Virtually any goal specification, pursued to the extreme, will result in the extinction of humanity”, and we aim to understand which formal models are suitable for investigating this hypothesis. Note that we remain agnostic as to whether Extinction-level Goodhart’s Law holds or not. As our key contribution, we identify a set of conditions that are necessary for a model that aims to be informative for evaluating specific arguments for Extinction-level Goodhart’s Law. Since each of the conditions seems to significantly contribute to the complexity of the resulting model, formally evaluating the hypothesis might be exceedingly difficult. This raises the possibility that whether the risk of extinction from artificial intelligence is real or not, the underlying dynamics might be invisible to current scientific methods.



External links

  • This entry has no external links. Add one.
Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Superintelligence as a Cause or Cure for Risks of Astronomical Suffering.Kaj Sotala & Lukas Gloor - 2017 - Informatica: An International Journal of Computing and Informatics 41 (4):389-400.
Existential Risks: Exploring a Robust Risk Reduction Strategy.Karim Jebari - 2015 - Science and Engineering Ethics 21 (3):541-554.
Is Extinction Risk Mitigation Uniquely Cost-Effective? Not in Standard Population Models.Gustav Alexandrie & Maya Eden - forthcoming - In Jacob Barrett, Hilary Greaves & David Thorstad (eds.), Essays on Longtermism. Oxford University Press.
Existential risks: a philosophical analysis.Phil Torres - 2023 - Inquiry: An Interdisciplinary Journal of Philosophy 66 (4):614-639.
Offsetting the harms of extinction.Michael Da Silva - 2015 - Law, Ethics and Philosophy 3:8-29.
Extinction as a function of the spacing of extinction trials.Walter C. Stanley - 1952 - Journal of Experimental Psychology 43 (4):249.
Bioethics as an Ethics of Extinction.Luca Lo Sapio - 2023 - Scienza E Filosofia 29:15-35.
Welcome to the Machine: AI, Existential Risk, and the Iron Cage of Modernity.Jay A. Gupta - 2023 - Telos: Critical Theory of the Contemporary 2023 (203):163-169.


Added to PP

205 (#94,302)

6 months
205 (#12,160)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Vojtech Kovarik
Carnegie Mellon University

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references