The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

Philosophical Studies (forthcoming)
  Copy   BIBTEX

Abstract

I explain the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems show that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. And patience trades off against shutdownability: the more patient an agent, the greater the costs that agent is willing to incur to manipulate the shutdown button. I end by noting that these theorems can guide our search for solutions.

Links

PhilArchive

External links

  • This entry has no external links. Add one.
Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Mediated democracy and internet shutdown in India.Md Nurul Momen, Harsha S. & Debobrata Das - 2021 - Journal of Information, Communication and Ethics in Society 19 (2):222-235.
Uses of construction in problems and theorems in Euclid’s Elements I–VI.Nathan Sidoli - 2018 - Archive for History of Exact Sciences 72 (4):403-452.
Decision procedure of some relevant logics: a constructive perspective.Jacques Riche - 2005 - Journal of Applied Non-Classical Logics 15 (1):9-23.
Allowing the Factory Shutdown: Proposed Legislation and its Justification.Ellen Kelly - 1985 - Notre Dame Journal of Law, Ethics and Public Policy 2 (1):329.
Units of Decision.Mariam Thomas - 1999 - Philosophy of Science 66 (Supplement):324-338.
Units of decision.Mariam Thalos - 1999 - Philosophy of Science 66 (3):338.
Interpolating Decisions.Jonathan Cohen & Elliott Sober - 2023 - Australasian Journal of Philosophy 101 (2):327-339.
How to Play the Lottery Safely?Haicheng Zhao - 2023 - Episteme 20 (1):23-38.
Safety and the True–True Problem.Jon Cogburn & Jeffrey W. Roland - 2013 - Pacific Philosophical Quarterly 94 (2):246-267.

Analytics

Added to PP
2023-10-23

Downloads
655 (#26,432)

6 months
473 (#3,397)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Elliott Thornley
University of Oxford

Citations of this work

No citations found.

Add more citations

References found in this work

The Morality of Freedom.Joseph Raz - 1986 - Philosophy 63 (243):119-122.
Money-Pump Arguments.Johan E. Gustafsson - 2022 - Cambridge: Cambridge University Press.

Add more references