The Shutdown Problem: Incomplete Preferences as a Solution

Abstract

I explain and motivate the shutdown problem: the problem of creating artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I then propose a solution: train agents to have incomplete preferences. Specifically, I propose that we train agents to lack a preference between every pair of different-length trajectories. I suggest a way to train such agents using reinforcement learning: we give the agent lower reward for repeatedly choosing same-length trajectories.

Links

PhilArchive

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Lexicographic expected utility without completeness.D. Borie - 2016 - Theory and Decision 81 (2):167-176.
Decision theory for agents with incomplete preferences.Adam Bales, Daniel Cohen & Toby Handfield - 2014 - Australasian Journal of Philosophy 92 (3):453-70.
Opaque Sweetening and Transitivity.Ryan Doody - 2019 - Australasian Journal of Philosophy 97 (3):559-571.
Normative Decision Theory.Edward Elliott - 2019 - Analysis 79 (4):755-772.
Ensemble prospectism.Kim Kaivanto - 2017 - Theory and Decision 83 (4):535-546.
Choosing well: the good, the bad, and the trivial.Chrisoula Andreou - 2022 - New York, NY. United States of America: Oxford University Press.

Analytics

Added to PP
2024-03-05

Downloads
109 (#160,094)

6 months
109 (#37,982)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Elliott Thornley
University of Oxford

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references