Will AI avoid exploitation? Artificial general intelligence and expected utility theory

Philosophical Studies:1-20 (forthcoming)
  Copy   BIBTEX

Abstract

A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure is instructive: in exploring the argument, we gain insight into how to model advanced AI systems.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 93,990

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Analytics

Added to PP
2023-08-23

Downloads
49 (#316,613)

6 months
26 (#139,639)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Adam Bales
University of Oxford

Citations of this work

Non-Ideal Decision Theory.Sven Neth - 2023 - Dissertation, University of California, Berkeley

Add more citations

References found in this work

No references found.

Add more references