AI Deception: A Survey of Examples, Risks, and Potential Solutions

Abstract

This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI systems. Finally, we outline several potential solutions to the problems posed by AI deception: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,475

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Self-deception: A Reflexive Dilemma.T. S. Champlin - 1977 - Philosophy 52 (201):281-299.
Self-Deception: A Reflexive Dilemma.T. S. Champlin - 1977 - Philosophy 52 (201):281 - 299.
Self-deception.Eric Funkhouser - 2019 - New York, NY: Routledge.
Self-deception vs. self-caused deception: A comment on professor Mele.Robert Audi - 1997 - Behavioral and Brain Sciences 20 (1):104-104.
The philosophy of deception.Clancy W. Martin (ed.) - 2009 - New York: Oxford University Press.
Understanding and explaining real self-deception.Alfred R. Mele - 1997 - Behavioral and Brain Sciences 20 (1):127-134.
A Case of Insincerity: What Does it Mean to Deceive Someone?Kevin Kinghorn - 2012 - In Philip Tallon & David Baggett (eds.), The Philosophy of Sherlock Holmes. University Press of Kentucky. pp. 37-48.
The uses of self-deception.Howard Rachlin & Marvin Frankel - 1997 - Behavioral and Brain Sciences 20 (1):124-125.
Conceptual Art, Social Psychology, And Deception.Peter Goldie - 2004 - Postgraduate Journal of Aesthetics 1 (1):32-41.
Belief, Deception, and Self-Deception.Rick Alan Fairbanks - 1990 - Dissertation, University of Minnesota
Instantaneous self-deception.Maiya Jordan - 2022 - Inquiry: An Interdisciplinary Journal of Philosophy 65 (2):176-201.

Analytics

Added to PP
2023-09-19

Downloads
76 (#216,315)

6 months
49 (#88,865)

Historical graph of downloads
How can I increase my downloads?

Author Profiles

Simon Goldstein
University of Hong Kong

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references