Artificial Intelligence, Values, and Alignment

Minds and Machines 30 (3):411-437 (2020)
  Copy   BIBTEX

Abstract

This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 89,408

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

The value alignment problem: a geometric approach.Martin Peterson - 2019 - Ethics and Information Technology 21 (1):19-28.
Robustness to Fundamental Uncertainty in AGI Alignment.G. G. Worley Iii - 2020 - Journal of Consciousness Studies 27 (1-2):225-241.
Beyond linguistic alignment.Allan Mazur - 2004 - Behavioral and Brain Sciences 27 (2):205-206.
Alignment and commitment in joint action.Matthew Rachar - 2018 - Philosophical Psychology 31 (6):831-849.
Interactive alignment: Priming or memory retrieval?Michael Kaschak & Arthur Glenberg - 2004 - Behavioral and Brain Sciences 27 (2):201-202.
Machines learning values.Steve Petersen - 2020 - In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. New York, USA: Oxford University Press.
The emergence of active/stative alignment in Otomi.Enrique L. Palancar - 2008 - In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment. Oxford University Press.

Analytics

Added to PP
2020-10-02

Downloads
278 (#63,902)

6 months
87 (#42,083)

Historical graph of downloads
How can I increase my downloads?

References found in this work

Anarchy, State, and Utopia.Robert Nozick - 1974 - New York: Basic Books.
Principles of biomedical ethics.Tom L. Beauchamp - 1979 - New York: Oxford University Press. Edited by James F. Childress.
What we owe to each other.Thomas Scanlon - 1998 - Cambridge, Mass.: Belknap Press of Harvard University Press.

View all 82 references / Add more references