Calibrating machine behavior: a challenge for AI alignment

Ethics and Information Technology 25 (3):1-8 (2023)
  Copy   BIBTEX

Abstract

When discussing AI alignment, we usually refer to the problem of teaching or training advanced autonomous AI systems to make decisions that are aligned with human values or preferences. Proponents of this approach believe it can be employed as means to stay in control over sophisticated intelligent systems, thus avoiding certain existential risks. We identify three general obstacles on the path to implementation of value alignment: a technological/technical obstacle, a normative obstacle, and a calibration problem. Presupposing, for the purposes of this discussion, that the technical and normative problems are solved, we focus on the problem of how to calibrate a system, for a specific value, to be on a specific location within a spectrum stretching between righteous and normal or average human behavior. Calibration, or more specifically mis-calibration, also raises the issue of trustworthiness. If we cannot trust AI systems to perform tasks the way we intended, we would not use them on our roads and at our homes. In an era where we strive to construct autonomous machines endowed with common sense, reasoning abilities and a connection to the world, so they would be able to act in alignment with human values, such mis-calibrations can make the difference between trustworthy and untrustworthy systems.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,227

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Instructions for Authors.[author unknown] - 2001 - Ethics and Information Technology 3 (4):303-306.
Instructions for Authors.[author unknown] - 2001 - Ethics and Information Technology 3 (2):151-154.
Instructions for authors.[author unknown] - 2002 - Ethics and Information Technology 4 (1):93-96.
Instructions for Authors.[author unknown] - 2003 - Ethics and Information Technology 5 (4):239-242.
Instructions for Authors.[author unknown] - 1999 - Ethics and Information Technology 1 (1):87-90.
Instructions for Authors.[author unknown] - 2000 - Ethics and Information Technology 2 (4):257-260.
Editorial.[author unknown] - 2005 - Ethics and Information Technology 7 (2):49-49.
Governing (ir)responsibilities for future military AI systems.Liselotte Polderman - 2023 - Ethics and Information Technology 25 (1):1-4.
The ethics of hacking. Ross W. Bellaby.Cécile Fabre - 2023 - Ethics and Information Technology 25 (3):1-4.
The Ethics of AI in Human Resources.Evgeni Aizenberg & Matthew J. Dennis - 2022 - Ethics and Information Technology 24 (3):1-3.
Correction to: the Ethics of AI in Human Resources.Evgeni Aizenberg & Matthew J. Dennis - 2023 - Ethics and Information Technology 25 (1):1-1.

Analytics

Added to PP
2023-09-20

Downloads
32 (#502,127)

6 months
23 (#120,782)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

Artificial Intelligence, Values, and Alignment.Iason Gabriel - 2020 - Minds and Machines 30 (3):411-437.
Just consequentialism and computing.James H. Moor - 1999 - Ethics and Information Technology 1 (1):61-65.
Just consequentialism and computing.James H. Moor - 1999 - Ethics and Information Technology 1 (1):61-65.
The missing G.Erez Firt - 2020 - AI and Society 35 (4):995-1007.

Add more references