Skip to main content

Advertisement

Log in

Safety Engineering for Artificial General Intelligence

  • Published:
Topoi Aims and scope Submit manuscript

Abstract

Machine ethics and robot rights are quickly becoming hot topics in artificial intelligence and robotics communities. We will argue that attempts to attribute moral agency and assign rights to all intelligent machines are misguided, whether applied to infrahuman or superhuman AIs, as are proposals to limit the negative effects of AIs by constraining their behavior. As an alternative, we propose a new science of safety engineering for intelligent artificial agents based on maximizing for what humans value. In particular, we challenge the scientific community to develop intelligent systems that have human-friendly values that they provably retain, even under recursive self-improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The term AGI can also refer more narrowly to engineered AI, in contrast to those derived from the human model, such as emulated or uploaded brains (Goertzel and Pennachin 2007). In this article, unless specified otherwise, we use AI and AGI to refer to artificial general intelligences in the broader sense.

  2. The term “artimetrics” was coined (Yampolskiy and Govindaraju 2008) on the basis of “artilect,” which is Hugo de Garis’s (2005) neologism for “artificial intellect.”

References

  • Allen C, Varner G, Zinser J (2000) Prolegomena to any future artificial moral agent. J Exp Theor Artif Intell 12:251–261

    Article  Google Scholar 

  • Allen C, Smit I, Wallach W (2005) Artificial morality: top-down, bottom-up, and hybrid approaches. Ethics Inf Technol 7(3):149–155

    Article  Google Scholar 

  • Allen C, Wallach W, Smit I (2006) Why machine ethics? IEEE Intell Syst 21(4):12–17

    Article  Google Scholar 

  • Anderson M, Anderson SL (2007) Machine ethics: creating an ethical intelligent agent. AI Mag 28(4):15–26

    Google Scholar 

  • Arneson RJ (1999) What, if anything, renders all humans morally equal? In: Jamieson D (ed) Peter singer and his critics. Blackwell, Oxford

    Google Scholar 

  • Asimov I (1942) Runaround. In: Astounding science fiction, March, pp 94–103

  • Berg P, Baltimore D, Brenner S, Roblin RO, Singer MF (1975) Summary statement of the Asilomar conference on recombinant DNA molecules. Proc Natl Acad Sci USA 72(6):1981–1984

    Article  Google Scholar 

  • Bishop M (2009) Why computers can’t feel pain. Mind Mach 19(4):507–516

    Article  Google Scholar 

  • Bostrom N (2002) Existential risks: analyzing human extinction scenarios and related hazards. J Evol Technol 9(1)

  • Bostrom N (2006) How long before superintelligence. Linguist Philos Investig 5(1):11–30

    Google Scholar 

  • Butler S (1863) Darwin among the machines, letter to the Editor. The Press, Christchurch, New Zealand, 13 June 1863

  • Butler S (1970/1872) Erewhon: or, over the range. Penguin, London

  • Chalmers DJ (2010) The singularity: a philosophical analysis. J Conscious Stud 17:7–65

    Google Scholar 

  • Churchland PS (2011) Brain trust. Princeton University Press, Princeton

    Google Scholar 

  • Clarke R (1993) Asimov’s laws of robotics: implications for information technology, part 1. IEEE Comput 26(12):53–61

    Article  Google Scholar 

  • Clarke R (1994) Asimov’s laws of robotics: implications for information technology, part 2. IEEE Comput 27(1):57–66

    Article  Google Scholar 

  • de Garis H (2005) The artilect war: cosmists versus Terrans. ETC. Publications, Palm Springs

    Google Scholar 

  • Dennett DC (1978) Why you can’t make a computer that feels pain. Synthese 38(3):415–456

    Article  Google Scholar 

  • Drescher G (2006) Good and real: demystifying paradoxes from physics to ethics. MIT Press, Cambridge

    Google Scholar 

  • Drexler E (1986) Engines of creation. Anchor Press, New York

    Google Scholar 

  • Fox J (2011) Morality and super-optimizers. Paper presented at the Future of Humanity Conference, 24 Oct 2011, Van Leer Institute, Jerusalem

  • Fox J, Shulman C (2010) Superintelligence does not imply benevolence. In: Mainzer K (ed) Proceedings of the VIII European conference on computing and philosophy. Verlag Dr. Hut, Munich

    Google Scholar 

  • Gauthier D (1986) Morals by agreement. Oxford University Press, Oxford

    Google Scholar 

  • Gavrilova M, Yampolskiy R (2011) Applying biometric principles to avatar recognition. Trans Comput Sci XII:140–158

    Google Scholar 

  • Goertzel B (2011) Does humanity need an AI nanny. H+ Magazine, 17 Aug 2011

  • Goertzel B, Pennachin C (eds) (2007) Essentials of general intelligence: the direct path to artificial general intelligence. Springer, Berlin

    Google Scholar 

  • Good IJ (1965) Speculations concerning the first ultraintelligent machine. Adv Comput 6:31–88

    Article  Google Scholar 

  • Gordon DF (1998) Well-behaved Borgs, bolos, and berserkers. Paper presented at the 15th International Conference on Machine Learning (ICML98), San Francisco, CA

  • Gordon-Spears DF (2003) Asimov’s laws: current progress. Lect Notes Comput Sci 2699:257–259

    Article  Google Scholar 

  • Gordon-Spears DF (2005) Assuring the behavior of adaptive agents. In: Hinchey M, Rash J, Truszkowski W, Gordon-Spears DF, Rouff C (eds) Agent technology from a formal perspective. Kluwer, Amsterdam, pp 227–259

    Google Scholar 

  • Grau C (2006) There is no “I” in “Robot”: robots and utilitarianism. IEEE Intell Syst 21(4):52–55

    Article  Google Scholar 

  • Guo S, Zhang G (2009) Robot rights. Science 323(5916):876

    Article  Google Scholar 

  • Hall JS (2007a) Beyond AI: creating the conscience of the machine. Prometheus, Amherst

    Google Scholar 

  • Hall JS (2007b) Self-improving AI: an analysis. Mind Mach 17(3):249–259

    Article  Google Scholar 

  • Hanson R (2010) Prefer law to values. Overcoming Bias, 10 Oct 2010. Retrieved 15 Jan 2012, from http://www.overcomingbias.com/2009/10/prefer-law-to-values.html

  • Hobbes T (1998/1651) Leviathan. Oxford University Press, Oxford

  • Hutter M (2005) Universal artificial intelligence: sequential decisions based on algorithmic probability. Springer, Berlin

    Google Scholar 

  • Joy B (2000) Why the future doesn’t need us. Wired Magazine, 8, April 2000

  • Kaczynski T (1995) Industrial society and its future. The New York Times, 19 Sep 1995

  • Kurzweil R (2006) The singularity is near: when humans transcend biology. Penguin, New York

    Google Scholar 

  • LaChat MR (1986) Artificial intelligence and ethics: an exercise in the moral imagination. AI Mag 7(2):70–79

    Google Scholar 

  • Legg S (2006) Unprovability of Friendly AI. Vetta Project, 15 Sep 2006. Retrieved Jan. 15, 2012, from http://www.vetta.org/2006/09/unprovability-of-friendly-ai/

  • Legg S, Hutter M (2007) Universal intelligence: a definition of machine intelligence. Mind Mach 17(4):391–444

    Article  Google Scholar 

  • Lin P, Abney K, Bekey G (2011) Robot ethics: mapping the issues for a mechanized world. Artif Intell 175(5–6):942–949

    Google Scholar 

  • McCauley L (2007) AI Armageddon and the three laws of robotics. Ethics Inf Technol 9(2):153–164

    Google Scholar 

  • McDermott D (2008) Why ethics is a high hurdle for AI. Paper presented at the North American Conference on Computers and Philosophy, Bloomington, IN

  • Moor JH (2006) The nature, importance, and difficulty of machine ethics. IEEE Intell Syst 21(4):18–21

    Article  Google Scholar 

  • Omohundro SM (2008) The basic AI drives. In: Wang P, Goertzel B, Franklin S (eds) The proceedings of the first AGI conference. IOS Press, Amsterdam, pp 483–492

    Google Scholar 

  • Pierce MA, Henry JW (1996) Computer ethics: the role of personal, informal, and formal codes. J Bus Ethics 14(4):425–437

    Article  Google Scholar 

  • Powers TM (2006) Prospects for a Kantian machine. IEEE Intell Syst 21(4):46–51

    Article  Google Scholar 

  • Pynadath DV, Tambe M (2001) Revisiting Asimov’s first law: a response to the call to arms. Paper presented at the Intelligent Agents VIII. International Workshop on Agents, Theories, Architectures and Languages (ATAL’01)

  • Rappaport ZH (2006) Robotics and artificial intelligence: jewish ethical perspectives. Acta Neurochir Suppl 98:9–12

    Article  Google Scholar 

  • Roth D (2009) Do humanlike machines deserve human rights? Wired 17, 19 Jan 2009

  • Ruvinsky AI (2007) Computational ethics. In: Quigley M (ed) Encyclopedia of information ethics and security. IGI Global, Hershey, p 76

    Chapter  Google Scholar 

  • Salamon A, Rayhawk S, Kramár J (2010) How intelligible is intelligence? In: Mainzer K (ed) Proceedings of the VIII European conference on computing and philosophy. Verlag Dr. Hut, Munich

    Google Scholar 

  • Sawyer RJ (2007) Robot ethics. Science 318(5853):1037

    Article  Google Scholar 

  • Sharkey N (2008) The ethical frontiers of robotics. Science 322(5909):1800–1801

    Article  Google Scholar 

  • Sotala K (2010) From mostly harmless to civilization-threatening: pathways to dangerous artificial general intelligences. In: Mainzer K (ed) Proceedings of the VIII European conference on computing and philosophy. Verlag Dr. Hut, Munich

    Google Scholar 

  • Sotala K (2012) Advantages of artificial intelligences, uploads, and digital minds. Int J Mach Conscious 4:275–291

    Google Scholar 

  • Sparrow R (2007) Killer robots. J Appl Philos 24(1):62–77

    Article  Google Scholar 

  • Tonkens R (2009) A challenge for machine ethics. Mind Mach 19(3):421–438

    Article  Google Scholar 

  • Tooby J, Cosmides L (1992) The psychological foundations of culture. In: Barkow J, Tooby J, Cosmides L (eds) The adapted mind: evolutionary psychology and the generation of culture. Oxford University Press, Oxford, pp 19–136

    Google Scholar 

  • Vassar M (2005) AI boxing (dogs and helicopters), 2 Aug 2005. Retrieved 18 Jan 2012, from http://sl4.org/archive/0508/11817.html

  • Veruggio G (2010) Roboethics. IEEE Robot Autom Mag 17(2):105–109

    Article  Google Scholar 

  • von Ahn L, Blum M, Hopper N, Langford J (2003) CAPTCHA: using hard AI problems for security. In: E. Biham (ed) Advances in cryptology—EUROCRYPT 2003: International conference on the theory and applications of cryptographic techniques, Warsaw, Poland, May 4-8, 2003 proceedings. Lecture notes in computer science 2656, Berlin, Springer, pp 293–311

  • Wallach W, Allen C (2006) EthicALife: a new field of inquiry. Paper presented at the AnAlifeX workshop, USA

  • Wallach W, Allen C (2008) Moral machines: teaching robots right from wrong. Oxford University Press, Oxford

    Google Scholar 

  • Warwick K (2003) Cyborg morals, cyborg values, cyborg ethics. Ethics Inf Technol 5:131–137

    Article  Google Scholar 

  • Weld DS, Etzioni O (1994) The first law of robotics (a call to arms). Paper presented at the Twelfth National Conference on Artificial Intelligence (AAAI)

  • Wright R (2001) Nonzero: the logic of human destiny. Vintage, New York

    Google Scholar 

  • Yampolskiy RV (2011a) AI-complete CAPTCHAs as zero knowledge proofs of access to an artificially intelligent system. ISRN Artificial Intelligence, 271878

  • Yampolskiy RV (2011b) Artificial intelligence safety engineering: why machine ethics is a wrong approach. Philosophy and Theory of Artificial Intelligence, 3–4 Oct, Thessaloniki, Greece

  • Yampolskiy RV (2011c) What to do with the singularity paradox? Paper presented at the Philosophy and Theory of Artificial Intelligence (PT-AI2011), 3–4 Oct, Thessaloniki, Greece

  • Yampolskiy RV (2012a) Leakproofing singularity: the artificial intelligence confinement problem. J Conscious Stud 19(1–2):194–214

    Google Scholar 

  • Yampolskiy RV (2012b) Turing test as a defining feature of AI-completeness. In: Yang X-S (ed) Artificial intelligence, evolutionary computation and metaheuristics—in the footsteps of Alan Turing. Springer, Berlin

  • Yampolskiy RV, Fox J (2012) Artificial intelligence and the human mental model. In: Eden A, Moor J, Soraker J, Steinhart E (eds) The singularity hypothesis: a scientific and philosophical assessment. Springer, Berlin (in press)

  • Yampolskiy R, Gavrilova M (2012) Artimetrics: biometrics for artificial entities. IEEE Robot Autom Mag (RAM) (In press)

  • Yampolskiy RV, Govindaraju V (2008) Behavioral biometrics for verification and recognition of malicious software agents. Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security and Homeland Defense VII. SPIE Defense and Security Symposium, Orlando, Florida, 16–20 Mar

  • Yudkowsky E (2002) The AI-box experiment. Retrieved 15 Jan 2012, from http://yudkowsky.net/singularity/aibox

  • Yudkowsky E (2007) The logical fallacy of generalization from fictional evidence. Less Wrong. Retrieved 20 Feb 2012, from http://lesswrong.com/lw/k9/the_logical_fallacy_of_generalization_from/

  • Yudkowsky E (2008) Artificial intelligence as a positive and negative factor in global risk. In: Bostrom N, Ćirković MM (eds) Global catastrophic risks. Oxford University Press, Oxford, pp 308–345

    Google Scholar 

  • Yudkowsky E (2010) Timeless decision theory. Retrieved 15 Jan 2012, from http://singinst.org/upload/TDT-v01o.pdf

  • Yudkowsky E (2011a) Complex value systems are required to realize valuable futures. In: Schmidhuber J, Thórisson KR, Looks M (eds) Artificial general intelligence: 4th international conference, AGI 2011, mountain view, CA, USA, August 3–6, 2011, proceedings. Springer, Berlin, pp 388–393

    Google Scholar 

  • Yudkowsky E (2011b) Open problems in friendly artificial intelligence. Paper presented at the Singularity Summit, New York

  • Yudkowsky E, Bostrom N (2011) The ethics of artificial intelligence. In: Ramsey W, Frankish K (eds) Cambridge handbook of artificial intelligence. Cambridge University Press, Cambridge

    Google Scholar 

Download references

Acknowledgments

This article is an expanded version of the conference paper “Artificial intelligence safety engineering: why machine ethics is a wrong approach” (Yampolskiy 2011b). We would like to thank Brian Rabkin and Michael Anissimov for their comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roman Yampolskiy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yampolskiy, R., Fox, J. Safety Engineering for Artificial General Intelligence. Topoi 32, 217–226 (2013). https://doi.org/10.1007/s11245-012-9128-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11245-012-9128-9

Keywords

Navigation