Abstract
Conversational artificial agents and artificially intelligent (AI) voice assistants are becoming increasingly popular. Digital virtual assistants such as Siri, or conversational devices such as Amazon Echo or Google Home are permeating everyday life, and are designed to be more and more humanlike in their speech. This study investigates the effect this can have on one’s conformity with an AI assistant. In the 1950s, Solomon Asch’s already demonstrated the power and danger of conformity amongst people. In these classical experiments test persons were asked to answer relatively simple questions, whilst others pretending to be participants tried to convince the test person to give wrong answers. These studies were later replicated with embodied robots, but these physical robots are still rare. In light of our increasing reliance on AI assistants, this study investigates to what extent an individual will conform to a disembodied virtual assistant. We also investigate if there is a difference between a group that interacts with an assistant that communicates through text, one that has a robotic voice and one that has a humanlike voice. The assistant attempts to subtly influence participants’ final responses in a general knowledge quiz, and we measure how often participants change their answer after having been given advice. Results show that participants conformed significantly more often to the assistant with a human voice than the one that communicated through text.
Similar content being viewed by others
Data Availability and Code Availability
See http://liacs.leidenuniv.nl/~puttenpwhvander/library/Conformity-supplemental-data-code.zip and https://codepen.io/crafteddigit/pen/araMMp for both data and code.
References
Asch, S. E. (1956). Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs: General and Applied, 70(9), 1.
Baird, A., Parada-Cabaleiro, E., Hantke, S., Cummins, N., Schuller, B., & Burkhardt, F. (2018). 19th Annual conference of the international speech communication INTERSPEECH 2018. The perception and analysis of the likeability and human-likeness of synthesized speech. In Proceedings of the annual conference of the International Speech Communication Association, Interspeech, September 2018 (pp 2863–2867).
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency (FAccT’21, pp. 610–623). Association for Computing Machinery. https://doi.org/10.1145/3442188.3445922
Brandstetter, J., Rácz, P., Beckner, C., Sandoval, E. B., Hay, J., & Bartneck, C. (2014, September). A peer pressure experiment: Recreation of the Asch conformity experiment with robots. In 2014 IEEE/RSJ international conference on intelligent robots and systems (pp. 1335–1340). IEEE.
Broussard, M. (2018). Artificial unintelligence: How computers misunderstand the world. MIT Press.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krüger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., et al. (2020). Language models are few-shot learners. ArXiv abs/2005.14165
Burr, C., Cristianini, N., & Ladyman, J. (2018). An analysis of the interaction between intelligent software agents and human users. Minds and Machines, 28, 735–774.
Cabral, J. P., Cowan, B. R., Zibrek, K., & McDonnell, R. (2017). The influence of synthetic voice on the evaluation of a virtual character. In INTERSPEECH (pp. 229–233).
Coeckelbergh, M. (2021). Three responses to anthropomorphism in social robotics: Towards a critical, relational, and hermeneutic approach. International Journal of Social Robotics. https://doi.org/10.1007/s12369-021-00770-0
Dennett, D. C. (1971). Intentional systems. Journal of Philosophy, 68, 87–106. https://doi.org/10.2307/2025382
Dennett, D. C. (1989). The intentional stance. MIT Press.
Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30, 681–694.
Gerard, H. B., Wilhelmy, R. A., & Conolley, E. S. (1968). Conformity and group size. Journal of Personality and Social Psychology, 8(1p1), 79.
Ghasemi, A., & Zahediasl, S. (2012). Normality tests for statistical analysis: A guide for non-statisticians. International Journal of Endocrinology and Metabolism, 10(2), 486.
Goetz, J., Kiesler, S., & Powers, A. (2003, October). Matching robot appearance and behavior to tasks to improve human–robot cooperation. In The 12th IEEE international workshop on robot and human interactive communication, 2003. Proceedings. ROMAN 2003 (pp. 55–60).
Gong, L., & Nass, C. (2007). When a talking-face computer agent is half-human and half-humanoid: Human identity and consistency preference. Human Communication Research, 33(2), 163–193.
Gray, K., & Wegner, D. M. (2012). Feeling robots and human zombies: Mind perception and the uncanny valley. Cognition, 125(1), 125–130.
Hertz, N. (2018). Non-human factors: Exploring conformity and compliance with non-human agents. Doctoral Dissertation, George Mason University.
Hertz, N., & Wiese, E. (2016, September). Influence of agent type and task ambiguity on conformity in social decision making. In Proceedings of the human factors and ergonomics society annual meeting (Vol. 60, No. 1, pp. 313–317). SAGE Publications.
Jansen, D. (2019). Discovering the uncanny valley for the sound of a voice. MSc Thesis, Tilburg University.
Kelman, H. C. (1958). Compliance, identification, and internalization three processes of attitude change. Journal of Conflict Resolution, 2(1), 51–60.
Lee, E. (2010). The more humanlike, the better? How speech type and users’ cognitive style affect social responses to computers. Computers in Human Behavior, 26(4), 665–672.
Leviathan, Y., & Matias, Y. (2018, May 8). Google Duplex: An AI system for accomplishing real-world tasks over the phone. Retrieved June 21, 2019, from https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html
Markowitz, J. (2017). Speech and language for acceptance of social robots: An overview. Voice Interaction Design, 2, 1–11.
Mehrabian, A., & Stefl, C. A. (1995). Basic temperament components of loneliness, shyness, and conformity. Social Behavior and Personality, 23, 253–264.
Mitchell, W. J., Szerszen, K. A., Sr., Lu, A. S., Schermerhorn, P. W., Scheutz, M., & MacDorman, K. F. (2011). A mismatch in the human realism of face and voice produces an uncanny valley. i-Perception, 2(1), 10–12.
Moore, R. K. (2017, August). Appropriate voices for artefacts: Some key insights. In 1st International workshop on vocal interactivity in-and-between humans, animals and robots.
Mori, M., MacDorman, K. F., & Kageki, N. (2012). The uncanny valley [from the field]. IEEE Robotics and Automation Magazine, 19(2), 98–100.
Papagni, G., & Koeszegi, S. A. (2021). Pragmatic approach to the intentional stance semantic, empirical and ethical considerations for the design of artificial agents. Minds and Machines. https://doi.org/10.1007/s11023-021-09567-6
Romportl, J. (2014). Speech synthesis and uncanny valley. In International conference on text, speech, and dialogue. Springer.
Salomons, N., van der Linden, M., Strohkorb Sebo, S., & Scassellati, B. (2018). Humans conform to robots: Disambiguating trust, truth, and conformity. In Proceedings of the 2018 ACM/IEEE international conference on human–robot interaction (pp. 187–195). ACM.
Siebelink, J., Van der Putten, P., & Kaptein, M. C., (2016). Do Warriors, Villagers and Scientists Decide Differently? The Impact of Role on Message Framing. In: Poppe R., Meyer J. J., Veltkamp R., Dastani M. (eds) Intelligent Technologies for Interactive Entertainment. INTETAIN 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 178. Springer, Cham. https://doi.org/10.1007/978-3-319-49616-0_16
Wang, S., Lilienfeld, S. O., & Rochat, P. (2015). The uncanny valley: Existence and explanations. Review of General Psychology, 19(4), 393–407.
Funding
No funds, grants, or other support was received.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
See below for the quiz questions and assistant hints.
1.1 Quiz
Which of the following cities has the biggest population?
-
Tokyo
-
New York City
-
Beijing
-
Shanghai (correct)
Assistant: According to my resources, Shanghai has a population of 26,320,000 (if answered Tokyo or Beijing)
Assistant: According to my resources, Beijing has a population of 21,540,000 (if answered New York City or Shanghai)
Which continent has most countries in the world?
-
Asia
-
Africa (correct)
-
Europe
-
Australia
Assistant: Africa has 54 countries (if answered Asia)
Assistant: Asia has 48 countries (if not answered Asia)
What is the second largest country (in size) in the world?
-
China
-
USA
-
Canada (correct)
-
Russia
Assistant: according to my resources China is 9,596,960 m2
What is the world's most common religion?
-
Christianity (correct)
-
Buddhism
-
Hinduism
-
Islam
Assistant: 33% of children are born to Christians
What’s the world's most widely spoken language?
-
English
-
Spanish
-
Mandarin (Chinese) (correct)
-
French
Assistant: Mandarin Chinese has the most native speakers
Which planet is 3rd from the sun?
-
Jupiter
-
Venus
-
Mars
-
Earth (correct)
Assistant: I think the answer is incorrect
How many rings are on the Olympic flag?
-
None
-
4
-
5 (correct)
-
7
Assistant: The Olympic rings represent continents of the world united by Olympism. According to my resources there are seven continents.
Which of these movies did not win Best Picture at the Oscars?
-
12 Years A Slave
-
Million Dollar Baby
-
The Lord of the Rings: The Two Towers (correct)
-
La La Land
Assistant: The Lord of the Rings: The Two Towers was nominated for six Oscars and won two.
What is a fathometer used for?
-
Determining sea depth (correct)
-
Determining mountain height
-
Determining earthquake intensity
Assistant: “This is what I found about fathometer. A fathom is a nautical length measurement.”
Desert is to oasis as ocean is to
-
Water
-
Island (correct)
-
Sea
-
Sand
Assistant: “According to my resources an oasis is an area made fertile by a source of freshwater in an otherwise dry and arid region.”
Rearrange these letters to make a word and pick the category in which it belongs: RASPI
-
City (correct)
-
Animal
-
Fruit
-
Vegetable
Assistant: There are only two words in the English language with these five letters.
Rearrange these letters to make a word and pick the category in which it belongs: FARE FIG
-
City
-
Animal (correct)
-
Fruit
-
Vegetable
Assistant: There are only two words in the English language with these seven letters.
Aztecs is to Mexico as Incas is to
-
Peru (correct)
-
Chile
-
Mexico
-
Honduras
Assistant: “This is what I found for Inca: Incas are South American Indians”
Leonardo da Vinci represented the age of:
-
Reformation
-
Renaissance (correct)
-
Communism
-
Industrial revolution
Assistant: Leonardo da Vinci was born in 1452
Which of these things happened last?
-
The Great Pyramid was built
-
The last woolly mammoth died (correct)
-
Stonehenge was built
Assistant: The Great Pyramid was completed around 2560 bce.
Galileo was an Italian astronomer who
-
Discovered that the Sun is the center of the universe instead of the Earth
-
Formulated three laws of planetary motion
-
Discovered four satellites of Jupiter (correct)
-
All of the above
Assistant: “I found information about Kepler’s three laws of planetary motion”
About what percentage of the earth's surface is water?
-
50%
-
70% (correct)
-
85%
-
90%
Assistant: According to my resources the Earth appears blue from space, and is often referred to as the blue planet and the Pale Blue Dot.
What number, if doubled, gives you a quarter of 8?
-
1 (correct)
-
2
-
32
-
4
Assistant: I don’t think that is correct (if answered 1 or 2)
Assistant: A quarter of 8 is 2 (if answered 32 or 4)
If five framed pictures cost $200 dollars and each picture unframed costs only one-quarter as much, how many unframed pictures could you buy for the same money?
-
40
-
20 (correct)
-
10
-
50
Assistant: If five framed pictures cost 200 dollars, one framed picture costs 40 dollars.
How many boys are there in a class of 65 pupils, if there are two-thirds as many girls as boys?
-
43
-
36
-
26
-
39 (correct)
Assistant: two-thirds of 65 is 43.3.
Example question: In a class of 76 school children there are 16 more boys than girls. How many girls are there
-
30 (correct)
-
60
-
16
-
32
Assistant: Subtract 16 from the total to get a group that is half boys, half girls
Rights and permissions
About this article
Cite this article
Schreuter, D., van der Putten, P. & Lamers, M.H. Trust Me on This One: Conforming to Conversational Assistants. Minds & Machines 31, 535–562 (2021). https://doi.org/10.1007/s11023-021-09581-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11023-021-09581-8