Being Trustworthy is Not Enough: How Untrustworthy Artificial Intelligence (AI) Can Deceive the End-Users and Gain Their Trust

Nikola Banovic,Zhuoran Yang,Aditya Ramesh,Alice Liu
DOI: https://doi.org/10.1145/3579460
2023-04-14
Proceedings of the ACM on Human-Computer Interaction
Abstract:Trustworthy Artificial Intelligence (AI) is characterized, among other things, by: 1) competence, 2) transparency, and 3) fairness. However, end-users may fail to recognize incompetent AI, allowing untrustworthy AI to exaggerate its competence under the guise of transparency to gain unfair advantage over other trustworthy AI. Here, we conducted an experiment with 120 participants to test if untrustworthy AI can deceive end-users to gain their trust. Participants interacted with two AI-based chess engines, trustworthy (competent, fair) and untrustworthy (incompetent, unfair), that coached participants by suggesting chess moves in three games against another engine opponent. We varied coaches' transparency about their competence (with the untrustworthy one always exaggerating its competence). We quantified and objectively measured participants' trust based on how often participants relied on coaches' move recommendations. Participants showed inability to assess AI competence by misplacing their trust with the untrustworthy AI, confirming its ability to deceive. Our work calls for design of interactions to help end-users assess AI trustworthiness.
What problem does this paper attempt to address?