Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

Hua Shen,Tiffany Knearem,Reshmi Ghosh,Kenan Alkiek,Kundan Krishna,Yachuan Liu,Ziqiao Ma,Savvas Petridis,Yi-Hao Peng,Li Qiwei,Sushrita Rakshit,Chenglei Si,Yutong Xie,Jeffrey P. Bigham,Frank Bentley,Joyce Chai,Zachary Lipton,Qiaozhu Mei,Rada Mihalcea,Michael Terry,Diyi Yang,Meredith Ringel Morris,Paul Resnick,David Jurgens
2024-08-11
Abstract:Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. In particular, ML- and philosophy-oriented alignment research often views AI alignment as a static, unidirectional process (i.e., aiming to ensure that AI systems' objectives match humans) rather than an ongoing, mutual alignment problem. This perspective largely neglects the long-term interaction and dynamic changes of alignment. To understand these gaps, we introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML). We characterize, define and scope human-AI alignment. From this, we present a conceptual framework of "Bidirectional Human-AI Alignment" to organize the literature from a human-centered perspective. This framework encompasses both 1) conventional studies of aligning AI to humans that ensures AI produces the intended outcomes determined by humans, and 2) a proposed concept of aligning humans to AI, which aims to help individuals and society adjust to AI advancements both cognitively and behaviorally. Additionally, we articulate the key findings derived from literature analysis, including literature gaps and trends, human values, and interaction techniques. To pave the way for future studies, we envision three key challenges and give recommendations for future research.
Human-Computer Interaction,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve bidirectional human - AI alignment in the context of the current rapid development of artificial intelligence (AI). Specifically, the paper focuses on the following aspects: 1. **Ambiguity in Definition and Scope**: Currently, there are unclear points regarding the definition and scope of human - AI alignment, which hinders the cooperative efforts in different research fields and makes it more difficult to achieve effective alignment. 2. **Limitations of One - Way Alignment**: Existing AI alignment research often regards it as a static, one - way process, that is, ensuring that the goals of AI systems match those of humans. This view ignores the influence of long - term interaction and dynamic changes and does not fully consider the possible evolution of human values and goals with the development of AI technology. 3. **Proposal of the Concept of Bidirectional Alignment**: To make up for the above deficiencies, the paper proposes the conceptual framework of "bidirectional human - AI alignment". This framework not only covers the traditional "aligning AI to humans" to ensure that AI can produce the expected results determined by humans, but also proposes a new concept - "aligning humans to AI", aiming to help individuals and society adapt to the development of AI in terms of cognition and behavior. 4. **Research Directions and Future Challenges**: Based on a systematic review of more than 400 related literatures, the paper identifies four key research questions (RQ1 - RQ4) and provides suggestions for the three main challenges faced by future research to promote interdisciplinary cooperation and drive the research progress of bidirectional human - AI alignment. Through these works, the paper aims to provide a comprehensive perspective for understanding the complex and dynamic interaction between humans and AI and to provide guidance for future research and development.