Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI

Hadassah Harland,Richard Dazeley,Peter Vamplew,Hashini Senaratne,Bahareh Nakisa,Francisco Cruz
2024-10-31
Abstract:Emerging research in Pluralistic Artificial Intelligence (AI) alignment seeks to address how intelligent systems can be designed and deployed in accordance with diverse human needs and values. We contribute to this pursuit with a dynamic approach for aligning AI with diverse and shifting user preferences through Multi Objective Reinforcement Learning (MORL), via post-learning policy selection adjustment. In this paper, we introduce the proposed framework for this approach, outline its anticipated advantages and assumptions, and discuss technical details about the implementation. We also examine the broader implications of adopting a retroactive alignment approach through the sociotechnical systems perspective.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve alignment adaptable to diverse and constantly changing user preferences in a Pluralistic AI system. Specifically, the researchers propose a dynamic adjustment method based on Multi - Objective Reinforcement Learning (MORL), continuously realigning user preferences through post - learning policy selection adjustment. This method aims to overcome the limitations of existing AI alignment methods, especially the fact that these methods usually assume that user preferences are static and cannot well adapt to the diversity and variability of user needs and values. The main contribution of the paper lies in providing a framework that can dynamically adjust its behavior during the operation of the AI system to better conform to the user's current preferences without requiring direct and specific feedback from the user. This not only improves the flexibility and adaptability of the system but also reduces the need for frequent user interactions, thereby alleviating the user's burden. In addition, through a continuous learning and self - review process, the system can continuously optimize its understanding and response to user preferences over time, thereby achieving a more long - lasting alignment state. In summary, the core issue of this paper is to explore an effective method to enable AI systems to dynamically and adaptively keep in line with users' diverse and dynamic preferences, thereby enhancing the practicality of AI systems and the user experience.