Deliberative Technology for Alignment

Andrew Konya,Deger Turan,Aviv Ovadya,Lina Qui,Daanish Masood,Flynn Devine,Lisa Schirch,Isabella Roberts,Deliberative Alignment Forum
2023-12-07
Abstract:For humanity to maintain and expand its agency into the future, the most powerful systems we create must be those which act to align the future with the will of humanity. The most powerful systems today are massive institutions like governments, firms, and NGOs. Deliberative technology is already being used across these institutions to help align governance and diplomacy with human will, and modern AI is poised to make this technology significantly better. At the same time, the race to superhuman AGI is already underway, and the AI systems it gives rise to may become the most powerful systems of the future. Failure to align the impact of such powerful AI with the will of humanity may lead to catastrophic consequences, while success may unleash abundance. Right now, there is a window of opportunity to use deliberative technology to align the impact of powerful AI with the will of humanity. Moreover, it may be possible to engineer a symbiotic coupling between powerful AI and deliberative alignment systems such that the quality of alignment improves as AI capabilities increase.
Computers and Society,Human-Computer Interaction
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: How to ensure that the most powerful systems in the future (from governments to super - human - level artificial intelligence) are in line with human will, in order to safeguard and expand human autonomy in the future. Specifically, the paper explores the following points: 1. **Defining "human will"**: - The paper first explores the concept of "human will" and defines it as all humans' intentional preference judgments about all possible futures, which determine their voluntary actions. This definition not only covers individual wills but also collective, well - considered preferences. 2. **The framework of the alignment system**: - The alignment system refers to using a certain signal (such as human will) to take actions so as to make the future conform to this signal as much as possible. The paper proposes the importance of taking human will as the alignment target, believing that this is the only way to ensure human autonomy. 3. **The role of deliberative technology**: - Deliberative technology has been used in many institutions to better reflect the public will. Modern artificial intelligence has the potential to significantly improve this technology, making it more intelligent and effective. The paper discusses how to use AI to enhance deliberative technology, thereby better achieving alignment. 4. **The possibility of symbiotic improvement**: - The paper proposes an assumption that a symbiotic relationship can be formed between powerful AI and the deliberative alignment system. As the AI's capabilities increase, the quality of alignment will also improve accordingly. 5. **Action instructions**: - In order to increase the possibility of future alignment with human will, the paper proposes three specific action instructions: 1. **Generate a generally legal human will signal** as an open public resource. 2. **Integrate the intelligent deliberative alignment system into powerful institutions**. 3. **Ensure that the most powerful AI systems are aligned with human will**. Through these measures, the paper hopes to ensure that the behaviors of AI and other powerful systems are in line with the overall will of humans during their development, thereby avoiding potential catastrophic consequences and releasing more possibilities. ### Formula summary Although this article does not involve complex mathematical formulas, in order to ensure the rigor and completeness of the content, the following are the symbolic representations of some key concepts mentioned in the article: - \( w_i^t \): The will of the \( i \) - th individual at time \( t \). - \( w^t=\{w_1^t,\dots,w_i^t,\dots,w_N^t\} \): The overall human will at time \( t \), where \( N \) represents the total number of people. - \( W^t \): The will matrix, where each row corresponds to a person and each column corresponds to an item describing a future state. - \( m_{jk}^\tau \): The degree of alignment between the \( j \) - th item and the \( k \) - th possible cosmic state at time \( \tau \). - \( P_j w_{ij}^t m_{jk}^\tau \): The degree of alignment between the \( k \) - th cosmic state and the will of the \( i \) - th individual at time \( t \). These symbols and expressions help us understand more clearly the mechanism of "human will" and its alignment with the future in the paper.