Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero

Lisa Schut,Nenad Tomasev,Tom McGrath,Demis Hassabis,Ulrich Paquet,Been Kim
2023-10-25
Abstract:Artificial Intelligence (AI) systems have made remarkable progress, attaining super-human performance across various domains. This presents us with an opportunity to further human knowledge and improve human expert performance by leveraging the hidden knowledge encoded within these highly performant AI systems. Yet, this knowledge is often hard to extract, and may be hard to understand or learn from. Here, we show that this is possible by proposing a new method that allows us to extract new chess concepts in AlphaZero, an AI system that mastered the game of chess via self-play without human supervision. Our analysis indicates that AlphaZero may encode knowledge that extends beyond the existing human knowledge, but knowledge that is ultimately not beyond human grasp, and can be successfully learned from. In a human study, we show that these concepts are learnable by top human experts, as four top chess grandmasters show improvements in solving the presented concept prototype positions. This marks an important first milestone in advancing the frontier of human knowledge by leveraging AI; a development that could bear profound implications and help us shape how we interact with AI systems across many AI applications.
Human-Computer Interaction,Machine Learning
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to address how to utilize hidden knowledge in artificial intelligence (AI) systems to expand the boundaries of human knowledge. Specifically, the paper proposes a new method to extract new chess concepts from AlphaZero (an AI system that masters chess through self-play) and verify whether these concepts can be learned and applied by top human experts. ### Main Research Questions 1. **How to extract new knowledge from highly performant AI systems**: The paper proposes a new method to find vectors representing new concepts in the latent space of AlphaZero. 2. **Can these new concepts be understood and learned by humans**: By collaborating with four top chess grandmasters, the paper verifies whether these new concepts can be learned and applied by human experts. 3. **Can the superhuman knowledge of AI systems be translated into human-understandable knowledge**: The paper explores the gap between the knowledge in AI systems (M−H) and existing human knowledge (H) and attempts to bridge this gap. ### Research Background - **Superhuman capabilities of AI systems**: AI systems have reached superhuman levels in many fields, such as AlphaZero's performance in chess. - **Challenges of knowledge extraction**: Despite the excellent performance of AI systems, their internal knowledge is often difficult to extract and understand. - **Human-AI interaction**: By transferring knowledge from AI systems to humans, human skills and understanding can be further enhanced. ### Research Methods 1. **Concept discovery**: Using convex optimization methods to find vectors representing new concepts in the latent space of AlphaZero. 2. **Concept filtering**: Ensuring these concepts are novel, teachable, and contain information unique to the AI system. 3. **Human experiments**: Collaborating with top chess grandmasters to verify whether these new concepts can be learned and applied by humans. ### Experimental Results - **Concept learning**: The four top chess grandmasters were able to solve related chess problems better after learning these new concepts. - **Concept application**: The grandmasters expressed that they understood and appreciated AlphaZero's new strategies, which often deviated from traditional chess principles. ### Significance - **Expanding human knowledge**: Extracting new concepts from AI systems can expand the boundaries of human knowledge. - **Promoting human-AI collaboration**: This method helps humans better understand and utilize the knowledge in AI systems, leading to greater breakthroughs in various fields. ### Conclusion The paper successfully proposes a new method to extract new chess concepts from AlphaZero and verifies that these concepts can be learned and applied by top human experts. This marks an important step in using AI systems to expand human knowledge, with profound implications.