Using Generative Agents to Create Tip Sheets for Investigative Data Reporting

Joris Veerbeek,Nicholas Diakopoulos
2024-09-11
Abstract:This paper introduces a system using generative AI agents to create tip sheets for investigative data reporting. Our system employs three specialized agents--an analyst, a reporter, and an editor--to collaboratively generate and refine tips from datasets. We validate this approach using real-world investigative stories, demonstrating that our agent-based system generally generates more newsworthy and accurate insights compared to a baseline model without agents, although some variability was noted between different stories. Our findings highlight the potential of generative AI to provide leads for investigative data reporting.
Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to use generative AI agents to mine valuable news leads from datasets to assist in investigative data reporting. Specifically, the author designed, developed, and evaluated a system that uses three specialized AI agents (analyst, reporter, and editor) to collaborate in generating and optimizing data - driven "tip sheets", which can provide inspiration for reporters to further explore datasets. ### Research Background In investigative data reporting, reporters need to discover valuable information from a large amount of data, and this process is usually time - consuming and complex. With the development of generative AI technology, researchers hope to use the capabilities of large - language models (LLMs) to automate part of the data analysis work, thereby improving efficiency and discovering more valuable news leads. However, most of the existing computational news discovery (CND) methods rely on predefined templates and lack flexibility and creativity. ### Solution The author proposed a system based on generative AI agents, which is implemented through the following steps: 1. **Question Generation**: The reporter agent generates a series of feasible questions based on the dataset. 2. **Analysis Plan**: The analyst agent formulates a detailed analysis plan for each question. 3. **Execution and Explanation**: The analyst executes the analysis plan and summarizes the results into key points. The editor and reporter provide feedback to help improve the analysis. 4. **Compilation and Presentation**: Finally, the reporter compiles the most valuable findings into tip sheets for users' reference. ### Main Contributions - **Multi - agent Collaboration**: By introducing three AI agents with different roles (analyst, reporter, and editor), the flexibility and creativity of the system are improved. - **Practical Verification**: Verification is carried out using real - world investigative data reporting projects, proving that this system can generate more newsworthy and accurate insights compared to the baseline model. - **Newsworthiness Improvement**: Experimental results show that generative AI agents significantly improve the newsworthiness of the generated leads, that is, they are more likely to become valuable news materials. ### Future Work Although this system has demonstrated the potential of generative AI in investigative data reporting, there are still many aspects that need further research and improvement, such as supporting more complex code execution, incorporating the data collection stage, and better understanding the bias of AI - generated content. In conclusion, this paper aims to explore the application potential of generative AI agents in investigative data reporting and provide reporters with more valuable news leads through automated and intelligent means.