Open-Source Assessments of AI Capabilities: The Proliferation of AI Analysis Tools, Replicating Competitor Models, and the Zhousidun Dataset

Ritwik Gupta,Leah Walker,Eli Glickman,Raine Koizumi,Sarthak Bhatnagar,Andrew W. Reddie
2024-05-25
Abstract:The integration of artificial intelligence (AI) into military capabilities has become a norm for major military power across the globe. Understanding how these AI models operate is essential for maintaining strategic advantages and ensuring security. This paper demonstrates an open-source methodology for analyzing military AI models through a detailed examination of the Zhousidun dataset, a Chinese-originated dataset that exhaustively labels critical components on American and Allied destroyers. By demonstrating the replication of a state-of-the-art computer vision model on this dataset, we illustrate how open-source tools can be leveraged to assess and understand key military AI capabilities. This methodology offers a robust framework for evaluating the performance and potential of AI-enabled military capabilities, thus enhancing the accuracy and reliability of strategic assessments.
Computers and Society
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to evaluate and understand the capabilities of military artificial intelligence (AI) models through open - source methods, especially for China's publicly available Zhousidun dataset. Specifically, the author hopes to replicate a near - state - of - the - art computer vision model and train and evaluate it on the Zhousidun dataset, demonstrating how to use open - source tools to analyze key military AI capabilities. This helps to improve the understanding of competitor AI model performance and potential, thereby enhancing the accuracy and reliability of strategic assessment. ### Core issues of the paper 1. **Evaluating the capabilities of military AI models**: - The author hopes to develop an open - source method to evaluate the capabilities of military AI models, especially those used for identifying and classifying military ships and their key components. 2. **Understanding the value of the Zhousidun dataset**: - The Zhousidun dataset is a publicly available dataset from China, containing 608 oblique and satellite images of US and its allies' destroyers, with detailed annotations of key components of the Aegis combat system. The author hopes to reveal its potential uses and value by studying this dataset. 3. **Verifying the generalization ability of the model**: - The author not only evaluates the performance of the model on the original dataset but also tests the model's generalization ability from different perspectives by generating synthetic data to evaluate its reliability in practical applications. ### Methods and experiments - **Dataset description**: - The Zhousidun dataset contains 608 images of military ships, mainly destroyers equipped with the Aegis combat system. These images are sourced from the public Internet, including platforms such as Google Earth. The positions of SPY radars and vertical launch systems are annotated in the images. - **Model selection**: - The author selects YOLOv8 as the object detection model because it performs excellently in terms of speed, memory efficiency, and accuracy, and is suitable for deployment on edge devices such as small drones. - **Experimental design**: - Train the YOLOv8 - large model on the Zhousidun dataset and use mean average precision (mAP) as the evaluation metric. mAP measures the comprehensive performance of the model in terms of recall and precision. \[ \text{mAP}=\frac{\sum_{i = 1}^{N}\text{AP}_i}{N} \] where \(\text{AP}_i\) is the average precision of each category and \(N\) is the total number of categories. - **Result analysis**: - On the Zhousidun test set, the model achieved an mAP of 0.926, but its performance on the synthetic dataset decreased significantly, especially in images simulating satellite perspectives. This indicates that the model has limited generalization ability in the real world. ### Conclusions and significance - **Limitations and challenges**: - Although the model performs well on the original dataset, it performs poorly on datasets simulating real - world scenarios, especially in near - top - view perspectives. This indicates that models trained solely on publicly available data have difficulty coping with complex real - world environments. - **Potential applications**: - The research shows how to evaluate competitor AI models through open - source tools, which is of great significance for understanding and dealing with potential threats. At the same time, it also highlights the crucial role of data quality in AI model performance. In conclusion, through detailed experiments and analysis, this paper reveals the potential and challenges of open - source methods in evaluating the capabilities of military AI models, providing valuable references for future research and applications.