Systematic analysis of 32,111 AI model cards characterizes documentation practice in AI

Weixin Liang,Nazneen Rajani,Xinyu Yang,Ezinwanne Ozoani,Eric Wu,Yiqun Chen,Daniel Scott Smith,James Zou
DOI: https://doi.org/10.1038/s42256-024-00857-z
IF: 23.8
2024-06-22
Nature Machine Intelligence
Abstract:The rapid proliferation of AI models has underscored the importance of thorough documentation, which enables users to understand, trust and effectively use these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much or what information these cards contain. In this study we conduct a comprehensive analysis of 32,111 AI model documentations on Hugging Face, a leading platform for distributing and deploying AI models. Our investigation sheds light on the prevailing model card documentation practices. Most AI models with a substantial number of downloads provide model cards, although with uneven informativeness. We find that sections addressing environmental impact, limitations and evaluation exhibit the lowest filled-out rates, whereas the training section is the one most consistently filled-out. We analyse the content of each section to characterize practitioners' priorities. Interestingly, there are considerable discussions of data, sometimes with equal or even greater emphasis than the model itself. Our study provides a systematic assessment of community norms and practices surroinding model documentation through large-scale data science and linguistic analysis.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?