Experts fail to reliably detect AI-generated histological data

Jan Hartung,Stefanie Reuter,Vera Anna Kulow,Michael Fähling,Cord Spreckelsen,Ralf Mrowka
DOI: https://doi.org/10.1101/2024.01.23.576647
2024-01-25
Abstract:AI-based methods to generate images have seen unprecedented advances in recent years challenging both image forensic and human perceptual capabilities. Accordingly, they are expected to play an increasingly important role in the fraudulent fabrication of data. This includes images with complicated intrinsic structures like histological tissue samples, which are harder to forge manually. We use stable diffusion, one of the most recent generative algorithms, to create such a set of artificial histological samples and in a large study with over 800 participants, we study the ability of human subjects to discriminate between such artificial and genuine histological images. Although they perform better than naive participants, we find that even experts fail to reliably identify fabricated data. While participant performance depends on the amount of training data used, even low quantities result in convincing images, necessitating methods to detect fabricated data and technical standards such as C2PA to secure data integrity.
Scientific Communication and Education
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **Can experts and ordinary participants reliably distinguish histological images generated by AI from real histological images?** Specifically, in recent years, as AI - based methods have made unprecedented progress in image generation, these methods have challenged not only image forensics techniques but also human perception. Especially in scientific publications, it has become increasingly difficult to detect falsified data (including complex histological sample images). Therefore, the researchers used a state - of - the - art generation algorithm, the Stable Diffusion model, to create artificial histological samples and, through a large - scale study (with more than 800 participants), evaluated the ability of humans (including experts and non - experts) to distinguish these artificial images from real ones. The core issues of the paper can be summarized as follows: 1. **Can complex images generated by AI be reliably recognized by humans?** - The researchers found that even experts with relevant field experience cannot reliably identify AI - generated images. 2. **The impact of the amount of training data on image generation quality** - Research shows that even with a small amount of training data (such as 3 or 15 images), the generated images are still very realistic, making it difficult for humans to distinguish. 3. **The performance difference between experts and non - experts** - Although experts perform better than non - experts, overall, the performance of both groups of participants is low, indicating the difficulty of distinguishing real from AI - generated images. 4. **The relationship between response time and classification accuracy** - The study also found that the response time of participants when correctly classifying is usually shorter than when misclassifying, indicating that the correct classification decision is relatively easy. ### Conclusions and Recommendations The conclusion of the paper emphasizes the challenges currently faced by the scientific community: **Images generated by AI have reached a level that is difficult to distinguish with the naked eye**, which poses a threat to scientific integrity. To address this issue, the authors make the following recommendations: - **Introduce automated detection tools**: Use technical means to scan images in scientific publications to improve detection efficiency and accuracy. - **Implement data provenance standards**: For example, technical standards such as C2PA to ensure data integrity and traceability. - **Strengthen the requirements for submitting original data**: Ensure that journals can access the original data during the review process, thereby reducing the occurrence of fraud. In summary, this paper reveals the potential risks of AI - generated images in scientific publications and calls for technical and policy measures to maintain scientific integrity and transparency.