Abstract:Artificial Intelligence (AI) tools have become incredibly powerful in generating synthetic images. Of particular concern are generated images that resemble photographs as they aspire to represent real world events. Synthetic photographs may be used maliciously by a broad range of threat actors, from scammers to nation-state actors, to deceive, defraud, and mislead people. Mitigating this threat usually involves answering a basic analytic question: Is the photograph real or synthetic? To address this, we have examined the capabilities of recent generative diffusion models and have focused on their flaws: visible artifacts in generated images which reveal their synthetic origin to the trained eye. We categorize these artifacts, provide examples, discuss the challenges in detecting them, suggest practical applications of our work, and outline future research directions.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to identify and distinguish synthetic images generated by artificial intelligence from real photos, in order to prevent these synthetic images from being used for deceptive or malicious purposes. Specifically, the article focuses on how to improve people's ability to recognize synthetic images by identifying the flaws (i.e., "artifacts") in synthetic images, thereby avoiding the harm and misinformation caused by believing false information by mistake.
### Detailed description of the main problem
1. **Widespread use of synthetic images**:
- Synthetic images can be used for legal purposes such as advertising, creative works, or entertainment.
- However, they may also be used for deceptive purposes, such as creating and selling false content, spreading false product information, and even influencing the opinions and views of individuals, groups, or the whole society.
2. **Flaws in synthetic images**:
- Although the generated synthetic images may be very realistic visually, many inconsistencies or artifacts can be found upon closer inspection.
- These artifacts include physical and geometric inconsistencies, human anatomical structure errors, text errors, color distortion, and unreasonable semantic scene combinations.
3. **Classification of artifacts**:
- The paper defines an artifact classification system (taxonomy), which divides artifacts into six major categories: physical artifacts, geometric artifacts, human anatomical artifacts, semantic artifacts, distortion artifacts, and text artifacts.
- Each category is further subdivided into specific artifact types. For example, physical artifacts include optical reflection errors, unreasonable light source positions, and inconsistent object shadows; geometric artifacts include shape, size, surface structure, and perspective errors.
4. **Application scenarios and future research directions**:
- Provide practical application suggestions, such as helping consumers identify false images on social media.
- Explore future research directions, including improving the generation model to reduce artifacts and developing more effective detection tools.
### Examples of artifacts
- **Physical artifacts**: As shown in the figure, there may be phenomena in the synthetic image that do not conform to physical laws, such as light without a reasonable source and objects floating in the air.
- **Geometric artifacts**: As shown in the figure, there may be geometric errors in the synthetic image, such as inconsistent table leg lengths and disproportionate object proportions.
- **Human anatomical artifacts**: As shown in the figure, there may be anatomical errors in the human body parts in the synthetic image, such as asymmetry and abnormal number of fingers.
- **Semantic artifacts**: As shown in the figure, the scene in the synthetic image may be illogical, for example, there are too many tables and chairs in the room, but the number of chairs is less than that of tables.
- **Distortion artifacts**: As shown in the figure, there may be visual distortions in the synthetic image, such as inconsistent colors, noise interference, and blurred areas.
- **Text artifacts**: As shown in the figure, there may be problems in the text in the synthetic image, such as spelling mistakes and inconsistent fonts.
Through these detailed analyses and classifications, the paper aims to provide readers with key clues for identifying synthetic images and improve their ability to recognize synthetic images.