Shape-Preserving Generation of Food Images for Automatic Dietary Assessment

Guangzong Chen,Zhi-Hong Mao,Mingui Sun,Kangni Liu,Wenyan Jia
2024-08-24
Abstract:Traditional dietary assessment methods heavily rely on self-reporting, which is time-consuming and prone to bias. Recent advancements in Artificial Intelligence (AI) have revealed new possibilities for dietary assessment, particularly through analysis of food images. Recognizing foods and estimating food volumes from images are known as the key procedures for automatic dietary assessment. However, both procedures required large amounts of training images labeled with food names and volumes, which are currently unavailable. Alternatively, recent studies have indicated that training images can be artificially generated using Generative Adversarial Networks (GANs). Nonetheless, convenient generation of large amounts of food images with known volumes remain a challenge with the existing techniques. In this work, we present a simple GAN-based neural network architecture for conditional food image generation. The shapes of the food and container in the generated images closely resemble those in the reference input image. Our experiments demonstrate the realism of the generated images and shape-preserving capabilities of the proposed framework.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve two key problems in automatic dietary assessment: 1. **Obtaining a large number of training images labeled with food names and volumes**: Traditional dietary assessment methods rely on self - reporting, which is time - consuming and prone to bias. In order to achieve automated dietary assessment, a large amount of labeled data is required to train deep - learning models to identify foods and estimate their volumes. However, such labeled data is currently very limited. 2. **Generating high - quality food images with consistent shapes**: Existing generative adversarial networks (GANs) can generate food images, but it is difficult to ensure both image quality and the accuracy of food shapes at the same time. This has an adverse impact on the accuracy of dietary assessment. To solve these problems, the author proposes a neural network architecture based on GAN for conditionally generating food images. This architecture can generate realistic food images and keep the shapes of foods and containers in the generated images similar to those in the reference input images. Specifically, the model solves the problems in the following ways: - **Generating realistic food images**: By introducing a shape encoder, it is ensured that the generated images are not only realistic but also retain the shapes of foods and containers in the original images. - **Increasing the amount of training data**: Using the generated images to expand the training set, thereby improving the performance of the AI system in food identification and volume estimation. - **Controlling food categories and shapes**: By using style and category variables, the food category of the generated images can be conveniently controlled, and the consistency of food shapes can be maintained. These improvements enable the model to effectively overcome the limitations in current automatic dietary assessment and provide an efficient and scalable solution.