NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images

Matthew Keller,Chi-en Amy Tai,Yuhao Chen,Pengcheng Xi,Alexander Wong

DOI: https://doi.org/10.48550/arXiv.2405.07814

2024-05-13

Abstract:Many aging individuals encounter challenges in effectively tracking their dietary intake, exacerbating their susceptibility to nutrition-related health complications. Self-reporting methods are often inaccurate and suffer from substantial bias; however, leveraging intelligent prediction methods can automate and enhance precision in this process. Recent work has explored using computer vision prediction systems to predict nutritional information from food images. Still, these methods are often tailored to specific situations, require other inputs in addition to a food image, or do not provide comprehensive nutritional information. This paper aims to enhance the efficacy of dietary intake estimation by leveraging various neural network architectures to directly predict a meal's nutritional content from its image. Through comprehensive experimentation and evaluation, we present NutritionVerse-Direct, a model utilizing a vision transformer base architecture with three fully connected layers that lead to five regression heads predicting calories (kcal), mass (g), protein (g), fat (g), and carbohydrates (g) present in a meal. NutritionVerse-Direct yields a combined mean average error score on the NutritionVerse-Real dataset of 412.6, an improvement of 25.5% over the Inception-ResNet model, demonstrating its potential for improving dietary intake estimation accuracy.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the challenges that the elderly encounter in tracking dietary intake, especially as this problem is exacerbated by health complications. Traditional self - reporting methods are often inaccurate and have significant biases, so an automated method is required to improve the accuracy of dietary intake estimation. The authors use deep neural networks to directly predict nutrient components from food images, aiming to improve the effectiveness of dietary intake estimation by improving the neural network architecture. Specifically, they explore different fully - connected layer structures and feature extractors (such as vision transformers and masked auto - encoders) to optimize the performance of predicting nutrient components from food images. The paper experimentally evaluates several different model architectures and finally proposes a vision - transformer - based model - NutritionVerse - Direct. This model can directly predict the calorie, mass, protein, fat and carbohydrate content of a meal from a food image and achieves a combined mean absolute error (MAE) 25.5% lower than that of the Inception - ResNet model on the NutritionVerse - Real dataset. This indicates that the model has the potential to improve the accuracy of dietary intake estimation.

NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images

NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

Nutrition Estimation for Dietary Management: A Transformer Approach with Depth Sensing

NutritionVerse-3D: A 3D Food Model Dataset for Nutritional Intake Estimation

Multi-Task Learning for Calorie Prediction on a Novel Large-Scale Recipe Dataset Enriched with Nutritional Information

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

Adsorption of Dicarbollylcobaltate(III) Anion {(pi-(3)-1,2-B(9)C(2)H(11))(2)Co(III)(-)} at the Water/1,2-Dichloroethane Interface. Influence of Counterions' Nature.

DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment

DPF-Nutrition: Food Nutrition Estimation via Depth Prediction and Fusion

NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment

An Intelligent Vision-Based Nutritional Assessment Method for Handheld Food Items

DelicacyNet for nutritional evaluation of recipes

Food Image Classification and Calorie Prediction for Dietary Analysis

When Segmentation is Not Enough: Rectifying Visual-Volume Discordance Through Multisensor Depth-Refined Semantic Segmentation for Food Intake Tracking in Long-Term Care

An Optimized Recurrent Neural Network for re-modernize food dining bowls and estimating food capacity from images

Towards computer vision powered color-nutrient assessment of pureed food

Utilizing RT-DETR Model for Fruit Calorie Estimation from Digital Images

NutrifyAI: An AI-Powered System for Real-Time Food Detection, Nutritional Analysis, and Personalized Meal Recommendations

An Exploratory Approach to Deriving Nutrition Information of Restaurant Food from Crowdsourced Food Images: Case of Hartford

Multi-Task Image-Based Dietary Assessment for Food Recognition and Portion Size Estimation