Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes

Luis Rita,Josh Southern,Ivan Laponogov,Kyle Higgins,Kirill Veselkov
2024-09-13
Abstract:In the emerging field of computational gastronomy, aligning culinary practices with scientifically supported nutritional goals is increasingly important. This study explores how large language models (LLMs) can be applied to optimize ingredient substitutions in recipes, specifically to enhance the phytochemical content of meals. Phytochemicals are bioactive compounds found in plants, which, based on preclinical studies, may offer potential health benefits. We fine-tuned models, including OpenAI's GPT-3.5, DaVinci, and Meta's TinyLlama, using an ingredient substitution dataset. These models were used to predict substitutions that enhance phytochemical content and create a corresponding enriched recipe dataset. Our approach improved Hit@1 accuracy on ingredient substitution tasks, from the baseline 34.53 plus-minus 0.10% to 38.03 plus-minus 0.28% on the original GISMo dataset, and from 40.24 plus-minus 0.36% to 54.46 plus-minus 0.29% on a refined version of the same dataset. These substitutions led to the creation of 1,951 phytochemically enriched ingredient pairings and 1,639 unique recipes. While this approach demonstrates potential in optimizing ingredient substitutions, caution must be taken when drawing conclusions about health benefits, as the claims are based on preclinical evidence. Future work should include clinical validation and broader datasets to further evaluate the nutritional impact of these substitutions. This research represents a step forward in using AI to promote healthier eating practices, providing potential pathways for integrating computational methods with nutritional science.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issue of enhancing the phytochemical content in recipes through optimized ingredient substitution. Specifically, the study explores how to utilize Large Language Models (LLMs) to perform ingredient substitutions to increase the phytochemical content in meals. ### Main Objectives: 1. **Improve Ingredient Substitution Accuracy**: Enhance the accuracy of ingredient substitution tasks by fine-tuning large language models, including GPT-3.5, DaVinci, and TinyLlama. 2. **Generate Phytochemical-Rich Recipes**: Create new recipes that contain more phytochemicals with potential health benefits to prevent or treat diseases such as cancer, Alzheimer's disease (AD), and COVID-19. 3. **Validate Method Effectiveness**: Benchmark against the existing GISMo model to demonstrate the superiority of the proposed method in ingredient substitution prediction. ### Research Background: - Phytochemicals are bioactive compounds with antioxidant, anti-inflammatory, and anticancer properties. - Previous methods include statistical approaches, co-occurrence networks, and natural language processing-based techniques, but these methods have limitations in capturing the full culinary context. - Large Language Models (LLMs) are considered capable of overcoming the shortcomings of existing methods due to their powerful text understanding and generation capabilities, providing more accurate and context-sensitive ingredient substitution suggestions. ### Key Results: - On the original dataset, the improved model increased the Hit@1 accuracy from a baseline of 34.53±0.10% to 38.03±0.28%, and on the filtered dataset from 40.24±0.36% to 54.46±0.29%. - Generated 1,951 phytochemical-rich ingredient pairs and created 1,639 unique recipes. ### Future Work Directions: - Further clinical validation of the actual nutritional effects of these ingredient substitutions. - Expand the dataset to include more clinical and biochemical data to improve the accuracy and relevance of ingredient substitution suggestions.