Mansi Goel,Ayush Agarwal,Shubham Agrawal,Janak Kapuriya,Akhil Vamshi Konam,Rishabh Gupta,Shrey Rastogi,Niharika,Ganesh Bagler
Abstract:Food touches our lives through various endeavors, including flavor, nourishment, health, and sustainability. Recipes are cultural capsules transmitted across generations via unstructured text. Automated protocols for recognizing named entities, the building blocks of recipe text, are of immense value for various applications ranging from information extraction to novel recipe generation. Named entity recognition is a technique for extracting information from unstructured or semi-structured data with known labels. Starting with manually-annotated data of 6,611 ingredient phrases, we created an augmented dataset of 26,445 phrases cumulatively. Simultaneously, we systematically cleaned and analyzed ingredient phrases from RecipeDB, the gold-standard recipe data repository, and annotated them using the Stanford NER. Based on the analysis, we sampled a subset of 88,526 phrases using a clustering-based approach while preserving the diversity to create the machine-annotated dataset. A thorough investigation of NER approaches on these three datasets involving statistical, fine-tuning of deep learning-based language models and few-shot prompting on large language models (LLMs) provides deep insights. We conclude that few-shot prompting on LLMs has abysmal performance, whereas the fine-tuned spaCy-transformer emerges as the best model with macro-F1 scores of 95.9%, 96.04%, and 95.71% for the manually-annotated, augmented, and machine-annotated datasets, respectively.
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve
This paper aims to address the problem of Named Entity Recognition (NER) in recipe texts. Specifically, it focuses on extracting named entities from unstructured or semi-structured recipe texts, including information such as ingredient names, quantities, units, states (e.g., fresh/dry), sizes, and temperatures.
### Background and Motivation
1. **Importance and Diversity of Food**: Food plays a crucial role in our lives, providing not only nutrition and taste enjoyment but also involving aspects of health and sustainability.
2. **Unstructured Nature of Recipe Texts**: Recipes are usually presented in unstructured text form, containing a large number of named entities. Automated NER technology is essential for extracting valuable information from these texts.
3. **Wide Range of Applications**: NER technology has broad application value in information extraction, new recipe generation, dietary safety detection, restaurant operation optimization, food safety tracking, cost, and sustainability analysis.
### Research Objectives
1. **Creating Annotated Datasets**: The paper creates multiple datasets through manual annotation and data augmentation techniques, including manually annotated datasets, extended datasets, and machine-annotated datasets.
2. **Evaluating the Performance of Different Models**: The study investigates the performance of statistical methods, fine-tuning of deep learning-based language models, and few-shot prompting on large-scale language models.
3. **Proposing the Best Model**: Through experimental validation, the best-performing model for the task of NER in recipe texts is determined.
### Main Contributions
1. **Creation of Datasets**:
- Manually Annotated Dataset: Contains 6,611 ingredient phrases.
- Extended Dataset: Expanded to 26,445 ingredient phrases through techniques such as label replacement, synonym replacement, and intra-segment shuffling.
- Machine-Annotated Dataset: Extracted 349,762 unique ingredient phrases from the RecipeDB dataset and selected 88,526 phrases for annotation using the Stratified Entity Frequency Sampling (SEFS) method.
2. **Model Evaluation**:
- Evaluated different models using macro F1 score, precision, and recall.
- Experimental results show that the fine-tuned spaCy-transformer model performs excellently on all three datasets, achieving macro F1 scores of 95.9%, 96.04%, and 95.71%, respectively.
3. **Evaluation of Few-Shot Prompting**:
- Few-shot prompting on large-scale language models performed poorly, indicating a lack of domain-specific knowledge in these models, necessitating further domain-specific data fine-tuning.
### Conclusion
By creating high-quality datasets and evaluating various models, the paper successfully addresses the problem of NER in recipe texts. The research results indicate that deep learning-based models, particularly the fine-tuned spaCy-transformer model, perform excellently in this task. Additionally, the study highlights the limitations of using few-shot prompting in specific domains, pointing out directions for future research.