Abstract:With the rise of online education platforms, there is a growing abundance of educational content across various domain. It can be difficult to navigate the numerous available resources to find the most suitable training, especially in domains that include many interconnected areas, such as ICT. In this study, we propose a domain-specific chatbot application that requires limited resources, utilizing versions of the Phi language model to help learners with educational content. In the proposed method, Phi-2 and Phi-3 models were fine-tuned using QLoRA. The data required for fine-tuning was obtained from the Huawei Talent Platform, where courses are available at different levels of expertise in the field of computer science. RAG system was used to support the model, which was fine-tuned by 500 Q&A pairs. Additionally, a total of 420 Q&A pairs of content were extracted from different formats such as JSON, PPT, and DOC to create a vector database to be used in the RAG system. By using the fine-tuned model and RAG approach together, chatbots with different competencies were obtained. The questions and answers asked to the generated chatbots were saved separately and evaluated using ROUGE, BERTScore, METEOR, and BLEU metrics. The precision value of the Phi-2 model supported by RAG was 0.84 and the F1 score was 0.82. In addition to a total of 13 different evaluation metrics in 4 different categories, the answers of each model were compared with the created content and the most appropriate method was selected for real-life applications.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to help learners efficiently find the training resources that best meet their needs in the context of the increasingly rich content on online education platforms, especially in fields like ICT which contain many inter - related areas. Specifically, the paper proposes a domain - specific chatbot application aimed at using improved language models (such as Phi - 2 and Phi - 3) to help learners retrieve educational content. ### Main problems: 1. **Information overload**: There are a large number of educational resources on online education platforms, and it is difficult for learners to screen out the most suitable training content from them. 2. **Domain complexity**: Especially in fields like ICT, which involve multiple inter - related sub - fields, making navigation and selection more difficult. 3. **Real - time and accuracy**: Traditional language models may generate inaccurate or fact - inconsistent content (i.e., "hallucinations") when generating answers, especially when dealing with domain - specific questions. ### Solutions: To address the above challenges, the paper proposes the following solutions: - **Combining RAG (Retrieval - Augmented Generation) and LLM (Large Language Model)**: By using the RAG system, combine external data sources with language models to improve the accuracy and relevance of generated answers. - **Parameter - efficient fine - tuning**: Use the QLoRA (Quantized Low - Rank Adaptation) method to fine - tune the Phi - 2 and Phi - 3 models to reduce computational resource consumption and improve model performance. - **Multi - source data extraction**: Obtain course content from platforms such as the Huawei Talent Platform and convert it into Q&A pairs for training and evaluating the model. - **Evaluation metrics**: Evaluate the performance of different methods through metrics such as BLEU, ROUGE, METEOR, and BERTScore to ensure that the generated answers are both accurate and comprehensive. ### Specific steps: 1. **Dataset generation**: Extract 500 Q&A pairs from the course content of the Huawei Talent Platform for fine - tuning the model. 2. **Parameter - efficient fine - tuning**: Use the QLoRA method to fine - tune the Phi - 2 and Phi - 3 models to adapt to domain - specific tasks. 3. **Application of the RAG system**: Utilize the RAG system to enhance the model's generation ability by retrieving external data sources. 4. **Performance evaluation**: Compare the performance of different methods through multiple evaluation metrics and select the optimal solution for application in practical scenarios. ### Summary: The main objective of this paper is to develop a chatbot that can efficiently retrieve and provide accurate educational content by combining RAG and fine - tuning techniques, thereby helping learners better select training resources suitable for them.

Efficient Learning Content Retrieval with Knowledge Injection

Reinforcement Learning for Optimizing RAG for Domain Chatbots

IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery

Q-Module-Bot: A Generative AI-Based Question and Answer Bot for Module Teaching Support

From Questions to Insightful Answers: Building an Informed Chatbot for University Resources

EduChatbot: Implementing educational Chatbot for assisting the teaching-learning process by NLP-based hybrid heuristic adopted deep learning framework

Advanced NLP Models for Technical University Information Chatbots: Development and Comparative Analysis

Unimib Assistant: designing a student-friendly RAG-based chatbot for all their needs

Understanding the impact of knowledge management factors on the sustainable use of AI-based chatbots for educational purposes using a hybrid SEM-ANN approach

Towards Optimizing and Evaluating a Retrieval Augmented QA Chatbot using LLMs with Human in the Loop

Assessing Fine-Tuning Efficacy in LLMs: A Case Study with Learning Guidance Chatbots

A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration

Integrating A.I. in Higher Education: Protocol for a Pilot Study with 'SAMCares: An Adaptive Learning Hub'

MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation

KatzBot: Revolutionizing Academic Chatbot for Enhanced Communication

RAG based Question-Answering for Contextual Response Prediction System

Supporting Student Decisions on Learning Recommendations: An LLM-Based Chatbot with Knowledge Graph Contextualization for Conversational Explainability and Mentoring

Distilling Knowledge for Fast Retrieval-based Chat-bots

Genie‐on‐demand: A custom AI chatbot for enhancing learning performance, self‐efficacy, and technology acceptance in occupational health and safety for engineering education

ArRASA: Channel Optimization for Deep Learning-Based Arabic NLU Chatbot Framework