From Large Language Models to Databases and Back: A discussion on research and education

Sihem Amer-Yahia,Angela Bonifati,Lei Chen,Guoliang Li,Kyuseok Shim,Jianliang Xu,Xiaochun Yang
2023-07-08
Abstract:This discussion was conducted at a recent panel at the 28th International Conference on Database Systems for Advanced Applications (DASFAA 2023), held April 17-20, 2023 in Tianjin, China. The title of the panel was "What does LLM (ChatGPT) Bring to Data Science Research and Education? Pros and Cons". It was moderated by Lei Chen and Xiaochun Yang. The discussion raised several questions on how large language models (LLMs) and database research and education can help each other and the potential risks of LLMs.
Databases
What problem does this paper attempt to address?
The paper primarily explores the relationship between Large Language Models (LLMs) and database research and education, and attempts to address the following core issues: 1. **Application of LLMs in Database Research**: - Explore how LLMs assist in data preparation, annotation tasks (such as text mining, sentiment analysis, etc.), and applications in feature extraction, selection, and parameter tuning. - Analyze the potential of LLMs in knowledge acquisition and analysis, while also pointing out issues with their accuracy. 2. **Support of Database Research for LLMs**: - Study how data cleaning, preprocessing, and other methods can support the development of LLMs. - Explore how to optimize prompt engineering to improve the effectiveness of LLMs. 3. **Application of LLMs in the Field of Education**: - Discuss how LLMs can be used to reform database education, helping students master techniques for handling dirty data. - Emphasize the need for cautious use of information generated by LLMs to prevent plagiarism and inaccuracies. 4. **Research Assistant Function of LLMs**: - Explore the assistance LLMs can provide in scientific writing, such as proofreading, rewriting, and summary generation. - Analyze the potential of LLMs in data creation and data analysis, including tasks like code generation, data cleaning, and feature engineering. 5. **Specific Applications of LLMs in Education**: - Explore the possibility of LLMs as teaching tools, including how to teach about LLMs themselves and their applications. - Analyze the challenges and opportunities when LLMs are used as learner models or tools to support learners. In summary, this paper aims to comprehensively explore the potential impacts and applications of LLMs in database research and education, while identifying and addressing related technical and social ethical issues.