Abstract:We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like GPT-4 excel in broader question-answering scenarios due to superior reasoning capabilities, our findings suggest that continual pre-training with limited resources can still enhance model performance on specialized topics. Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use. Comprehensive quantitative benchmarking is currently in progress and will be detailed in an upcoming full paper. The model, AstroLLaMA-Chat, is now available at

What problem does this paper attempt to address?

The problem this paper attempts to address is: In the field of astronomy, existing large language models (such as GPT-4 and LLaMA-2), although performing well in a wide range of tasks, have limitations when dealing with highly specialized astronomical issues. These limitations are mainly reflected in the following aspects: 1. **Lack of detail and precision**: Existing models often lack in-depth understanding and precise expression of complex details when answering astronomical questions. 2. **Data update lag**: Due to the low frequency of training dataset updates, these models struggle to timely reflect the latest advancements in astronomical research. 3. **Limited conversational ability**: Existing models perform poorly in multi-turn conversations and cannot sustain high-quality interactions. To address these issues, the paper proposes a new model named AstroLLaMA-Chat. By continuously pre-training and fine-tuning on specific datasets in the field of astronomy, this model aims to improve question-answering capabilities and conversation quality in the field of astronomy. Specifically, the main objectives of the paper include: - **Enhancing understanding of specialized topics**: By training on abstracts, introductions, and conclusions of astronomical papers, the model's understanding of complex astronomical concepts is enhanced. - **Timely reflection of the latest research findings**: Utilizing the latest astronomical datasets ensures that the model can capture the latest research dynamics. - **Improving conversational ability**: By fine-tuning on domain-specific conversational datasets, the model's performance in multi-turn conversations is enhanced. In summary, this paper aims to develop a high-performance conversational model specifically for the field of astronomy through continuous pre-training and fine-tuning, to address the shortcomings of existing large language models in this domain.

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy

TCMChat: A Generative Large Language Model for Traditional Chinese Medicine

AstroMLab 3: Achieving GPT-4o Level Performance in Astronomy with a Specialized 8B-Parameter Large Language Model

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

AstroMLab 1: Who Wins Astronomy Jeopardy!?

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Llama 2: Open Foundation and Fine-Tuned Chat Models

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Designing an Evaluation Framework for Large Language Models in Astronomy Research

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

On Overcoming Miscalibrated Conversational Priors in LLM-based Chatbots

chatClimate: Grounding Conversational AI in Climate Science

Quokka: An Open-source Large Language Model ChatBot for Material Science

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest

Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature