AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Tuan Dung Nguyen,Yuan-Sen Ting,Ioana Ciucă,Charlie O'Neill,Ze-Chang Sun,Maja Jabłońska,Sandor Kruk,Ernest Perkowski,Jack Miller,Jason Li,Josh Peek,Kartheik Iyer,Tomasz Różański,Pranav Khetarpal,Sharaf Zaman,David Brodrick,Sergio J. Rodríguez Méndez,Thang Bui,Alyssa Goodman,Alberto Accomazzi,Jill Naiman,Jesse Cranney,Kevin Schawinski,UniverseTBD
2023-09-12
Abstract:Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
Instrumentation and Methods for Astrophysics,Cosmology and Nongalactic Astrophysics,Astrophysics of Galaxies,High Energy Astrophysical Phenomena,Computation and Language,Machine Learning