Yoel Zimmermann,Adib Bazgir,Zartashia Afzal,Fariha Agbere,Qianxiang Ai,Nawaf Alampara,Alexander Al-Feghali,Mehrad Ansari,Dmytro Antypov,Amro Aswad,Jiaru Bai,Viktoriia Baibakova,Devi Dutta Biswajeet,Erik Bitzek,Joshua D. Bocarsly,Anna Borisova,Andres M Bran,L. Catherine Brinson,Marcel Moran Calderon,Alessandro Canalicchio,Victor Chen,Yuan Chiang,Defne Circi,Benjamin Charmes,Vikrant Chaudhary,Zizhang Chen,Min-Hsueh Chiu,Judith Clymo,Kedar Dabhadkar,Nathan Daelman,Archit Datar,Matthew L. Evans,Maryam Ghazizade Fard,Giuseppe Fisicaro,Abhijeet Sadashiv Gangan,Janine George,Jose D. Cojal Gonzalez,Michael Götte,Ankur K. Gupta,Hassan Harb,Pengyu Hong,Abdelrahman Ibrahim,Ahmed Ilyas,Alishba Imran,Kevin Ishimwe,Ramsey Issa,Kevin Maik Jablonka,Colin Jones,Tyler R. Josephson,Greg Juhasz,Sarthak Kapoor,Rongda Kang,Ghazal Khalighinejad,Sartaaj Khan,Sascha Klawohn,Suneel Kuman,Alvin Noe Ladines,Sarom Leang,Magdalena Lederbauer,Sheng-Lun Mark Liao,Hao Liu,Xuefeng Liu,Stanley Lo,Sandeep Madireddy,Piyush Ranjan Maharana,Shagun Maheshwari,Soroush Mahjoubi,José A. Márquez,Rob Mills,Trupti Mohanty,Bernadette Mohr,Seyed Mohamad Moosavi,Alexander Moßhammer,Amirhossein D. Naghdi,Aakash Naik,Oleksandr Narykov,Hampus Näsström,Xuan Vu Nguyen,Xinyi Ni,Dana O'Connor,Teslim Olayiwola,Federico Ottomano,Aleyna Beste Ozhan,Sebastian Pagel,Chiku Parida,Jaehee Park,Vraj Patel,Elena Patyukova,Martin Hoffmann Petersen,Luis Pinto,José M. Pizarro,Dieter Plessers,Tapashree Pradhan,Utkarsh Pratiush,Charishma Puli,Andrew Qin,Mahyar Rajabi,Francesco Ricci,Elliot Risch,Martiño Ríos-García,et al. (41 additional authors not shown)

Abstract:Here, we present the outcomes from the second Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry, which engaged participants across global hybrid locations, resulting in 34 team submissions. The submissions spanned seven key application areas and demonstrated the diverse utility of LLMs for applications in (1) molecular and material property prediction; (2) molecular and material design; (3) automation and novel interfaces; (4) scientific communication and education; (5) research data management and automation; (6) hypothesis generation and evaluation; and (7) knowledge extraction and reasoning from scientific literature. Each team submission is presented in a summary table with links to the code and as brief papers in the appendix. Beyond team results, we discuss the hackathon event and its hybrid format, which included physical hubs in Toronto, Montreal, San Francisco, Berlin, Lausanne, and Tokyo, alongside a global online hub to enable local and virtual collaboration. Overall, the event highlighted significant improvements in LLM capabilities since the previous year's hackathon, suggesting continued expansion of LLMs for applications in materials science and chemistry research. These outcomes demonstrate the dual utility of LLMs as both multipurpose models for diverse machine learning tasks and platforms for rapid prototyping custom applications in scientific research.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to explore the application potential and practical effects of large - language models (LLMs) in the fields of materials science and chemistry. Specifically, the paper showcases the achievements of the 2024 large - language - model (LLM) hackathon, covering seven key application areas: 1. **Molecular and Material Property Prediction**: Use LLMs to predict the chemical and physical properties of molecules and materials. They perform particularly well in environments with a small amount of data and can combine structured and unstructured data. 2. **Molecular and Material Design**: Utilize LLMs to generate and optimize new molecules and materials, including the design of peptides, metal - organic frameworks, and sustainable building materials. 3. **Automation and New Interfaces**: Develop natural - language interfaces and automated workflows to simplify complex scientific tasks and make advanced tools and techniques more accessible to researchers. 4. **Scientific Communication and Education**: Improve the efficiency of academic communication, automate the creation of educational content, and promote learning in the fields of materials science and chemistry. 5. **Research Data Management and Automation**: Simplify the processing, organization, and processing of scientific data through LLM - driven tools and multi - modal agents. 6. **Hypothesis Generation and Evaluation**: Use LLMs to generate, evaluate, and validate scientific hypotheses, often in combination with multiple AI agents and statistical methods. 7. **Knowledge Extraction and Reasoning**: Extract structured information from scientific literature and perform complex reasoning on chemical and materials - science concepts through knowledge graphs and multi - modal methods. Through the presentation of these projects, the paper aims to demonstrate the versatility and rapid prototyping capabilities of LLMs in the above - mentioned fields. At the same time, it emphasizes the significant improvement in LLM performance since the last hackathon and its broad application prospects in materials - science and chemistry research.

Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon

LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation

LLMatDesign: Autonomous Materials Discovery with Large Language Models

Materials science in the era of large language models: a perspective

Beyond designer's knowledge: Generating materials design hypotheses via large language models

Assessment of Fine-Tuned Large Language Models for Real-World Chemistry and Material Science Applications

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

Evaluating the Performance and Robustness of LLMs in Materials Science Q&A and Property Predictions

Are LLMs Ready for Real-World Materials Discovery?

An Interdisciplinary Outlook on Large Language Models for Scientific Research

A Prompt-Engineered Large Language Model, Deep Learning Workflow for Materials Classification

LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Large Language Models are Catalyzing Chemistry Education

LLM360: Towards Fully Transparent Open-Source LLMs

From Text to Insight: Large Language Models for Materials Science Data Extraction

MaScQA: Investigating Materials Science Knowledge of Large Language Models

From Words to Molecules: A Survey of Large Language Models in Chemistry

Regression with Large Language Models for Materials and Molecular Property Prediction

Quantum Many-Body Physics Calculations with Large Language Models