Abstract:The advancement of Large Language Models (LLMs) for domain applications in fields such as materials science and engineering depends on the development of fine-tuning strategies that adapt models for specialized, technical capabilities. In this work, we explore the effects of Continued Pretraining (CPT), Supervised Fine-Tuning (SFT), and various preference-based optimization approaches, including Direct Preference Optimization (DPO) and Odds Ratio Preference Optimization (ORPO), on fine-tuned LLM performance. Our analysis shows how these strategies influence model outcomes and reveals that the merging of multiple fine-tuned models can lead to the emergence of capabilities that surpass the individual contributions of the parent models. We find that model merging leads to new functionalities that neither parent model could achieve alone, leading to improved performance in domain-specific assessments. Experiments with different model architectures are presented, including Llama 3.1 8B and Mistral 7B models, where similar behaviors are observed. Exploring whether the results hold also for much smaller models, we use a tiny LLM with 1.7 billion parameters and show that very small LLMs do not necessarily feature emergent capabilities under model merging, suggesting that model scaling may be a key component. In open-ended yet consistent chat conversations between a human and AI models, our assessment reveals detailed insights into how different model variants perform and show that the smallest model achieves a high intelligence score across key criteria including reasoning depth, creativity, clarity, and quantitative precision. Other experiments include the development of image generation prompts based on disparate biological material design concepts, to create new microstructures, architectural concepts, and urban design based on biological materials-inspired construction principles.

Tracking Universal Features Through Fine-Tuning and Model Merging

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning

Merging Text Transformer Models from Different Initializations

An Emulator for Fine-Tuning Large Language Models using Small Language Models

Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures

Tuning Language Models by Mixture-of-Depths Ensemble

Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

LM-Cocktail: Resilient Tuning of Language Models via Model Merging

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

Making the Most of your Model: Methods for Finetuning and Applying Pretrained Transformers

From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain

Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model

Transformer Vision-Language Tracking via Proxy Token Guided Cross-Modal Fusion

INTEGRATING DISCRETE AND NEURAL FEATURES VIA MIXED-FEATURE TRANS-DIMENSIONAL RANDOM FIELD LANGUAGE MODELS

Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process

Unifying the Convergences in Multilingual Neural Machine Translation

Inheritune: Training Smaller Yet More Attentive Language Models

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification