Abstract:The advancement of Large Language Models (LLMs) for domain applications in fields such as materials science and engineering depends on the development of fine-tuning strategies that adapt models for specialized, technical capabilities. In this work, we explore the effects of Continued Pretraining (CPT), Supervised Fine-Tuning (SFT), and various preference-based optimization approaches, including Direct Preference Optimization (DPO) and Odds Ratio Preference Optimization (ORPO), on fine-tuned LLM performance. Our analysis shows how these strategies influence model outcomes and reveals that the merging of multiple fine-tuned models can lead to the emergence of capabilities that surpass the individual contributions of the parent models. We find that model merging leads to new functionalities that neither parent model could achieve alone, leading to improved performance in domain-specific assessments. Experiments with different model architectures are presented, including Llama 3.1 8B and Mistral 7B models, where similar behaviors are observed. Exploring whether the results hold also for much smaller models, we use a tiny LLM with 1.7 billion parameters and show that very small LLMs do not necessarily feature emergent capabilities under model merging, suggesting that model scaling may be a key component. In open-ended yet consistent chat conversations between a human and AI models, our assessment reveals detailed insights into how different model variants perform and show that the smallest model achieves a high intelligence score across key criteria including reasoning depth, creativity, clarity, and quantitative precision. Other experiments include the development of image generation prompts based on disparate biological material design concepts, to create new microstructures, architectural concepts, and urban design based on biological materials-inspired construction principles.

Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging

Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction

Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation

An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models

HFT: Half Fine-Tuning for Large Language Models

Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement

PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance

Model Balancing Helps Low-data Training and Fine-tuning

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning

Parameter Competition Balancing for Model Merging

UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function