Abstract:Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfer to life science tasks is non-trivial due to the different training strategies and data forms. To address this gap, we introduce SES-Adapter, a simple, efficient, and scalable adapter method for enhancing the representation learning of PLMs. SES-Adapter incorporates PLM embeddings with structural sequence embeddings to create structure-aware representations. We show that the proposed method is compatible with different PLM architectures and across diverse tasks. Extensive evaluations are conducted on 2 types of folding structures with notable quality differences, 9 state-of-the-art baselines, and 9 benchmark datasets across distinct downstream tasks. Results show that compared to vanilla PLMs, SES-Adapter improves downstream task performance by a maximum of 11% and an average of 3%, with significantly accelerated training speed by a maximum of 1034% and an average of 362%, the convergence rate is also improved by approximately 2 times. Moreover, positive optimization is observed even with low-quality predicted structures. The source code for SES-Adapter is available at

What problem does this paper attempt to address?

The paper mainly addresses the following issues: 1. **Enhancing the performance of Protein Language Models (PLMs) on downstream tasks**: By proposing a method called SES-Adapter, it aims to improve the performance of PLMs in various downstream prediction tasks, particularly by incorporating protein structure information to enhance representation learning. 2. **Efficient and lightweight fine-tuning strategy**: To address the lack of efficient fine-tuning methods in the protein domain, a simple, efficient, and scalable adapter method is proposed to improve the representation quality and training efficiency of PLMs. 3. **Effective integration of structural information**: It solves the potential negative optimization issues that may arise from directly adding structural information to PLMs by effectively integrating protein sequence and structure information through a novel method, thereby improving the performance on downstream tasks. Specifically, SES-Adapter addresses these issues in the following ways: - **Structure-aware representation**: This method combines the embeddings of PLMs with structural sequence embeddings to generate structure-aware representations, which help capture key information in protein structures, thereby improving the prediction performance of downstream tasks. - **Compatibility and adaptability**: SES-Adapter is designed to be model-agnostic, applicable to different PLM architectures, and shows good compatibility and adaptability across various types of downstream tasks. - **Fast convergence and accelerated training**: Experimental validation shows that this method not only significantly improves the performance of downstream tasks but also greatly accelerates training speed and improves convergence efficiency. - **Error tolerance**: Even when using low-quality or predicted protein structures, SES-Adapter can effectively overcome inaccuracies in these structures, avoiding negative optimization. In summary, the main contribution of the paper is the proposal of a novel structure-aware adapter method that significantly enhances the performance of PLMs in various downstream tasks. It is efficient and lightweight, making it an important advancement for the progress of protein research.

Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

S-PLM: Structure-aware Protein Language Model via Contrastive Learning between Sequence and Structure

Efficient Inference, Training, and Fine-tuning of Protein Language Models

Structure-Infused Protein Language Models

Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction

Endowing Protein Language Models with Structural Knowledge

SaProt: Protein Language Modeling with Structure-aware Vocabulary

MULAN: Multimodal Protein Language Model for Sequence and Structure Encoding

Structure Language Models for Protein Conformation Generation

Absence of strial melanin coincides with age-associated marginal cell loss and endocochlear potential decline

CPE-Pro: A Structure-Sensitive Deep Learning Model for Protein Representation and Origin Evaluation

Parameter-Efficient Fine-Tuning With Adapters

InstructPLM: Aligning Protein Language Models to Follow Protein Structure Instructions

CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

Utilization of pre-trained language models for adapter-based knowledge transfer in software engineering

Experience Adapter: Adapting Pre-trained Language Models for Continual Task Planning.

Structure-informed Language Models Are Protein Designers

Contrasting Sequence with Structure: Pre-training Graph Representations with PLMs

ProSST: Protein Language Modeling with Quantized Structure and Disentangled Attention

Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding