Abstract:Reinforcement learning from human feedback (RLHF) has proven effective in enhancing the instruction-following capabilities of large language models; however, it remains underexplored in the cross-modality domain. As the number of modalities increases, aligning all-modality models with human intentions -- such as instruction following -- becomes a pressing challenge. In this work, we make the first attempt to fine-tune all-modality models (i.e. input and output with any modality, also named any-to-any models) using human preference data across all modalities (including text, image, audio, and video), ensuring its behavior aligns with human intentions. This endeavor presents several challenges. First, there is no large-scale all-modality human preference data in existing open-source resources, as most datasets are limited to specific modalities, predominantly text and image. Secondly, the effectiveness of binary preferences in RLHF for post-training alignment in complex all-modality scenarios remains an unexplored area. Finally, there is a lack of a systematic framework to evaluate the capabilities of all-modality models, particularly regarding modality selection and synergy. To address these challenges, we propose the align-anything framework, which includes meticulously annotated 200k all-modality human preference data. Then, we introduce an alignment method that learns from unified language feedback, effectively capturing complex modality-specific human preferences and enhancing the model's instruction-following capabilities. Furthermore, to assess performance improvements in all-modality models after post-training alignment, we construct a challenging all-modality capability evaluation framework -- eval-anything. All data, models, and code frameworks have been open-sourced for the community. For more details, please refer to <a class="link-external link-https" href="https://github.com/PKU-Alignment/align-anything" rel="external noopener nofollow">this https URL</a>.

Decoding-Time Language Model Alignment with Multiple Objectives

Decoding-Time Language Model Alignment with Multiple Objectives

Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization

DeAL: Decoding-time Alignment for Large Language Models

Decoding-time Realignment of Language Models

Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts

MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time

It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization

PAD: Personalized Alignment of LLMs at Decoding-Time

Modality-Fair Preference Optimization for Trustworthy MLLM Alignment

Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization

Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference

MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples

On Diversified Preferences of Large Language Model Alignment

Aligning Large Language Models via Fine-grained Supervision

MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models

Language Model Decoding as Direct Metrics Optimization

Fast Best-of-N Decoding via Speculative Rejection

Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model

Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback