Abstract:Despite the superior capabilities of Multimodal Large Language Models (MLLMs) across diverse tasks, they still face significant trustworthiness challenges. Yet, current literature on the assessment of trustworthy MLLMs remains limited, lacking a holistic evaluation to offer thorough insights into future improvements. In this work, we establish MultiTrust, the first comprehensive and unified benchmark on the trustworthiness of MLLMs across five primary aspects: truthfulness, safety, robustness, fairness, and privacy. Our benchmark employs a rigorous evaluation strategy that addresses both multimodal risks and cross-modal impacts, encompassing 32 diverse tasks with self-curated datasets. Extensive experiments with 21 modern MLLMs reveal some previously unexplored trustworthiness issues and risks, highlighting the complexities introduced by the multimodality and underscoring the necessity for advanced methodologies to enhance their reliability. For instance, typical proprietary models still struggle with the perception of visually confusing images and are vulnerable to multimodal jailbreaking and adversarial attacks; MLLMs are more inclined to disclose privacy in text and reveal ideological and cultural biases even when paired with irrelevant images in inference, indicating that the multimodality amplifies the internal risks from base LLMs. Additionally, we release a scalable toolbox for standardized trustworthiness research, aiming to facilitate future advancements in this important field. Code and resources are publicly available at: <a class="link-external link-https" href="https://multi-trust.github.io/" rel="external noopener nofollow">this https URL</a>.

MoTIF: a Method for Trustworthy Dynamic Multimodal Learning on Omics

TMODINET: A trustworthy multi-omics dynamic learning integration network for cancer diagnostic

MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models

MVKTrans: Multi-View Knowledge Transfer for Robust Multiomics Classification

A Co-Informative and Trustworthy Multi-Omics Integration Network for Diagnostic Prediction

TEMINET: A Co-Informative and Trustworthy Multi-Omics Integration Network for Diagnostic Prediction

MOINER: A Novel Multiomics Early Integration Framework for Biomedical Classification and Biomarker Discovery.

Integration of multi-omics data using adaptive graph learning and attention mechanism for patient classification and biomarker identification

MOTL: enhancing multi-omics matrix factorization with transfer learning

Integrating T-cell Receptor and Transcriptome for Large-Scale Single-Cell Immune Profiling Analysis

Multimodal CustOmics: A Unified and Interpretable Multi-Task Deep Learning Framework for Multimodal Integrative Data Analysis in Oncology

MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data

MOFA+: a probabilistic framework for comprehensive integration of structured single-cell data

MODILM: towards better complex diseases classification using a novel multi-omics data integration learning model

Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study

MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive Learning with Omics-Inference Modeling

Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions

RethinkingTMSC: An Empirical Study for Target-Oriented Multimodal Sentiment Classification

Orthogonal multimodality integration and clustering in single-cell data

Dynamic Multimodal Information Bottleneck for Multimodality Classification

Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets