Benchmarking Self-Supervised Learning for Single-Cell Data

Philip Toma,Olga Ovcharenko,Imant Daunhawer,Julia Vogt,Florian Barkmann,Valentina Boeva
DOI: https://doi.org/10.1101/2024.11.04.620867
2024-11-06
Abstract:Self-supervised learning (SSL) has emerged as a powerful approach for learning biologically meaningful representations of single-cell data. To establish best practices in this domain, we present a comprehensive benchmark evaluating eight SSL methods across three downstream tasks and eight datasets, with various data augmentation strategies. Our results demonstrate that SimCLR and VICReg consistently outperform other methods across different tasks. Furthermore, we identify random masking as the most effective augmentation technique. This benchmark provides valuable insights into the application of SSL to single-cell data analysis, bridging the gap between SSL and single-cell biology.
Bioinformatics
What problem does this paper attempt to address?