Exploiting Generative Self-Supervised Learning For The Assessment of Biological Images With Lack of Annotations: A COVID-19 Case-Study

Alessio Mascolini,Dario Cardamone,Francesco Ponzio,Santa Di Cataldo,Elisa Ficarra
DOI: https://doi.org/10.21203/rs.3.rs-757777/v1
2021-08-12
Abstract:Abstract Computer-aided analysis of biological images typically requires extensive training on large-scale annotated datasets, which is not viable in many situations. In this paper, we present GAN-DL, a Discriminator Learner based on the StyleGAN2 architecture, which we employ for self-supervised image representation learning in the case of fluorescent biological images. We show that Wasserstein Generative Adversarial Networks combined with linear Support Vector Machines enable high-throughput compound screening based on raw images. We demonstrate this by classifying active and inactive compounds tested for the inhibition of SARS-CoV-2 infection in VERO and HRCE cell lines. In contrast to previous methods, our deep learning-based approach does not require any annotation besides the one that is normally collected during the sample preparation process. We test our technique on the RxRx19a Sars-CoV-2 image collection. The dataset consists of fluorescent images that were generated to assess the ability of regulatory-approved or late-stage clinical trials compounds to modulate the in vitro infection from SARS-CoV-2 in both VERO and HRCE cell lines. We show that our technique can be exploited not only for classification tasks but also to effectively derive a dose-response curve for the tested treatments, in a self-supervised manner. Lastly, we demonstrate its generalization capabilities by successfully addressing a zero-shot learning task, consisting of the categorization of four different cell types of the RxRx1 fluorescent images collection.
What problem does this paper attempt to address?