DOP40 Inflammatory Bowel Disease single cell atlas construction to enable cell type-specific target identification

G De Baets,J D'Rozario,U Gehrman,J Mattsson,D Castiblanco,I Kaymak,A L Moldoveanu,C Giorgio,F Prandi,C Jawale,P Honsa,K Ruppova,M Valny,E Macuchova,E Freinkman,J Sponarova,A Platt,T Ort,D Corridoni
DOI: https://doi.org/10.1093/ecco-jcc/jjad212.0080
2024-01-01
Journal of Crohn's and Colitis
Abstract:Abstract Background Advances in single-cell technologies enable the unbiased study of cellular heterogeneity. Recently, single-cell RNA sequencing (scRNA-seq) has been utilised on intestinal and blood samples from patients with inflammatory bowel disease (IBD) and healthy individuals, often in conjunction with cell surface proteome and TCR repertoire analyses. These individual studies revealed novel cell subpopulations of immune, mesenchymal, and epithelial cells in UC and CD, but are not sufficiently powered to consistently identify granular cell subsets or establish their association with disease status and response to treatment. Integration of these studies into a unified IBD single-cell atlas would provide a more robust data foundation for therapeutic target discovery. Here, we constructed a comprehensive, multi-modal single-cell atlas of human IBD tissue, combining a large breadth of meta-analysis with the depth of single-cell resolution. Methods Raw data from 20 public datasets were curated and reprocessed to generate a harmonised data foundation to support downstream discovery efforts. Low-quality cells and doublets were removed using thresholds for gene number, gene count, and percentage of counts originating from mitochondrial genes, and the resulting data were normalised and adjusted for batch effects. Published clinical metadata from each study were re-annotated with controlled vocabularies. Associations between clinical metadata and cellular/molecular features were discovered using computational and ML approaches. Results We generated a large-scale integrated single-cell atlas for IBD comprising >500 tissue samples from >200 IBD patients and relevant controls, with harmonised clinical metadata including treatment history and response. This tissue atlas comprises >990k high-quality cells with granular annotations of 129 cell types/states. IBD inflammation and non-response to anti-TNF treatment were associated with unique transcriptional signatures in specific mononuclear phagocyte, CD4 T cell, and fibroblast subpopulations. Further prioritisation and validation of genes comprising these signatures may yield future therapeutic targets for IBD. Conclusion This atlas integrates single-cell data across the largest available collection of IBD patient-derived tissue samples. Leveraging high-resolution cell type annotations and harmonised clinical metadata, meta-analyses of this data foundation will broaden the understanding of IBD biology to identify novel targets and pathways for drug discovery.
gastroenterology & hepatology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: by constructing a single - cell atlas of inflammatory bowel disease (IBD), to achieve target identification for specific cell types. Specifically, the research aims to integrate multiple single - cell datasets and establish a comprehensive, multimodal IBD single - cell atlas, thereby providing a more solid data foundation for the discovery of treatment targets. ### Background In recent years, single - cell RNA sequencing (scRNA - seq) technology has been widely used in the study of intestinal and blood samples from IBD patients and healthy individuals, usually combined with cell - surface proteome and TCR (T - cell receptor) repertoire analysis. These studies have revealed new subpopulations of immune cells, mesenchymal cells, and epithelial cells in IBD (including ulcerative colitis UC and Crohn's disease CD). However, these individual studies are insufficient in identifying fine cell subpopulations and their associations with disease states and treatment responses. Therefore, it is necessary to integrate these studies into a unified IBD single - cell atlas to provide a more powerful data foundation for the discovery of treatment targets. ### Methods Researchers collected raw data from 20 public datasets and carried out data cleaning and pre - processing, including removing low - quality cells and doublets, and setting thresholds based on the number of genes, gene counts, and the proportion of mitochondrial genes. Then the data was normalized and batch effects were adjusted. The clinical metadata of each study was also re - annotated, labeled with a controlled vocabulary. Through computational and machine - learning methods, the associations between clinical metadata and cell/molecular features were discovered. ### Results The study generated a large - scale IBD single - cell atlas, containing more than 500 tissue samples from more than 200 IBD patients and related control groups, as well as more than 990,000 high - quality cells, covering detailed annotations of 129 cell types or states. The study found that IBD inflammation and non - response to TNF antagonist treatment are associated with unique transcriptional features of specific mononuclear phagocytes, CD4 T cells, and fibroblast subpopulations. Further prioritization and validation of genes in these features may provide new targets for future IBD treatment. ### Conclusions This atlas integrates the currently largest single - cell dataset of tissue samples from IBD patients. By using high - resolution cell - type annotations and standardized clinical metadata, meta - analysis of this data foundation will expand the understanding of IBD biology and help discover new drug targets and pathways. ### Formula Examples (if any) This abstract does not involve specific mathematical formulas, but the data processing steps involved can be expressed as: - Removal of low - quality cells and doublets: \[ \text{Filtering conditions} = (\text{Number of genes} > \text{Threshold}) \land (\text{Gene count} > \text{Threshold}) \land (\text{Proportion of mitochondrial genes} < \text{Threshold}) \] - Data normalization: \[ \text{Normalized data} = f(\text{Original data}) \] - Batch effect adjustment: \[ \text{Adjusted data} = g(\text{Normalized data}) \] where \(f\) and \(g\) represent the normalization function and the batch effect adjustment function respectively.