Abstract:Objectives: To identify a novel, generalizable diagnostic for acute respiratory distress syndrome using whole-blood gene expression arrays from multiple acute respiratory distress syndrome cohorts of varying etiologies. Data sources: We performed a systematic search for human whole-blood gene expression arrays of acute respiratory distress syndrome in National Institutes of Health Gene Expression Omnibus and ArrayExpress. We also included the Glue Grant gene expression cohorts. Study selection: We included investigator-defined acute respiratory distress syndrome within 48 hours of diagnosis and compared these with relevant critically ill controls. Data extraction: We used multicohort analysis of gene expression to identify genes significantly associated with acute respiratory distress syndrome, both with and without adjustment for clinical severity score. We performed gene ontology enrichment using Database for Annotation, Visualization and Integrated Discovery and cell type enrichment tests for both immune cells and pneumocyte gene expression. Finally, we selected a gene set optimized for diagnostic power across the datasets and used leave-one-dataset-out cross validation to assess robustness of the model. Data synthesis: We identified datasets from three adult cohorts with sepsis, one pediatric cohort with acute respiratory failure, and two datasets of adult patients with trauma and burns, for a total of 148 acute respiratory distress syndrome cases and 268 critically ill controls. We identified 30 genes that were significantly associated with acute respiratory distress syndrome (false discovery rate < 20% and effect size >1.3), many of which had been previously associated with sepsis. When metaregression was used to adjust for clinical severity scores, none of these genes remained significant. Cell type enrichment was notable for bands and neutrophils, suggesting that the gene expression signature is one of acute inflammation rather than lung injury per se. Finally, an attempt to develop a generalizable diagnostic gene set for acute respiratory distress syndrome showed a mean area under the receiver-operating characteristic curve of only 0.63 on leave-one-dataset-out cross validation. Conclusions: The whole-blood gene expression signature across a wide clinical spectrum of acute respiratory distress syndrome is likely confounded by systemic inflammation, limiting the utility of whole-blood gene expression studies for uncovering a generalizable diagnostic gene signature.

Disease-specific classification using deconvoluted whole blood gene expression

Disease Classification Via Gene Network Integrating Modules and Pathways

A Robust Statistical Procedure to Discover Expression Biomarkers Using Microarray Genomic Expression Data.

DNA-based molecular classifiers for the profiling of gene expression signatures

Principal component analysis of blood microRNA datasets facilitates diagnosis of diverse diseases

Deciphering Cell-Type-Specific Gene Expression Signatures of Cardiac Diseases Through Reconstruction of Bulk Transcriptomes

A novel workflow for cancer blood biomarker identification

Genome-wide whole-blood transcriptome profiling across inherited bone marrow failure subtypes

A Linear Programming Model for Identifying Non-Redundant Biomarkers Based on Gene Expression Profiles

A Computational Approach to Identify Blood Cell-Expressed Parkinson's Disease Biomarkers That Are Coordinately Expressed in Brain Tissue

An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era

Standard machine learning approaches outperform deep representation learning on phenotype prediction from transcriptomics data

Systematic evaluation of the two main blood-based RNA-seq approaches for Mendelian disease diagnosis

Explainable models using transcription factor binding and epigenome patterns at promoters reveal disease-associated genes and their regulators in the context of cell-types

Multicohort Analysis of Whole-Blood Gene Expression Data Does Not Form a Robust Diagnostic for Acute Respiratory Distress Syndrome

The synchronous upregulation of a specific protein cluster in the blood predicts both colorectal cancer risk and patient immune status

Identification of a gene-expression predictor for diagnosis and personalized stratification of lupus patients

DiSignAtlas: an atlas of human and mouse disease signatures based on bulk and single-cell transcriptomics

Investigation of Variation in Gene Expression Profiling of Human Blood by Extended Principle Component Analysis.

Predicting gene signature in breast cancer patients with multiple machine learning models

Identification of Co-diagnostic Genes for Heart Failure and Hepatocellular Carcinoma Through WGCNA and Machine Learning Algorithms