Abstract:Replicable signals from different yet conceptually related studies provide stronger scientific evidence and more powerful inference. We introduce STAREG, a statistical method for replicability analysis of high throughput experiments, and apply it to analyze spatial transcriptomic studies. STAREG uses summary statistics from multiple studies of high throughput experiments and models the the joint distribution of p -values accounting for the heterogeneity of different studies. It effectively controls the false discovery rate (FDR) and has higher power by information borrowing. Moreover, it provides different rankings of important genes. With the EM algorithm in combination with pool-adjacent-violator-algorithm (PAVA), STAREG is scalable to datasets with millions of genes without any tuning parameters. Analyzing two pairs of spatially resolved transcriptomic datasets, we are able to make biological discoveries that otherwise cannot be obtained by using existing methods. Irreplicable research wastes time, money, and/or resources. Approximately $28 billion is estimated to be spent on preclinical research that cannot be replicated every year in the United States alone. Possible causes of irreplicable research may include experimental design, laboratory practices, and data analysis. We focus on data analysis. The past two decades have witnessed the expansion and increased availability of genomic data from high-throughput experiments. Due to privacy concerns or logistic reasons, raw data can be difficult to access but summary data such as p -values are readily available. We introduce STAREG, which jointly analyzes p -values from multiple genomic datasets that target the same scientific question with different populations or different technologies. This allows us to have more convincing and robust findings. STAREG is computationally scalable with solid statistical analysis. Moreover, it is versatile, platform-independent, and only requires p -values as input. By analyzing data sets from spatially resolved transcriptomic studies, we make biological discoveries that otherwise cannot be obtained with existing methods.

STARRPeaker: uniform processing and accurate identification of STARR-seq active regions

STARR-seq and UMI-STARR-seq: Assessing Enhancer Activities for Genome-Wide-, High-, and Low-Complexity Candidate Libraries

Underlying causes for prevalent false positives and false negatives in STARR-seq data

How Useful is Basal Renal Tubular Epithelial Cell Vacuolization as a Marker for Significant Hyperglycemia at Autopsy?

Functional assessment of human enhancer activities using whole-genome STARR-sequencing

Starr: Simple Tiling Array Analysis of Affymetrix ChIP-chip data

Differences in Borrelia infections in adult Ixodes persulcatus and Ixodes ricinus ticks (Acari: Ixodidae) in populations of north-western Russia

STAR: ultrafast universal RNA-seq aligner

Inference of Transcriptional Regulation From STARR-seq Data

starTracer is an accelerated approach for precise marker gene identification in single-cell RNA-Seq analysis

STHD: probabilistic cell typing of single Spots in whole Transcriptome spatial data with High Definition

Statistical batch-aware embedded integration, dimension reduction and alignment for spatial transcriptomics

Proceedings: Studies in the role of the colon in urea metabolism.

SDePER: a hybrid machine learning and regression method for cell-type deconvolution of spatial barcoding-based transcriptomic data

Optimized high-throughput screening of non-coding variants identified from genome-wide association studies

Generating single-cell gene expression profiles for high-resolution spatial transcriptomics based on cell boundary images

STAREG: Statistical replicability analysis of high throughput experiments with applications to spatial transcriptomic studies

Stabilized marker gene identification and functional annotation from single-cell transcriptomic data

Scstar Reveals Hidden Heterogeneity with a Real-Virtual Cell Pair Structure Across Conditions in Single-Cell RNA Sequencing Data.

RNASEQR—a Streamlined and Accurate RNA-seq Sequence Analysis Program

Effects of neuroleptics on hippocampal stimulation-induced ‘wet-dog shaking’ in rats