New Empirical Bayes Models to Jointly Analyze Multiple RNA-Sequencing Data in a Hypophosphatasia Disease Study

Dawson Kinsman,Jian Hu,Zhi Zhang,Gengxin Li
DOI: https://doi.org/10.3390/genes15040407
IF: 4.141
2024-03-27
Genes
Abstract:Hypophosphatasia is a rare inherited metabolic disorder caused by the deficiency of tissue-nonspecific alkaline phosphatase. More severe and early onset cases present symptoms of muscle weakness, diminished motor coordination, and epileptic seizures. These neurological manifestations are poorly characterized. Thus, it is urgent to discover novel differentially expressed genes for investigating the genetic mechanisms underlying the neurological manifestations of hypophosphatasia. RNA-sequencing data offer a high-resolution and highly accurate transcript profile. In this study, we apply an empirical Bayes model to RNA-sequencing data acquired from the spinal cord and neocortex tissues of a mouse model, individually, to more accurately estimate the genetic effects without bias. More importantly, we further develop two integration methods, weighted gene approach and weighted Z method, to incorporate two RNA-sequencing data into a model for enhancing the effects of genetic markers in the diagnostics of hypophosphatasia disease. The simulation and real data analysis have demonstrated the effectiveness of our proposed integration methods, which can maximize genetic signals identified from the spinal cord and neocortex tissues, minimize the prediction error, and largely improve the prediction accuracy in risk prediction.
genetics & heredity
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: by analyzing multiple RNA - sequencing data, to reveal the genetic mechanism of Hypophosphatasia (HPP), especially the gene expression changes related to neurological manifestations. Specifically, the authors aim to: 1. **Accurately estimate genetic effects**: By applying the Empirical Bayes model, more accurately estimate the genetic effects in RNA - sequencing data in spinal cord and neocortex tissues and avoid bias. 2. **Develop integration methods**: Propose two integration methods - the weighted gene approach and the weighted Z method, to combine RNA - sequencing data from two different tissues to enhance the effect of genetic markers for diagnosing HPP disease. 3. **Improve prediction performance**: By integrating multiple sets of RNA - sequencing data, maximize the detected genetic signals, minimize the prediction error, and significantly improve the accuracy of disease - risk prediction. ### Specific problem background Hypophosphatasia (HPP) is a rare hereditary metabolic disorder caused by a deficiency of tissue - nonspecific alkaline phosphatase (TNAP). In severe and early - onset cases of this disease, symptoms such as muscle weakness, decreased motor coordination ability, and epileptic seizures will occur, and these neurological symptoms have not been fully characterized. Therefore, the discovery of new differentially expressed genes is crucial for studying the genetic mechanism of HPP. ### Research methods To achieve the above - mentioned goals, the authors collected two sets of RNA - sequencing data from spinal cord and neocortex tissues of HPP mouse models and applied the following steps: 1. **Single - tissue analysis**: Apply the Empirical Bayes model to RNA - sequencing data of the spinal cord and neocortex respectively to identify differentially expressed genes. 2. **Integration analysis**: - **Weighted gene method**: Calculate the weight of each gene according to the importance of the gene effects detected from spinal cord and neocortex data, and combine the gene effects of these two tissues. - **Weighted Z method**: Calculate the weighted average of Z - statistics from spinal cord and neocortex data to enhance the genetic signal. 3. **Evaluation and verification**: Through simulation experiments and real - data analysis, evaluate the effectiveness of the proposed integration methods and compare their performance with single - tissue analysis methods. ### Expected results The authors hope that through these methods they can: - Capture more strongly the genetic signals related to HPP disease. - Improve the accuracy of disease prediction and reduce prediction errors. - Provide new insights for understanding the neurological manifestations of HPP. In summary, this paper aims to deeply mine the genetic information in RNA - sequencing data through innovative statistical models and integration methods, in order to better understand the genetic mechanism of HPP and improve its diagnostic methods.