Expectile Neural Networks for Genetic Data Analysis of Complex Diseases
Jinghang Lin,Xiaoran Tong,Chenxi Li,Qing Lu
DOI: https://doi.org/10.1109/tcbb.2022.3146795
2022-01-01
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:The genetic etiologies of common diseases are highly complex and heterogeneous. Classic methods, such as linear regression, have successfully identified numerous variants associated with complex diseases. Nonetheless, for most diseases, the identified variants only account for a small proportion of heritability. Challenges remain to discover additional variants contributing to complex diseases. Expectile regression is a generalization of linear regression and provides complete information on the conditional distribution of a phenotype of interest. While expectile regression has many nice properties, it has rarely been used in genetic research. In this paper, we develop an expectile neural network (ENN) method for genetic data analyses of complex diseases. Similar to expectile regression, ENN provides a comprehensive view of relationships between genetic variants and disease phenotypes, which can be used to discover variants predisposing to sub-populations. We further integrate the idea of neural networks into ENN, making it capable of capturing non-linear and non-additive genetic effects (e.g., gene-gene interactions). Through simulations, we showed that the proposed method outperformed an existing expectile regression when there exist complex genotype-phenotype relationships. We also applied the proposed method to the data from the Study of Addiction: Genetics and Environment (SAGE), investigating the relationships of candidate genes with smoking quantity.
computer science, interdisciplinary applications,biochemical research methods,mathematics,statistics & probability