Predicting Schizophrenia Related Genes through Large-scale Data Integration

LIANG Yan,SHI Yi-bin,TIAN Wei-dong
DOI: https://doi.org/10.15943/j.cnki.fdxb-jns.2013.02.017
2013-01-01
Abstract:Schizophrenia,with its complexity in heredity and multi genetic factors,has been an area of frontier in disease related gene studies and also a puzzle in the field of genetics.As the generation of all kinds of omics data and the publication of more and more data about single nucleotide polymorphism(SNP) in GWAS(genome wide association studies),it is possible to integrate these large-scale data and use bioinformatics models to predict candidate schizophrenia-related genes for further experimental validation.Here,a bioinformatics model incorporated random forests machine learning algorithm and diversified biological databases,the performance of which has been assessed as satisfactory in previous predictive studies,was utilized to predict schizophrenia related genes.After validation with significant SNP loci in GWAS,a couple of genes are prioritized pertaining to the disease of schizophrenia.Thus,33 candidate genes from random forests model were validated by two GWAS datasets,and 10 candidate genes were finally gained as the most potential schizophrenia-related genes,which were further proofed by literature searching.
What problem does this paper attempt to address?