Gene-Disease Association

E. E. Abdelbadeea,M. El-Dosuky,M. Rashad
DOI: https://doi.org/10.21608/mjcis.2020.321071
2020-12-01
Abstract:Disease susceptibility prediction is defined as follows. Given training set S and a test case t ∉ S as a tuple (known as SNP, unknown disease), trying predicting the unknown disease with maximum accuracy. DisGeNET is a proponent dataset in disease susceptibility research. This paper reviews DisGeNET comprehensive information, before introducing a proposed system operating atop it. First, vetting the dataset by consolidation, and removing genes with effects beyond a certain threshold. Second, computing the empirical cumulative distribution function, using it for plotting and printing gene associations for many diseases such as, and not limited to, Alzheimer, Anemia, and Brain, breast cancer proposed methods such as applying C4.5 & naïve Bayes give better accuracy then previous works
Computer Science,Medicine,Biology
What problem does this paper attempt to address?