Towards better accuracy for missing value estimation of epistatic miniarray profiling data by a novel ensemble approach.

Xiao-Yong Pan,Ye Tian,Yan Huang,Hong-Bin Shen
DOI: https://doi.org/10.1016/j.ygeno.2011.03.001
IF: 4.31
2011-01-01
Genomics
Abstract:Epistatic miniarray profiling (E-MAP) is a powerful tool for analyzing gene functions and their biological relevance. However, E-MAP data suffers from large proportion of missing values, which often results in misleading and biased analysis results. It is urgent to develop effective missing value estimation methods for E-MAP. Although several independent algorithms can be applied to achieve this goal, their performance varies significantly on different datasets, indicating different algorithms having their own advantages and disadvantages. In this paper, we propose a novel ensemble approach EMDI based on the high-level diversity to impute missing values that consists of two global and four local base estimators. Experimental results on five E-MAP datasets show that EMDI outperforms all single base algorithms, demonstrating an appropriate combination providing complementarity among different methods. Comparison results between several fusion strategies also demonstrate that the proposed high-level diversity scheme is superior to others. EMDI is freely available at www.csbio.sjtu.edu.cn/bioinf/EMDI/.
What problem does this paper attempt to address?