Parallel Large Average Submatrices Biclustering Based on MapReduce.

Qin Lin,Yun Xue,Wen-Sheng Chen,Shu-qun Ye,Wan-li Li,Jing-jing Liu
DOI: https://doi.org/10.1109/cis.2015.40
2015-01-01
Abstract:Large Average Sub matrices (LAS) is one of the biclustering algorithms, which can capture large average sub matrices within a high dimensional data matrix. It has gained increasing popularity in many fields such as biological data analysis and financial forecasting. However, due to urgent requirements for high performance in large scale data processing applications, high performance parallel solutions for LAS biclustering are highly desirable. In this paper, we propose an efficient parallel large average submatrices biclustering based on MapReduce. Furthermore, we boost the search efficiency of LAS by using heap sort. Experimental results demonstrate that the presented parallel algorithm has advantages of both high speedup and good scalability.
What problem does this paper attempt to address?