Hadoop Applications in Bioinformatics

Xubin Li,Wenrui Jiang,Yi Jiang,Quan Zou
DOI: https://doi.org/10.1109/OCS.2012.40
2012-01-01
Abstract:Bioinformatics is in a dilemma that traditional analysis tools work hard on the large-scale data from the high-throughout sequencing. In recent years, the open source Apache Hadoop project, which adopts MapReduce framework and distributed file system, brings bioinformatics researchers opportunities to obtain a scalable, efficient and reliable computing performance on Linux clusters and Cloud Computing Service. In this paper, we present Hadoop-based applications employed in bioinformatics, covering next-generation sequencing and other biological domains. In addition, we discuss obstacles and future works about Hadoop in bioinformatics.
What problem does this paper attempt to address?