A Scalable Data Mining Architecture for Bioinformation

R Li,Z Zhang,S Cao,Y Zhu,Y Li
DOI: https://doi.org/10.2495/data030561
2004-01-01
Abstract:Bioinformatics is a new field of data mining research, and data mining is also a promising tool for bioinformatics. Research has been done on how to bridge these two attractive fields. With the critical progression on genetics and high-throughput biotechnologies, all kinds of bio-data have been explosively produced and accumulated. With the success of global views of DNA sequences, gene expression levels, etc. on the genomic scale, now it is possible to discover the nature of life and to promote biology and medicine research further. However, to achieve these goals, accurate large-scale data processing technologies are required. Data mining has been known as a powerful tool for this purpose, but it is still difficult to develop and apply a data mining system for different analysis functions. In this paper, a new scalable data mining architecture for bioinformation analysis, "The Architecture of Bioinformation Data Mining Application Platform", has been proposed to facilitate biologists use of complex data mining technology and to develop a professional data mining system by conveniently developing, customizing or trimming a data mining system for some specific bioresearch. Logically, the architecture is composed of three tiers: a data mining algorithm tier, a analysis logic tier and a profession application tier, with the mapping of data mining algorithm-analysis model-special application as backbone. The architecture can be implemented by using the distributed client/server model. A gene expression data mining system BioMiner is designed and implemented to perform and validate the efficiency of the architecture.
What problem does this paper attempt to address?