Gene Sequence Alignment on a Public Computing Platform

stephen pellicer,nova ahmed,yi pan,yao zheng
DOI: https://doi.org/10.1109/ICPPW.2005.35
2005-01-01
Abstract:Public computing can potentially supply not only computational power but also memory and short term storage resources to grid and cluster scale problems. Gene sequence alignment is a fundamental computational challenge in bioinformatics with attributes such as moderate computational requirements, extensive memory requirements, and highly interdependent tasks. This study examines the performance of calculating the alignment for two 100,000 base sequences on a public computing platform utilizing the BOINC framework. When compared to the theoretical, optimal sequential implementation, the parallel implementation achieves speedup by a factor of 1.4 and at the point of maximum parallelism and ends with a speedup of 1.2. This speedup factor is based on extrapolation of the sequential performance of a segment of the problem. This extrapolation would require a theoretical sequential machine with approximately 37.3 GB of working memory or suffer performance degradation from use of secondary storage during the calculation.
What problem does this paper attempt to address?