Construction and Application of a Large-Scale DNA Sequence Analysis System Based on PC Linux

CG Zhang,SG Ouyang,SW Zhang,XH Qu,YT Yu,GQ Zhou,SF Wu,FC He
2001-01-01
PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS
Abstract:More and more DNA sequences have been obtained since the start-up of human genome project. Powerful system is badly needed for data mining on these DNA sequences. Based on a personal computer and Linux operating system, the Phred/Phrap/Consed software and Blast software were used to construct a platform for batch analysis of the sequences, including identifying raw DNA sequence from chromatogram file, vector sequence removing, contig analysis (sequence assembly), repeat sequence identifying and sequence similarity analysis. Result demonstrated that this robust platform could accelerate data analysis for large-scale DNA sequencing.
What problem does this paper attempt to address?