gWord: A tool for genome-wide word search and count

Jianming Xie,Xiao Sun,ZhiYuan Lu,Weiyan Xue,Xianjun Dong,Zuhong Lu
DOI: https://doi.org/10.1109/ICBBE.2007.84
2007-01-01
Abstract:Word search and count in whole genome DNA sequence is very important for re-sequencing an organism's genome DNA using the microarray based technology and studying the functional element's function based on genomewide approach. A stand-alone program named gWord (abbr. of genome Word), which applies a fast algorithm to rapidly map the n-mer word into an index in memory, is developed for the task that can fulfill two main functions: counting all possible n-mer short DNA sequences in genome and acquiring the locations of one motif or those words presented only once in genome DNA. In addition, the search hits of any word will be annotated with gene information. Two examples on human genome are given to demonstrate the application of gWord in genomics research. © 2007 IEEE.
What problem does this paper attempt to address?