TaSer (TabAnno and SeqMiner): a toolset for annotating and querying next-generation sequence data

xiaowei zhan,d j liu
2013-01-01
Abstract: Summary: We develop TaSer (TabAnno and SeqMiner), a toolkit for annotating and querying next generation sequence (NGS) dataset in tab-delimited files. TabAnno is a powerful and efficient command-line tool designed to pre-process sequence data, annotate variations and generate an indexed feature-enriched project file that can integrate multiple sources of information. Using the project file generated by TabAnno, complex queries to the sequence dataset can be performed using SeqMiner, an R-package designed to efficiently access large datasets. Extracted information can be conveniently viewed and analyzed by tools in R. TaSer is optimized and computationally more efficient than software using database systems. It enables annotating and querying NGS dataset using moderate computing resource. Availability and implementation: TabAnno can be downloaded from github (zhanxw.github.io/anno/). SeqMiner is distributed on CRAN (cran.r-project.org/web/packages/seqminer). Contact: X.Z. (zhanxw@umich.edu) D.J.L (dajiang@umich.edu)
What problem does this paper attempt to address?