Sensommatic: an efficient pipeline to mine and predict sensory receptor genes in the era of reference-quality genomes

Louise Ryan,Colleen Lawless,Graham M Hughes
DOI: https://doi.org/10.1093/bioinformatics/btae040
IF: 5.8
2024-01-01
Bioinformatics
Abstract:Abstract Summary Sensory receptor gene families have undergone extensive expansion and loss across vertebrate evolution, leading to significant variation in receptor counts between species. However, due to their species-specific nature, conventional reference-based annotation tools often underestimate the true number of sensory receptors in a given species. While there has been an exponential increase in the taxonomic diversity of publicly available genome assemblies in recent years, only ∼30% of vertebrate species on the NCBI database are currently annotated. To overcome these limitations, we developed ‘Sensommatic’, an automated and accessible sensory receptor annotation pipeline. Sensommatic implements BLAST and AUGUSTUS to mine and predict sensory receptor genes from whole genome assemblies, adopting a one-to-many gene mapping approach. While designed for vertebrates, Sensommatic can be extended to run on non-vertebrate species by generating customized reference files, making it a scalable and generalizable tool. Availability and implementation Source code and associated files are available at: https://github.com/GMHughes/Sensommatic
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?