Conserved 3' Stem-Loop Structures Enable Comprehensive Analysis of Bacterial Transcription Termination in Metagenomes, Regardless of Rho Factor Dependency

Yunfan Jin,Jiyun Cui,Hongli Ma,Fei Gan,Zhenjiang Zech Xu,Zhi John Lu
DOI: https://doi.org/10.1101/2023.10.02.560326
2024-12-08
Abstract:Bacterial transcription termination is a critical yet underexplored mechanism of gene regulation in microbial ecosystems. Existing computational tools, however, primarily focus on predicting Rho-independent terminators (RITs) in model species, leaving significant gaps in understanding Rho-dependent termination (RDTs) and termination mechanisms in non-model species. To address these limitations, we developed BATTER (BActeria Transcript Three Prime End Recognizer), a comprehensive computational tool for bacterial transcript 3' termini prediction. BATTER builds on the observation that conserved stem-loop structures are frequently associated with 3' ends of primary transcripts generated by both RIT and RDT mechanisms across distantly related bacterial species. By leveraging Longformer (a transformer-based neural network model) with a CRF (Conditional Random Field) layer, BATTER demonstrated superior performance compared to existing tools. It enabled comprehensive analysis of 42,905 representative bacterial genomes, uncovering that stem-loop structures exhibit clade-specific properties with greater variations between species than between gene families. Notably, BATTER uncovered that certain Cyanobacteria lineages, despite lacking Rho homologs, harbor Rho utilization (RUT) site-like sequences near 3' ends, with preliminary experimental validation in E. coli suggesting their partial functionality in transcription termination. Additionally, BATTER systematically identified pervasive premature termination events in antimicrobial resistance (AMR) genes, highlighting their regulatory roles in translation protection and drug efflux. This study advances our understanding of transcription termination across diverse bacterial lineages and provides a robust computational approach for exploring transcription regulation in complex microbial ecosystems.
Bioinformatics
What problem does this paper attempt to address?