ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs

Samuel Miravet-Verde,Rocco Mazzolini,Carolina Segura-Morales,Alicia Broto,Maria Lluch-Senar,Luis Serrano
DOI: https://doi.org/10.1038/s41467-024-46112-2
IF: 16.6
2024-03-07
Nature Communications
Abstract:Abstract Identifying open reading frames (ORFs) being translated is not a trivial task. ProTInSeq is a technique designed to characterize proteomes by sequencing transposon insertions engineered to express a selection marker when they occur in-frame within a protein-coding gene. In the bacterium Mycoplasma pneumoniae , ProTInSeq identifies 83% of its annotated proteins, along with 5 proteins and 153 small ORF-encoded proteins (SEPs; ≤100 aa) that were not previously annotated. Moreover, ProTInSeq can be utilized for detecting translational noise, as well as for relative quantification and transmembrane topology estimation of fitness and non-essential proteins. By integrating various identification approaches, the number of initially annotated SEPs in this bacterium increases from 27 to 329, with a quarter of them predicted to possess antimicrobial potential. Herein, we describe a methodology complementary to Ribo-Seq and mass spectroscopy that can identify SEPs while providing other insights in a proteome with a flexible and cost-effective DNA ultra-deep sequencing approach.
multidisciplinary sciences
What problem does this paper attempt to address?
This paper aims to solve the problem of identifying small ORF - encoded proteins (SEPs) encoded by small open reading frames (smORFs) in bacteria. Specifically, the paper introduces a technique named ProTInSeq, which identifies translated large and small open reading frames (ORFs) by using ultra - deep DNA sequencing to track transposon insertions. This method can identify annotated proteins, and also discover new previously unannotated proteins and small ORF - encoded proteins (SEPs), and can be used to detect translational noise, perform relative quantification, and estimate the transmembrane topology of adaptive and non - essential proteins. The main contribution of the paper is to provide a new experimental method to verify and characterize SEPs in a high - throughput manner, especially those SEPs that overlap with larger genes. This helps to supplement the existing Ribo - Seq and mass spectrometry techniques, improve the identification rate of SEPs, and provides a powerful tool for studying these small proteins that may have important biological functions. In addition, ProTInSeq can also reveal quantitative information of the proteome and verify the topological predictions of non - essential and adaptive membrane proteins.