ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs

Samuel Miravet-Verde,Rocco Mazzolini,Carolina Segura-Morales,Alicia Broto,Maria Lluch-Senar,Luis Serrano

DOI: https://doi.org/10.1038/s41467-024-46112-2

IF: 16.6

2024-03-07

Nature Communications

Abstract:Abstract Identifying open reading frames (ORFs) being translated is not a trivial task. ProTInSeq is a technique designed to characterize proteomes by sequencing transposon insertions engineered to express a selection marker when they occur in-frame within a protein-coding gene. In the bacterium Mycoplasma pneumoniae , ProTInSeq identifies 83% of its annotated proteins, along with 5 proteins and 153 small ORF-encoded proteins (SEPs; ≤100 aa) that were not previously annotated. Moreover, ProTInSeq can be utilized for detecting translational noise, as well as for relative quantification and transmembrane topology estimation of fitness and non-essential proteins. By integrating various identification approaches, the number of initially annotated SEPs in this bacterium increases from 27 to 329, with a quarter of them predicted to possess antimicrobial potential. Herein, we describe a methodology complementary to Ribo-Seq and mass spectroscopy that can identify SEPs while providing other insights in a proteome with a flexible and cost-effective DNA ultra-deep sequencing approach.

multidisciplinary sciences

What problem does this paper attempt to address?

This paper aims to solve the problem of identifying small ORF - encoded proteins (SEPs) encoded by small open reading frames (smORFs) in bacteria. Specifically, the paper introduces a technique named ProTInSeq, which identifies translated large and small open reading frames (ORFs) by using ultra - deep DNA sequencing to track transposon insertions. This method can identify annotated proteins, and also discover new previously unannotated proteins and small ORF - encoded proteins (SEPs), and can be used to detect translational noise, perform relative quantification, and estimate the transmembrane topology of adaptive and non - essential proteins. The main contribution of the paper is to provide a new experimental method to verify and characterize SEPs in a high - throughput manner, especially those SEPs that overlap with larger genes. This helps to supplement the existing Ribo - Seq and mass spectrometry techniques, improve the identification rate of SEPs, and provides a powerful tool for studying these small proteins that may have important biological functions. In addition, ProTInSeq can also reveal quantitative information of the proteome and verify the topological predictions of non - essential and adaptive membrane proteins.

ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs

Transposon sequencing: A powerful tool for the functional genomic study of food-borne pathogens

Cryopreservation of mouse gametes and embryos.

Identification of new genes on a whole genome scale using saturated reporter transposon mutagenesis

Real-time and programmable transcriptome sequencing with PROFIT-seq

Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression

Ultra-Sensitive Detection of Transposon Insertions Across Multiple Families by Transposable Element Display Sequencing

Precision run-on sequencing (PRO-seq) for microbiome transcriptomics

A Proteogenomics Approach Integrating Proteomics and Ribosome Profiling Increases the Efficiency of Protein Identification and Enables the Discovery of Alternative Translation Start Sites.

Enhancing Transcriptome Mapping with Rapid PRO-seq Profiling of Nascent RNA

ProBac-seq, a bacterial single-cell RNA sequencing methodology using droplet microfluidics and large oligonucleotide probe sets

Genome‐Wide Fitness and Genetic Interactions Determined by Tn‐seq, a High‐Throughput Massively Parallel Sequencing Method for Microorganisms

The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments

Rp3: Ribosome profiling-assisted proteogenomics improves coverage and confidence during microprotein discovery

&Lt;em>de Novo</em> Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

Inducible transposon mutagenesis for genome-scale forward genetics

Mapping Transposon Insertion Sites by Inverse Polymerase Chain Reaction and Sanger Sequencing.

Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing

Amplifiable protein identification via residue-resolved barcoding and composition code counting

Droplet Tn-Seq combines microfluidics with Tn-Seq for identifying complex single-cell phenotypes

sPepFinder expedites genome-wide identification of small proteins in bacteria

ProTInSeq: transposon insertion tracking by ultra-deep DNA sequencing to identify translated large and small ORFs

Transposon sequencing: A powerful tool for the functional genomic study of food-borne pathogens

Cryopreservation of mouse gametes and embryos.

Identification of new genes on a whole genome scale using saturated reporter transposon mutagenesis

Real-time and programmable transcriptome sequencing with PROFIT-seq

Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression

Ultra-Sensitive Detection of Transposon Insertions Across Multiple Families by Transposable Element Display Sequencing

Precision run-on sequencing (PRO-seq) for microbiome transcriptomics

A Proteogenomics Approach Integrating Proteomics and Ribosome Profiling Increases the Efficiency of Protein Identification and Enables the Discovery of Alternative Translation Start Sites.

Enhancing Transcriptome Mapping with Rapid PRO-seq Profiling of Nascent RNA

ProBac-seq, a bacterial single-cell RNA sequencing methodology using droplet microfluidics and large oligonucleotide probe sets

Genome‐Wide Fitness and Genetic Interactions Determined by Tn‐seq, a High‐Throughput Massively Parallel Sequencing Method for Microorganisms

The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments

Rp3: Ribosome profiling-assisted proteogenomics improves coverage and confidence during microprotein discovery

&Lt;em&gt;de Novo&lt;/em&gt; Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

Inducible transposon mutagenesis for genome-scale forward genetics

Mapping Transposon Insertion Sites by Inverse Polymerase Chain Reaction and Sanger Sequencing.

Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing

Amplifiable protein identification via residue-resolved barcoding and composition code counting

Droplet Tn-Seq combines microfluidics with Tn-Seq for identifying complex single-cell phenotypes

sPepFinder expedites genome-wide identification of small proteins in bacteria

&Lt;em>de Novo</em> Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data