Abstract:Recently a binary/digital scheme based on the superspecificity property of transfer RNAs (tRNAs) was proposed for the identification of single amino acids (AAs) from binary-valued measurements (Eur. Phys. J. E 45, 94, 2022). There are two formulations, they can be used to sequence short peptides and/or identify their parent proteins. In one of them an array of peptides is sequenced in 20 cycles by adding 20 different tRNAs carrying a fluorescent tag, optically recognizing the C-terminal residues, and cleaving the latter with a carboxypeptidase; the process is repeated over the peptides in parallel. Here this scheme is used to develop in theory a minimalist approach to protein identification that uses only two tRNAs and the carboxypeptidases A, B, and C. The latter form a complete and mutually exclusive set capable of cleaving all 20 AA types; this divides the 20 AAs into three classes. The sequences obtained are partial sequences in the reduced alphabet, their parent proteins can be obtained by search through a proteome database. The AA class of the terminal residue of every peptide in the array can be identified in a single cycle by using the three carboxypeptidases in the order C-B-A. With peptide lengths of ~20 and a cycle time of ~1 hour, the parent proteins of K peptides can be obtained in about 20 hours. This is independent of K (within the limits imposed by the imaging method used) and the dynamic range of a proteome; thus in theory a whole proteome can be processed in less than a day. Computational results suggest that the parent proteins of over 92% of peptides from the human proteome (Uniprot id UP000005640_9606) can be identified. The identification rate when residues are skipped due to carboxypeptidases cleaving the second and later residues in delayed reactions is about ~90% with 1 or 2 skips. Full sequencing without skipped residues can be done by using all 20 tRNA types over 20 cycles in increasing order of cleavage time of the 20 AA types; a recursive procedure is given.

Single molecule identification and quantification of whole proteins without purification, proteolysis, or labeling: a computational model

Whole protein sequencing and quantification without proteolysis, terminal residue cleavage, or purification: A computational model

MoS 2 nanopore identifies single amino acids with sub-1 Dalton resolution

Single-molecule protein identification by sub-nanopore sensors

A strategy to load, rethread and read protein sequences through a nanopore

Controlled Translocation of Proteins through a Biological Nanopore for Single-Protein Fingerprint Identification

Enzyme-less nanopore detection of post-translational modifications within long polypeptides

Ultra-high Sensitivity Mass Spectrometry Quantifies Single-Cell Proteome Changes Upon Perturbation

Multi-pass, single-molecule nanopore reading of long protein strands

Nanopore-Based Measurements of Protein Size, Fluctuations, and Conformational Changes

Light-Driven Conversion of Silicon Nitride Nanopore to Nanonet for Single-Protein Trapping Analysis

Full‐Length Single Protein Molecules Tracking and Counting in thin Silicon Channels

High-Resolution Probing of Conformational Changes in Proteins Through a Tug-of-War Between Electrokinetic Forces and Pressure-Driven Flow

Detection of Single Proteins with a General Nanopore Sensor

High Accuracy Protein Identification: Fusion of solid-state nanopore sensing and machine learning

Label-Free Optical Analysis of Biomolecules in Solid-State Nanopores: Toward Single-Molecule Protein Sequencing

A minimalist binary/digital approach to large-scale single molecule protein identification with optically labeled tRNAs and multiple carboxypeptidases and its extension to peptide sequencing

Multi-pass nanopore for single-molecule protein sequencing

Unclogged pores: designer channels for protein translocation

Paving the way to single-molecule protein sequencing

Amplifiable protein identification via residue-resolved barcoding and composition code counting