Evaluation of Time-of-Flight Secondary Ion Mass Spectrometry Spectra of Peptides by Random Forest with Amino Acid Labels: Results from a Versailles Project on Advanced Materials and Standards Interlaboratory Study
Satoka Aoyagi,Yukio Fujiwara,Akio Takano,Jean-Luc Vorng,Ian S. Gilmore,Yung-Chen Wang,Elke Tallarek,Birgit Hagenhoff,Shin-ichi Iida,Andreas Luch,Harald Jungnickel,Yusheng Lang,Hyun Kyong Shon,Tae Geol Lee,Zhanping Li,Kazuhiro Matsuda,Ichiro Mihara,Ako Miisho,Yohei Murayama,Takaharu Nagatomi,Reiko Ikeda,Masayuki Okamoto,Kunio Saiga,Toshihiko Tsuchiya,Shigeaki Uemura
DOI: https://doi.org/10.1021/acs.analchem.0c04577
IF: 7.4
2021-02-26
Analytical Chemistry
Abstract:We report the results of a VAMAS (Versailles Project on Advanced Materials and Standards) interlaboratory study on the identification of peptide sample TOF-SIMS spectra by machine learning. More than 1000 time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of six peptide model samples (one of them was a test sample) were collected using 27 TOF-SIMS instruments from 25 institutes of six countries, the U. S., the U. K., Germany, China, South Korea, and Japan. Because peptides have systematic and simple chemical structures, they were selected as model samples. The intensity of peaks in every TOF-SIMS spectrum was extracted using the same peak list and normalized to the total ion count. The spectra of the test peptide sample were predicted by Random Forest with 20 amino acid labels. The accuracy of the prediction for the test spectra was 0.88. Although the prediction of an unknown peptide was not perfect, it was shown that all of the amino acids in an unknown peptide can be determined by Random Forest prediction and the TOF-SIMS spectra. Moreover, the prediction of peptides, which are included in the training spectra, was almost perfect. Random Forest also suggests specific fragment ions from an amino acid residue Q, whose fragment ions detected by TOF-SIMS have not been reported, in the important features. This study indicated that the analysis using Random Forest, which enables translation of the mathematical relationships to chemical relationships, and the multi labels representing monomer chemical structures, is useful to predict the TOF-SIMS spectra of an unknown peptide.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.analchem.0c04577?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.analchem.0c04577</a>.(Note 1) VAMAS Technical Working Area 2, surface chemical analysis, A26 protocol; (Note 2) the peak list for the descriptor; (Note 3) main parameters for random forest (3.2.4.3.1. sklearn.ensemble.RandomForestClassifier); (Figure 1) typical TOF-SIMS spectra of the peptide samples obtained with Bi<sub>3</sub><sup>2+</sup>, (a) spectra obtained with TOF.SIMS5 (ION-TOF GmbH) and (b) spectra obtained with PHI nanoTOF II (Ulvac-Phi); (Figure 2) typical TOF-SIMS spectra of S3 peptide (Angiotensin II) by Ga<sup>+</sup>, Bi<sub>3</sub><sup>2+</sup>, C<sub>60</sub><sup>2+</sup>, and Ar<sub>1000</sub><sup>+</sup>, and mass ranges of the upper and the lower figures are 0–200 and 0–1100 u, respectively; (Figure 3) typical unknown peptide spectrum by Bi<sub>3</sub><sup>2+</sup>, (a) Logscale spectrum and (b) magnified spectrum in a low mass range (0–200 u); (Figure 4) typical spectra of S2 and S4 peptides by Bi<sub>3</sub><sup>2+</sup> in a low mass range (0–200 u); (Figure 5) work flow diagram of the identification process; (Table 1.) list of instruments, peak list software, primary ion sources, primary ion energy and raster size used by the participants in this study; (Table 2.) example of the peak list; (Table 3.) typical data format (labels and descriptors); (Table 4.) the existence of the amino acids in the reference samples (S1, S2, S3, S4, and S5); (Table 5.) wrongly predicted spectra (interpolation); (Table 6.) known peptide (interpolation) prediction (gray: wrongly predicted labels); (Table 7.) Random Forest prediction of unknown peptide. (gray: wrongly predicted labels); and (Table 8.) amino acid fragment ions observed in TOF-SIMS spectra [refs 5, 26–29] (<a class="ext-link" href="/doi/suppl/10.1021/acs.analchem.0c04577/suppl_file/ac0c04577_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
chemistry, analytical