Machine learning recognition of protein secondary structures based on two-dimensional spectroscopic descriptors

Hao Ren,Qian Zhang,Zhengjie Wang,Guozhen Zhang,Hongzhang Liu,Wenyue Guo,Shaul Mukamel,Jun Jiang
DOI: https://doi.org/10.1073/pnas.2202713119
2022-01-01
Abstract:Protein secondary structure discrimination is crucial for understanding their biological function. It is not generally possible to invert spectroscopic data to yield the structure. We present a machine learning protocol which uses two-dimensional UV (2DUV) spectra as pattern recognition descriptors, aiming at automated protein secondary structure determination from spectroscopic features. Accurate secondary structure recognition is obtained for homologous (97%) and nonhomologous (91%) protein segments, randomly selected from simulated model datasets. The advantage of 2DUV descriptors over one-dimensional linear absorption and circular dichroism spectra lies in the cross-peak information that reflects interactions between local regions of the protein. Thanks to their ultrafast (similar to 200 fs) nature, 2DUV measurements can be used in the future to probe conformational variations in the course of protein dynamics.
What problem does this paper attempt to address?