The recent progress in proteochemometric modelling: focusing on target descriptors, cross-term descriptors and application scope.
Tianyi Qiu,Jingxuan Qiu,Jun Feng,Dingfeng Wu,Yiyan Yang,Kailin Tang,Zhiwei Cao,Ruixin Zhu
DOI: https://doi.org/10.1093/bib/bbw004
IF: 9.5
2017-01-01
Briefings in Bioinformatics
Abstract:As an extension of the conventional quantitative structure activity relationship models, proteochemometric (PCM) modelling is a computational method that can predict the bioactivity relations between multiple ligands and multiple targets. Traditional PCM modelling includes three essential elements: descriptors (including target descriptors, ligand descriptors and cross-term descriptors), bioactivity data and appropriate learning functions that link the descriptors to the bioactivity data. Since its appearance, PCM modelling has developed rapidly over the past decade by taking advantage of the progress of different descriptors and machine learning techniques, along with the increasing amounts of available bioactivity data. Specifically, the new emerging target descriptors and cross-term descriptors not only significantly increased the performance of PCM modelling but also expanded its application scope from traditional protein-ligand interaction to more abundant interactions, including protein-peptide, protein-DNA and even protein-protein interactions. In this review, target descriptors and cross-term descriptors, as well as the corresponding application scope, are intensively summarized. Additionally, we look forward to seeing PCM modelling extend into new application scopes, such as Target-Catalyst-Ligand systems, with the further development of descriptors, machine learning techniques and increasing amounts of available bioactivity data.