Investigating Active Learning and Meta-Learning for Iterative Peptide Design

Rainier Barrett,Andrew D. White
DOI: https://doi.org/10.1021/acs.jcim.0c00946
IF: 6.162
2020-12-22
Journal of Chemical Information and Modeling
Abstract:Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here, we study the effectiveness of active learning, optimizing experiment order, and meta-learning, transferring knowledge between contexts, to reduce the number of experiments necessary to build a predictive model. We present a multitask benchmark database of peptides designed to advance these methods for experimental design. Each task is a binary classification of peptides represented as a sequence string. We find neither active learning method tested to be better than random choice. The meta-learning method Reptile was found to improve the average accuracy across data sets. Combining meta-learning with active learning offers inconsistent benefits.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.jcim.0c00946?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.jcim.0c00946</a>.Tables with exact values for accuracy and AUC results for each data set and method. Statistical analysis box-and-whisker plot for AUC. Training accuracy curves using beta calibration and uncertainty minimization. (<a class="ext-link" href="/doi/suppl/10.1021/acs.jcim.0c00946/suppl_file/ci0c00946_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.
chemistry, multidisciplinary, medicinal,computer science, interdisciplinary applications, information systems
What problem does this paper attempt to address?