Effect of selection of molecular descriptors on the prediction of blood-brain barrier penetrating and nonpenetrating agents by statistical learning methods.

Hu Li,Chun Wei Yap,Choong Yong Ung,Ying Xue,Zhi Wei Cao,Yu Zong Chen
DOI: https://doi.org/10.1021/ci050135u
IF: 6.162
2005-01-01
Journal of Chemical Information and Modeling
Abstract:The ability or inability of a drug to penetrate into the brain is a key consideration in drug design. Drugs for treating central nervous system (CNS) disorders need to be able to penetrate the blood-brain barrier (BBB). BBB nonpenetration is desirable for non-CNS-targeting drugs to minimize potential CNS-related side effects. Computational methods have been employed for the prediction of BBB-penetrating (BBB+) and -nonpenetrating (BBB-) agents at impressive accuracies of 75 similar to 92% and 60 similar to 80%, respectively. However, the majority of these studies give a substantially lower BBB- accuracy, and thus overall accuracy, than the BBB+ accuracy. This work examined whether proper selection of molecular descriptors can improve both the BBB- and the overall accuracies of statistical learning methods. The methods tested include logistic regression, linear discriminate analysis, k nearest neighbor, C4.5 decision tree, probabilistic neural network, and support vector machine. Molecular descriptors were selected by using a feature selection method, recursive feature elimination (RFE). Results by using 415 BBB+ and BBB- agents show that RFE substantially improves both the BBB- and the overall accuracy for all of the methods studied. This suggests that statistical learning methods combined with proper feature selection is potentially useful for facilitating a more balanced and improved prediction of BBB+ and BBB- agents.
What problem does this paper attempt to address?