Widespread misinterpretation of pKa terminology and its consequences

Jonathan Zheng,Ivo Leito,William Green
DOI: https://doi.org/10.26434/chemrxiv-2024-msd0q
2024-08-07
Abstract:The acid dissociation constant (pK a), which quantifies the propensity for a solute to donate a proton to its solvent, is crucial for drug design and synthesis, environmental fate studies, chemical manufacturing, and many other fields. Unfortunately, the terminology used for describing acid base phenomena is inconsistent, causing large potential for misinterpretation. In this work, we examine a systematic confusion underlying the definition of “acidic” and “basic” pKa values for zwitterionic compounds. Due to this confusion, some pKa data is misrepresented in data repositories, including the widely- used and highly trusted ChEMBL Database. Such datasets are widely used to supply training data for pKa prediction models, and hence, confusion and errors in the data makes model performance worse. Herein, we discuss the intricacies of this issue. We make suggestions for describing acid-base phenomena, training pKa prediction models, and stewarding pKa datasets, given the high potential for confusion and potentially high impact of accurately describing acid-base phenomena.
Chemistry
What problem does this paper attempt to address?