Predicting Gas-Particle Partitioning Coefficients of Atmospheric Molecules with Machine Learning

Emma Lumiaro,Milica Todorović,Theo Kurten,Hanna Vehkamäki,Patrick Rinke
DOI: https://doi.org/10.5194/acp-21-13227-2021
2020-10-27
Abstract:The formation, properties and lifetime of secondary organic aerosols in the atmosphere are largely determined by gas-particle partitioning coefficients of the participating organic vapours. Since these coefficients are often difficult to measure or compute, we developed a machine learning (ML) model to predict them given molecular structure as input. Our data-driven approach is based on the dataset by Wang et al. (Atmos. Chem. Phys., 17, 7529 (2017)), who computed the partitioning coefficients and saturation vapour pressures of 3414 atmospheric oxidation products from the master chemical mechanism using the COSMOtherm program. We train a kernel ridge regression (KRR) ML model on the saturation vapour pressure ($P_{sat}$), and on two equilibrium partitioning coefficients: between a water-insoluble organic matter phase and the gas phase ($K_{WIOM/G}$), and between an infinitely dilute solution with pure water and the gas phase ($K_{W/G}$). For the input representation of the atomic structure of each organic molecule to the machine, we test different descriptors. Our best ML model predicts $P_{sat}$ and $K_{WIOM/G}$ to within 0.3 and $K_{W/G}$ to within 0.4 logarithmic units of the original COSMOtherm calculations. This is equal or better than the typical accuracy of COSMOtherm predictions compared to experimental data. We then apply our ML model to a dataset of 35,383 molecules that we generated based on a carbon 10 backbone and functionalized with 0 to 6 carboxyl, carbonyl or hydroxyl groups to evaluate its performance for polyfunctional compounds with potentially low $P_{sat}$. The resulting $P_{sat}$ and partitioning coefficient distributions were physico-chemically reasonable, and the volatility predictions for the most highly oxidized compounds were in qualitative agreement with experimentally inferred volatilities of atmospheric oxidation products with similar elemental composition.
Chemical Physics,Atmospheric and Oceanic Physics
What problem does this paper attempt to address?