Towards comprehensive coverage of chemical space: Quantum mechanical properties of 836k constitutional and conformational closed shell neutral isomers consisting of HCNOFSiPSClBr
Danish Khan,Anouar Benali,Scott Y. H. Kim,Guido Falk von Rudorff,O. Anatole von Lilienfeld
2024-09-20
Abstract:The Vector-QM24 (VQM24) dataset attempts to more comprehensively cover all possible neutral closed shell small organic and inorganic molecules and their conformers at state of the art level of theory. We have used density functional theory ($\omega$B97X-D3/cc-pVDZ) to optimize 577k conformational isomers corresponding to 258k constitutional isomers. Isomers included contain up to five heavy atoms (non-hydrogen) consisting of $p$-block elements C, N, O, F, Si, P, S, Cl, Br. Single point diffusion quantum Monte Carlo (DMC@PBE0(ccECP/cc-pVQZ)) energies are reported for the sub-set of all the lowest conformers (10,793 molecules) with up to 4 heavy atoms. This dataset has been systematically generated by considering all combinatorially possible stoichiometries, and graphs (according to Lewis rules as implemented in the {\tt SURGE} package), along with all stable conformers identified by GFN2-xTB. Apart from graphs, geometries, rotational constants, and vibrational normal modes, VQM24 includes internal, atomization, electron-electron repulsion, exchange correlation, dispersion, vibrational frequency, Gibbs free, enthalpy, ZPV, molecular orbital energies; as well as entropy, and heat capacities. Electronic properties include multipole moments (dipole, quadrupole, octupole, hexadecapole), electrostatic potentials at nuclei (alchemical potential), Mulliken charges, and molecular wavefunctions. Machine learning (ML) models on the 258k constitutional isomers indicate an upto $\sim$8 times more challenging benchmark than the commonly used QM9 dataset. VQM24 represents a highly accurate and unbiased dataset of molecules, ideal for testing and training transferable, scalable, and generative ML models of real quantum systems.
Chemical Physics