CLC-DB: an online open-source database of chiral ligands and catalysts

Gufeng Yu,Kaiwen Yu,Xi Wang,Xiaohong Huo,Yang Yang
DOI: https://doi.org/10.26434/chemrxiv-2024-h2rdl
2024-07-08
Abstract:As an important research area in organic chemistry, asymmetric catalysis has contributed greatly to the development of chemistry and other fields, and chiral ligands/catalysts are the core research content of it. However, traditional experimental methods still have some limitations, and machine learning (ML)-based computational methods suffer from the lack of sufficient and accurate data resources about chiral ligands/catalysts. To overcome this challenge, we develop the Chiral Ligand and Catalyst Database (CLC-DB). To our best knowledge, CLC-DB is the first open-source and largest professional database for chiral ligands/catalysts, containing 1861 molecules of several basic chiral types that belong to 32 different chiral ligand/catalyst types. A total of 19 items of information are included for each data record, including the 2D and 3D chemical structure, ligand/catalyst category, chiral type, chemical and physical properties, artificial intelligence (AI) generated description, etc. Each molecular data is linked with authoritative chemical databases and validated by chemical experts. CLC-DB is a user-friendly database that supports two quick search methods and batch search. In addition, CLC-DB provides an efficient online molecular clustering tool for ML computational analyses. CLC-DB is accessible at https://compbio.sjtu.edu.cn/services/clc-db, and all the data can be downloaded for free.
Chemistry
What problem does this paper attempt to address?
This paper introduces an online open-source database called CLC-DB, which focuses on professional information about chiral ligands and catalysts. Currently, despite the fact that asymmetric catalysis is an important research field in organic chemistry, both traditional experimental methods and machine learning (ML)-based computational methods face the problem of insufficient data. To address this challenge, the authors developed CLC-DB, which is the first publicly available and largest professional database of chiral ligands and catalysts, containing 1861 chiral molecular structures belonging to 32 different types. Each data entry includes 2D and 3D chemical structures, categories, chiral types, chemical and physical properties, and 19 other pieces of information, all verified by chemical experts. The database provides fast search and batch search functions, as well as online molecular clustering tools for ML computational analysis. CLC-DB aims to support research and design of chiral ligands and catalysts, providing chemists with a user-friendly resource repository.