DataMol Tutorials - Aligning Molecules
©️ Copyright 2023 @ Authors
作者:
王应泽 📨
日期:2023-06-14
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 datamol:06-15镜像及任意节点配置,稍等片刻即可运行。
One of the most important things enabling us to have success in drug discovery is understanding structure-activity relationship (SAR). This is a foundation that explains how the structure of a molecule relates to its biological activity. Here is an excerpt from CDDVault that explains the importance of SAR in drug discovery:
”SAR depends on the recognition of which structural characteristics correlate with chemical and biological reactivity. Thus the ability to draw conclusions about an unknown compound depends upon both the structural features that can be characterized as well as the database of molecules against which they are compared. When combined with appropriate professional judgment, SAR can be a powerful tool for understanding functional implications when similarities are found. For example, in the case of risk assessment of uncharacterized compounds, data from the most sensitive toxicological endpoints should be included in the analysis, such as carcinogenicity or cardiotoxicity.”
This is where aligning molecules can be useful. Given the importance of structure, aligning molecules is generally useful for a few things:
- Aligning molecules makes it easier to visualize and highlight substructures
- It can help you identify and understand any shared structures between drug candidates
- It can help you identify any potential activity patterns
- For example, when analyzing activity cliffs - “defined as pairs or groups of structurally similar compounds that are active against the same target but have large differences in potency. Activity cliffs capture chemical modifications that strongly influence biological activity”. An example of some compounds with activity cliffs and their structural differences highlighted are shown below:
Note: The concept of an activity cliff seems simple in this example, however, in practice it is extremely complicated to represent them computationally and derive a systematic approach for identification. Read more about the evolving concept of activity cliffs here.
As an example to show you how alignment makes visualization easier, please see the image below. The alignment of this sample dataset makes it extremely easy to identify the core structure of 3 rings which are common in all compounds.
Tutorial
This tutorial will show you how to:
- Partition a list of molecules into clusters sharing a common scaffold of a common core, then align the molecules to that common core.
- Note: this function will compute the list of smiles/smarts representative of each cluster first.
- Aligning molecules according to a template molecule