Graph-based Latent Space Traversal for New Molecules Discovery.

Yang Zhang,Jie Li,Chao Xu
DOI: https://doi.org/10.1145/3615522.3615548
2023-01-01
Abstract:Generative models provide an efficient way to analyze and understand unlabeled data, creating the latent space for data modeling and generation. Since the interpretation of latent space usually requires implicit expert knowledge, this human-centered feature makes visual analytic methods effective. In the filed of computer chemistry, some research have applied generative models to generate chemical spaces and generated new molecules by sampling in the latent space. However, the latent space is typically high and sparse, and there may be a large number of “dead zones”, which may lead to decoding sample points from the latent space are noisy or invalid. Therefore, it is extremely challenging to efficiently search and traverse the latent space and generate new molecules with the desired properties. This paper aims to propose a visualization system for interactive exploration of latent space, which inspires the researchers to design new potential molecules with desired properties. The main work of this paper is as follows: First, we investigate a series of literature on the application of generative networks to drug design and synthesis, and interview experts with computer chemistry background to summarize the requirements and tasks. Second, based on the above requirements and tasks, we propose a graph-based latent space traversal and interpolation algorithms and neighborhood sampling algorithms. This can improve the number of generated potential molecules and the speed of discovery of similar molecules. Then, we conduct comparison experiments to verify the effectiveness of the algorithms. Finally, we design visualization system and then conduct the case study and user study to verify the effectiveness of the visualization system.
What problem does this paper attempt to address?