RaptGen: A variational autoencoder with profile hidden Markov model for generative aptamer discovery

Natsuki Iwano,Tatsuo Adachi,Kazuteru Aoki,Yoshikazu Nakamura,Michiaki Hamada
DOI: https://doi.org/10.1101/2021.02.17.431338
2021-02-17
Abstract:Abstract Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). A variety of candidates is limited by actual sequencing data from an experiment. Here, we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimension latent space dependent on motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery. Codes are available at https://github.com/hmdlab/raptgen .
What problem does this paper attempt to address?