Structural-model-based genome mining can efficiently discover novel non-canonical terpene synthases hidden in genomes of diverse species

Tohru Abe,Haruna Shiratori,Kosuke Kashiwazaki,Kazuma Hiasa,Daijiro Ueda,Tohru Taniguchi,Hajime Sato,Takashi Abe,Tsutomu Sato
DOI: https://doi.org/10.1039/d4sc01381f
IF: 8.4
2024-06-06
Chemical Science
Abstract:Non-canonical terpene synthases (TPSs) with primary sequences that are unrecognizable as canonical TPSs have evaded detection by conventional genome mining. This study aimed to prove that novel non-canonical TPSs can be efficiently discovered from proteins, hidden in genome databases, predicted to have 3D strutures similar to those of class I TPSs. Six types of non-canonical TPS candidates were detected using this search strategy from 268 genome sequences from actinomycetes. Functional analyses of these candidates revealed that at least three types were novel non-canonical TPSs. We propose classifying the non-canonical TPSs as classes ID, IE, and IF. A hypothetical protein MBB6373681 from Pseudonocardia eucalypti (PeuTPS) was selected as a representative example of class ID TPSs and characterized. PeuTPS was identified as a diterpene synthase that forms a 6/6/6-fused tricyclic gersemiane skeleton. Analyses of PeuTPS variants revealed that amino acid residues within new motifs [D(N/D), ND, RXXKD] located close to the class I active site in the 3D structure were essential for enzymatic activity. The homologs of non-canonical TPSs found in this study exist in bacteria as well as in fungi, protists, and plants, and the PeuTPS gene is not located near terpene biosynthetic genes in the genome. Therefore, structural-model-based genome mining is an efficient strategy to search for novel non-canonical TPSs that are independent of biological species and biosynthetic gene clusters, and will contribute to expanding the structural diversity of terpenoids.
chemistry, multidisciplinary
What problem does this paper attempt to address?
This paper aims to solve the problem of how to efficiently discover new non - canonical terpene synthases (non - canonical terpene synthases, TPSs) hidden in genomic databases. Traditionally, new terpene synthases are discovered through sequence homology searches (such as BLAST and Hidden Markov Models), but this method is not effective for non - canonical TPSs whose primary sequences are not similar to typical TPSs. Therefore, the authors proposed a genome - mining strategy based on structural models, that is, to identify potential non - canonical TPS candidates by predicting the three - dimensional structure similarity of proteins to the 3D structure of known class I TPSs. Specifically, the authors screened proteins with 3D structures similar to class I TPSs from the genomes of 268 Actinomycetes, and confirmed at least three new non - canonical TPSs (named ID, IE and IF classes) through functional analysis. This method can not only span different biological species and biosynthetic gene clusters, but also expand the structural diversity of terpene compounds, providing new tools and directions for future research.