LogoMotif: a comprehensive database of transcription factor binding site profiles in Actinobacteria

Hannah E. Augustijn,Dimitris Karapliafis,Kristy Joosten,Sébastien Rigali,Gilles P. van Wezel,Marnix H. Medema
DOI: https://doi.org/10.1101/2024.02.28.582527
2024-03-03
Abstract:Actinobacteria undergo a complex multicellular life cycle and produce a wide range of specialized metabolites, including the majority of the antibiotics. These biological processes are controlled by intricate regulatory pathways, and to better understand how they are controlled we need to augment our insights into the transcription factor binding sites. Here, we present LogoMotif ( ), an open-source database for characterized and predicted transcription factor binding sites in Actinobacteria, along with their cognate position weight matrices and hidden Markov models. Genome-wide predictions of binding site locations in model organisms are supplied and visualized in interactive regulatory networks. In the web interface, users can freely access, download and investigate the underlying data. With this curated collection of actinobacterial regulatory interactions, LogoMotif serves as a basis for binding site predictions, thus providing users with clues on how to elicit the expression of genes of interest and guide genome mining efforts.
Bioinformatics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the insufficient understanding of transcription factor binding sites (TFBSs) in Actinobacteria. Actinobacteria are an important class of bacteria that can produce a large number of natural products, including most of the known antibiotics. These biological processes are controlled by complex regulatory networks. In order to better understand these regulatory mechanisms, it is necessary to increase the knowledge of transcription factor binding sites. However, at present, little is known about the transcription factor binding sites in many Actinobacteria, which limits our understanding of gene expression regulation and our potential in drug discovery and biotechnological applications. For this reason, the paper introduces the **LogoMotif** database, which is an open - source database that contains verified and predicted transcription factor binding sites in Actinobacteria as well as their corresponding position weight matrices (PWMs) and hidden Markov models (HMMs). By providing these data and tools, LogoMotif aims to: 1. **Provide a comprehensive data set**: It contains about 400 verified and about 12,100 predicted regulatory interactions, which are presented in an interactive network form. 2. **Serve as the basis for the gene cluster detection tool antiSMASH**: The data and algorithms of LogoMotif provide the basis for regulatory prediction for antiSMASH. 3. **Support research on gene expression and function inference**: By providing detailed regulatory information, it helps researchers understand the expression and function of genes. 4. **Promote the discovery of new chemical substances**: It is not limited to Actinobacteria, but may also be extended to other microorganisms. In summary, the goal of LogoMotif is to enhance the understanding of the regulatory networks in Actinobacteria by providing comprehensive transcription factor binding site data and prediction tools, thereby promoting drug discovery and the development of biotechnology.