Distinct genomic contexts predict gene presence-absence variation in different pathotypes of Magnaporthe oryzae

Pierre M Joubert,Ksenia V Krasileva
DOI: https://doi.org/10.1093/genetics/iyae012
IF: 4.402
2024-01-31
Genetics
Abstract:Fungi use the accessory gene content of their pangenomes to adapt to their environments. While gene presence-absence variation (PAV) contributes to shaping accessory gene reservoirs, the genomic contexts that shape these events remain unclear. Since pangenome studies are typically species-wide and do not analyze different populations separately, it is yet to be uncovered whether PAV patterns and mechanisms are consistent across populations. Fungal plant pathogens are useful models for studying PAV because they rely on it to adapt to their hosts, and members of a species often infect distinct hosts. We analyzed gene PAV in the blast fungus, Magnaporthe oryzae (syn. Pyricularia oryzae), and found that PAV genes involved in host-pathogen and microbe-microbe interactions may drive the adaptation of the fungus to its environment. We then analyzed genomic and epigenomic features of PAV and observed that proximity to transposable elements, gene GC content, gene length, expression level in the host, and histone H3K27me3 marks were different between PAV genes and conserved genes. We used these features to construct a model that was able to predict whether a gene is likely to experience PAV with high precision (86.06%) and recall (92.88%) in M. oryzae. Finally, we found that PAV genes in the rice and wheat pathotypes of M. oryzae differed in their number and their genomic context. Our results suggest that genomic and epigenomic features of gene PAV can be used to better understand and predict fungal pangenome evolution. We also show that substantial intra-species variation can exist in these features.
genetics & heredity
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to understand the patterns and mechanisms of gene presence - absence variation (PAV) in different pathogenic types of *Magnaporthe oryzae*. Specifically, the researchers analyzed PAV genes related to host - pathogen and microbe - microbe interactions in *Magnaporthe oryzae* and explored the genomic and epigenomic characteristics of these genes to predict whether genes are likely to undergo PAV. In addition, the study also compared the PAV differences between rice - and wheat - pathogenic types, revealing significant differences in the number of PAV events and genomic backgrounds between different pathogenic types. ### Main research questions: 1. **Functions of PAV genes**: Which functional genes are more likely to undergo PAV? 2. **Genomic and epigenomic characteristics of PAV genes**: What are the differences in genomic and epigenomic characteristics between PAV genes and conserved genes? 3. **PAV prediction model**: Can a model be constructed to predict whether a gene is likely to undergo PAV? 4. **PAV differences between different pathogenic types**: Are there differences in PAV patterns and mechanisms between rice - and wheat - pathogenic types? ### Research methods: - **Genome annotation and orthogroup partitioning**: Use FunGAP for genome annotation and OrthoFinder for orthogroup partitioning. - **Gene deletion verification**: Verify gene deletion through TBLASTN and BLASTP. - **Effector annotation**: Use SignalP, TMHMM and EffectorP to predict effectors. - **Principal component analysis**: Use PCA to identify PAV orthogroups that distinguish different lineages. - **Gene ontology and protein family enrichment analysis**: Use TopGO and PFAM databases for enrichment analysis. - **Identification of large insertions and deletions**: Use Illumina sequencing data and multiple tools (such as smoove, wham, Delly, Manta) to identify large insertions and deletions. - **Random forest classification and feature importance calculation**: Use scikit - learn to train a random forest model, evaluate the model performance and calculate the feature importance. ### Research results: - **Functions of PAV genes**: PAV genes are mainly involved in host - pathogen and microbe - microbe interactions. - **Characteristics of PAV genes**: PAV genes are usually close to transposable elements (TEs), have a lower GC content, shorter gene length, lower expression level, and are more likely to show the H3K27me3 histone mark. - **PAV prediction model**: The constructed random forest model can predict whether a gene is likely to undergo PAV with high precision (86.06%) and recall rate (92.88%). - **Differences between different pathogenic types**: There are significant differences in the number of PAV events and genomic backgrounds between rice - and wheat - pathogenic types, which may reflect their different evolutionary histories and adaptation mechanisms. Through these studies, the authors hope to better understand the evolution of the fungal pan - genome and provide theoretical support for future fungal disease control.