Abstract:Abstract The recent assembly and annotation of the 26 maize nested association mapping population founder inbreds have enabled large-scale pan-genomic comparative studies. These studies have expanded our understanding of agronomically important traits by integrating pan-transcriptomic data with trait-specific gene candidates from previous association mapping results. In contrast to the availability of pan-transcriptomic data, obtaining reliable protein–protein interaction (PPI) data has remained a challenge due to its high cost and complexity. We generated predicted PPI networks for each of the 26 genomes using the established STRING database. The individual genome-interactomes were then integrated to generate core- and pan-interactomes. We deployed the PPI clustering algorithm ClusterONE to identify numerous PPI clusters that were functionally annotated using gene ontology (GO) functional enrichment, demonstrating a diverse range of enriched GO terms across different clusters. Additional cluster annotations were generated by integrating gene coexpression data and gene description annotations, providing additional useful information. We show that the functionally annotated PPI clusters establish a useful framework for protein function prediction and prioritization of candidate genes of interest. Our study not only provides a comprehensive resource of predicted PPI networks for 26 maize genomes but also offers annotated interactome clusters for predicting protein functions and prioritizing gene candidates. The source code for the Python implementation of the analysis workflow and a standalone web application for accessing the analysis results are available at https://github.com/eporetsky/PanPPI.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Gene function prediction**: By generating and analyzing the predicted protein - protein interaction networks (PPI) of 26 genomes of maize (*Zea mays*), this study aims to improve the ability to predict the functions of maize genes. Due to the high cost and technical difficulty of high - throughput protein - protein interaction experiments, researchers use existing databases (such as the STRING database) to predict these interactions, thereby constructing the pan - interactome and core - interactome of maize. 2. **Candidate gene prioritization**: The study not only provides the predicted PPI network resources of 26 maize genomes but also predicts protein functions through functionally annotated PPI clusters and gives priority to candidate genes related to specific traits. This helps to identify causal genes that may affect important agronomic traits such as plant architecture, height, flowering time, grain weight, and the abundance of different metabolites. 3. **Integrating multi - omics data**: In order to better understand gene functions and their roles in complex biological systems, the study integrates gene co - expression data and gene description annotations, providing additional useful information. This integration method not only helps to predict protein functions but also can be used to verify the regulatory roles of proteins in complex signaling pathways. Specifically, the study solves the above problems through the following steps: - **Generating predicted PPI networks**: The PPI networks of 26 maize genomes are predicted using the STRING database. - **Constructing pan - interactome and core - interactome**: The PPI networks of 26 individual genomes are mapped to unified pan - gene IDs to generate the pan - interactome and core - interactome. - **PPI network clustering**: The PPI networks are clustered using the ClusterONE algorithm to generate functionally annotated PPI clusters. - **Functional annotation and enrichment analysis**: Functional annotation of PPI clusters is carried out through GO term enrichment analysis and gene co - expression data, improving the interpretability of the network. - **Application examples**: By searching for GO terms related to flowering time, it is shown how to use the functional annotation of PPI clusters to infer potential gene functions and prioritize candidate genes. In conclusion, this study provides a comprehensive framework for generating and analyzing the predicted PPI networks of maize, thereby improving gene function prediction and candidate gene prioritization, and providing a powerful tool for crop improvement and understanding of complex biological systems.

Harnessing the predicted maize pan-interactome for putative gene function prediction and prioritization of candidate genes for important traits

PRIN: a Predicted Rice Interactome Network

The Predicted Arabidopsis Interactome Resource and Network Topology-Based Systems Biology Analyses

PPIM: A Protein-Protein Interaction Database for Maize.

Predicted Arabidopsis Interactome Resource and Gene Set Linkage Analysis: A Transcriptomic Analysis Resource.

Computational Identification of Protein-Protein Interactions in Rice Based on the Predicted Rice Interactome Network.

Integrated De Novo Gene Prediction and Peptide Assembly of Metagenomic Sequencing Data

DWPPI: A Deep Learning Approach for Predicting Protein–Protein Interactions in Plants Based on Multi-Source Information With a Large-Scale Biological Network

Prioritizing Maize Metabolic Gene Regulators through Multi-Omic Network Integration

Maize network analysis revealed gene modules involved in development, nutrients utilization, metabolism, and stress response

Enhanced pan-genomic resources at the maize genetics and genomics database

Predicted Networks of Protein-Protein Interactions in Stegodyphus Mimosarum by Cross-Species Comparisons

Open source tool for prediction of genome wide protein-protein interaction network based on ortholog information

PanEffect: a pan-genome visualization tool for variant effects in maize

MaizeNetome: A multi-omics network database for functional genomics in maize

Structure-based prediction of protein-protein interaction network in rice

Prediction of new candidate proteins and analysis of sub-modules and protein hubs associated with seed development in rice (Oryza sativa) using an ensemble network-based systems biology approach.

Assessment of community efforts to advance network-based prediction of protein–protein interactions

Prediction and Analysis of the Protein-Protein Interaction Networks for Chickens, Cattle, Dogs, Horses and Rabbits

Gene function annotations for the maize NAM founder lines

An updated gene atlas for maize reveals organ‐specific and stress‐induced genes