Gene Regulatory Network Inference Methodology for Genomic and Transcriptomic Data Acquired in Genetically Related Heterozygote Individuals
Lise Pomiès,Céline Brouard,Harold Duruflé,Élise Maigné,Clément Carré,Louise Gody,Fulya Trösser,George Katsirelos,Brigitte Mangin,Nicolas B Langlade,Simon de Givry
DOI: https://doi.org/10.1093/bioinformatics/btac445
IF: 5.8
2022-07-07
Bioinformatics
Abstract:Inferring gene regulatory networks in non-independent genetically-related panels is a methodological challenge. This hampers evolutionary and biological studies using heterozygote individuals such as in wild sunflower populations or cultivated hybrids. First, we simulated 100 datasets of gene expressions and polymorphisms, displaying the same gene expression distributions, heterozygosities and heritabilities as in our dataset including 173 genes and 353 genotypes measured in sunflower hybrids. Secondly, we performed a meta-analysis based on six inference methods (Lasso, Random Forests, Bayesian Networks, Markov Random Fields, Ordinary Least Square and Findr) and selected the minimal density networks for better accuracy with 64 edges connecting 79 genes and 0.35 AUPR score on average. We identified that triangles and mutual edges are prone to errors in the inferred networks. Applied on classical datasets without heterozygotes, our strategy produced a 0.65 AUPR score for one dataset of the DREAM5 Systems Genetics Challenge. Finally, we applied our method to an experimental dataset from sunflower hybrids. We successfully inferred a network composed of 105 genes connected by 106 putative regulations with a major connected component. Our inference methodology dedicated to genomic and transcriptomic data is available at https://forgemia.inra.fr/sunrise/inference_methods. The data are available in the Data INRAE, at https://doi.org/10.15454/vrgwz2 (simulated datasets and also the output of meta-analysis) and https://doi.org/10.15454/HESVA0 (experimental sunflower dataset), and the complete descriptions of the inference methods used by the meta-analysis, the gene selection procedure related to drought and heterosis are available online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology