Deep Learning and GBLUP Integration: An Approach that Identifies Nonlinear Genetic Relationships Between Traits
Fatima Shokor,Pascal Croiseau,Hugo Gangloff,Romain Saintilan,Thierry Tribout,Tristan Mary-Huard,Beatriz C.D Cuyabano
DOI: https://doi.org/10.1101/2024.03.23.585208
2024-08-21
Abstract:Background
Genomic prediction aims to predict the breeding values of multiple complex traits, usually assumed to be normally distributed by the largely used statistical methods, thus imposing linear genetic correlations between traits. While statistical methods are of great value for genomic prediction, these methods do not account for nonlinear genetic relationships between traits. If such relationships exist, although statistical models do perform a fair linear approximation, their prediction accuracy is limited due to the nonlinearity. Deep learning (DL) is a promising methodology for predicting multiple complex traits, in scenarios where nonlinear genetic relationships are present, due to its capacity to capture complex and nonlinear patterns in large data. We proposed a novel hybrid DLGBLUP model which uses the output of the traditional GBLUP, and enhances its PGV by accounting for nonlinear genetic relationships between traits using DL. Using simulated data, we compared the accuracy of the PGV obtained with the proposed hybrid DLGBLUP model, a DL model, and the traditional GBLUP model, the latter being our baseline reference.
Results
We found that both DL and DLGBLUP models either outperformed GBLUP, or presented equally accurate PGV, with a particular greater accuracy for traits presenting a strongly characterized nonlinear genetic relationship. Overall, DLGBLUP presented the highest prediction accuracy, up to 0.2 points higher than GBLUP, and smallest mean squared error of the PGV for all traits. Additionally, we evolved a base population over seven generations and compared the genetic progress when selecting individuals based on the additive PGV obtained by either DL, DLGBLUP or GBLUP. For all traits with a nonlinear genetic relationship, after the fourth generation, the observed genetic gain when selection was based on the additive PGV from GBLUP was always inferior to the one achieved from either DL or DLGBLUP.
Conclusions
The integration of DL into genomic prediction enables the possibility of modeling nonlinear relationships between traits. Moreover, by identifying these nonlinear genetic relationships, our DL and DLGBLUP models improved prediction accuracy, when compared to GBLUP. The possibility of nonlinear relationships between traits offers a different perspective into multi-trait evaluations and prediction, as well as into the traits evolution over generations, with potential to further improve selection strategies in commercial livestock breeding programs. Moreover, DLGBLUP shows that DL can be used as a complement to statistical methods, by enhancing their performance.
Genomics