Feature engineering of environmental covariates improves plant genomic-enabled prediction

Osval A. Montesinos-López,Leonardo Crespo-Herrera,Carolina Saint Pierre,Bernabe Cano-Paez,Gloria Isabel Huerta-Prado,Brandon Alejandro Mosqueda-González,Sofia Ramos-Pulido,Guillermo Gerard,Khalid Alnowibet,Roberto Fritsche-Neto,Abelardo Montesinos-López,José Crossa
DOI: https://doi.org/10.3389/fpls.2024.1349569
IF: 5.6
2024-05-16
Frontiers in Plant Science
Abstract:Introduction: Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology. Methods: When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models. Results and discussion: We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.
plant sciences
What problem does this paper attempt to address?