Abstract:The genomic evaluation process relies on the assumption of linkage disequilibrium between dense single-nucleotide polymorphism (SNP) markers at the genome level and quantitative trait loci (QTL). The present study was conducted with the aim of evaluating four frequentist methods including Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, and Genomic Best Linear Unbiased Prediction (GBLUP) and five Bayesian methods including Bayes Ridge Regression (BRR), Bayes A, Bayesian LASSO, Bayes C, and Bayes B, in genomic selection using simulation data. The difference between prediction accuracy was assessed in pairs based on statistical significance (p-value) (i.e., t test and Mann-Whitney U test) and practical significance (Cohen's d effect size) For this purpose, the data were simulated based on two scenarios in different marker densities (4000 and 8000, in the whole genome). The simulated data included a genome with four chromosomes, 1 Morgan each, on which 100 randomly distributed QTL and two different densities of evenly distributed SNPs (1000 and 2000), at the heritability level of 0.4, was considered. For the frequentist methods except for GBLUP, the regularization parameter λ was calculated using a five-fold cross-validation approach. For both scenarios, among the frequentist methods, the highest prediction accuracy was observed by Ridge Regression and GBLUP. The lowest and the highest bias were shown by Ridge Regression and GBLUP, respectively. Also, among the Bayesian methods, Bayes B and BRR showed the highest and lowest prediction accuracy, respectively. The lowest bias in both scenarios was registered by Bayesian LASSO and the highest bias in the first and the second scenario were shown by BRR and Bayes B, respectively. Across all the studied methods in both scenarios, the highest and the lowest accuracy were shown by Bayes B and LASSO and Elastic Net, respectively. As expected, the greatest similarity in performance was observed between GBLUP and BRR ( d = 0.007 , in the first scenario and d = 0.003 , in the second scenario). The results obtained from parametric t and non-parametric Mann-Whitney U tests were similar. In the first and second scenario, out of 36 t test between the performance of the studied methods in each scenario, 14 ( P < . 001 ) and 2 ( P < . 05 ) comparisons were significant, respectively, which indicates that with the increase in the number of predictors, the difference in the performance of different methods decreases. This was proven based on the Cohen's d effect size, so that with the increase in the complexity of the model, the effect size was not seen as very large. The regularization parameters in frequentist methods should be optimized by cross-validation approach before using these methods in genomic evaluation.

A Penalized Regression Method for Genomic Prediction Reduces Mismatch between Training and Testing Sets

A marker weighting approach for enhancing within-family accuracy in genomic prediction

Improving the Efficiency of Genomic Selection

Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials

Investigating the Performance of Frequentist and Bayesian Techniques in Genomic Evaluation

Identification of Gene Pairs Through Penalized Regression Subject to Constraints

Genomic prediction using machine learning: a comparison of the performance of regularized regression, ensemble, instance-based and deep learning methods on synthetic and empirical data

A Multivariate Poisson Deep Learning Model for Genomic Prediction of Count Data

A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library

Residual network improves the prediction accuracy of genomic selection

Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data

A New Deep Learning Calibration Method Enhances Genome-Based Prediction of Continuous Crop Traits

GA-GBLUP: leveraging the genetic algorithm to improve the predictability of genomic selection

Comparing gradient boosting machine and Bayesian threshold BLUP for genome‐based prediction of categorical traits in wheat breeding

SABO-ILSTSVR: a genomic prediction method based on improved least squares twin support vector regression

A Penalized Linear Mixed Model for Genomic Prediction Using Pedigree Structures.

Residual Networks Without Pooling Layers Improve the Accuracy of Genomic Predictions

Penalized Regression Methods With Modified Cross‐Validation and Bootstrap Tuning Produce Better Prediction Models

Impact of selective genotyping in the training population on accuracy and bias of genomic selection

A Benchmarking Between Deep Learning, Support Vector Machine and Bayesian Threshold Best Linear Unbiased Prediction for Predicting Ordinal Traits in Plant Breeding

Computationally efficient whole-genome regression for quantitative and binary traits