Prediction of key milk biomarkers in dairy cows through milk mid-infrared spectra and international collaborations
C Grelet,T Larsen,M A Crowe,D C Wathes,C P Ferris,K L Ingvartsen,C Marchitelli,F Becker,A Vanlierde,J Leblois,U Schuler,F J Auer,A Köck,L Dale,J Sölkner,O Christophe,J Hummel,A Mensching,J A Fernández Pierna,H Soyeurt,M Calmels,R Reding,M Gelé,Y Chen,N Gengler,GplusE Consortium,F Dehareng
DOI: https://doi.org/10.3168/jds.2023-23843
Abstract:At the individual cow level, suboptimum fertility, mastitis, negative energy balance, and ketosis are major issues in dairy farming. These problems are widespread on dairy farms and have an important economic impact. The objectives of this study were (1) to assess the potential of milk mid-infrared (MIR) spectra to predict key biomarkers of energy deficit (citrate, isocitrate, glucose-6 phosphate [glucose-6P], free glucose), ketosis (β-hydroxybutyrate [BHB] and acetone), mastitis (N-acetyl-β-d-glucosaminidase activity [NAGase] and lactate dehydrogenase), and fertility (progesterone); (2) to test alternative methodologies to partial least squares (PLS) regression to better account for the specific asymmetric distribution of the biomarkers; and (3) to create robust models by merging large datasets from 5 international or national projects. Benefiting from this international collaboration, the dataset comprised a total of 9,143 milk samples from 3,758 cows located in 589 herds across 10 countries and represented 7 breeds. The samples were analyzed by reference chemistry for biomarker contents, whereas the MIR analyses were performed on 30 instruments from different models and brands, with spectra harmonized into a common format. Four quantitative methodologies were evaluated to address the strongly skewed distribution of some biomarkers. Partial least squares regression was used as the reference basis, and compared with a random modification of distribution associated with PLS (random-downsampling-PLS), an optimized modification of distribution associated with PLS (KennardStone-downsampling-PLS), and support vector machine (SVM). When the ability of MIR to predict biomarkers was too low for quantification, different qualitative methodologies were tested to discriminate low versus high values of biomarkers. For each biomarker, 20% of the herds were randomly removed within all countries to be used as the validation dataset. The remaining 80% of herds were used as the calibration dataset. In calibration, the 3 alternative methodologies outperform the PLS performances for the majority of biomarkers. However, in the external herd validation, PLS provided the best results for isocitrate, glucose-6P, free glucose, and lactate dehydrogenase (coefficient of determination in external herd validation [R2v] = 0.48, 0.58, 0.28, and 0.24, respectively). For other molecules, PLS-random-downsampling and PLS-KennardStone-downsampling outperformed PLS in the majority of cases, but the best results were provided by SVM for citrate, BHB, acetone, NAGase, and progesterone (R2v = 0.94, 0.58, 0.76, 0.68, and 0.15, respectively). Hence, PLS and SVM based on the entire dataset provided the best results for normal and skewed distributions, respectively. Complementary to the quantitative methods, the qualitative discriminant models enabled the discrimination of high and low values for BHB, acetone, and NAGase with a global accuracy around 90%, and glucose-6P with an accuracy of 83%. In conclusion, MIR spectra of milk can enable quantitative screening of citrate as a biomarker of energy deficit and discrimination of low and high values of BHB, acetone, and NAGase, as biomarkers of ketosis and mastitis. Finally, progesterone could not be predicted with sufficient accuracy from milk MIR spectra to be further considered. Consequently, MIR spectrometry can bring valuable information regarding the occurrence of energy deficit, ketosis, and mastitis in dairy cows, which in turn have major influences on their fertility and survival.