Abstract:Maize is a widely grown crop in China, and the relationships between agroclimatic parameters and maize yield are complicated, hence, accurate and timely yield prediction is challenging. Here, climate, satellite data, and meteorological indices were integrated to predict maize yield at the city-level in China from 2000 to 2015 using four machine learning approaches, e.g., cubist, random forest (RF), extreme gradient boosting (Xgboost), and support vector machine (SVM). The climate variables included the diffuse flux of photosynthetic active radiation (PDf), the diffuse flux of shortwave radiation (SDf), the direct flux of shortwave radiation (SDr), minimum temperature (Tmn), potential evapotranspiration (Pet), vapor pressure deficit (Vpd), vapor pressure (Vap), and wet day frequency (Wet). Satellite data, including the enhanced vegetation index (EVI), normalized difference vegetation index (NDVI), and adjusted vegetation index (SAVI) from the Moderate Resolution Imaging Spectroradiometer (MODIS), were used. Meteorological indices, including growing degree day (GDD), extreme degree day (EDD), and the Standardized Precipitation Evapotranspiration Index (SPEI), were used. The results showed that integrating all climate, satellite data, and meteorological indices could achieve the highest accuracy. The highest estimated correlation coefficient (R) values for the cubist, RF, SVM, and Xgboost methods were 0.828, 0.806, 0.742, and 0.758, respectively. The climate, satellite data, or meteorological indices inputs from all growth stages were essential for maize yield prediction, especially in late growth stages. R improved by about 0.126, 0.117, and 0.143 by adding climate data from the early, peak, and late-period to satellite data and meteorological indices from all stages via the four machine learning algorithms, respectively. R increased by 0.016, 0.016, and 0.017 when adding satellite data from the early, peak, and late stages to climate data and meteorological indices from all stages, respectively. R increased by 0.003, 0.032, and 0.042 when adding meteorological indices from the early, peak, and late stages to climate and satellite data from all stages, respectively. The analysis found that the spatial divergences were large and the R value in Northwest region reached 0.942, 0.904, 0.934, and 0.850 for the Cubist, RF, SVM, and Xgboost, respectively. This study highlights the advantages of using climate, satellite data, and meteorological indices for large-scale maize yield estimation with machine learning algorithms.

Multi-omics assists genomic prediction of maize yield with machine learning approaches

Prediction and association mapping of agronomic traits in maize using multiple omic data

Identification of optimal prediction models using multi-omic data for selecting hybrid rice

Multi-trait and multi-environment genomic prediction for flowering traits in maize: a deep learning approach

Beyond Genomic Prediction: Combining Different Types of omics Data Can Improve Prediction of Hybrid Performance in Maize

Combining Optical, Fluorescence, Thermal Satellite, and Environmental Data to Predict County-Level Maize Yield in China Using Machine Learning Approaches.

Genomic Prediction Of Maize Microphenotypes Provides Insights For Optimizing Selection And Mining Diversity

Predicting Maize Yield at the Plot Scale of Different Fertilizer Systems by Multi-Source Data and Machine Learning Methods

Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials

Prediction of Maize Cultivar Yield Based on Machine Learning Algorithms for Precise Promotion and Planting

Investigating Maize Yield-Related Genes in Multiple Omics Interaction Network Data

High-dimensional multi-omics measured in controlled conditions are useful for maize platform and field trait predictions

Integrated UAV-Based Multi-Source Data for Predicting Maize Grain Yield Using Machine Learning Approaches

Classification of plant growth-promoting bacteria inoculation status and prediction of growth-related traits in tropical maize using hyperspectral image and genomic data

Incorporation of parental phenotypic data into multi‐omic models improves prediction of yield‐related traits in hybrid rice

Predicting China's Maize Yield Using Multi-Source Datasets and Machine Learning Algorithms

Near‐infrared reflectance spectroscopy phenomic prediction can perform similarly to genomic prediction of maize agronomic traits across environments

Genomic prediction of yield-related traits and genome-based establishment of heterotic pattern in maize hybrid breeding of Southwest China

Predicting Maize Yield in Northeast China by a Hybrid Approach Combining Biophysical Modelling and Machine Learning

Prediction of Maize Yield at the City Level in China Using Multi-Source Data

Gradient boosting for yield prediction of elite maize hybrid ZhengDan 958