Characterization and classification of fine-resolution soil profile for precision agriculture using random forest and self-organizing map

Ani A. Elias,Megha Sharma,Shailendra Goel

DOI: https://doi.org/10.1101/2024.04.02.587707

2024-04-03

Abstract:The availability of high throughput soil profile information is an important component in precision agriculture to perform efficient soil management for sustainable production. We collected 14 soil physiochemical features from Nagpur, Pune, and Haveri, representing target environments of safflower cultivation and also from our experiment station at Delhi, at fine resolution and created graphical maps to depict the variability. Additionally, we evaluated the predictive ability of two statistical learning models, random forest (RF) and self-organizing maps (SOM) against multinomial regression models for correctly classifying the soil profile. Clustering was performed around the medoids produced from the dissimilarity matrices of these models using partitioning around medoids (PAM) model. The robustness, versatility, and predictive ability of models in correctly classifying the soil profile to clusters were then tested using cross-validation which was repeated 100 times. This study was performed using training data with proportionate size varying from 60 to 95%, and increasing the unit area of observation up to nine times (or decreasing the total number of observations up to a ninth). RF model was found to be the best performing with average prediction accuracy above 85% in all settings which reached close to 100% in some settings. The predictive ability of all the models was maintained even when only the most influencing six variables were used for classification. The optimal training population size for prediction was found to be 70 – 80%. Based on our study, it is recommended to i) collect fine resolution edaphic features from a marginal farm before crop season, ii) use RF or SOM model to identify the most influencing features distinguishing the soil samples iii) expand the area of sample collection, find values for the most influencing features, and use RF model to correctly predict the class to which the new set of the soil belongs to.

Plant Biology

What problem does this paper attempt to address?

The main aim of this paper is to address the following issues: 1. **Creating Soil Maps**: Researchers conducted fine-resolution soil profile sampling in different regions of India to create graphical maps that reflect changes in soil physical and chemical characteristics. 2. **Soil Sample Classification**: Using the collected data to classify soil samples from specific target environments (TE) suitable for growing safflower. Specifically, soil samples were classified using two statistical learning models: Random Forest (RF) and Self-Organizing Maps (SOM). 3. **Evaluating Model Predictive Ability**: Comparing the performance of Random Forest, Self-Organizing Maps, and multiple regression models in correctly classifying soil samples and determining the best model. Additionally, the predictive ability of the models was evaluated under different training data proportions and different observation areas. 4. **Identifying Key Influencing Factors**: Identifying the most influential factors for soil sample classification. For example, components such as phosphorus, potassium, sodium, calcium, carbon, and sand were found to have significant impacts on classification. 5. **Optimizing Training Data Volume**: Investigating the amount of training data required to achieve optimal predictive accuracy, with results indicating that 70%-80% of the training data volume is the best choice. In summary, the focus of this study is on developing a methodological framework to support soil management practices in precision agriculture. By acquiring and analyzing high-throughput soil information, this research contributes to achieving the goal of sustainable agricultural production.

Characterization and classification of fine-resolution soil profile for precision agriculture using random forest and self-organizing map

An effective implementation and assessment of a random forest classifier as a soil spatial predictive model

Interaction of climate, topography and soil properties with cropland and cropping pattern using remote sensing data and machine learning methods

A Fine Digital Soil Mapping by Integrating Remote Sensing-Based Process Model and Deep Learning Method in Northeast China

Mapping Soil Organic Matter Using Different Modeling Techniques in the Dryland Agroecosystem of Huang-Huai-Hai Plain, Eastern China

Developing Parsimonious Model for Digital Soil Mapping Using Forward Recursive Feature Selection

High-Precision Mapping of Soil Organic Matter Based on UAV Imagery Using Machine Learning Algorithms

Improving model parsimony and accuracy by modified greedy feature selection in digital soil mapping

Soil total and organic carbon mapping and uncertainty analysis using machine learning techniques

Digital Mapping of Soil pH Based on Machine Learning Combined with Feature Selection Methods in East China

Improved soil carbon stock spatial prediction in a Mediterranean soil erosion site through robust machine learning techniques

Predictive Mapping of Soil Properties for Precision Agriculture Using Geographic Information System (GIS) Based Geostatistics Models

Improving Model Performance in Mapping Cropland Soil Organic Matter Using Time-Series Remote Sensing Data

Soil Nutrient Estimation and Mapping in Farmland Based on UAV Imaging Spectrometry

High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models

Machine learning algorithms realized soil stoichiometry prediction and its driver identification in intensive agroecosystems across a north-south transect of eastern China

Tree-based algorithms for spatial modeling of soil particle distribution in arid and semi-arid region

A novel framework for improving soil organic matter prediction accuracy in cropland by integrating soil, vegetation and human activity information

Prediction of soil fertility parameters using USB-microscope imagery and portable X-ray fluorescence spectrometry

Crop yield prediction in cotton for regional level using random forest approach

Machine Learning and Feature Selection for soil spectroscopy. An evaluation of Random Forest wrappers to predict soil organic matter, clay, and carbonates