Cotton Yield Prediction Using Random Forest

Alakananda Mitra,Sahila Beegum,David Fleisher,Vangimalla R. Reddy,Wenguang Sun,Chittaranjan Ray,Dennis Timlin,Arindam Malakar
2023-12-05
Abstract:The cotton industry in the United States is committed to sustainable production practices that minimize water, land, and energy use while improving soil health and cotton output. Climate-smart agricultural technologies are being developed to boost yields while decreasing operating expenses. Crop yield prediction, on the other hand, is difficult because of the complex and nonlinear impacts of cultivar, soil type, management, pest and disease, climate, and weather patterns on crops. To solve this issue, we employ machine learning (ML) to forecast production while considering climate change, soil diversity, cultivar, and inorganic nitrogen levels. From the 1980s to the 1990s, field data were gathered across the southern cotton belt of the United States. To capture the most current effects of climate change over the previous six years, a second data source was produced using the process-based crop model, GOSSYM. We concentrated our efforts on three distinct areas inside each of the three southern states: Texas, Mississippi, and Georgia. To simplify the amount of computations, accumulated heat units (AHU) for each set of experimental data were employed as an analogy to use time-series weather data. The Random Forest Regressor yielded a 97.75% accuracy rate, with a root mean square error of 55.05 kg/ha and an R2 of around 0.98. These findings demonstrate how an ML technique may be developed and applied as a reliable and easy-to-use model to support the cotton climate-smart initiative.
Machine Learning,Computers and Society,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to accurately predict cotton yield in the context of climate change. Specifically, the research aims to improve the accuracy of cotton yield prediction by using machine - learning techniques, taking into account factors such as climate warming, soil diversity, varieties, and inorganic nitrogen levels. The paper mentions that due to the complex and non - linear impacts of various factors on crops, such as crop type data, management techniques, changes in weather patterns, climate, varieties, soil types, pests, and diseases, it is challenging to establish an accurate crop - yield prediction model. To meet this challenge, the author adopted the Random Forest Regressor as the main machine - learning model and used Accumulated Heat Units (AHU) to simplify the processing of time - series weather data, thereby achieving high - precision cotton - yield prediction.