Evaluation of a Random Forest Model to Identify Invasive Carp Eggs Based on Morphometric Features

Katherine Goode,Michael J. Weber,Aaron Matthews,Clay L. Pierce
DOI: https://doi.org/10.1002/nafm.10616
2021-05-24
North American Journal of Fisheries Management
Abstract:Three species of invasive carp, Grass Carp (<i>Ctenopharyngodon idella</i>), Silver Carp (<i>Hypophthalmichthys molitrix</i>), and Bighead Carp (<i>H. nobilis</i>), are rapidly spreading throughout North America. Monitoring reproduction can help determine establishment in new areas but is difficult due to challenges associated with identifying fish eggs. Recently, random forest models provided accurate identification of eggs based on morphological traits, but the models have not been validated using independent data. Our objective was to evaluate the predictive performance of egg identification models developed by Camacho et al. (2019) for classifying invasive carp eggs using an independent dataset. When invasive carp were grouped as one category, predictive accuracy was high at the family (89%), genus (90%), species (91%), and species with reduced predictor variables (94%) levels. Invasive carp predictive accuracy decreased when only considering observations from newly sampled locations [family (9%), genus (22%), species (30%), and species with reduced predictor variables (70%)], suggesting potential differences in egg characteristics among locations. Random forest models using a combination of previous and new data resulted in high predictive accuracy for invasive carp (96% to 98%) when invasive carp were grouped as one class for all models at the family, genus, and species levels. The two most influential predictor variables were average membrane diameter and average embryo diameter, where the probability of predicting an invasive carp egg increased with these metrics. High predictive accuracy metrics suggest that these trained and validated random forest models can be used to identify invasive carp eggs based on morphometric variables. However, decreased performance at new locations suggests more research would be beneficial to determine the applicability of the models to a larger spatial region.
fisheries
What problem does this paper attempt to address?