Deep autoencoder-based behavioral pattern recognition outperforms standard statistical methods in high-dimensional zebrafish studies
Adrian J. Green,Lisa Truong,Preethi Thunga,Connor Leong,Melody Hancock,Robyn L. Tanguay,David M. Reif
DOI: https://doi.org/10.1371/journal.pcbi.1012423
2024-09-11
PLoS Computational Biology
Abstract:Zebrafish have become an essential model organism in screening for developmental neurotoxic chemicals and their molecular targets. The success of zebrafish as a screening model is partially due to their physical characteristics including their relatively simple nervous system, rapid development, experimental tractability, and genetic diversity combined with technical advantages that allow for the generation of large amounts of high-dimensional behavioral data. These data are complex and require advanced machine learning and statistical techniques to comprehensively analyze and capture spatiotemporal responses. To accomplish this goal, we have trained semi-supervised deep autoencoders using behavior data from unexposed larval zebrafish to extract quintessential "normal" behavior. Following training, our network was evaluated using data from larvae shown to have significant changes in behavior (using a traditional statistical framework) following exposure to toxicants that include nanomaterials, aromatics, per- and polyfluoroalkyl substances (PFAS), and other environmental contaminants. Further, our model identified new chemicals (Perfluoro-n-octadecanoic acid, 8-Chloroperfluorooctylphosphonic acid, and Nonafluoropentanamide) as capable of inducing abnormal behavior at multiple chemical-concentrations pairs not captured using distance moved alone. Leveraging this deep learning model will allow for better characterization of the different exposure-induced behavioral phenotypes, facilitate improved genetic and neurobehavioral analysis in mechanistic determination studies and provide a robust framework for analyzing complex behaviors found in higher-order model systems. We demonstrate that a deep autoencoder using raw behavioral tracking data from zebrafish toxicity screens outperforms conventional statistical methods, resulting in a comprehensive evaluation of behavioral data. Our models can accurately distinguish between normal and abnormal behavior with near-complete overlap with existing statistical approaches, with many chemicals detectable at lower concentrations than with conventional statistical tests; this is a crucial finding for the protection of public health as exposure can lead to a range of neurodevelopmental disorders, including cognitive and other behavioral deficits. Our deep learning models enable the identification of new substances capable of inducing aberrant behavior, and we generated new data to demonstrate the reproducibility of these results. Thus, neurodevelopmentally active chemicals identified by our deep autoencoder models may represent previously undetectable signals of subtle individual response differences. Our method elegantly accounts for the high degree of behavioral variability associated with the genetic diversity found in a highly outbred population, as is typical for zebrafish research, thereby making it applicable to multiple laboratories generating similar data. Utilizing the vast quantities of control data generated during high-throughput screening is one of the most innovative aspects of this study and to our knowledge is the first study to explicitly develop a deep autoencoder model for anomaly detection in large-scale toxicological behavior studies.
biochemical research methods,mathematical & computational biology