Discovering interpretable models of scientific image data with deep learning

Christopher J. Soelistyo,Alan R. Lowe
2024-02-05
Abstract:How can we find interpretable, domain-appropriate models of natural phenomena given some complex, raw data such as images? Can we use such models to derive scientific insight from the data? In this paper, we propose some methods for achieving this. In particular, we implement disentangled representation learning, sparse deep neural network training and symbolic regression, and assess their usefulness in forming interpretable models of complex image data. We demonstrate their relevance to the field of bioimaging using a well-studied test problem of classifying cell states in microscopy data. We find that such methods can produce highly parsimonious models that achieve $\sim98\%$ of the accuracy of black-box benchmark models, with a tiny fraction of the complexity. We explore the utility of such interpretable models in producing scientific explanations of the underlying biological phenomenon.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to find interpretable models of natural phenomena applicable to specific fields from complex and raw data (such as images), and use these models to obtain scientific insights from the data. Specifically, the authors propose some methods to achieve this goal, especially achieving decoupled representation learning, sparse deep neural network training and symbolic regression, and evaluate their usefulness in forming interpretable models of complex image data. The paper demonstrates the relevance of these methods in the field of bio - imaging through a widely studied test problem - classifying cell states in microscope data. The study finds that these methods can produce models with a high degree of parsimony, and the accuracy of these models is close to about 98% of the black - box benchmark models, but the complexity is only a small fraction of the latter. In addition, the paper explores the practicality of such interpretable models in providing scientific explanations for underlying biological phenomena.