Towards More Reliable Unsupervised Tissue Segmentation Via Integrating Mass Spectrometry Imaging and Hematoxylin-Erosin Stained Histopathological Image

Ang Guo,Zhiyu Chen,Fang Li,Wenbo Li,Qian Luo
DOI: https://doi.org/10.1101/2020.07.17.208025
2020-01-01
Abstract:Mass Spectrometry Imaging (MSI) provides a useful tool to divide a tissue section into sub-regions with similar molecular profiles, namely tissue segmentation. However, owing to the lack of ground truth, there is no reliable evaluation approach to assess the validity of unsupervised segmentation outcomes of MSI. We propose a novel solution grounded on a presumption that a segmentation is reliable if it can be reproduced using distinct bio-information extracted from independent sources. Specifically, besides molecular information from MSI data, we also obtain morphological information over a tissue section from its Hematoxylin-Erosin (H&E) stained histopathological image. MSI has high molecular specificity but low spatial resolving power, the H&E image has no molecular specificity but it can capture microscopic details of the tissue with a spatial resolution two magnitudes higher than MSI. The whole H&E image is split into an array of small patches, which correspond to the spatial pixels of MSI. A spectrum of informative morphological features is computed iteratively for each patch and spatial segmentation can be generated by clustering the patches based on their morphological similarities. Adjusted Mutual Information (AMI) score measures the degree of agreement between MSI-based and H&E image-based segmentation outcomes, which is defined by us as an objective and quantitative evaluation metric of segmentation validity. We investigated various candidate morphological features: a combination of Deep Convolution Neural Network (DCNN) features and handcrafted Threshold Adjacency Statistics (TAS) features finally stood out. The most appropriate number of tissue segments was also determined according to AMI score. Moreover, we introduced Co-Clustering algorithm to MSI data to simultaneously group m/z variables and spatial pixels, so potential biomarkers associated to each sub-region were discovered without the need of further analysis. Eventually, by integrating the segmentation outcomes based on MSI and H&E image data, the confidence level of the segment assignment was displayed for each pixel, which offered a much more informative and compelling way to present the segmentation results.
What problem does this paper attempt to address?