Discovering distinctive elements of biomedical datasets for high-performance exploration

Md Tauhidul Islam,Lei Xing
2024-10-08
Abstract:The human brain represents an object by small elements and distinguishes two objects based on the difference in elements. Discovering the distinctive elements of high-dimensional datasets is therefore critical in numerous perception-driven biomedical and clinical studies. However, currently there is no available method for reliable extraction of distinctive elements of high-dimensional biomedical and clinical datasets. Here we present an unsupervised deep learning technique namely distinctive element analysis (DEA), which extracts the distinctive data elements using high-dimensional correlative information of the datasets. DEA at first computes a large number of distinctive parts of the data, then filters and condenses the parts into DEA elements by employing a unique kernel-driven triple-optimization network. DEA has been found to improve the accuracy by up to 45% in comparison to the traditional techniques in applications such as disease detection from medical images, gene ranking and cell recognition from single cell RNA sequence (scRNA-seq) datasets. Moreover, DEA allows user-guided manipulation of the intermediate calculation process and thus offers intermediate results with better interpretability.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to reliably extract distinctive elements from high - dimensional biomedical and clinical datasets**. Specifically, the author points out that there is currently a lack of effective methods to extract distinctive elements from high - dimensional biomedical and clinical datasets, which is crucial in many perception - based biomedical studies. ### Problem Background 1. **Cognitive Modes of the Human Brain** - The human brain distinguishes different objects by recognizing small elements of the objects and according to the differences between these elements. - This cognitive mode has inspired researchers to develop a technique that can automatically discover distinctive elements in datasets. 2. **Limitations of Existing Methods** - Existing analysis techniques (such as PCA, NNMF, etc.) cannot effectively distinguish important parts from unimportant parts in data, resulting in learning unnecessary information or misclassification. - Although deep - learning methods (such as CapsNet, SIMCLR, etc.) are powerful, they have problems such as poor interpretability and the need for a large amount of data in high - dimensional data analysis, and are not suitable for the biomedical field. ### Proposed Solution To overcome the above problems, the author proposes a new unsupervised deep - learning technique - **Distinctive Element Analysis (DEA)**. DEA is implemented through the following steps: 1. **Calculating High - Dimensional Correlation Information** - DEA first calculates the high - dimensional correlation distance between data points, thereby finding a large number of distinctive parts. 2. **Filtering and Condensing** - Using a unique kernel - driven triple - optimization network, these parts are further filtered and condensed into the final DEA elements. 3. **Improving Accuracy and Interpretability** - DEA not only improves accuracy (up to 45%) but also allows users to operate on the intermediate calculation process, thus providing better interpretability. ### Application Effects DEA performs well in multiple applications, including but not limited to: - **Disease Detection**: Detecting diseases from medical images, such as diabetic retinopathy image classification. - **Gene Ranking**: Identifying the most important genes from single - cell RNA - sequencing data. - **Cell Identification**: Performing cell classification from single - cell RNA - sequencing data. In summary, DEA aims to solve the problem of reliable extraction of distinctive elements in high - dimensional biomedical datasets and improves the accuracy of tasks and the interpretability of results through its unique design.