NeuroDAVIS: A neural network model for data visualization
Chayan Maitra,Dibyendu B. Seal,Rajat K. De
DOI: https://doi.org/10.1016/j.neucom.2023.127182
IF: 6
2024-01-09
Neurocomputing
Abstract:The task of dimensionality reduction and visualization of high-dimensional datasets remains a challenging problem since long. Modern high-throughput technologies produce large datasets having multiple views with relatively new data types. Visualization of these datasets require efficient algorithms that can uncover hidden patterns in the data without affecting the local and global structures, and bring out the inherent non-linearity within the data. To this end, however, very few such methodologies exist, which can realise this task. In this work, we have introduced a novel unsupervised deep neural network model, called NeuroDAVIS, for data visualization. NeuroDAVIS is capable of extracting important features from the data, without assuming any data distribution, visualize effectively in low dimension, and preserve both local and global structures simultaneously. It has been shown theoritically that neighbourhood relationship of the data in high dimension remains preserved in lower dimension. The performance of NeuroDAVIS has been evaluated on a wide variety of synthetic and real high-dimensional datasets including numeric, textual, image and biological data. NeuroDAVIS has been highly competitive against t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), Principal Component Analysis (PCA) and Random Projection (RP) with respect to visualization quality, preservation of data size, shape, and both local and global structure. It has outperformed Fast interpolation-based t-SNE (Fit-SNE), a variant of t-SNE, for most of the datasets. For the biological datasets, besides t-SNE, UMAP, Fit-SNE, PCA and RP, NeuroDAVIS has also performed well as compared to other state-of-the-art algorithms, like Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE) and the siamese neural network-based method, called IVIS. Downstream classification and clustering analyses have also revealed favourable results for NeuroDAVIS-generated embeddings.
computer science, artificial intelligence