Abstract:In the BCI field, introspection and interpretation of brain signals are desired for providing feedback or to guide rapid paradigm prototyping but are challenging due to the high noise level and dimensionality of the signals. Deep neural networks are often introspected by transforming their learned feature representations into 2- or 3-dimensional subspace visualizations using projection algorithms like Uniform Manifold Approximation and Projection (UMAP). Unfortunately, these methods are computationally expensive, making the projection of data streams in real-time a non-trivial task. In this study, we introduce a novel variant of UMAP, called approximate UMAP (aUMAP). It aims at generating rapid projections for real-time introspection. To study its suitability for real-time projecting, we benchmark the methods against standard UMAP and its neural network counterpart parametric UMAP. Our results show that approximate UMAP delivers projections that replicate the projection space of standard UMAP while decreasing projection speed by an order of magnitude and maintaining the same training time.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: real - time visualization and interpretation of high - dimensional and noisy brain signal data streams in the field of brain - computer interface (BCI). Specifically, existing projection algorithms such as standard UMAP (Uniform Manifold Approximation and Projection) can generate high - quality low - dimensional representations, but their computational cost is high and it is difficult to achieve real - time processing. Therefore, this paper proposes a new UMAP variant - approximate UMAP (aUMAP), aiming to achieve fast real - time visualization of high - dimensional data streams by reducing projection time while maintaining training time and projection quality comparable to standard UMAP. ### Background and Problem Description of the Paper 1. **Characteristics of Brain Signal Data** - High - dimensional: Brain signal data usually has a high dimension. - Noisy: Brain signal data is easily affected by noise, increasing the difficulty of analysis. 2. **Limitations of Existing Methods** - **Standard UMAP**: Although it can generate high - quality low - dimensional representations, it has a high computational cost and is not suitable for real - time processing. - **PCA**: It has a fast calculation speed, but it cannot handle data with complex nonlinear structures. - **ISOMAP**: It has a good effect on processing noisy data, but its computational complexity is high and it is not suitable for large - scale data sets. - **parametric UMAP (pUMAP)**: It accelerates projection through neural networks, but the model is heavy and may require specific hardware support. 3. **Research Objectives** - Propose a new UMAP variant - aUMAP, which can significantly reduce projection time while ensuring projection quality. - Evaluate the performance of aUMAP on different data sets and verify whether it is suitable for real - time online projection. ### Method Overview 1. **Working Principle of aUMAP** - **Model Training**: The training process of aUMAP is the same as that of standard UMAP, fitting data by optimizing a well - defined objective function. - **Projection of New Data Points**: aUMAP approximates the projection of new data points through the k - NN (k - Nearest Neighbor) method instead of recalculating the entire projection space. The specific formula is as follows: \[ u=\frac{\sum_{i = 1}^{k}\frac{1}{d_{i}}u_{i}}{\sum_{j = 1}^{k}\frac{1}{d_{j}}} \] where \(u\) is the projection of the new data point \(x\), \(k\) is the number of neighbors considered, \(u_{1},u_{2},\ldots,u_{k}\) are the UMAP projections of the \(k\) nearest neighbor points of \(x\) in the input space, and \(d_{i}=\text{distance}(x,x_{i})\) is the distance between \(x\) and its \(i\)-th nearest neighbor point. 2. **Experimental Setup** - **Data Sets**: Three standard data sets (Iris plants, handwritten digits, Wisconsin breast cancer) were used to evaluate the performance of aUMAP. - **Benchmark Tests**: The performance of aUMAP was compared with that of standard UMAP and pUMAP in terms of training time and projection time. - **Hardware Configuration**: The experiment was carried out on an AMD Ryzen 7 5800x 8 - core processor and an NVIDIA GeForce RTX 3060 Ti, using Windows Subsystem for Linux (WSL) v.2.0.9.0 to support TensorFlow GPU. ### Experimental Results 1. **Accuracy of aUMAP** - The projections generated by aUMAP are very close to those of standard UMAP, and the average Euclidean distance is between 0.1 and 0.25 standard deviations. - Although aUMAP sometimes generates some outliers, it can still maintain a clustering effect similar to that of standard UMAP on the whole. 2. **Training Time** - aUMAP and...

Approximate UMAP allows for high-rate online visualization of high-dimensional data streams

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey

Bringing UMAP Closer to the Speed of Light with GPU Acceleration

Parametric UMAP embeddings for representation and semi-supervised learning

Lens functions for exploring UMAP Projections with Domain Knowledge

Approximate Nearest Neighbor Graph Provides Fast and Efficient Embedding with Applications in Large-scale Biological Data

Emerging-properties Mapping Using Spatial Embedding Statistics: EMUSES

Dimensionality reduction by UMAP to visualize physical and genetic interactions

Efficient Unsupervised Dimension Reduction for Streaming Multiview Data.

Accelerating UMAP for Large-Scale Datasets Through Spectral Coarsening

Improving multidimensional projection quality with user-specific metrics and optimal scaling

ActUp: Analyzing and Consolidating tSNE and UMAP

On UMAP's true loss function

Motor intent recognition of multi-feature fusion EEG signals by UMAP algorithm

Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Subspace Projection Approaches To Classification And Visualization Of Neural Network-Level Encoding Patterns

HUMAP: Hierarchical Uniform Manifold Approximation and Projection

UEQMS: UMAP Embedded Quick Mean Shift Algorithm for High Dimensional Clustering

Bubblewrap: Online tiling and real-time flow prediction on neural manifolds

Exploring UMAP in hybrid models of entropy-based and representativeness sampling for active learning in biomedical segmentation