Abstract:We propose an interactive editing method that allows humans to help deep neural networks (DNNs) learn a latent space more consistent with human knowledge, thereby improving classification accuracy on indistinguishable ambiguous data. Firstly, we visualize high-dimensional data features through dimensionality reduction methods and design an interactive system \textit{SpaceEditing} to display the visualized data. \textit{SpaceEditing} provides a 2D workspace based on the idea of spatial layout. In this workspace, the user can move the projection data in it according to the system guidance. Then, \textit{SpaceEditing} will find the corresponding high-dimensional features according to the projection data moved by the user, and feed the high-dimensional features back to the network for retraining, therefore achieving the purpose of interactively modifying the high-dimensional latent space for the user. Secondly, to more rationally incorporate human knowledge into the training process of neural networks, we design a new loss function that enables the network to learn user-modified information. Finally, We demonstrate how \textit{SpaceEditing} meets user needs through three case studies while evaluating our proposed new method, and the results confirm the effectiveness of our method.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of poor classification performance of deep neural networks (DNNs) when dealing with similar and ambiguous data. Specifically, the author points out that current deep - learning networks have the following deficiencies: 1. **Difficulty in distinguishing similar ambiguous data**: Although DNNs perform well in many classification tasks, their performance is not satisfactory when dealing with ambiguous data such as abstract concepts or shapes. 2. **Lack of integration of domain knowledge**: For some domain - specific datasets (such as archaeology - related data), the performance of the network is not ideal because these datasets require corresponding domain knowledge to achieve better results. 3. **Uncontrollable training process**: The current deep - learning network training process is a "black box", and users cannot directly intervene and control the learning process of high - dimensional features. To solve these problems, the author proposes a new interactive editing method, which allows humans to help DNNs learn feature representations that are more in line with human knowledge by modifying the latent space, thereby improving classification accuracy. Specifically, this method includes the following key points: - **Visualizing high - dimensional features**: Project high - dimensional features onto a two - dimensional workspace through a dimensionality reduction method, allowing users to intuitively observe the data distribution. - **Interactive editing**: Users can manually adjust the position of the projected data in the two - dimensional workspace, and the system will retrain the network according to the user's editing feedback. - **Designing a new loss function**: In order to reasonably integrate human knowledge into the network training process, the author designs a new loss function, enabling the network to learn the information modified by the user. Through this method, users can not only better understand the network training process, but also effectively improve the network performance, especially when dealing with ambiguous data. In addition, this method also provides an interactive system named SpaceEditing, which supports multiple interactive functions, such as zooming, visual volume adjustment, interactive movement, movement guidance, and history recording, etc., to enhance the user experience and the convenience of operation. ### Main contributions of the paper 1. Propose a novel and effective method that enables users to interactively edit the latent space based on their own knowledge, thereby guiding the learning process of the network. This not only improves the performance of the network but also makes the latent space more interpretable. 2. Design a new interactive system SpaceEditing, which supports manual editing from two - dimensional space to high - dimensional space synchronization and provides multiple interactive functions. 3. Evaluate the effectiveness of SpaceEditing in different types of machine - learning tasks through three case studies, verifying the effectiveness and flexibility of this method. ### Formula presentation To ensure that human knowledge can be effectively integrated into the network training process, the author designs a new loss function. This loss function consists of two parts: classification loss and distance difference loss. #### Classification loss The classification loss \( \text{loss}_{\text{cls}} \) is obtained by calculating the cross - entropy between the predicted label and the true label: \[ \text{loss}_{\text{cls}}=-\sum_{i} y_{i} \log(\hat{y}_{i}) \] where \( y_{i} \) is the true label and \( \hat{y}_{i} \) is the predicted label. #### Distance difference loss The distance difference loss \( \text{loss}_{\text{dis}} \) is calculated according to the points moved by the user: \[ \text{loss}_{\text{dis}}=\sum_{D} \max(||m_{i}-P_{i}||_{2}^{2}-||m_{i}-N_{i}||_{2}^{2}+\delta, 0) \] where \( D \) represents the number of points moved by the user, and \( m_{i}\), \( P_{i}\) and \( N_{i}\) respectively represent the corresponding features in the high - dimensional space, and \( \delta \) is the margin used to control the distance between \( P_{i}\) and \( N_{i}\). #### Total loss function The total loss function \( \text{Loss} \) is the sum of the classification loss and the distance difference loss.

SpaceEditing: Integrating Human Knowledge into Deep Neural Networks via Interactive Latent Space Editing

LatentEditor: Text Driven Local Editing of 3D Scenes

NeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture Editing.

Learning from AI: An Interactive Learning Method Using a DNN Model Incorporating Expert Knowledge as a Teacher

The two-way knowledge interaction interface between humans and neural networks

Learning Naturally Aggregated Appearance for Efficient 3D Editing

DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing

CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization

DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing

Neural Face Editing with Intrinsic Image Disentangling

ED-NeRF: Efficient Text-Guided Editing of 3D Scene with Latent Space NeRF

NeuralSI: Neural Design of Semantic Interaction for Interactive Deep Learning

Deep Recursive Embedding for High-Dimensional Data

Latent Subspace Clustering based on Deep Neural Networks

Learning to Infer Semantic Parameters for 3D Shape Editing

Text-driven Editing of 3D Scenes without Retraining

GSEditPro: 3D Gaussian Splatting Editing with Attention‐based Progressive Localization

[Malformations in the eye region based on autosomal aberrations].

Interpretable Latent Spaces for Learning from Demonstration

Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation

Infusing Expert Knowledge Into a Deep Neural Network Using Attention Mechanism for Personalized Learning Environments