Abstract:When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.

What problem does this paper attempt to address?

This paper aims to solve several key problems in 3D data processing, especially those related to shape modeling and object classification. Specifically, the paper explores the application of voxel representation in these tasks and proposes several new methods to overcome the challenges brought by voxel representation, such as high computational cost and the curse of dimensionality that limits the available resolution. ### Main problems to be solved include: 1. **Shape Modeling and Object Classification**: - The paper proposes a voxel - based variational auto - encoder (VAE) method for generating high - quality 3D shape interpolation and random sample generation. - At the same time, the paper also proposes a deep convolutional neural network (ConvNet) architecture for 3D object classification tasks. 2. **Challenges of Voxel Representation**: - Although voxel representation is suitable for convolutional neural networks, it has problems of high computational cost and the curse of dimensionality. The paper addresses these challenges by designing efficient network architectures and training methods. - In particular, the paper proposes an improved binary cross - entropy (BCE) loss function to improve the vanishing gradient problem during the training process and balance the weights of positive and negative samples. 3. **Performance Improvement**: - The paper conducts experiments on the ModelNet benchmark dataset and shows a significant performance improvement of its method in object classification tasks. Specifically, compared with existing methods, the paper's method achieves a relative performance improvement of 51.5% and 53.2% on the ModelNet40 and ModelNet10 datasets respectively. ### Specific Contributions: 1. **Generative Model**: - Proposes a method for training voxel - based variational auto - encoders, which can perform high - quality shape interpolation and random sample generation. - Designs a user interface that allows users to explore the latent space learned by the auto - encoder. 2. **Discriminative Model**: - Proposes a deep convolutional neural network architecture for 3D object classification tasks. - Introduces Voxception and Voxception - ResNet modules, which combine the advantages of Inception and ResNet, improving the expressive ability and classification performance of the model. 3. **Experimental Verification**: - Conducts extensive experiments on the ModelNet benchmark dataset to verify the effectiveness and superiority of the proposed method. Through these methods and contributions, the paper successfully demonstrates the feasibility and potential of voxel representation in 3D shape modeling and object classification tasks.

Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Voxel-based three-view hybrid parallel network for 3D object classification

Voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data

Voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data.

VV-Net: Voxel VAE Net with Group Convolutions for Point Cloud Segmentation

VoxResNet: Deep Voxelwise Residual Networks for Volumetric Brain Segmentation

VoxNet: A 3D Convolutional Neural Network for real-time object recognition

Voxel2Mesh: 3D Mesh Model Generation from Volumetric Data

Two-stage Generative Models of Simulating Training Data at the Voxel Level for Large-Scale Microscopy Bioimage Segmentation

AVFP-MVX: Multimodal VoxelNet with Attention Mechanism and Voxel Feature Pyramid

A Variational Autoencoder Cascade Generative Adversarial Network for Scalable 3D Object Generation and Reconstruction

MSG-Voxel-GAN: Multi-Scale Gradient Voxel GAN for 3D Object Generation

Voxelized Facial Reconstruction Using Deep Neural Network

Multi Point-Voxel Convolution (MPVConv) for Deep Learning on Point Clouds

Voxel-Based 3D Object Reconstruction from Single 2D Image Using Variational Autoencoders

Multi Voxel-Point Neurons Convolution (MVPConv) for Fast and Accurate 3D Deep Learning

Neural Volumetric Mesh Generator

SparseVoxNet: 3-D Object Recognition With Sparsely Aggregation of 3-D Dense Blocks

MPVNN: Multi-resolution Point-Voxel Non-parametric Network for 3D Point Cloud Processing

Using Simulated Training Data of Voxel-Level Generative Models to Improve 3D Neuron Reconstruction

Voxel-Based 3D Shape Segmentation Using Deep Volumetric Convolutional Neural Networks