Abstract:The design, analysis and application of a volumetric convolutional neural network (VCNN) are studied in this work. Although many CNNs have been proposed in the literature, their design is empirical. In the design of the VCNN, we propose a feed-forward K-means clustering algorithm to determine the filter number and size at each convolutional layer systematically. For the analysis of the VCNN, the cause of confusing classes in the output of the VCNN is explained by analyzing the relationship between the filter weights (also known as anchor vectors) from the last fully-connected layer to the output. Furthermore, a hierarchical clustering method followed by a random forest classification method is proposed to boost the classification performance among confusing classes. For the application of the VCNN, we examine the 3D shape classification problem and conduct experiments on a popular ModelNet40 dataset. The proposed VCNN offers the state-of-the-art performance among all volume-based CNN methods.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on two aspects: 1. **Narrowing the performance gap between Volume - based CNN (VCNN) and View - based CNN in 3D shape classification tasks**: Volume - based CNN directly processes 3D data and can theoretically retain more 3D structure information. However, limited by computing resources and memory requirements, its input resolution is usually lower than that of view - based methods, resulting in inferior classification performance. The paper proposes a new design method for volume - based CNN, aiming to improve the performance of volume - based CNN by optimizing network parameter selection. 2. **Solving the problem of confusing classes in 3D shape classification**: In 3D shape classification, the shapes of some classes are easily misclassified due to their similar appearances. The paper proposes a method based on Shape Anchor Vectors (SAVs) to identify these confusing classes and further improve the classification performance among these confusing classes through hierarchical clustering and random forest classifiers. ### Specific solutions - **Network parameter selection**: The paper proposes a feed - forward K - means clustering algorithm to systematically determine the number and size of filters in each convolutional layer, thereby optimizing the network structure. This method not only improves the classification performance of the network but also reduces the dependence on empirical parameter selection. - **Confusing class identification and re - classification**: - **Confusing class identification**: By analyzing the filter weights from the last fully - connected layer to the output layer (i.e., shape anchor vectors), it is identified which classes are likely to be confused. Specifically, if the angle between the shape anchor vectors of two classes is small and the internal variation of one of the classes is large, then these two classes may be easily confused. - **Re - classification**: For the identified confusing classes, the paper proposes a hierarchical clustering method to divide the samples into multiple subsets and then uses a random forest classifier to re - classify these subsets to improve the classification accuracy. ### Experimental results - **Effect of network parameter selection**: The experimental results show that by optimizing network parameter selection, the proposed VCNN has increased the average classification accuracy (ACA) and the average instance accuracy (AIA) by 2.65% and 1.94% respectively compared to VoxNet. - **Effect of re - classification of confusing classes**: After further introducing the confusing class re - classification module, the performance of VCNN has been further improved, with ACA and AIA increased by 0.57% and 0.44% respectively. In conclusion, this paper significantly improves the performance of volume - based CNN in 3D shape classification tasks by optimizing network design and solving the problem of confusing classes.

Design, Analysis and Application of A Volumetric Convolutional Neural Network

Continuous Volumetric Convolution Network with Self-Learning Kernels for Point Clouds

A Calibration-Informed Deep Learning Model for Three-Dimensional Particle Reconstruction of Volumetric Particle Image Velocimetry

Volume Upscaling with Convolutional Neural Networks

3D LVCN: A Lightweight Volumetric ConvNet

The Analysis Between Traditional Convolution Neural Network and CapsuleNet

V-CNN: Data Visualizing Based Convolutional Neural Network

A 3D Convolutional Neural Network for Volumetric Image Semantic Segmentation

Binary Volumetric Convolutional Neural Networks for 3-D Object Recognition

VolumeNet: A Lightweight Parallel Network for Super-Resolution of MR and CT Volumetric Data

3D Dense Separated Convolution Module for Volumetric Medical Image Analysis

Automatic Classification of Volumetric Optical Coherence Tomography Images via Recurrent Neural Network

LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects

V-CNN: when Convolutional Neural Network Encounters Data Visualization.

SparseVoxNet: 3-D Object Recognition With Sparsely Aggregation of 3-D Dense Blocks

Enhancement and evaluation for deep learning-based classification of volumetric neuroimaging with 3D-to-2D knowledge distillation

Toward Real-Time 3D Object Recognition: A Lightweight Volumetric CNN Framework Using Multitask Learning

3D shape classification based on convolutional neural networks fusing multi-view information

VC-Net: Deep Volume-Composition Networks for Segmentation and Visualization of Highly Sparse and Noisy Image Data

GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition

Voxel-based three-view hybrid parallel network for 3D object classification