Abstract:In July 2020, the Joint Video Experts Team has published the versatile video coding (VVC) standard. The VVC encoder enhances the coding efficiency compared with his predecessor high-efficiency video coding encoder, thanks to the improved coding modules and the new proposed techniques such as the new block partitioning structure called quadtree with nested multi-type tree (QTMT). However, QTMT induces a significant increase in encoding time mainly at the rate distortion optimization level (RDO) which causes an enormous computational complexity. Instead of RDO-QTMT partition process, a deep-QTMT partition approach based on a fast convolution neural network-ternary tree (CNN-TT) is proposed to predict the best intra-QTMT decision tree in order to reduce the encoding time. A database is initially established containing CU-based TT partition depths with several video contents. Then, a CNN-TT model is developed under three-levels provided by the TT structure to early determine the QTMT partition at 32 32. Different threshold values are fixed for each level according to the CNN-TT predicted probabilities to reach a balance between the encoding complexity and the coding efficiency. The experimental results prove that our deep-QTMT partition approach saves a significant encoder time on average between 23% and 58% with an acceptable RD performance.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problems of significantly increased encoding time and computational complexity in the VVC (Versatile Video Coding) standard due to the adoption of the new block partitioning structure (QTMT, i.e., Quadtree and Nested Multi - type Tree). Specifically: 1. **Challenges brought by the improvement of VVC encoding efficiency**: - The VVC encoder has a significant improvement in encoding efficiency compared to its predecessor, the HEVC (High - Efficiency Video Coding) encoder. This is mainly due to the improved encoding modules and new technologies, such as the new block partitioning structure QTMT. - However, the QTMT structure leads to a huge computational complexity at the rate - distortion optimization (RDO) level, thus significantly increasing the encoding time. 2. **Limitations of existing methods**: - Although there are various machine - learning - and deep - learning - based methods to accelerate the CU (Coding Unit) partitioning decision, these methods have not been fully studied when dealing with the TT (Ternary Tree) structure. 3. **The proposed new method**: - The paper proposes a fast QTMT partitioning method based on the convolutional neural network (CNN), called CNN - TT (Convolutional Neural Network Ternary Tree), to predict the optimal intra - QTMT decision tree, thereby reducing the encoding time. - By establishing a database of CU - based TT partitioning depths for different video contents and developing a three - layer CNN - TT model to determine QTMT partitioning in advance, this method can significantly reduce the encoding time while maintaining acceptable rate - distortion performance. ### Main contributions - **Database generation**: A large database containing the TT partitioning depths of 32 × 32 CUs was constructed. - **CNN - TT model development**: A CNN - TT classifier based on the TT structure was developed and trained to predict the TT segmentation of 32 × 32 CUs. - **Model application**: The trained CNN - TT model was applied to the VVC encoder to predict the QTMT decision tree. ### Experimental results The experimental results show that this method can save an average of 23% to 58% of the encoding time while maintaining acceptable rate - distortion performance. This proves the effectiveness of this method in reducing encoding complexity. In summary, this paper effectively solves the problems of high computational complexity and long encoding time in VVC encoding caused by the QTMT structure by introducing the CNN - TT model, providing a new solution for improving video encoding efficiency.

CNN-based ternary tree partition approach for VVC intra-QTMT coding

Multi-scale and Bi-path Method Based on Image Entropy and CNN for Fast CU Partition in VVC

Fast QTMT Partition Decision Algorithm in VVC Intra Coding based on Variance and Gradient.

Fast QTMT Partition for VVC Intra Coding Using U-Net Framework

Efficient Intra Coding Through Hierarchical CU Partition Prediction for VVC.

A Fast QTMT Partition Decision Strategy for VVC Intra Prediction

Partition Map Prediction for Fast Block Partitioning in VVC Intra-Frame Coding

Light-weight CNN-based VVC Inter Partitioning Acceleration

Fast QTBT Partitioning Decision for Interframe Coding with Convolution Neural Network.

ResNet-Based Fast CU Partition Decision Algorithm for VVC

Gradient-based Early Termination of CU Partition in VVC Intra Coding

Fast CU partition decision for H.266/VVC based on the improved DAG-SVM classifier model

HG-FCN: Hierarchical Grid Fully Convolutional Network for Fast VVC Intra Coding

Fast CU Partition for VVC Intra-Frame Coding via Texture Complexity

Fast Multi-Type Tree Partitioning for Versatile Video Coding Using a Lightweight Neural Network

CNN-based Partitioning Structure Prediction for VVC Intra Speedup: Bottom-Up-based and Top-Down-based.

Fast GLCM-based Intra Block Partition for VVC.

Fast Decision-Tree-Based Series Partitioning and Mode Prediction Termination Algorithm for H.266/VVC

Effective Quadtree Plus Binary Tree Block Partition Decision for Future Video Coding

A Fast Decision Algorithm for VVC Intra-Coding Based on Texture Feature and Machine Learning

Adaptive CU Split Decision with Pooling-variable CNN for VVC Intra Encoding.