Abstract:The Deep Convolutional Neural Networks (CNNs) have obtained a great success for pattern recognition, such as recognizing the texts in images. But existing CNNs based frameworks still have several drawbacks: 1) the traditaional pooling operation may lose important feature information and is unlearnable; 2) the tradi-tional convolution operation optimizes slowly and the hierar-chical features from different layers are not fully utilized. In this work, we address these problems by developing a novel deep network model called Fully-Convolutional Intensive Feature Flow Neural Network (IntensiveNet). Specifically, we design a further dense block called intensive block to extract the feature information, where the original inputs and two dense blocks are connected tightly. To encode data appropriately, we present the concepts of dense fusion block and further dense fusion opera-tions for our new intensive block. By adding short connections to different layers, the feature flow and coupling between layers are enhanced. We also replace the traditional convolution by depthwise separable convolution to make the operation efficient. To prevent important feature information being lost to a certain extent, we use a convolution operation with stride 2 to replace the original pooling operation in the customary transition layers. The recognition results on large-scale Chinese string and MNIST datasets show that our IntensiveNet can deliver enhanced recog-nition results, compared with other related deep models.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is some deficiencies in the existing text recognition frameworks based on deep convolutional neural networks (CNNs). Specifically, these deficiencies include: 1. **Defects of traditional pooling operations**: Traditional pooling operations may lose important feature information and are unlearnable. This may lead to a decline in the performance of the model when processing images with complex backgrounds and contents. 2. **Inefficiency of traditional convolution operations**: Traditional convolution operations are slow to optimize, and the hierarchical features from different levels are not fully utilized. This limits the learning efficiency and feature extraction ability of the model. To solve these problems, the author proposes a new deep network model - **Fully - Convolutional Intensive Feature Flow Neural Network (IntensiveNet)**. This model enhances the feature flow and inter - layer coupling by introducing intensive blocks, dense fusion blocks, and further dense fusion operations. In addition, the author replaces the traditional pooling operation with a convolution operation with a stride of 2 to prevent the loss of important feature information and uses depthwise separable convolution to improve the computational efficiency of the model. ### Specific improvement points 1. **Intensive Block**: - Two dense blocks are introduced, and the input features are closely connected to the dense blocks through short connections, thereby enhancing feature flow and inter - layer coupling. - The dense fusion block and further dense fusion operations are proposed to enhance the feature representation learning ability. 2. **Replace pooling operations**: - Use a convolution operation with a stride of 2 instead of the traditional pooling operation to reduce the loss of feature information and make the parameters of the entire framework learnable. 3. **Improve model efficiency**: - Use depthwise separable convolution instead of the standard convolution operation to reduce the computational cost and maintain similar performance. Through these improvements, the experimental results of IntensiveNet on large - scale Chinese character strings and the MNIST data set show that this model can provide better recognition results than other related deep models. ### Summary This paper aims to solve the problems of feature information loss and low computational efficiency in the existing text recognition frameworks by designing a brand - new convolutional neural network structure, thereby improving the accuracy and efficiency of text recognition.

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition

Database Systems for Advanced Applications

Dense Residual Network: Enhancing global dense feature flow for character recognition

Deep Neural Network with Attention Model for Scene Text Recognition.

Deep CovDenseSNN: A Hierarchical Event-Driven Dynamic Framework with Spiking Neurons in Noisy Environment

A Residual-Attention Offline Handwritten Chinese Text Recognition Based on Fully Convolutional Neural Networks.

Deep Texture Recognition Via Exploiting Cross-Layer Statistical Self-Similarity

Intelligent character recognition using fully convolutional neural networks

Text-Attentional Convolutional Neural Networks for Scene Text Detection

Text-Attentional Convolutional Neural Network for Scene Text Detection

Reconstruction Combined Training for Convolutional Neural Networks on Character Recognition

Densely Connected CNN with Multi-scale Feature Attention for Text Classification

Convolutional Neural Networks for Text Classification with Multi-size Convolution and Multi-type Pooling.

Deep Networks for Image-to-Image Prediction

Multitask learning and CNN for application of face recognition.

Facial Expression Recognition System Based On Deep Residual Fusion Neural Network

A Convolutional Neural Network Face Recognition Method Based on BiLSTM and Attention Mechanism

An Efficient Channel Attention CNN for Facial Expression Recognition

CSFF-Net: Scene Text Detection Based on Cross-Scale Feature Fusion

Facial Expression Recognition Based on Convolution Neural Network

Efficient Neural Network for Text Recognition in Natural Scenes Based on End-to-End Multi-Scale Attention Mechanism