Research on an Improved Neural Network Model for Film Text Image Segmentation in Film Internet of Things

Yangfan Tong,Shuo Feng,Ruiqing Zhang
DOI: https://doi.org/10.21203/rs.3.rs-106013/v1
2020-11-13
Abstract:Abstract In order to solve the problem that Film text is difficult to recognize and difficult to handle in Film Internet of Things, a method that can effectively identify the content in Film text is sought. This paper uses the Mask RCNN algorithm with ResNet101 as the backbone network to establish a Film document image segmentation model.The optimal hyperparameters are: the shape ratio of the anchor frame is [0.5, 1, 3], the threshold for non-maximum suppression is 0.15, and the confidence level is 0.85. The F1 score obtained at this time is 0.8951. When these hyperparameters are substituted into the IOU of 0.8, the F1 score is 0.7417. According to the results of the Pattern Recognition Laboratory of the Chinese Academy of Sciences, this algorithm model ranked first with an IOU of 0.6. Under the premise that IOU is 0.8, it is ranked second, and the first is a non-end-to-end model with a single task. It can be seen that the adjustment of the hyperparameters and the training of the algorithm model are relatively successful.The experimental results show that the MASK RCNN can accurately identify all the formulas in the Film Text. MASK RCNN is significantly better at identifying small objects such as formulas in Film Text images than traditional fast cnn and faster cnn.
What problem does this paper attempt to address?