Abstract:There are approximately 2.2 billion people around the world with varying degrees of visual impairments. Among them, individuals with severe visual impairments predominantly rely on hearing and touch to gather external information. At present, there are limited reading materials for the visually impaired, mostly in the form of audio or text, which cannot satisfy the needs for the visually impaired to comprehend graphical content. Although many scholars have devoted their efforts to investigating methods for converting visual images into tactile graphics, tactile graphic translation fails to meet the reading needs of visually impaired individuals due to image type diversity and limitations in image recognition technology. The primary goal of this paper is to enable the visually impaired to gain a greater understanding of the natural sciences by transforming images of mathematical functions into an electronic format for the production of tactile graphics. In an effort to enhance the accuracy and efficiency of graph element recognition and segmentation of function graphs, this paper proposes an MA Mask R-CNN model which utilizes MA ConvNeXt as its improved feature extraction backbone network and MA BiFPN as its improved feature fusion network. The MA ConvNeXt is a novel feature extraction network proposed in this paper, while the MA BiFPN is a novel feature fusion network introduced in this paper. This model combines the information of local relations, global relations and different channels to form an attention mechanism that is able to establish multiple connections, thus increasing the detection capability of the original Mask R-CNN model on slender and multi-type targets by combining a variety of multi-scale features. Finally, the experimental results show that MA Mask R-CNN attains an 89.6% mAP value for target detection and 72.3% mAP value for target segmentation in the instance segmentation of function graphs. This results in a 9% mAP improvement for target detection and 12.8% mAP improvement for target segmentation compared to the original Mask R-CNN.

Research on an Improved Neural Network Model for Film Text Image Segmentation in Film Internet of Things

A Simultaneous Object Detection and Component Segmentation Approach Based on Mask R-CNN

Research on Methods of English Text Detection and Recognition Based on Neural Network Detection Model

Mask-R-FCN: A Deep Fusion Network for Semantic Segmentation.

Remote sensing image segmentation based on improved Mask-RCNN

Oriented Cascade Mask R-CNN for Biomedical Image Segmentation

Element detection and segmentation of mathematical function graphs based on improved Mask R-CNN

Brain tumor image segmentation method using hybrid attention module and improved mask RCNN

Segmentation of Unsound Wheat Kernels Based on Improved Mask RCNN

Traffic Signs Detection and Segmentation Based on the Improved Mask R-CNN

A Mask R-Cnn Model With Improved Region Proposal Network For Medical Ultrasound Image

A semantic segmentation algorithm for fashion images based on modified mask RCNN

A Crop Image Segmentation and Extraction Algorithm Based on Mask RCNN

A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

Improvement of Road Instance Segmentation Algorithm Based on the Modified Mask R-CNN

Matting Enhanced Mask R-CNN

Improved Mask R-CNN Multi-Target Detection and Segmentation for Autonomous Driving in Complex Scenes

Mask is All You Need: Rethinking Mask R-CNN for Dense and Arbitrary-Shaped Scene Text Detection

Improved Mask R-CNN for Aircraft Detection in Remote Sensing Images

Efficient Text Bounding Box Identification Using Mask R-CNN: Case of Thai Documents

Boosting Mask R-CNN Performance for Long, Thin Forensic Traces with Pre-Segmentation and IoU Region Merging