Two-layer Federated Learning for Scene Text Detection

Haolin Wang,Xiao Xiao,Yilong Hui,Zhisheng Yin,Nan Cheng
DOI: https://doi.org/10.1109/IPCCC51483.2021.9679372
2021-01-01
Abstract:Incident scene text detection, as the most crucial step of an incident scene text recognition system, has received increasing research attention. In this paper, a two-layer mobile federated learning model (TMFL) is proposed to protect data privacy and improve training efficiency. Particularly, a fast scene text detector is proposed to detect the multi-directional and multi-scale text by using an asymmetric convolution based feature pyramid network (AC-FPN). Compared with the traditional feature pyramid, asymmetric convolutions can effectively extract rotation-invariant features to improve the model's robustness to directed text. Moreover, in order to achieve a balance between the detection accuracy and efficiency, we modify the lightweight backbone of mobilenetv3, and integrate it with the asymmetric convolution based feature pyramid. In addition, we evaluate the performance of our detector on three benchmark datasets, where the results show that both the accuracy and the speed can be improved. Our detector can achieve an F-measure of 87.8 on the ICDAR2013, 80.5 on the MSRA-TD500 and 84.1 on the ICDAR2015 dataset, running at 32.5 FPS.
What problem does this paper attempt to address?