CroAno - A Crowd Annotation Platform for Improving Label Consistency of Chinese NER Dataset.

Baoli Zhang,Zhucong Li,Zhen Gan,Yubo Chen,Jing Wan,Kang Liu,Jun Zhao,Shengping Liu,Yafei Shi
DOI: https://doi.org/10.18653/v1/2021.emnlp-demo.32
2021-01-01
Abstract:In this paper, we introduce CroAno, a webbased crowd annotation platform for the Chinese named entity recognition (NER). Besides some basic features for crowd annotation like fast tagging and data management, CroAno provides a systematic solution for improving label consistency of Chinese NER dataset. 1) Disagreement Adjudicator: CroAno uses a multi-dimensional highlight mode to visualize instance-level inconsistent entities and makes the revision process user-friendly. 2) Inconsistency Detector: CroAno employs a detector to locate corpus-level label inconsistency and provides users an interface to correct inconsistent entities in batches. 3) Prediction Error Analyzer: We deconstruct the entity prediction error of the model to six fine-grained entity error types. Users can employ this error system to detect corpus-level inconsistency from a model perspective. To validate the effectiveness of our platform, we use CroAno to revise two public datasets. In the two revised datasets, we get an improvement of +1.96% and +2.57% F1 respectively in model performance.
What problem does this paper attempt to address?