Survey of Data Annotation

Li CAI,Shu-Ting WANG,Jun-Hui LIU,Yang-Yong ZHU
DOI: https://doi.org/10.13328/j.cnki.jos.005977
2020-01-01
Journal of Software
Abstract:Data annotation is a key part of the effective operation of most artificial intelligence algorithms. The better the annotation accuracy and quantity, the better the performance of the algorithm. The development of the data annotation industry boosts employment in many cities and towns in China, prompting China to gradually become the center of world data annotation. This study summarizes its development, including origin, application scenarios, classifications, and tasks; lists the commonly used annotation data sets, open source data annotation tools and commercial annotation platforms; proposes the data annotation specification including roles, standards, and processes; gives an example of data annotation in a sentiment analysis. Then, this paper describes the models and characteristics of state-of-the-art algorithms for evaluating annotation results, and compares their advantages and disadvantages. Finally, this paper prospects research focuses and development trends of data annotation from four aspects: tasks, tools, annotation quality, and security.
What problem does this paper attempt to address?