Quality-aided Annotation Service Selection in MLaaS Market

Shanyang Jiang,Lan Zhang
DOI: https://doi.org/10.1109/iwqos54832.2022.9812877
2022-01-01
Abstract:The vibrant markets offering data annotation services are fast-growing and play an important part in machine learning. While many multi-label prediction services are available, it is challenging for consumers to decide which services to use for their own tasks and budgets due to the heterogeneity in those services’ labeling categories, labeling quality and price. In this paper, we focus on a practical problem of obtaining high-quality multi-label annotation data from multiple services within a budget constraint. We propose a framework that firstly parameterizes the labeling generation based on the constructed Probabilistic Graph Model, and designs an Expectation Maximization(EM)-based iteration algorithm to estimate the service labeling quality and task truth distribution. Then we transform the annotation service selection strategy into an adaptive submodular maximization coverage problem, which motivates us to design an adaptive random greedy algorithm with a constant approximation ratio 1−1/e. We evaluate our design on both real-world experiments and a series of simulations on various machine learning models and real datasets. These experiments will show that our method has more accuracy and reliability improvements.
What problem does this paper attempt to address?