Collaborative and Interactive Detection and Repair of Activity Labels in Process Event Logs.

Sareh Sadeghianasl,Arthur H. M. ter Hofstede,Suriadi,Selen Turkay
DOI: https://doi.org/10.1109/icpm49681.2020.00017
2020-01-01
Abstract:Process mining uses computational techniques for process-oriented data analysis. The use of poor quality input data will lead to unreliable analysis outcomes (garbage in - garbage out), as it does for other types of data analysis. Among the key inputs to process mining analyses are activity labels in event logs which represent tasks that have been performed. Activity labels are not immune from data quality issues. Fixing them is an important but challenging endeavour, which may require domain knowledge and can be computationally expensive. In this paper we propose to tackle this challenge from a novel angle by using a gamified crowdsourcing approach to the detection and repair of problematic activity labels, namely those with identical semantics but different syntax. Evaluation of the prototype with users and a real-life log showed promising results in terms of quality improvements achieved.
What problem does this paper attempt to address?