Incomplete Multi-view Multi-label Active Learning

Chuanwei Qu,Kuangmeng Wang,Hong Zhang,Guoxian Yu,Carlotta Domeniconi
DOI: https://doi.org/10.1109/icdm51629.2021.00160
2021-01-01
Abstract:The label information of training data is crucial for effective machine learning in many domains, while it is expensive to annotate data at a large-scale by domain experts. The problem was intensified by the multiplicity and incompleteness of multiview multi-label (MVML) objects, which is ignored by almost all existing multi-view multi-label active learning approaches. In this paper, we propose an incomplete multi-view multi-label active learning (iMVMAL) approach to reduce the cost of querying MVML data. iMVMAL firstly extends under-complete Autoencoder to learn the shared/individual representations of samples across/within incomplete views by an indicator matrix to indicate the missing samples of respective view. As such, the optimization of the Autoencoder’s parameters will not be impacted by the missing samples. Next, it uses the extracted shared/individual information to train multiple classifiers and to quantify the informativeness of sample-label pairs from these classifiers, from label-wise and sample-wise information also. After that, it selects the sample-label pairs with the highest informativeness for query. Empirical studies on benchmark datasets show that iMVMAL outperforms competitive baselines at the same query cost in the complete multi-view setting, and maintains its effectiveness in the incomplete multi-view setting as well.
What problem does this paper attempt to address?