Probability Knowledge Acquisition from Unlabeled Instance Based on Dual Learning
Yuetan Zhao,Limin Wang,Xinyu Zhu,Taosheng Jin,Minghui Sun,Xiongfei Li
DOI: https://doi.org/10.1007/s10115-024-02238-9
IF: 2.7
2024-01-01
Knowledge and Information Systems
Abstract:The functionality of machine learning algorithms heavily relies on the abundance and quality of training data accessible. However, the data may originate from diverse data subspaces, making labeled training set typically offers only limited insights, inadequately representing the entirety of potential scenarios. How to safely make use of the unlabeled instances is an emerging and interesting problem for learning Bayesian network classifiers (BNCs), which graphically model the probabilistic relationships among variables in the form of directed acyclic graph (DAG). In this paper, we introduce dual learning into the learning procedure to realize the safe exploitation of the unlabeled instances. We elucidate the mapping between information metric and the local DAG, as well as the distinction between informational (in)dependence and probabilistic (in)dependence. Building upon this foundation, we propose new metrics to accurately measure attribute dependencies within unlabeled instances. The proposed dual learning-based flexible selective k-dependence Bayesian network classifier (DL-FSKDB) employs eager learning to construct the initial model and incorporates lazy learning for personalized fine-tuning and optimization. The extensive experimental evaluations across 36 datasets spanning various domains with distinct properties reveal that the learned BNCs demonstrate competitive classification performance in comparison with state-of-the-art learners in terms of zero–one loss, bias and variance, as well as F1-measure.