Model stability: a key factor in determining whether an algorithm produces an optimal model from a matching distribution

Kai Ming Ting,Regina Jing Ying Quek
DOI: https://doi.org/10.1109/ICDM.2003.1251000
2003-01-01
Abstract:We investigate the factors leading to producing suboptimal models when training and test class distributions (or misclassification costs) are matched. Our result shows that model stability plays a key role in determining whether the algorithm produces an optimal model from a matching distribution (cost). The performance difference between a model trained from the matching distribution (cost) and the optimal model generally increases as the degree of model stability decreases. The practical implication of our result is that one should only follow the conventional wisdom of using a training class distribution (cost) that matches the test class distribution (cost) to train a classifier if the learning algorithm is known to be stable.
What problem does this paper attempt to address?