High Confidence Fragment-Based Classification Rule Mining For Imbalanced Hiv Data

Bing Lv,Jianyong Wang,Lizhu Zhou
DOI: https://doi.org/10.1007/978-3-540-78849-2_8
2008-01-01
Abstract:In this paper, we study the problem of mining high confidence fragment-based classification rules from the imbalanced HIV data whose class distribution is extremely skewed. We propose an efficient approach to mining frequent fragments in different classes of compounds that can provide best hints of the characteristic of each class and can be used to build associative classification rules. We adopt the pattern-growth paradigm and define an efficient fragment enumeration scheme. Moreover, we introduce an improved instance-centric rule-generation strategy to mine the high-confidence fragment-based classification rules, which are very insightful and useful in differentiating one class from other classes. Experiments show that our algorithm can discover more interesting rules than the previous method and can facilitate the detection of new compounds with desired anti-HIV activity.
What problem does this paper attempt to address?