An improved ART 2-A model for mixed numeric and categorical data

Xiao Han,Yahui Yang,Qingni Shen,Min Xia
DOI: https://doi.org/10.1109/ICIECS.2009.5365746
2009-01-01
Abstract:Adaptive Resonance Theory (ART) architectures are important neural networks for unsupervised clustering. ART 2-A is one version of the ART family capable of clustering both binary and numeric data. However, real-world problems usually contain categorical data that cannot be processed by ART 2-A. A simple solution is using binary encoding to preprocess categorical data. Binary encoding is a simple and straightforward approach, but it suffers from two main drawbacks: increase of dimensionality and lack of scalability. Therefore this paper proposes ART 2a-M, an improved version over ART 2-A. ART 2a-M can deal with mixed numeric and categorical data. Experiments were carried out on KDD Cup 99 data set to compare ART 2a-M with ART 2-A. Results show that not only ART 2a-M overcomes the two drawbacks of binary encoding. but also runs about 10% faster than the original one, while keeping the same accuracy. ©2009 IEEE.
What problem does this paper attempt to address?