A Large-Scale Data Clustering Algorithm Based on BIRCH and Artificial Immune Network.

Yangyang Li,Guangyuan Liu,Peidao Li,Licheng Jiao
DOI: https://doi.org/10.1007/978-3-319-93815-8_32
2018-01-01
Abstract:This paper describes a large-scale data clustering algorithm which is a combination of Balanced Iterative Reducing and Clustering using Hierarchies Algorithm (BIRCH) and Artificial Immune Network Clustering Algorithm (aiNet). Compared with traditional clustering algorithms, aiNet can better adapt to non-convex datasets and does not require a given number of clusters. But it is not suitable for handling large-scale datasets for it needs a long time to evolve. Besides, the aiNet model is very sensitive to noise, which greatly restricts its application. Contrary to aiNet, BIRCH can better process large-scale datasets but cannot deal with non-convex datasets like traditional clustering algorithms, and requires the cluster number. By combining these two methods, a new large-scale data clustering algorithm is obtained which inherits the advantages and overcomes the disadvantages of BIRCH and aiNet simultaneously.
What problem does this paper attempt to address?