Detecting Arbitrarily Shaped Clusters Using Ant Colony Optimization
Tao Pei,You Wan,Yong Jiang,Chenxu Qu,Chenghu Zhou,Youlin Qiao
DOI: https://doi.org/10.1080/13658816.2010.533674
2011-01-01
International Journal of Geographical Information Science
Abstract:In the map of geo-referenced population and cases, the detection of the most likely cluster (MLC), which is made up of many connected polygons (e.g., the boundaries of census tracts), may face two difficulties. One is the irregularity of the shape of the cluster and the other is the heterogeneity of the cluster. A heterogeneous cluster is referred to as the cluster containing depression links (a polygon is a depression link if it satisfies two conditions: (1) the ratio between the case number and the population in the polygon is below the average ratio of the whole map; (2) the removal of the polygon will disconnect the cluster). Previous studies have successfully solved the problem of detecting arbitrarily shaped clusters not containing depression links. However, for a heterogeneous cluster, existing methods may generate mistakes, for example, missing some parts of the cluster. In this article, a spatial scanning method based on the ant colony optimization (AntScan) is proposed to improve the detection power. If a polygon can be simplified as a node, the research area consisting of many polygons then can be seen as a graph. So the detection of the MLC can be seen as the search of the best subgraph (with the largest likelihood value) in the graph. The comparison between AntScan, GAScan (the spatial scan method based on the genetic optimization), and SAScan (the spatial scan method based on the simulated annealing optimization) indicates that (1) the performance of GAScan and SAScan is significantly influenced by the parameter of the fraction value (the maximum allowed size of the detected cluster), which can only be estimated by multiple trials, while no such parameter is needed in AntScan; (2) AntScan shows superior power over GAScan and SAScan in detecting heterogeneous clusters. The case study on esophageal cancer in North China demonstrates that the cluster identified by AntScan has the larger likelihood value than that detected by SAScan and covers all high-risk regions of esophageal cancer whereas SAScan misses some high-risk regions (the region in the southwest of Shandong province, eastern China) due to the existence of a depression link.