Spatially Constrained Spectral Clustering Algorithms for Region Delineation

Shuai Yuan,Pang-Ning Tan,Kendra Spence Cheruvelil,Sarah M. Collins,Patricia A. Soranno
DOI: https://doi.org/10.48550/arXiv.1905.08451
2019-05-21
Abstract:Regionalization is the task of dividing up a landscape into homogeneous patches with similar properties. Although this task has a wide range of applications, it has two notable challenges. First, it is assumed that the resulting regions are both homogeneous and spatially contiguous. Second, it is well-recognized that landscapes are hierarchical such that fine-scale regions are nested wholly within broader-scale regions. To address these two challenges, first, we develop a spatially constrained spectral clustering framework for region delineation that incorporates the tradeoff between region homogeneity and spatial contiguity. The framework uses a flexible, truncated exponential kernel to represent the spatial contiguity constraints, which is integrated with the landscape feature similarity matrix for region delineation. To address the second challenge, we extend the framework to create fine-scale regions that are nested within broader-scaled regions using a greedy, recursive bisection approach. We present a case study of a terrestrial ecology data set in the United States that compares the proposed framework with several baseline methods for regionalization. Experimental results suggest that the proposed framework for regionalization outperforms the baseline methods, especially in terms of balancing region contiguity and homogeneity, as well as creating regions of more similar size, which is often a desired trait of regions.
Machine Learning
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are two challenges in geographical regionalization: 1. **Balance between regional homogeneity and spatial continuity**: In the regionalization task, the goal is to divide the landscape into homogeneous regions with similar properties, and these regions are spatially continuous. Existing methods may not be able to ensure both regional homogeneity and spatial continuity simultaneously. 2. **Handling of hierarchical structures**: Natural landscapes are usually hierarchical, that is, fine - scale regions are completely nested within broader regions. How to effectively create such nested regions is an important challenge. To address these two challenges, the author proposes a **Spatially Constrained Spectral Clustering Framework**, which can balance the homogeneity and spatial continuity of regions and be extended to hierarchical clustering through recursive bisection to create nested regions. ### Specific solutions 1. **Spatially Constrained Spectral Clustering Framework**: - **Spatial continuity constraint**: Use a flexible Truncated Exponential Kernel to represent the spatial continuity constraint and combine it with the landscape feature similarity matrix for regional division. - **Feature similarity matrix**: Calculate the feature similarity matrix using the Gaussian RBF Kernel. 2. **Extension to hierarchical clustering**: - **Recursive bisection**: Recursively select the most heterogeneous region for segmentation until each sub - region contains only one spatial unit. This method can create fine - scale regions nested within broader regions. ### Experimental verification The author evaluated the effectiveness of the proposed framework through a case study of a US terrestrial ecology data set. The experimental results show that the framework is superior to other baseline methods in the following three aspects: - **Regional homogeneity**: The generated regions are more homogeneous. - **Regional continuity**: The generated regions are more spatially continuous. - **Consistency of regional size**: The generated regions are more uniform in size. ### Key formulas - **Truncated Exponential Kernel**: \[ S_{\text{trunc}}^c(\delta)=\sum_{k = 0}^{\delta}\frac{C^k}{k!} \] where \(C\) is the adjacency matrix and \(\delta\) controls the size of the ML neighborhood. - **Hadamard product graph Laplacian matrix**: \[ S_{\text{total}}(\delta)=S\circ S_c(\delta) \] where \(S\) is the feature similarity matrix and \(S_c(\delta)\) is the spatial constraint kernel matrix. - **Generalized eigenvalue problem**: \[ \arg\min_r r^T L_{\text{total}}r\quad\text{s.t.}\quad r^T D_{\text{total}}r=\sum_i D_{\text{total}, ii},\quad1^T D_{\text{total}}r = 0 \] where \(L_{\text{total}}=D_{\text{total}}-S\circ S_c(\delta)\). Through these methods and formulas, the author has successfully solved the key challenges in regionalization and provided an effective solution.