Abstract:Urban land use classification plays a significant role in urban studies and provides key guidance for urban development. However, existing methods predominantly rely on either raster structure deep features through convolutional neural networks (CNNs) or topological structure deep features through graph neural networks (GNNs), making it challenging to comprehensively capture the rich semantic information in remote sensing images. To address this limitation, we propose a novel urban land use classification model by integrating both raster and topological structure deep features to enhance the accuracy and robustness of the classification model. First, we divide the urban area into block units based on road network data and further subdivide these units using the fractal network evolution algorithm (FNEA). Next, the K-nearest neighbors (KNN) graph construction method with adaptive fusion coefficients is employed to generate both global and local graphs of the blocks and sub-units. The spectral features and subgraph features are then constructed, and a graph convolutional network (GCN) is utilized to extract the node relational features from both the global and local graphs, forming the topological structure deep features while aggregating local features into global ones. Subsequently, VGG-16 (Visual Geometry Group 16) is used to extract the image convolutional features of the block units, obtaining the raster structure deep features. Finally, the transformer is used to fuse both topological and raster structure deep features, and land use classification is completed using the softmax function. Experiments were conducted using high-resolution Google images and Open Street Map (OSM) data, with study areas on the third ring road of Shenyang and the fourth ring road of Chengdu. The results demonstrate that the proposed method improves the overall accuracy and Kappa coefficient by 9.32% and 0.17, respectively, compared to single deep learning models. Incorporating subgraph structure features further enhances the overall accuracy and Kappa by 1.13% and 0.1. The adaptive KNN graph construction method achieves accuracy comparable to that of the empirical threshold method. This study enables accurate large-scale urban land use classification with reduced manual intervention, improving urban planning efficiency. The experimental results verify the effectiveness of the proposed method, particularly in terms of classification accuracy and feature representation completeness.

M3 LUC: Multi-modal Model for Urban Land-Use Classification

Exploring the Synergistic Use of Multi-Scale Image Object Metrics for Land-Use/Land-Cover Mapping Using an Object-Based Approach

Mixed land use measurement and mapping with street view images and spatial context-aware prompts via zero-shot multimodal learning

Enhancing Urban Land Use Identification Using Urban Morphology

A multimodal fusion framework for urban scene understanding and functional identification using geospatial data

Multinomial Logistic Regression for Land Use Classification with Remote Sensing

A Multimodal Data Fusion Model for Accurate and Interpretable Urban Land Use Mapping with Uncertainty Analysis

Understanding urban landuse from the above and ground perspectives: A deep learning, multimodal solution

A Lightweight Multi-Label Classification Method for Urban Green Space in High-Resolution Remote Sensing Imagery

Urban Land Use Classification Model Fusing Multimodal Deep Features

Multimodal Informative ViT: Information Aggregation and Distribution for Hyperspectral and LiDAR Classification

More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

Target Classification of Similar Spatial Characteristics in Complex Urban Areas by Using Multispectral LiDAR

Grid-Based Essential Urban Land Use Classification: A Data and Model Driven Mapping Framework in Xiamen City

From Pixels to Prose: Advancing Multi-Modal Language Models for Remote Sensing

Sensing Urban Land-Use Patterns By Integrating Google Tensorflow And Scene-Classification Models

An Ensemble Learning Approach for Urban Land Use Mapping Based on Remote Sensing Imagery and Social Sensing Data

Uncovering the Nature of Urban Land Use Composition Using Multi-Source Open Big Data with Ensemble Learning

JM3D & JM3D-LLM: Elevating 3D Understanding with Joint Multi-modal Cues

Identifying up-to-date urban land-use patterns with visual and semantic features based on multisource geospatial data

Synergistic Classification of Multilevel Land Patches (Sc-Mlps): Reducing Conflicts and Improving Mapping Results for Land Uses and Functional Spaces with Very-High-Resolution Satellite Imagery