M3 LUC: Multi-modal Model for Urban Land-Use Classification

Sibo Li,Xin Zhang,Yuming Lin,Yong Li
DOI: https://doi.org/10.1145/3678717.3691278
2024-01-01
Abstract:Identifying urban land-use types is crucial for effective resource management, urban planning, and sustainable development. However, classifying land use is complex due to the complexity of the city and the poor data available in undeveloped areas. In this work, we present the Multi-modal Model for Land-use Classification (M3LUC). Our model is the first to leverage the advanced Vision-Language Model (VLM) to better capture urban functionality through remote sensing data and Points of Interest (POI). We have also designed specific mechanisms to robustly and extensively tackle the modality missing and conflict to enhance transferability. Experiments conducted in four major cities in China demonstrate our model's superior performance in both transfer and non-transfer tasks, revealing its potential for broader applications.
What problem does this paper attempt to address?