Multi-level Urban Street Representation with Street-View Imagery and Hybrid Semantic

Yan Zhang,Yong Li,Fan Zhang
DOI: https://doi.org/10.1016/j.isprsjprs.2024.09.032
IF: 12.7
2024-01-01
ISPRS Journal of Photogrammetry and Remote Sensing
Abstract:Street-view imagery has been densely covering cities. They provide a close-up perspective of the urban physical environment, allowing a comprehensive perception and understanding of cities. There has been a significant amount of effort to represent the urban physical environment based on street view imagery, and this representation has been utilized to study the relationships between the physical environment, human dynamics, and socioeconomic environments. However, there are two key challenges in representing the urban physical environment of streets based on street-view images for downstream tasks. First, current research mainly focuses on the proportions of visual elements within the scene, neglecting the spatial adjacency between them. Second, the spatial dependency and spatial interaction between streets have not been adequately accounted for. These limitations hinder the effective representation and understanding of urban streets. To address these challenges, we propose a dynamic graph representation framework based on dual spatial semantics. At the intra-street level, we consider the spatial adjacency relationships of visual elements. Our method dynamically parses visual elements within the scene, achieving context-specific representations. At the inter-street level, we construct two spatial weight matrices by integrating the spatial dependency and the spatial interaction relationships. It could account for the hybrid spatial relationships between streets comprehensively, enhancing the model's ability to represent human dynamics and socioeconomic status. Furthermore, aside from these two modules, we also provide a spatial interpretability analysis tool for downstream tasks. A case study of our research framework shows that our method improves vehicle speed and flow estimation by 2.4% and 6.4%, respectively. This not only indicates that street-view imagery provides rich information about urban transportation but also offers a more accurate and reliable data-driven framework for urban studies. The code is available at: (https://github.com/yemanzhongting/HybridGraph).
What problem does this paper attempt to address?