From Historical OpenStreetMap data to customized training samples for geospatial machine learning

Zhaoyan Wu,Hao Li,A. Zipf
DOI: https://doi.org/10.5281/ZENODO.3923040
2020-07-04
Abstract:This abstract was accepted to the Academic Track of the State of the Map 2020 Online Conference after peer-review. After more than a decade of rapid development of volunteered geographic information (VGI), VGI has already become one of the most important research topics in the GIScience community [1]. Almost in the meantime, we have witnessed the ever-fast growth of geospatial machine learning technologies to develop intelligent GIServices [2] and to address remote sensing tasks [3], for instance land use/land cover classification, object detection, and change detection. Nevertheless, the lack of abundant training samples as well as accurate semantic information has been long identified as a modelling bottleneck of such data-hungry machine learning applications. Correspondingly, OpenStreetMap (OSM) shows great potential in tackling this bottleneck challenge by providing massive and freely accessible geospatial training samples [4, 5]. More importantly, OSM has exclusive access to its full historical data [6], which could be further analyzed and employed to provide intrinsic data quality measurements of the training samples. Therefore, a flexible framework for labeling customized geospatial objects using historical OSM data allows more effective and efficient machine learning. This work approaches the topic of labeling geospatial machine learning samples by providing a flexible framework to automatically generate customized training samples and provide intrinsic data quality measurements. In more detail, we explored the historical OSM data for two purposes: feature extraction and intrinsic assessment. For example, when training building detection convolutional neural networks (CNNs), the OSM features with tags as ​ building=residential or ​ building=house are certainly of interest while the data quality
Computer Science
What problem does this paper attempt to address?