An IR-based approach towards automated integration of geo-spatial datasets in map-based software systems

Nima Miryeganeh,Mehdi Amoui,Hadi Hemmati,Nima Miryeganeh,Mehdi Amoui,Hadi Hemmati
DOI: https://doi.org/10.1145/3338906.3340454
2019-08-12
Abstract:Data is arguably the most valuable asset of the modern world. In this era, the success of any data-intensive solution relies on the quality of data that drives it. Among vast amount of data that are captured, managed, and analyzed everyday, geospatial data are one of the most interesting class of data that hold geographical information of real-world phenomena and can be visualized as digital maps. Geo-spatial data is the source of many enterprise solutions that provide local information and insights. Companies often aggregate geospacial datasets from various sources in order to increase the quality of such solutions. However, a lack of a global standard model for geospatial datasets makes the task of merging and integrating datasets difficult and error prone. Traditionally, this aggregation was accomplished by domain experts manually validating the data integration process by merging new data sources and/or new versions of previous data against conflicts and other requirement violations. However, this manual approach is not scalable is a hinder toward rapid release when dealing with big datasets which change frequently. Thus more automated approaches with limited interaction with domain experts is required. As a first step to tackle this problem, we have leveraged Information Retrieval (IR) and geospatial search techniques to propose a systematic and automated conflict identification approach. To evaluate our approach, we conduct a case study in which we measure the accuracy of our approach in several real-world scenarios and followed by interviews with Localintel Inc. software developers to get their feedbacks.
What problem does this paper attempt to address?