Rule-based Approach to Semantic Resolution of Chinese Addresses

张雪英,闾国年,李伯秋,陈文君
DOI: https://doi.org/10.3724/sp.j.1047.2010.00009
2010-01-01
Geo-information Science
Abstract:A geographic information system (GIS) integrates hardware, software, and data for capturing, manag-ing, analyzing, and displaying all forms of geographically referenced information. Addresses are one of the most popular geographical reference systems in natural languages. Address geocoding is considered as the most effective approach to bridging the gap between business data in management information systems (MIS) and GIS, which supports geospatial information visualization and spatial analysis. Chinese address geocoding faces three significant problems, i.e. address models, address resolution and address matching, because of the un-standardization of Chinese place names and the shortage of national address databases. Address resolution aims to automatically split address strings in natural language into address units without semantic incompletion. It plays a fundamental role in address models and address matching Previous research focuses on rule or gazetteer based approaches, which are easily implemented but with poor coverage and performance. In theory, Chinese address resolution is similar to word segmentation in Chinese natural language processing Based on the investigation of large-scale Chinese place names and address syntactic patterns, this paper identifies primary and secondary general characters that represent a variety of address units. And then an address numerical representation method is presented to induce syntactical rules of Chinese addresses. Finally, we develop an RBAI algorithm for implementation Chinese address resolution and illustrate an example. The experimental results indicate that the proposed approach can achieve satisfactory ef-ficiency and effectiveness for large-scale data processing, the accuracy ratio over 92% and the processing rate over 2,800 items per second. The proposed approach and system can be extended to such fields as land management, asset management, city plan, public security, postal system, taxation, public health management and other loca-tion-base services.
What problem does this paper attempt to address?