CHINESE ADDRESS STANDARDISATION FOR PLAIN TEXT

Juan Xu,Ye Cao,Qi Zhang
DOI: https://doi.org/10.3969/j.issn.1000-386x.2015.08.005
2015-01-01
Abstract:With the development of web 2.0, the user is no longer just browsing website's contents, but also becomes a maker of website contents.The information which shared and uploaded by users is becoming a vital source for Internet contents.For example, the participants of Wikipedia come from all places around the world;the modification and merchants-centre function offered by Google Maps search;the mer-chants information recording services in website of“public comments” ( www.dianping.com) , etc.While the users become the makers of the Internet content from internet surfers, we should also consider the standardisation and correctness of the information uploaded and shared by users.In particular, the standardisation of merchant address information is of utmost importance for those websites offering living consumption platforms.For this sake, the paper presents a method for Chinese address standardisation which is based on cascaded conditional random fields.Results of experiments indicate that the proposed Chinese address standardisation method is effective and the F-score achieves 81%in open testing of real corpus.
What problem does this paper attempt to address?