Data Preprocessing Technique in Web Log Mining

李燕,冯博琴,鲁晓锋
DOI: https://doi.org/10.3969/j.issn.1000-3428.2009.22.015
2009-01-01
Abstract:Data preprocessing is the important step in Web log mining. It consists of four sub-steps, i.e. data cleaning, user identification, session identification and path completion. The referer-based method is adopted for user session identification and path completion, in order to avoid the problems introduced by using proxy servers, firewall, local caching, and so on. Experimental results reveal that the technique can obtain the user access path efficiently if accurate referer information is available in Web access log.
What problem does this paper attempt to address?