Mining Access Patterns of Web Active User Based on Tree Structure

BEI Yi-jun,CHEN Gang,DONG Jin-xiang
DOI: https://doi.org/10.3785/j.issn.1008-973x.2009.06.007
2009-01-01
Abstract:Conventional Web mining approaches generally employ the Web logs of all users when mining patterns.However,the behaviors of active users and inactive users are usually not the same when visiting the Web site.Therefore,an approach to access pattern mining was introduced,oriented to active users.The session-retrieval algorithm,named active user session miner(AUSM),was proposed to retrieve sessions of active users using one pass scan of the Web logs.Moreover,a tree-mining algorithm,named Web access pattern bottom up miner(WAPBUM),was presented to discover frequent access patterns from the retrieved sessions based on the topology of Web site.Based on the characteristics of the Web logs,WAPBUM builtds the subtree equivalence classes and generated frequent subtrees from bottom to top.Performance of these two algorithms were evaluated both on the synthetic and real datasets.Experimental results show that the proposed algorithms are efficient and effective.AUSM can keep memory stable and its running time is linear to the log scale.WAPBUM is not only more efficient than the previous algorithm FREQT,but also provides useful mining results for analyzing the web structure.
What problem does this paper attempt to address?