Swarm Intelligence Based Topic Identification for Sessions in Web Access Log

FANG Qi,LIU Yiqun,ZHANG Min,RUN Liyun,MA Shaoping
DOI: https://doi.org/10.3969/j.issn.1003-0077.2011.01.006
2011-01-01
Abstract:A session in Web access log denotes a continuous-time sequence of user's Web browsing behavior.A topic of a session represents a hidden browsing intent of a Web user.It is fundamental to identify several topic-based log units from a session.Existing work mainly focuses on detecting boundaries without considering the common situation in which different topics often overlap in one session.In this paper,we first re-define the concept of session and topic,and then the task of largest segmentation is proposed.We further design the session topic identification algorithm based on crowd wisdom of Web users.The effectiveness of the algorithm is validated by the experiments performed on large scale of realistic Web access logs.
What problem does this paper attempt to address?