Matrix Dimensionality Reduction for Mining Web Logs

JJ Lu,BW Xu,HJ Yang
DOI: https://doi.org/10.1109/wi.2003.1241222
2003-01-01
Abstract:Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem todesign effective similarity measure between the session vectors which are usually high dimensional and sparse. Non-negative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. Theresults of experiment show that our algorithms can mine interesting user profiles effectively.
What problem does this paper attempt to address?