TRACE: Transformer-based user Representations from Attributed Clickstream Event sequences

William Black,Alexander Manlove,Jack Pennington,Andrea Marchini,Ercument Ilhan,Vilda Markeviciute
2024-09-03
Abstract:For users navigating travel e-commerce websites, the process of researching products and making a purchase often results in intricate browsing patterns that span numerous sessions over an extended period of time. The resulting clickstream data chronicle these user journeys and present valuable opportunities to derive insights that can significantly enhance personalized recommendations. We introduce TRACE, a novel transformer-based approach tailored to generate rich user embeddings from live multi-session clickstreams for real-time recommendation applications. Prior works largely focus on single-session product sequences, whereas TRACE leverages site-wide page view sequences spanning multiple user sessions to model long-term engagement. Employing a multi-task learning framework, TRACE captures comprehensive user preferences and intents distilled into low-dimensional representations. We demonstrate TRACE's superior performance over vanilla transformer and LLM-style architectures through extensive experiments on a large-scale travel e-commerce dataset of real user journeys, where the challenges of long page-histories and sparse targets are particularly prevalent. Visualizations of the learned embeddings reveal meaningful clusters corresponding to latent user states and behaviors, highlighting TRACE's potential to enhance recommendation systems by capturing nuanced user interactions and preferences
Information Retrieval,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the issue of complex navigation patterns that users exhibit when browsing and purchasing travel products on travel e-commerce websites. Specifically, the paper proposes a new method called TRACE (Transformer-based Representation of Attribute Clickstream Event Sequences), which aims to generate rich user embedding representations from multi-session clickstream data. Unlike previous studies that primarily focus on product sequences within a single session, TRACE leverages site-wide page view sequences spanning multiple user sessions to model long-term user engagement. Through a multi-task learning framework, TRACE is able to capture comprehensive user preferences and intentions and transform them into low-dimensional representations. Experimental results show that TRACE outperforms traditional Transformer and LLM-style architectures on a large-scale real-world travel e-commerce dataset, particularly excelling in handling long page history and sparse targets. Additionally, by visualizing the learned embeddings, meaningful clusters can be discovered, corresponding to latent user states and behaviors, highlighting TRACE's potential in enhancing recommendation systems.