Abstract:Retrieving temporal event sequences from textual descriptions is essential for applications such as analyzing e-commerce behavior, monitoring social media activities, and tracking criminal incidents. In this paper, we introduce TPP-LLM-Embedding, a unified model for efficiently embedding and retrieving event sequences based on natural language descriptions. Built on the TPP-LLM framework, which integrates large language models with temporal point processes, our model encodes both event types and times, generating a sequence-level representation through pooling. Textual descriptions are embedded using the same architecture, ensuring a shared embedding space for both sequences and descriptions. We optimize a contrastive loss based on similarity between these embeddings, bringing matching pairs closer and separating non-matching ones. TPP-LLM-Embedding enables efficient retrieval and demonstrates superior performance compared to baseline models across diverse datasets.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to efficiently retrieve time - event sequences from text descriptions. Specifically, the paper proposes a unified model named TPP - LLM - Embedding, which can embed and retrieve event sequences based on natural language descriptions. By integrating large - language models and temporal point processes, TPP - LLM - Embedding can encode event types and time simultaneously, generate sequence - level representations, and generate fixed - length representations through pooling operations. In addition, the model is optimized by contrastive loss, making matching pairs closer and non - matching pairs more separated, thereby achieving efficient event - sequence retrieval. ### Main Problems 1. **Limitations of Traditional Models**: - Traditional language models perform well in handling general text retrieval tasks, but often perform poorly when dealing with event sequences containing time and structural complexity. - Existing models either treat event types as categorical inputs, limiting the ability to capture rich event semantics, or treat the entire sequence as text, ignoring its time - dependence. 2. **The Need for Efficient Retrieval**: - In applications such as e - commerce user behavior analysis, social media monitoring, and crime tracking, efficient retrieval of time - event sequences is crucial. - These applications require models to be able to capture not only time - sensitive dynamics but also structural relationships in the sequence. ### Solutions - **TPP - LLM - Embedding Model**: - This model is based on the TPP - LLM framework, combining temporal point processes and large - language models. - Through time encoding and text embedding, the model can effectively capture the underlying patterns and dependencies of event sequences. - The model uses pooling operations to generate fixed - length representations and is optimized by contrastive learning, making matching pairs closer and non - matching pairs more separated. ### Experimental Verification - **Data Sets**: - The paper uses five real - world data sets from different domains, including Stack Overflow, Chicago Crime, NYC Taxi Trips, U.S. Earthquakes, and Amazon Reviews. - To generate accompanying text descriptions, the authors use GPT - 4 to generate objective summaries, focusing on the order and time of events to ensure that the model can capture the basic structure of each sequence. - **Baseline Models and Evaluation Metrics**: - Baseline models include All - MiniLM - L12 - v2, All - MPNet - Base - v2, BGE - Large - En - v1.5, and MxbAI - Embed - Large - v1. - Evaluation metrics include Mean Reciprocal Rank (MRR) and Recall@K, which are used to measure the retrieval quality of the model. ### Experimental Results - **Performance Comparison**: - The experimental results show that TPP - Llama and TPP - Llama - Chat consistently outperform the baseline models on multiple data sets, especially in terms of MRR and Recall@5. - Multitasking experiments further prove the effectiveness and flexibility of TPP - LLM - Embedding in handling multi - source event sequences. ### Conclusions - **Main Contributions**: 1. Proposed the TPP - LLM - Embedding model, which can effectively integrate time and event - type information to achieve accurate event - sequence retrieval. 2. Verified the superior performance of this model through experiments on multiple data sets. 3. Demonstrated the scalability of this method in multitasking experiments, indicating its generality in different event domains. ### Limitations and Ethical Considerations - **Data Quality and Noise**: - The performance of the model depends on high - quality time and event - type data, and noise or incomplete information that may exist in practical applications will affect performance. - **Computing Resources**: - Using large - scale language models may lead to computational latency, especially when dealing with extremely large data sets. - **Privacy and Bias**: - It is necessary to ensure the anonymization and compliance of training and retrieval data to avoid potential privacy leaks. - Pay attention to biases in the training data, such as unbalanced representations of event types, which may lead to biased retrieval results.

Efficient Retrieval of Temporal Event Sequences from Textual Descriptions

TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models

Exploring Generative Neural Temporal Point Process

Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences

Embedding and Predicting the Event at Early Stage

Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding

Learning Temporal Embeddings for Complex Video Analysis

Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Deep Representation Learning for Prediction of Temporal Event Sets in the Continuous Time Domain

A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events

Event2vec: Learning Representations Of Events On Temporal Sequences

Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval

Prompt-augmented Temporal Point Process for Streaming Event Sequence

TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding

Exploiting long-term temporal dynamics for video captioning

Enhancing Asynchronous Time Series Forecasting with Contrastive Relational Inference

Time-Dependent Representation for Neural Event Sequence Prediction

Event-enhanced Retrieval in Real-time Search

Modeling and Applications for Temporal Point Processes

VTimeLLM: Empower LLM to Grasp Video Moments