Abstract:Advances in tracking technologies have spurred the rapid growth of large-scale trajectory data. Building a compact collection of pathlets, referred to as a trajectory pathlet dictionary, is essential for supporting mobility-related applications. Existing methods typically adopt a top-down approach, generating numerous candidate pathlets and selecting a subset, leading to high memory usage and redundant storage from overlapping pathlets. To overcome these limitations, we propose a bottom-up strategy that incrementally merges basic pathlets to build the dictionary, reducing memory requirements by up to 24,000 times compared to baseline methods. The approach begins with unit-length pathlets and iteratively merges them while optimizing utility, which is defined using newly introduced metrics of trajectory loss and representability. We develop a deep reinforcement learning framework, PathletRL, which utilizes Deep Q-Networks (DQN) to approximate the utility function, resulting in a compact and efficient pathlet dictionary. Experiments on both synthetic and real-world datasets demonstrate that our method outperforms state-of-the-art techniques, reducing the size of the constructed dictionary by up to 65.8%. Additionally, our results show that only half of the dictionary pathlets are needed to reconstruct 85% of the original trajectory data. Building on PathletRL, we introduce PathletRL++, which extends the original model by incorporating a richer state representation and an improved reward function to optimize decision-making during pathlet merging. These enhancements enable the agent to gain a more nuanced understanding of the environment, leading to higher-quality pathlet dictionaries. PathletRL++ achieves even greater dictionary size reduction, surpassing the performance of PathletRL, while maintaining high trajectory representability.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to efficiently construct a compact and high - quality trajectory pathlet dictionary to support mobility - related applications while significantly reducing memory requirements and redundant storage. Specifically, in view of the high memory consumption of existing methods when dealing with large - scale trajectory data and the redundancy problems caused by overlapping pathlets, the paper proposes a bottom - up strategy based on reinforcement learning to optimize trajectory pathlet extraction and dictionary formation. ### Problem Background With the progress of tracking technology, large - scale trajectory data is growing rapidly. Constructing a compact trajectory pathlet dictionary is crucial for supporting various mobility - related applications (such as route planning, travel time prediction, personalized destination prediction, etc.). However, existing methods usually adopt a top - down approach, generating a large number of candidate pathlets and selecting a subset from them, which leads to high memory usage and redundant storage problems due to pathlet overlap. ### Limitations of Existing Methods 1. **High Memory Consumption**: Existing methods require a large amount of memory to store the initially generated large number of candidate pathlets, many of which are redundant. 2. **Redundant Storage**: Due to allowing pathlet overlap, existing methods will lead to redundant storage. 3. **High Computational Complexity**: Some methods require users to provide specific parameters (such as budget constraints) and are slow in calculation in practical applications. ### Solutions Proposed in the Paper To solve the above problems, the paper proposes the following solutions: 1. **Bottom - up Strategy**: Start from pathlets of unit length and gradually merge these pathlets to construct the dictionary, thereby greatly reducing memory requirements (up to 24,000 times reduction). 2. **Deep Reinforcement Learning Framework PathletRL**: Utilize the deep Q - network (DQN) to approximate the utility function, and optimize the pathlet merging process by maximizing the utility, and finally generate a compact and efficient pathlet dictionary. 3. **Introduction of New Metrics**: Define two new metrics, trajectory loss and trajectory representability, to more comprehensively evaluate the quality of pathlets. 4. **Optimization and Extension**: Based on PathletRL, introduce a richer state representation and an improved reward function, and develop the PathletRL++ model to further improve the quality and stability of the dictionary. ### Experimental Results Experiments show that this method outperforms existing techniques on both synthetic datasets and real - world datasets, can reduce the dictionary size by up to 65.8%, and only needs half of the pathlets in the dictionary to reconstruct 85% of the original trajectory data. ### Summary The main contribution of the paper lies in proposing an innovative bottom - up strategy and combining it with deep reinforcement learning technology to solve the high memory consumption and redundant storage problems of existing methods when dealing with large - scale trajectory data, providing new ideas and methods for constructing a compact and high - quality trajectory pathlet dictionary.

PathletRL++: Optimizing Trajectory Pathlet Extraction and Dictionary Formation via Reinforcement Learning