Abstract:Driven by the advance of positioning technology and the popularity of location-sharing services, semantic-enriched trajectory data have become unprecedentedly available. The sequential patterns hidden in such data, when properly defined and extracted, can greatly benefit tasks like targeted advertising and urban planning. Unfortunately, classic sequential pattern mining algorithms developed for transactional data cannot effectively mine patterns in semantic trajectories, mainly because the places in the continuous space cannot be regarded as independent \"items\". Instead, similar places need to be grouped to collaboratively form frequent sequential patterns. That said, it remains a challenging task to mine what we call fine-grained sequential patterns, which must satisfy spatial compactness, semantic consistency and temporal continuity simultaneously. We propose Splitter to effectively mine such fine-grained sequential patterns in two steps. In the first step, it retrieves a set of spatially coarse patterns, each attached with a set of trajectory snippets that precisely record the pattern's occurrences in the database. In the second step, Splitter breaks each coarse pattern into fine-grained ones in a top-down manner, by progressively detecting dense and compact clusters in a higher-dimensional space spanned by the snippets. Splitter uses an effective algorithm called weighted snippet shift to detect such clusters, and leverages a divide-and-conquer strategy to speed up the top-down pattern splitting process. Our experiments on both real and synthetic data sets demonstrate the effectiveness and efficiency of Splitter.

Automatically Segmenting Oral History Transcripts

Knowledge-Based Approaches to the Segmentation of Oral History Interviews

Splitter: mining fine-grained sequential patterns in semantic trajectories

Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies

Automating Easy Read Text Segmentation

TreeSeg: Hierarchical Topic Segmentation of Large Transcripts

OntoSeg: a Novel Approach to Text Segmentation using Ontological Similarity

Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images

Generating And Evaluating Segmentations For Automatic Speech Recognition Of Conversational Telephone Speech

Text Line Segmentation of Historical Documents: a Survey

Sentence Segmentation in Narrative Transcripts from Neuropsychological Tests using Recurrent Convolutional Neural Networks

Problem-Oriented Segmentation and Retrieval: Case Study on Tutoring Conversations

An Efficient and Effective Online Sentence Segmenter for Simultaneous Interpretation.

From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions

A comparative evaluation of interactive segmentation algorithms

NaturalTurn: A Method to Segment Transcripts into Naturalistic Conversational Turns

Combining Morphological and Histogram based Text Line Segmentation in the OCR Context

LumberChunker: Long-Form Narrative Document Segmentation

Recent Trends in Linear Text Segmentation: a Survey

TopWORDS-Seg: Simultaneous Text Segmentation and Word Discovery for Open-Domain Chinese Texts via Bayesian Inference

An automatic approach for efficient text segmentation