Effective Temporal Dependence Discovery in Time Series Data

Qingchao Cai,Zhongle Xie,Meihui Zhang,Gang Chen,H. Jagadish,Beng Chin Ooi
DOI: https://doi.org/10.14778/3204028.3204033
IF: 2.5
2018-01-01
Proceedings of the VLDB Endowment
Abstract:To analyze user behavior over time, it is useful to group users into cohorts, giving rise to cohort analysis. We identify several crucial limitations of current cohort analysis, motivated by the unmet need for temporal dependence discovery. To address these limitations, we propose a generalization that we call recurrent cohort analysis. We introduce a set of operators for recurrent cohort analysis and design access methods specific to these operators in both single-node and distributed environments. Through extensive experiments, we show that recurrent cohort analysis when implemented using the proposed access methods is up to six orders faster than one implemented as a layer on top of a database in a single-node setting, and two orders faster than one implemented using Spark SQL in a distributed setting.
What problem does this paper attempt to address?