A Probabilistic Approach to Detect Local Dependencies in Streams.
Qiyang Duan,Mingxi Wu,Peng Wang,Wei Wang,Yu Cao
DOI: https://doi.org/10.1007/978-3-319-10085-2_10
2014-01-01
Abstract:Given m source streams (X 1, X 2, ..., X m ) and one target data stream Y, at any time window w, we want to find out which source stream has the strongest dependency to the current target stream value. Existing solutions fail in several important dependency cases, such as the not-similar-but-frequent patterns, the signals with multiple lags, and the single point dependencies. To reveal these hard-to-detect local patterns in streams, a statistical model based framework is developed, together with an incremental update algorithm. Using the framework, a new scoring function based on the conditional probability is defined to effectively capture the local dependencies between any source stream and the target stream. Immediate real life applications include quickly identifying the causal streams with respect to a Key Performance Indicator (KPI) in a complex production system, and detecting locally correlated stocks for an interesting event in the financial system. We apply this framework to two real data sets to demonstrate its advantages compared with the Principal Component Analysis (PCA) based method [16] and the naive local Pearson implementation.