Barking dogs: A Fréchet distance variant for detour detection

Ivor van der Hoog,Fabian Klute,Irene Parada,Patrick Schnider
2024-02-21
Abstract:Imagine you are a dog behind a fence $Q$ and a hiker is passing by at constant speed along the hiking path $P$. In order to fulfil your duties as a watchdog, you desire to bark as long as possible at the human. However, your barks can only be heard in a fixed radius $\rho$ and, as a dog, you have bounded speed $s$. Can you optimize your route along the fence $Q$ in order to maximize the barking time with radius $\rho$, assuming you can run backwards and forward at speed at most $s$? We define the barking distance from a polyline $P$ on $n$ vertices to a polyline $Q$ on $m$ vertices as the time that the hiker stays in your barking radius if you run optimally along $Q$. This asymmetric similarity measure between two curves can be used to detect outliers in $Q$ compared to $P$ that other established measures like the Fr\'echet distance and Dynamic Time Warping fail to capture at times. We consider this measure in three different settings. In the discrete setting, the traversals of $P$ and $Q$ are both discrete. For this case we show that the barking distance from $P$ to $Q$ can be computed in $O(nm\log s)$ time. In the semi-discrete setting, the traversal of $Q$ is continuous while the one of $P$ is again discrete. Here, we show how to compute the barking distance in time $O(nm\log (nm))$. Finally, in the continuous setting in which both traversals are continuous, we show that the problem can be solved in polynomial time. For all the settings we show that, assuming SETH, no truly subquadratic algorithm can exist.
Computational Geometry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the detection of outliers or deviated points in curve similarity measurement. Specifically, existing curve similarity measurement methods such as Fréchet distance and Dynamic Time Warping (DTW) have deficiencies when dealing with data containing outliers. These methods may not be able to effectively identify abnormal paths or deviated points in trajectory data, especially when the data contains measurement errors. For example, the Fréchet distance may not be able to continue to detect subsequent deviations after capturing the first deviation, and continuous DTW may not be able to distinguish a translated curve from the original curve even under speed limits if a certain part of the latter is long enough. To overcome these problems, the author proposes a new curve similarity measurement method - "Barking Distance". This method calculates the similarity of curves in the presence of outliers by introducing a natural threshold technique. Specifically, the algorithm receives two curves \(P\) and \(Q\) and a threshold \(\rho\). If there is a traversal method such that the distance between the dog (representing curve \(Q\)) and the owner (representing curve \(P\)) is always no more than \(\rho\), then their distance is 0; otherwise, the distance is defined as the shortest time when the distance between the dog and the owner exceeds \(\rho\). Formally, a threshold function \(\theta_\rho(p, q)\) is defined. When \(d(p, q)>\rho\), \(\theta_\rho(p, q) = 1\), otherwise it is 0. Given a traversal path \((f, g)\), the traversal cost is defined as \(\int_0^1\theta_\rho(f(t), g(t))dt\). To avoid degenerate behavior, the speeds of the dog and the owner are limited. In particular, the speed of the person is set to a constant, and the upper limit of the dog's speed is set to an input variable \(s\). In this way, the "Barking Distance" can more accurately detect the similarity of curves in the presence of outliers, thus providing an effective outlier detection tool. The paper also provides efficient algorithms for calculating the "Barking Distance" under different settings, including discrete settings, semi - discrete settings and continuous settings, and proves that under the assumption that the Strong Exponential Time Hypothesis (SETH) holds, there is no truly sub - quadratic - time - complexity algorithm to calculate any variant of the "Barking Distance".