The DFS Fused Lasso: Linear-Time Denoising over General Graphs

Oscar Hernan Madrid Padilla,James G. Scott,James Sharpnack,Ryan J. Tibshirani
DOI: https://doi.org/10.48550/arXiv.1608.03384
2017-03-02
Abstract:The fused lasso, also known as (anisotropic) total variation denoising, is widely used for piecewise constant signal estimation with respect to a given undirected graph. The fused lasso estimate is highly nontrivial to compute when the underlying graph is large and has an arbitrary structure. But for a special graph structure, namely, the chain graph, the fused lasso---or simply, 1d fused lasso---can be computed in linear time. In this paper, we establish a surprising connection between the total variation of a generic signal defined over an arbitrary graph, and the total variation of this signal over a chain graph induced by running depth-first search (DFS) over the nodes of the graph. Specifically, we prove that for any signal, its total variation over the induced chain graph is no more than twice its total variation over the original graph. This connection leads to several interesting theoretical and computational conclusions. Denoting by $m$ and $n$ the number of edges and nodes, respectively, of the graph in question, our result implies that for an underlying signal with total variation $t$ over the graph, the fused lasso achieves a mean squared error rate of \smash{$t^{2/3} n^{-2/3}$}. Moreover, precisely the same mean squared error rate is achieved by running the 1d fused lasso on the induced chain graph from running DFS. Importantly, the latter estimator is simple and computationally cheap, requiring only $O(m)$ operations for constructing the DFS-induced chain and $O(n)$ operations for computing the 1d fused lasso solution over this chain. Further, for trees that have bounded max degree, the error rate of \smash{$t^{2/3} n^{-2/3}$} cannot be improved, in the sense that it is the minimax rate for signals that have total variation $t$ over the tree.
Statistics Theory
What problem does this paper attempt to address?