Time-Varying Graph Learning for Data with Heavy-Tailed Distribution

Amirhossein Javaheri,Jiaxi Ying,Daniel P. Palomar,Farokh Marvasti
2025-01-01
Abstract:Graph models provide efficient tools to capture the underlying structure of data defined over networks. Many real-world network topologies are subject to change over time. Learning to model the dynamic interactions between entities in such networks is known as time-varying graph learning. Current methodology for learning such models often lacks robustness to outliers in the data and fails to handle heavy-tailed distributions, a common feature in many real-world datasets (e.g., financial data). This paper addresses the problem of learning time-varying graph models capable of efficiently representing heavy-tailed data. Unlike traditional approaches, we incorporate graph structures with specific spectral properties to enhance data clustering in our model. Our proposed method, which can also deal with noise and missing values in the data, is based on a stochastic approach, where a non-negative vector auto-regressive (VAR) model captures the variations in the graph and a Student-t distribution models the signal originating from this underlying time-varying graph. We propose an iterative method to learn time-varying graph topologies within a semi-online framework where only a mini-batch of data is used to update the graph. Simulations with both synthetic and real datasets demonstrate the efficacy of our model in analyzing heavy-tailed data, particularly those found in financial markets.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to learn time - varying graph models that can effectively represent heavy - tailed distribution data. Specifically, existing methods often lack robustness to external outliers when dealing with data with heavy - tailed distributions (such as financial data), and are unable to effectively handle noise and missing values. In addition, existing methods fail to fully capture graph topologies with specific structures and spectral properties. ### Main problem summary: 1. **Lack of robustness to external outliers**: Existing methods perform poorly when dealing with data with heavy - tailed distributions. 2. **Unable to effectively handle noise and missing values**: Many real - world datasets contain noise or missing values, and existing methods have not been able to meet this challenge well. 3. **Failure to capture graph topologies with specific structures and spectral properties**: Existing methods do not fully consider the structure and spectral properties of graphs, such as k - connected component graphs, which are very important in tasks such as data clustering. ### Method proposed in the paper: To solve the above problems, the author proposes a new framework based on a stochastic model, which can: - Use the non - negative vector autoregressive (VAR) model to capture the time - varying of graph weights. - Use the Student - t distribution to model signals from time - varying graphs to better handle heavy - tailed data. - Update graphs using mini - batch data in a semi - online framework, thereby improving efficiency and reducing computational costs. - Introduce spectral and structural constraints to ensure that the learned graph has specific properties, such as k - connected component graphs. ### Formula representation: The formulas involved in the paper include but are not limited to: - **Probability density function of the Student - t distribution**: \[ p(x_t|w_n)\propto(\det^*(L_{w_n}))^{1/2}\left(1 + \frac{x_t^{\top}L_{w_n}x_t}{\nu}\right)^{-(\nu + p)/2} \] where \(\nu> 2\), \(t\in F_n\), and \(L_{w_n}\) is the Laplacian matrix of the time - varying graph. - **Minimization objective function**: \[ \min_{w_n\geq0,a\geq0,X_n}\frac{1}{T_n\sigma_n^2}\|Y_n - M_n\odot X_n\|_F^2-\log\det^*(L_{w_n})+\frac{\nu + p}{T_n}\sum_{t\in F_n}\log\left(1 + \frac{x_t^{\top}L_{w_n}x_t}{\nu}\right)+\alpha\|w_n - a\odot\hat{w}_{n - 1}\|_1+\beta\|w_n\|_0+\gamma a^{\top}1 \] Through these improvements, this method can not only handle data with heavy - tailed distributions more robustly, but also effectively deal with noise and missing values, and learn graph topologies with specific structures and spectral properties.