Non-parametric regression for networks

Katie E. Severn,Ian L. Dryden,Simon P. Preston
DOI: https://doi.org/10.48550/arXiv.2010.00050
2020-10-01
Abstract:Network data are becoming increasingly available, and so there is a need to develop suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold-valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example in dynamic networks where the covariate is time. We develop an adapted Nadaraya-Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co-occurrence networks.
Methodology,Applications
What problem does this paper attempt to address?