Abstract:This material introduces the D-Subspace algorithm derived on the basis of the centralized algorithm [1], which originally addresses parameter estimation problems under a subspace constraint.
What problem does this paper attempt to address?
This paper attempts to solve the problem of online learning in distributed networks, especially in the case where the low - rank subspace constraint is introduced in the parameter estimation problem. Specifically, the goal of the paper is to develop an algorithm named "D - Subspace" for efficient parameter estimation and optimization in a distributed network environment.
### Problem Background
The paper assumes that in a connected network consisting of \( N \) nodes, each node \( k\in\mathcal{N} \) has a strongly convex, real - valued and differentiable cost function \( J_k(w_k) \), which corresponds to the expected value of a loss function \( G_k(w_k; s_k, n) \):
\[ J_k(w_k)\triangleq\mathbb{E}\{G_k(w_k; s_k, n)\} \]
where \( \mathbb{E}\{\cdot\} \) represents the expectation with respect to the distribution of random data \( s_k, n \), and the subscripts \( k \) and \( n \) represent the node index and time instance respectively.
The true parameter vector \( w_k^*\in\mathbb{R}^L \) of each node \( k \) is the unique minimum solution of \( J_k(w_k) \). Define the matrix \( W^*\) as follows:
\[ W^*\triangleq[w_1^*, w_2^*, \cdots, w_N^*]\in\mathbb{R}^{L\times N} \]
### Low - Rank Assumption
The paper assumes that \( W^*\) is a low - rank matrix with rank \( r^*\), that is:
\[ w_k^*=\sum_{i = 1}^{r^*}\alpha_{k,i}^o c_i = C\cdot\alpha_k^o \]
where \( \{c_i\}_{i = 1}^{r^*} \) is a set of basis vectors, \( \{\alpha_{k,i}^o\}_{i = 1}^{r^*} \) are the corresponding weights, the matrix \( C\triangleq[c_1, c_2, \cdots, c_{r^*}]\in\mathbb{R}^{L\times r^*} \), and the vector \( \alpha_k^o\triangleq[\alpha_{k,1}^o, \alpha_{k,2}^o, \cdots, \alpha_{k,r^*}^o]^{\top} \). Assume that \( \alpha_k^o \) is known. Substituting the above expression into \( W^*\), we get:
\[ W^* = C\cdot\Theta^o \]
where the matrix \( \Theta^o\triangleq[\alpha_1^o, \alpha_2^o, \cdots, \alpha_N^o]\in\mathbb{R}^{r^* \times N} \) is also known.
### Centralized Optimization Problem
Based on the above assumptions, the centralized optimization problem can be expressed as:
\[ \arg\min_{w_{\ell}:\ell\in\mathcal{N}}\sum_{\ell = 1}^N J_{\ell}(w_{\ell}) \]
\[ \text{s.t. }[W^{\top}](:, j)\in\mathcal{R}([\Theta^o]^{\top}),\quad\forall j \]
where \( W\triangleq[w_{\ell}]_{\ell\in\mathcal{N}} \) is the estimated value of \( W^*\), and \( \mathcal{R}(\cdot) \) represents the range space operator.
### Distributed Optimization Problem
Since the network is connected and only local data exchange is allowed in distributed processing, for each node \(