Abstract:System identification is a fundamental problem in reinforcement learning, control theory and signal processing, and the non-asymptotic analysis of the corresponding sample complexity is challenging and elusive, even for linear time-varying (LTV) systems. To tackle this challenge, we develop an episodic block model for the LTV system where the model parameters remain constant within each block but change from block to block. Based on the observation that the model parameters across different blocks are related, we treat each episodic block as a learning task and then run meta-learning over many blocks for system identification, using two steps, namely offline meta-learning and online adaptation. We carry out a comprehensive non-asymptotic analysis of the performance of meta-learning based system identification. To deal with the technical challenges rooted in the sample correlation and small sample sizes in each block, we devise a new two-scale martingale small-ball approach for offline meta-learning, for arbitrary model correlation structure across blocks. We then quantify the finite time error of online adaptation by leveraging recent advances in linear stochastic approximation with correlated samples.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the sample complexity problem in system identification in linear time - varying systems (LTV systems). Specifically, the research aims to improve the parameter estimation efficiency of linear time - varying systems through the meta - learning method, especially in the case of a limited number of samples. Traditional non - asymptotic analysis has made progress in dealing with linear time - invariant systems (LTI systems), but for linear time - varying systems, especially when the environment changes rapidly, how to effectively use a small amount of data to quickly adapt to new model parameters is still a challenge.
### Main contributions of the paper
1. **Proposed a meta - learning - based linear time - varying system identification method**:
- The authors proposed an episodic block model, in which the model parameters within each block remain unchanged, but the parameters change randomly between different blocks.
- Based on this model, the meta - learning method is used for offline learning on multiple blocks to obtain a good model initialization, and then the data of new blocks are adapted online, thereby achieving fast and effective parameter estimation.
2. **Provided non - asymptotic performance analysis of meta - learning in LTV system identification**:
- In response to the problems of sample correlation and small sample size, a new two - scale martingale small - ball approach was designed to deal with the technical challenges in offline meta - learning.
- The finite - time error in the online adaptation phase was quantified, and the linear stochastic approximation algorithm was used to handle the case of correlated samples.
3. **Analyzed the model estimation error**:
- Through the multi - step gradient descent algorithm, which was reformulated as a linear stochastic approximation algorithm, the upper bound of the finite - time error was derived, revealing that the error between the model estimator and the true model decays exponentially over time.
### Key issues
- **The distance between meta - learning initialization and true model parameters**: The distance \(\|\phi^*_\theta - \phi_j\|\) between the meta - learning initialization \(\phi^*_\theta\) and the true model parameters \(\phi_j\) of a given block was studied, and the influence of the training set size \(M\), the test set size \(L - M\), and the number of trajectories \(D\) on this distance was analyzed.
- **The influence of model similarity**: Assuming that there is a certain similarity between model parameters in different blocks, the influence of this similarity on the parameter estimation error was studied.
### Conclusion
Through the meta - learning method, parameter estimation of linear time - varying systems can be effectively carried out in the case of a limited number of samples. The paper not only provides a new system identification method but also conducts a strict theoretical analysis of its performance, providing an important theoretical basis for understanding the application of meta - learning in dynamic systems.