Abstract:State inference and parameter learning in sequential models can be successfully performed with approximation techniques that maximize the evidence lower bound to the marginal log-likelihood of the data distribution. These methods may be referred to as Dynamical Variational Autoencoders, and our specific focus lies on the deep Kalman filter. It has been shown that the ELBO objective can oversimplify data representations, potentially compromising estimation quality. Tighter Monte Carlo objectives have been proposed in the literature to enhance generative modeling performance. For instance, the IWAE objective uses importance weights to reduce the variance of marginal log-likelihood estimates. In this paper, importance sampling is applied to the DKF framework for learning deep Markov models, resulting in the IW-DKF, which shows an improvement in terms of log-likelihood estimates and KL divergence between the variational distribution and the transition model. The framework using the sampled DKF update rule is also accommodated to address sequential state and parameter estimation when working with highly non-linear physics-based models. An experiment with the 3-space Lorenz attractor shows an enhanced generative modeling performance and also a decrease in RMSE when estimating the model parameters and latent states, indicating that tighter MCOs lead to improved state inference performance.

What problem does this paper attempt to address?

This paper aims to solve the problem that using the standard variational auto - encoder (VAE) objective function (i.e., evidence lower bound, ELBO) in deep sequential state estimation may lead to an overly simplified data representation. Specifically, the paper focuses on the deep Kalman filter (DKF) framework and introduces tighter Monte Carlo objectives (MCOs), especially the objective function of the importance - weighted auto - encoder (IWAE), to improve the generative modeling performance and the quality of state estimation. ### Main research problems 1. **Simplified data representation**: The standard ELBO objective function may cause the model's representation of data to be overly simplified, which may affect the quality of state estimation. 2. **Improvement of generative modeling performance**: Improve the performance of the generative model by introducing tighter MCOs, such as the IWAE objective function. 3. **State and parameter estimation of nonlinear physical models**: Evaluate the influence of the IWAE objective function on state and parameter estimation when dealing with highly nonlinear physical models. ### Solutions The paper proposes the importance - weighted deep Kalman filter (IW - DKF). This method improves the state - estimation performance by applying sampling techniques in the DKF framework and using K - sample importance - weighted estimation of the marginal log - likelihood. Specific improvements include: - **Generative modeling performance**: Experimental results show that using IW - DKF can improve the performance of generative modeling, especially in the case of Gaussian mixture models (DMMs) and the three - dimensional Lorenz attractor model. - **State and parameter estimation**: On the three - dimensional Lorenz attractor model, IW - DKF shows better performance in parameter estimation and state estimation, especially in reducing the root - mean - square error (RMSE). ### Experimental verification 1. **DMM learning on the polyphonic music dataset**: - **Settings**: Use the polyphonic music dataset, where the training, validation, and test sets contain 220, 76, and 77 sequences respectively. - **Results**: As the number of samples K increases, the log - likelihood estimate of IW - DKF gradually increases, and the standard deviation significantly decreases, indicating that the stability of the model is enhanced. 2. **State estimation of the three - dimensional Lorenz attractor model**: - **Settings**: Use the three - dimensional Lorenz attractor model, which is a nonlinear chaotic system. - **Results**: IW - DKF shows better performance in parameter estimation and state estimation, especially with significant improvements in the error of parameter estimation and the RMSE of state estimation. ### Conclusions By introducing IW - DKF, the paper successfully solves the problem of simplified data representation in deep sequential state estimation and achieves significant performance improvements in generative modeling and state estimation. Future research directions include further comparing the performance of different MCOs in state estimation and methods for directly optimizing the variational distribution.

On the Impact of Sampling on Deep Sequential State Estimation

Sequential Ensemble-Based Optimal Design for Parameter Estimation

A Stochastic Approximation-Langevinized Ensemble Kalman Filter Algorithm for State Space Models with Unknown Parameters

Nonlinear Assimilation with Score-based Sequential Langevin Sampling

Sample as You Infer: Predictive Coding With Langevin Dynamics

Importance sampling for online variational learning

Model error covariance estimation in particle and ensemble Kalman filters using an online expectation-maximization algorithm

Ensemble Kalman Variational Objectives: Nonlinear Latent Trajectory Inference with A Hybrid of Variational Inference and Ensemble Kalman Filter

Variationally Inferred Sampling through a Refined Bound

A Dynamical System View of Langevin-Based Non-Convex Sampling

Iterated INLA for State and Parameter Estimation in Nonlinear Dynamical Systems

Stochastic parameterization identification using ensemble Kalman filtering combined with expectation-maximization and Newton-Raphson maximum likelihood methods

Importance sampling for rare event tracking within the ensemble Kalman filtering framework

The discriminative Kalman filter for nonlinear and non-Gaussian sequential Bayesian filtering

Ensemble Kalman filter based sequential Monte Carlo sampler for sequential Bayesian inference

Streaming Variational Monte Carlo

Interacting Langevin Diffusions: Gradient Structure And Ensemble Kalman Sampler

Real-Time Variational Method for Learning Neural Trajectory and its Dynamics

Online Joint State Inference and Learning of Partially Unknown State-Space Models

A Reduced Basis Ensemble Kalman Method

Joint State Estimation and Noise Identification Based on Variational Optimization