Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging

Chung-En Tsai,Hao-Chung Cheng,Yen-Huan Li
2024-03-11
Abstract:Consider the problem of minimizing an expected logarithmic loss over either the probability simplex or the set of quantum density matrices. This problem includes tasks such as solving the Poisson inverse problem, computing the maximum-likelihood estimate for quantum state tomography, and approximating positive semi-definite matrix permanents with the currently tightest approximation ratio. Although the optimization problem is convex, standard iteration complexity guarantees for first-order methods do not directly apply due to the absence of Lipschitz continuity and smoothness in the loss function. In this work, we propose a stochastic first-order algorithm named $B$-sample stochastic dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d^2/\varepsilon^2)$ time, matching the state of the art, where $d$ denotes the dimension. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $\varepsilon$-optimal solution in $\smash{\tilde{O}}(d^3/\varepsilon^2)$ time. This improves on the time complexities of existing stochastic first-order methods by a factor of $d^{\omega-2}$ and those of batch methods by a factor of $d^2$, where $\omega$ denotes the matrix multiplication exponent. Numerical experiments demonstrate that empirically, our algorithm outperforms existing methods with explicit complexity guarantees.
Optimization and Control,Machine Learning,Quantum Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to minimize the expected logarithmic loss on the probability simplex or the set of quantum density matrices. Specifically, this problem includes solving the Poisson inverse problem, computing the maximum - likelihood estimation of quantum state tomography, and approximately computing the permanent of positive - semidefinite matrices. Although the optimization problem is convex, due to the lack of Lipschitz continuity and smoothness of the loss function, the iterative complexity guarantees of standard first - order methods are not directly applicable. ### Main Contributions 1. **Algorithm Proposal**: The paper proposes a mini - batch stochastic first - order algorithm based on stochastic dual averaging and the logarithmic barrier (B - sample Stochastic Dual Averaging with the Logarithmic Barrier, LB - SDA) to solve the above - mentioned optimization problems. 2. **Theoretical Analysis**: By introducing new local - norm analysis and the smoothness characteristics of self - concordant functions, the paper proves that the B - sample LB - SDA has a time complexity of \(\tilde{O}(d^2/\epsilon^2)\) in the classical setting and \(\tilde{O}(d^3/\epsilon^2)\) in the quantum setting. This is \(d^{\omega - 2}\) times better in terms of dimension - dependence than existing stochastic first - order methods, where \(\omega\) is the matrix multiplication exponent. 3. **Numerical Experiments**: The experimental results show that the 1 - sample LB - SDA is currently the fastest method with explicit complexity guarantees when solving the Poisson inverse problem, and the d - sample LB - SDA outperforms all other methods in terms of fidelity when computing the maximum - likelihood estimation of quantum state tomography. ### Problems Solved - **Poisson Inverse Problem**: Widely used in medical imaging and astronomical image denoising. - **Maximum - Likelihood Estimation of Quantum State Tomography**: Used for the basic task of verifying quantum devices. - **Permanent Approximation of Positive - Semidefinite Matrices**: Used to estimate the output probabilities of boson - sampling experiments. ### Theoretical Challenges - **Lack of Lipschitz Continuity and Smoothness**: Standard first - order methods such as mirror descent and dual averaging cannot be directly applied. - **Scalability for High Dimensions and Large Data Sets**: The time complexity of batch - processing methods depends at least linearly on the sample size, and second - order methods perform poorly in high dimensions. ### Experimental Results - **Poisson Inverse Problem**: The 1 - sample LB - SDA outperforms all methods with explicit complexity guarantees in terms of normalized estimation error, although it is slightly inferior to EMD and SPDHG in terms of optimization error, but the latter only guarantees asymptotic convergence. - **Quantum State Tomography**: The d - sample LB - SDA outperforms all methods in terms of fidelity and also performs well in terms of optimization error. In conclusion, by proposing the B - sample LB - SDA algorithm, this paper not only provides better time - complexity guarantees theoretically but also demonstrates excellent performance in practical applications.