Abstract:We consider the problem of sampling from a log-concave distribution $\pi(\theta) \propto e^{-f(\theta)}$ constrained to a polytope $K:=\{\theta \in \mathbb{R}^d: A\theta \leq b\}$, where $A\in \mathbb{R}^{m\times d}$ and $b \in \mathbb{R}^m$.The fastest-known algorithm \cite{mangoubi2022faster} for the setting when $f$ is $O(1)$-Lipschitz or $O(1)$-smooth runs in roughly $O(md \times md^{\omega -1})$ arithmetic operations, where the $md^{\omega -1}$ term arises because each Markov chain step requires computing a matrix inversion and determinant (here $\omega \approx 2.37$ is the matrix multiplication constant). We present a nearly-optimal implementation of this Markov chain with per-step complexity which is roughly the number of non-zero entries of $A$ while the number of Markov chain steps remains the same. The key technical ingredients are 1) to show that the matrices that arise in this Dikin walk change slowly, 2) to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and 3) to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

What problem does this paper attempt to address?

### The problem the paper attempts to solve The paper aims to solve the problem of efficient sampling from a log - concave distribution defined on a polyhedron $K$. Specifically, given a polyhedron $K :=\{\theta\in\mathbb{R}^d:A\theta\leq b\}$, where $A\in\mathbb{R}^{m\times d}$ and $b\in\mathbb{R}^m$, and a convex function $f:K\rightarrow\mathbb{R}$, the goal is to sample a point $\theta\in K$ from the distribution $\pi(\theta)\propto e^{-f(\theta)}$. This problem is important in many applications, including Bayesian inference, differential privacy optimization, and integration. For example, in Bayesian Lasso logistic regression, $f(\theta)=\sum_{i = 1}^n\ell(\theta^{\top}x_i)$, where $\ell$ is the logistic loss function, $x_i$ are data points and $\|x_i\|_2\leq1$, and $K =\{\theta\in\mathbb{R}^d:\|\theta\|_1\leq O(1)\}$ is an $\ell_1$-ball (see references [32, 36, 17]). ### Main contributions The main contribution of the paper is to propose an almost optimal implementation method, reducing the per - step complexity to approximately the number of non - zero elements $\text{nnz}(A)$ of the matrix $A$, while keeping the number of Markov chain steps unchanged. The key technical components include: 1. **Prove that the matrix changes slowly in the Dikin walk**: Show that in each iteration, the matrix in the Dikin walk changes very slowly. 2. **Use an efficient linear solver**: Utilize this slow change to accelerate matrix inversion by using the information calculated in the previous step. 3. **Accelerate the determinant calculation in the Metropolis filtering step**: Accelerate the determinant calculation through an estimator based on the stochastic Taylor series. These improvements directly improve the running time of sampling from the Gibbs distribution constrained to a polyhedron in applications such as Bayesian statistics and privacy optimization. ### Specific results According to the main result (Theorem 2.1) of the paper, the proposed algorithm has a significant efficiency improvement in the following cases: - When $f$ is $L$-Lipschitz or $\beta$-smooth, the sampling complexity is respectively: - $\tilde{O}((md + dL^2R^2)\log(w/\delta))\times(\text{nnz}(A)+d^2+T_f)$ arithmetic operations for the $L$-Lipschitz function $f$; - $\tilde{O}((md + d\beta R^2)\log(w/\delta))\times(\text{nnz}(A)+d^2+T_f)$ arithmetic operations for the $\beta$-smooth function $f$. Compared with previous algorithms, the method in this paper significantly reduces the running time in many cases, especially when $T_f=\Omega(d^2)$, the improvement is at least $d^{\omega - 2}$. If $A$ is a sparse matrix (i.e., $\text{nnz}(A)=O(m)$), the improvement is more obvious, depending on the relationship between $m$ and $d$. ### Application examples In the example of Bayesian Lasso logistic regression, when $f$ is $\beta$-smooth and $L$-Lip

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk

Gaussian Cooling and Dikin Walks: The Interior-Point Method for Logconcave Sampling

Sampling with Barriers: Faster Mixing via Lewis Weights

Faster high-accuracy log-concave sampling via algorithmic warm starts

Efficient Sampling from Time-Varying Log-Concave Distributions

Double Randomized Underdamped Langevin with Dimension-Independent Convergence Guarantee.

Query lower bounds for log-concave sampling

Linear-memory and Decomposition-invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes

Fast Convergence of Langevin Dynamics on Manifold: Geodesics meet Log-Sobolev

Adaptive complexity of log-concave sampling

Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling

Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel

Constrained Sampling with Primal-Dual Langevin Monte Carlo

PolytopeWalk: Sparse MCMC Sampling over Polytopes

Fast parallel sampling under isoperimetry

Sampling and Integration of Logconcave Functions by Algorithmic Diffusion

Statistical-Computational Trade-offs for Density Estimation

PolyCD: Optimization via Cycling through the Vertices of a Polytope

Log-Concave Sampling on Compact Supports: A Versatile Proximal Framework

Sampling-Based Methods for Multi-Block Optimization Problems over Transport Polytopes