Abstract:Oblivious dimension reduction, à la the Johnson-Lindenstrauss (JL) Lemma, is a fundamental approach for processing high-dimensional data. We study this approach for Uniform Facility Location (UFL) on a Euclidean input $X\subset\mathbb{R}^d$, where facilities can lie in the ambient space (not restricted to $X$). Our main result is that target dimension $m=\tilde{O}(\epsilon^{-2}\mathrm{ddim})$ suffices to $(1+\epsilon)$-approximate the optimal value of UFL on inputs whose doubling dimension is bounded by $\mathrm{ddim}$. It significantly improves over previous results, that could only achieve $O(1)$-approximation [Narayanan, Silwal, Indyk, and Zamir, ICML 2021] or dimension $m=O(\epsilon^{-2}\log n)$ for $n=|X|$, which follows from [Makarychev, Makarychev, and Razenshteyn, STOC 2019]. Our oblivious dimension reduction has immediate implications to streaming and offline algorithms, by employing known algorithms for low dimension. In dynamic geometric streams, it implies a $(1+\epsilon)$-approximation algorithm that uses $O(\epsilon^{-1}\log n)^{\tilde{O}(\mathrm{ddim}/\epsilon^{2})}$ bits of space, which is the first streaming algorithm for UFL to utilize the doubling dimension. In the offline setting, it implies a $(1+\epsilon)$-approximation algorithm, which we further refine to run in time $( (1/\epsilon)^{\tilde{O}(\mathrm{ddim})} d + 2^{(1/\epsilon)^{\tilde{O}(\mathrm{ddim})}}) \cdot \tilde{O}(n) $. Prior work has a similar running time but requires some restriction on the facilities [Cohen-Addad, Feldmann and Saulpic, JACM 2021]. Our main technical contribution is a fast procedure to decompose an input $X$ into several $k$-median instances for small $k$. This decomposition is inspired by, but has several significant differences from [Czumaj, Lammersen, Monemizadeh and Sohler, SODA 2013], and is key to both our dimension reduction and our PTAS.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the Euclidean space with bounded doubling dimension, how to approximately solve the Uniform Facility Location (UFL) problem by reducing the dimension and achieve a near - optimal solution. ### Specific problem description 1. **Background and challenges** - **High - dimensional data processing**: When processing data in high - dimensional space, dimension - reduction techniques are usually used. The Johnson - Lindenstrauss (JL) lemma is a commonly used dimension - reduction method, but it may not provide sufficient precision in some cases. - **UFL problem**: The goal of the UFL problem is, given a set of data points $X\subset\mathbb{R}^d$ and an overhead cost $f > 0$, to find a set of facilities $F\subset\mathbb{R}^d$ to minimize the objective cost function: \[ \text{cost}(X,F):=f\cdot|F|+\sum_{x\in X}\text{dist}(x,F), \] where $\text{dist}(x,F)=\min_{y\in F}\|x - y\|_2$. 2. **Limitations of existing methods** - **Previous dimension - reduction results**: For the UFL problem, previous dimension - reduction results can only achieve $O(1)$-approximate solutions [NSIZ21], or require a relatively high dimension $m = O(\varepsilon^{-2}\log n)$ [MMR19]. - **Impact of doubling dimension**: When the doubling dimension of the input data is low, the performance of these methods can be significantly improved, but there is still room for improvement. ### Main contributions of the paper 1. **New dimension - reduction results** - The paper proposes a new dimension - reduction method, such that the target dimension $m=\tilde{O}(\varepsilon^{-2}\text{ddim}(X))$ is sufficient to achieve a $(1 + \varepsilon)$-approximate solution. Here, $\text{ddim}(X)$ is the doubling dimension of the input data. - This result is a significant improvement over previous methods, especially when the doubling dimension is low. 2. **Theoretical and algorithmic implications** - **Offline algorithm**: By transforming the high - dimensional problem into a low - dimensional problem, a $(1 + \varepsilon)$-approximate solution can be achieved within the time complexity of $2^{(1/\varepsilon)\tilde{O}(\text{ddim}(X)/\varepsilon^2)}\cdot dn(\log n)^{\tilde{O}(\text{ddim}(X)/\varepsilon^2)}$. - **Streaming algorithm**: In a dynamic geometric flow environment, this method can achieve a $(1 + \varepsilon)$-approximate solution and only requires $O(\varepsilon^{-1}\log n)\cdot\tilde{O}(\text{ddim}/\varepsilon^2)$ bits of space. 3. **Technical contributions** - **New decomposition procedure**: The paper introduces a new metric decomposition method, which decomposes the UFL instance into multiple small - scale k - median instances, thus achieving more efficient solutions. - **Probability guarantee**: Through random linear mapping and probability analysis, it is ensured that the solution after dimension reduction is still near - optimal with high probability. ### Summary This paper addresses the challenges of solving the UFL problem in high - dimensional space with bounded doubling dimension by introducing new dimension - reduction techniques and metric decomposition methods, and provides more...

Near-Optimal Dimension Reduction for Facility Location

Moderate Dimension Reduction for $k$-Center Clustering

Parallel Approximation Algorithms for Facility-Location Problems

On Probabilistic Embeddings in Optimal Dimension Reduction

Local Feature Discriminant Projection

On Facility Location Problem in the Local Differential Privacy Model

Large-Scale Distributed Algorithms for Facility Location with Outliers

Facility Location Problem in Differential Privacy Model Revisited

Non-metric multicommodity and multilevel facility location

Super-Fast Distributed Algorithms for Metric Facility Location

Optimized Dimensionality Reduction for Moment-based Distributionally Robust Optimization

The Johnson-Lindenstrauss Lemma for Clustering and Subspace Approximation: From Coresets to Dimension Reduction

The Min-Dist Location Selection and Facility Replacement Queries

Sequential Competitive Facility Location: Exact and Approximate Algorithms

A Super-Fast Distributed Algorithm for Bipartite Metric Facility Location

A scalable solution for the extended multi-channel facility location problem

Robust sufficient dimension reduction via α-distance covariance

Improved Lower Bound for Differentially Private Facility Location

An O(loglog n)-Approximation for Submodular Facility Location

Optimality of the Johnson-Lindenstrauss Dimensionality Reduction for Practical Measures

Streaming Euclidean Max-Cut: Dimension Vs Data Reduction