Abstract:In this paper, we prove that with high probability, random Reed-Solomon codes approach the half-Singleton bound - the optimal rate versus error tradeoff for linear insdel codes - with linear-sized alphabets. More precisely, we prove that, for any $\epsilon>0$ and positive integers $n$ and $k$, with high probability, random Reed--Solomon codes of length $n$ and dimension $k$ can correct $(1-\varepsilon)n-2k+1$ adversarial insdel errors over alphabets of size $n+2^{\mathsf{poly}(1/\varepsilon)}k$. This significantly improves upon the alphabet size demonstrated in the work of Con, Shpilka, and Tamo (IEEE TIT, 2023), who showed the existence of Reed--Solomon codes with exponential alphabet size $\widetilde O\left(\binom{n}{2k-1}^2\right)$ precisely achieving the half-Singleton bound. Our methods are inspired by recent works on list-decoding Reed-Solomon codes. Brakensiek-Gopi-Makam (STOC 2023) showed that random Reed-Solomon codes are list-decodable up to capacity with exponential-sized alphabets, and Guo-Zhang (FOCS 2023) and Alrabiah-Guruswami-Li (STOC 2024) improved the alphabet-size to linear. We achieve a similar alphabet-size reduction by similarly establishing strong bounds on the probability that certain random rectangular matrices are full rank. To accomplish this in our insdel context, our proof combines the random matrix techniques from list-decoding with structural properties of Longest Common Subsequences.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is whether random Reed - Solomon codes can approach the half - Singleton bound under the insertion - deletion (insdel) error model, that is, whether these codes can correct a near - optimal number of insdel errors with high probability in the case of a linear - size alphabet. Specifically, the goal of the paper is to prove that for any $\varepsilon>0$ and positive integers $n$ and $k$, a random Reed - Solomon code (with length $n$ and dimension $k$) can correct $(1 - \varepsilon)n-2k + 1$ adversarial insdel errors with high probability when the alphabet size is $n+2\text{poly}(1/\varepsilon)k$. This significantly improves the work of Con, Shpilka, and Tamo (IEEE TIT, 2023), which shows that there exist Reed - Solomon codes that exactly reach the half - Singleton bound on an exponentially - sized alphabet. ### Background and Motivation - **Error - Correcting Codes**: Error - correcting codes (ECC) aim to recover the original information from corrupted data. Common error models include substitution errors and erasure errors, while synchronization errors (such as insertion and deletion errors) are more complex. - **Reed - Solomon Codes**: Reed - Solomon codes are a widely - used type of linear code, usually used to correct substitution errors. However, their performance in handling insdel errors is not clear. - **Linear Codes vs. Non - linear Codes**: The performance of linear codes in handling insdel errors is usually not as good as that of non - linear codes, but linear codes have the advantages of compact representation, efficient encoding and decoding algorithms, etc. Therefore, it is still very important to study the performance of linear codes under the insdel model. ### Main Contributions of the Paper - **Main Result**: The paper proves that random Reed - Solomon codes can approach the half - Singleton bound with high probability in the case of a linear - size alphabet and correct $(1 - \varepsilon)n-2k + 1$ insdel errors. - **Method**: The paper draws on recent work on list - decoding Reed - Solomon codes and proves that certain random rectangular matrices are full - rank by establishing strong probability bounds. In addition, the paper combines the structural properties of the longest common subsequence to adapt to the insdel error model. ### Technical Details - **V - matrix**: The paper defines a V - matrix $V_{k,\ell,I,J}$. If this matrix is not full - rank, it means that there are two different codewords whose longest common subsequence exceeds a certain length, resulting in the inability to correct insdel errors. - **Probability Amplification**: By introducing the "relaxation" parameter $\varepsilon$, the paper proves that the probability that the V - matrix is not full - rank can be significantly reduced, so that the entire code can correct more insdel errors with high probability. ### Future Research Directions 1. **Lower Bound of Field Size**: Further study the minimum field size required for linear codes (not just Reed - Solomon codes) to approach the half - Singleton bound. 2. **Explicit Construction**: Provide an explicit construction of Reed - Solomon codes so that they can efficiently correct insdel errors. 3. **Decoding Algorithm**: Design an efficient decoding algorithm so that Reed - Solomon codes can handle insdel errors. 4. **Affine Codes**: Study the performance of affine codes under the insdel error model, especially whether they can approach or reach the Singleton bound. Through these studies, the paper not only promotes theoretical understanding but also provides an important foundation for practical applications.

Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets

Randomly punctured Reed--Solomon codes achieve list-decoding capacity over linear-sized fields

Randomly Punctured Reed-Solomon Codes Achieve the List Decoding Capacity over Polynomial-Size Alphabets

List Decoding Reed-Solomon, Algebraic-Geometric, and Gabidulin Subcodes Up to the Singleton Bound.

Random Reed-Solomon Codes and Random Linear Codes are Locally Equivalent

Optimal Two-Dimensional Reed--Solomon Codes Correcting Insertions and Deletions

Improved Singleton bound on insertion-deletion codes and optimal constructions

Bounds and Constructions for Insertion and Deletion Codes

Improved List-Decodability of Reed--Solomon Codes via Tree Packings

Generic Reed-Solomon Codes Achieve List-decoding Capacity

Explicit Constructions of Two-Dimensional Reed-Solomon Codes in High Insertion and Deletion Noise Regime

Folded Codes from Function Field Towers and Improved Optimal Rate List Decoding

Randomized Decoding of Linearized Reed-Solomon Codes Beyond the Unique Decoding Radius

AG codes have no list-decoding friends: Approaching the generalized Singleton bound requires exponential alphabets

Improved List-Decodability and List-Recoverability of Reed–Solomon Codes via Tree Packings

Strict Half-Singleton Bound, Strict Direct Upper Bound for Linear Insertion-Deletion Codes and Optimal Codes

Explicit Folded Reed-Solomon and Multiplicity Codes Achieve Relaxed Generalized Singleton Bounds

On Relaxed Locally Decodable Codes for Hamming and Insertion-Deletion Errors

Efficient List-decoding of Polynomial Ideal Codes with Optimal List Size

Combinatorial list-decoding of Reed-Solomon codes beyond the Johnson radius