Recovering a Message from an Incomplete Set of Noisy Fragments

Aditya Narayan Ravi,Alireza Vahid,Ilan Shomorony
2024-07-08
Abstract:We consider the problem of communicating over a channel that breaks the message block into fragments of random lengths, shuffles them out of order, and deletes a random fraction of the fragments. Such a channel is motivated by applications in molecular data storage and forensics, and we refer to it as the torn-paper channel. We characterize the capacity of this channel under arbitrary fragment length distributions and deletion probabilities. Precisely, we show that the capacity is given by a closed-form expression that can be interpreted as F - A, where F is the coverage fraction ,i.e., the fraction of the input codeword that is covered by output fragments, and A is an alignment cost incurred due to the lack of ordering in the output fragments. We then consider a noisy version of the problem, where the fragments are corrupted by binary symmetric noise. We derive upper and lower bounds to the capacity, both of which can be seen as F - A expressions. These bounds match for specific choices of fragment length distributions, and they are approximately tight in cases where there are not too many short fragments.
Information Theory
What problem does this paper attempt to address?