A dual-rule encoding DNA storage system using chaotic mapping to control GC content

Xuncai Zhang,Baonan Qi,Ying Niu
DOI: https://doi.org/10.1093/bioinformatics/btae113
IF: 5.8
2024-02-28
Bioinformatics
Abstract:Abstract Motivation DNA as a novel storage medium is considered an effective solution to the world's growing demand for information due to its high density and long-lasting reliability. However, early coding schemes ignored the biologically constrained nature of DNA sequences in pursuit of high density, leading to DNA synthesis and sequencing difficulties. This paper proposes a novel DNA storage coding scheme. The system encodes half of the binary data using each of the two GC-content complementary encoding rules to obtain a DNA sequence. Results After simulating the encoding of representative document and image file formats, a DNA sequence strictly conforming to biological constraints was obtained, reaching a coding potential of 1.66 bit/nt. In the decoding process, a mechanism to prevent error propagation was introduced. The simulation results demonstrate that by adding Reed-Solomon code, 90% of the data can still be recovered after introducing a 2% error, proving that the proposed DNA storage scheme has high robustness and reliability
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?