DBTRG: De Bruijn Trim Rotation Graph Encoding for Reliable DNA Storage

Yunzhu Zhao,Ben Cao,Penghao Wang,Kun Wang,Bin Wang
DOI: https://doi.org/10.1016/j.csbj.2023.09.004
IF: 6.155
2023-09-13
Computational and Structural Biotechnology Journal
Abstract:Highlights • Dynamic binary sequence and original binary sequence XOR postpartition k-mers construct the De Bruijn Trim graph. • Deleting a repeating base pair in a connected node can ensure base balance and diversity and reduce the occurrence probability of undesired motifs. • Modify the rotating tree algorithm to satisfy the homopolymer length of 2. • Improve error-correcting capabilities through the use of RS codes. DNA is a high-density, long-term stable, and scalable storage medium that can meet the increased demands on storage media resulting from the exponential growth of data. The existing DNA storage encoding schemes tend to achieve high-density storage but do not fully consider the local and global stability of DNA sequences and the read and write accuracy of the stored information. To address these problems, this article presents a graph-based De Bruijn Trim Rotation Graph (DBTRG) encoding scheme. Through XOR between the proposed dynamic binary sequence and the original binary sequence, k-mers can be divided into the De Bruijn Trim graph, and the stored information can be compressed according to the overlapping relationship. The simulated experimental results show that DBTRG ensures base balance and diversity, reduces the likelihood of undesired motifs, and improves the stability of DNA storage and data recovery. Furthermore, the maintenance of an encoding rate of 1.92 while storing 510 KB images and the introduction of novel approaches and concepts for DNA storage encoding methods are achieved. Graphical abstract Download : Download high-res image (71KB) Download : Download full-size image
biochemistry & molecular biology
What problem does this paper attempt to address?