D2Sim: A Computational Simulator for Nanopore Sequencing based DNA Data Storage

Subhasiny Sankar,Wang Yixin,Md. Noor-A-Rahim,Erry Gunawan,Yong Liang Guan,Chueh Loo Poh
DOI: https://doi.org/10.1101/2024.03.17.585393
2024-03-17
Abstract:DNA data storage has gained significant attention due to its high storage density and durability. However, errors during storage and reading processes compromise data integrity, prompting research into error correction strategies. Researchers have been exploring physical redundancy (data copies) and logical redundancy (added redundancy in error-correcting codes) to mitigate errors. Evaluating these designs and reconstruction methods typically involves time-consuming and costly trials. To streamline this process, We designed a computational channel simulator namely D2Sim for Nanopore sequencing-based DNA data storage. This simulator mimics real experiments, generating data with distribution and errors at the receiver. Integrated with DeepSimulator, D2Sim outputs signals closely resembling actual signals of Nanopore-based DNA storage. Comparative analysis reveals that the proposed simulator yields 16.7% to 88.7% lower sample difference deviations than signals from DeepSimulator alone. This cost-effective and time-efficient tool facilitates the assessment of physical and logical redundancy for data reconstruction in DNA data storage without the need for real-time experiments.
Bioengineering
What problem does this paper attempt to address?