Abstract:Background Usually, next generation sequencing (NGS) technology has the property of ultra-high throughput but the read length is remarkably short compared to conventional Sanger sequencing. Paired-end NGS could computationally extend the read length but with a lot of practical inconvenience because of the inherent gaps. Now that Illumina paired-end sequencing has the ability of read both ends from 600 bp or even 800 bp DNA fragments, how to fill in the gaps between paired ends to produce accurate long reads is intriguing but challenging. Results We have developed a new technology, referred to as pseudo-Sanger (PS) sequencing. It tries to fill in the gaps between paired ends and could generate near error-free sequences equivalent to the conventional Sanger reads in length but with the high throughput of the Next Generation Sequencing. The major novelty of PS method lies on that the gap filling is based on local assembly of paired-end reads which have overlaps with at either end. Thus, we are able to fill in the gaps in repetitive genomic region correctly. The PS sequencing starts with short reads from NGS platforms, using a series of paired-end libraries of stepwise decreasing insert sizes. A computational method is introduced to transform these special paired-end reads into long and near error-free PS sequences, which correspond in length to those with the largest insert sizes. The PS construction has 3 advantages over untransformed reads: gap filling, error correction and heterozygote tolerance. Among the many applications of the PS construction is de novo genome assembly, which we tested in this study. Assembly of PS reads from a non-isogenic strain of Drosophila melanogaster yields an N50 contig of 190 kb, a 5 fold improvement over the existing de novo assembly methods and a 3 fold advantage over the assembly of long reads from 454 sequencing. Conclusions Our method generated near error-free long reads from NGS paired-end sequencing. We demonstrated that de novo assembly could benefit a lot from these Sanger-like reads. Besides, the characteristic of the long reads could be applied to such applications as structural variations detection and metagenomics.

Optimizing of Cdna Preparation for Next Generation Sequencing

Improving the Diversity of Captured Full-Length Isoforms Using a Normalized Single-Molecule RNA-sequencing Method

Pseudo-Sanger Sequencing: Massively Parallel Production of Long and Near Error-Free Reads Using NGS Technology

Poly(A) capture full length cDNA sequencing improves the accuracy and detection ability of transcript quantification and alternative splicing events

[Optimization of T7-based RNA Amplification System for Cdna Microarray].

A high-throughput SNP discovery strategy for RNA-seq data

Digital Rna Sequencing Minimizes Sequence-Dependent Bias And Amplification Noise With Optimized Single-Molecule Barcodes

Next-generation sequencing applied to flower development: RNA-seq.

Evaluating methods for isolating total RNA and predicting the success of sequencing phylogenetically diverse plant transcriptomes

Optimization of Two-cycle T7-based RNA Amplification System for Oligonucleotide DNA Microarray

Direct RNA sequencing coupled with adaptive sampling enriches RNAs of interest in the transcriptome

Optimization of library preparation based on SMART for ultralow RNA-seq in mice brain tissues

A Generic Plant RNA Isolation Method Suitable for RNA-Seq and Suppression Subtractive Hybridization.

Tagmentation on Microbeads: Restore Long-Range DNA Sequence Information Using Next Generation Sequencing with Library Prepared by Surface-Immobilized Transposomes.

Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations

Improved precision, sensitivity, and adaptability of Ordered Two-Template Relay cDNA library preparation for RNA sequencing

Optimizing total RNA extraction method for human and mice samples

Large-Scale in Vitro Transcription, RNA Purification and Chemical Probing Analysis

Optimized Method for Robust Transcriptome Profiling of Minute Tissues Using Laser Capture Microdissection and Low-Input RNA-Seq

A Review On The Processing And Analysis Of Next-Generation Rna-Seq Data

[Investigation of optimum concentrations of betaine for improving the resolution of sequencing G-C rich DNA with trinucleotide repeats]