Abstract:Next-generation sequencing (NGS) technologies, such as Illumina/Solexa, ABI/SOLiD, and Roche/454 Pyrosequencing, are revolutionizing the acquisition of genomic data at relatively low cost. NGS technologies are rapidly changing the approach to complex genomic studies, opening a way to the development of personalized drugs and personalized medicine. NGS technologies use massive throughput sequencing to obtain relatively short reads. NGS technologies will generate enormous datasets, in which even small genomic projects may generate terabytes of data. Therefore, new computational methods are needed to analyze a wide range of genetic information and to assist data interpretation and downstream applications, including high-throughput polymorphism detections, comparative genomics, prediction of gene function and protein structure, transcriptome analysis, mutation detection and confirmation, genome mapping, and drug design. The creation of large-scale datasets now poses a great computational challenge. It will be imperative to improve software pipelines, so that we can analyze genome data more efficiently. Until now, many new computational methods have been proposed to cope with the big biological data, especially NGS sequence data. Also, many successful bioinformatics applications with NGS data through these methods have unveiled a lot of scientific results, which encourage biologists to adopt novel computing technologies. The research papers selected for this special issue represent recent progress in the aspects, including theoretical studies, novel algorithms, high performance computing technologies, and method and algorithm improvement. All of these papers not only provide novel ideas and state-of-the-art technologies in the field but also stimulate future research for next-generation sequencing data analysis and their applications.

Massive Genomic Data Processing and Deep Analysis

Biological Big Bytes: Integrative Analysis of Large Biological Datasets

Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges

Computational Strategies for Scalable Genomics Analysis

A Fully Integrated End-to-End Genome Analysis Accelerator for Next-Generation Sequencing

Analyzing large-scale DNA Sequences on Multi-core Architectures

Big Data Technology Accelerate Genomics Precision Medicine

MUSIC: A Hybrid Computing Environment for Burrows-Wheeler Alignment for Massive Amount of Short Read Sequence Data

The BIG Data Center: from deposition to integration to translation

Analyzing Large Microbiome Datasets Using Machine Learning and Big Data

Big Data access and infrastructure for modern biology: case studies in data repository utility

Parallel-META 2.0: Enhanced Metagenomic Data Analysis with Functional Annotation, High Performance Computing and Advanced Visualization

Big Data, Big Challenges

The parallelism motifs of genomic data analysis

Bioinformatics Methods for High-Throughput DNA Sequencing Data

Compression of Structured High-Throughput Sequencing Data

Novel Computational Technologies for Next-Generation Sequencing Data Analysis and Their Applications.

Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery

Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control

Acceleration and automation of genomic data analysis to meet corporate compliance standards using advanced cloud components.

A Divide-and-Conquer Approach to Large-Scale Evolutionary Analysis of Single-Cell DNA Data