Abstract:Meiotic recombination is an important biological process. As a main driving force of evolution, recombination provides natural new combinations of genetic variations. Rather than randomly occurring across a genome, meiotic recombination takes place in some genomic regions (the so-called 'hotspots') with higher frequencies, and in the other regions (the so-called 'coldspots') with lower frequencies. Therefore, the information of the hotspots and coldspots would provide useful insights for in-depth studying of the mechanism of recombination and the genome evolution process as well. So far, the recombination regions have been mainly determined by experiments, which are both expensive and time-consuming. With the avalanche of genome sequences generated in the post-genomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the recombination regions. In this study, a predictor, called 'iRSpot-PseDNC', was developed for identifying the recombination hotspots and coldspots. In the new predictor, the samples of DNA sequences are formulated by a novel feature vector, the so-called 'pseudo dinucleotide composition' (PseDNC), into which six local DNA structural properties, i.e. three angular parameters (twist, tilt and roll) and three translational parameters (shift, slide and rise), are incorporated. It was observed by the rigorous jackknife test that the overall success rate achieved by iRSpot-PseDNC was >82% in identifying recombination spots in Saccharomyces cerevisiae, indicating the new predictor is promising or at least may become a complementary tool to the existing methods in this area. Although the benchmark data set used to train and test the current method was from S. cerevisiae, the basic approaches can also be extended to deal with all the other genomes. Particularly, it has not escaped our notice that the PseDNC approach can be also used to study many other DNA-related problems. As a user-friendly web-server, iRSpot-PseDNC is freely accessible at http://lin.uestc.edu.cn/server/iRSpot-PseDNC.

Recombination spot identification Based on gapped k-mers

Irspot-Psednc: Identify Recombination Spots with Pseudo Dinucleotide Composition

Sequence Repetitiveness Quantification and De Novo Repeat Detection by Weighted K-Mer Coverage.

Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach

iRSpot-DACC: a computational predictor for recombination hot/cold spots identification based on dinucleotide-based auto-cross covariance

Irspot-Pse6nc: Identifying Recombination Spots in Saccharomyces Cerevisiae by Incorporating Hexamer Composition into General PseKNC

The prediction of Recombination Hotspot Based on Automated Machine Learning

Recombination Spots Prediction Using Dna Physical Properties in the Saccharomyces Cerevisiae Genome

A Comparison and Assessment of Computational Method for Identifying Recombination Hotspots Insaccharomyces Cerevisiae

A New Method for Detecting Human Recombination Hotspots and Its Applications to the Hapmap Encode Data

Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae

Gene Prediction by the Noise-Assisted MEMD and Wavelet Transform for Identifying the Protein Coding Regions

iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou’s PseAAC to formulate DNA samples

Exploring the accuracy and limits of algorithms for localizing recombination breakpoints

A two-phase approach for detecting recombination in nucleotide sequences

Deep learning identifies and quantifies recombination hotspot determinants

RecombineX: A generalized computational framework for automatic high-throughput gamete genotyping and tetrad-based recombination analysis

Genetic K-Modes Based Dna Splice Site Adjacent Sequences Feature Analysis

Prediction of Trans-regulators of Recombination Hotspots in Mouse Genome

DeepGenGrep: a general deep learning-based predictor for multiple genomic signals and regions

iRO-3wPseKNC: identify DNA replication origins by three-window-based PseKNC