Abstract:A computer program for calculating clusters of binding sites of various transcription factors (TFs) according to the genomic coordinates of the ChIP-seq (Chromatin ImmunoPrecipitation-sequencing) profile peaks is developed. The statistical features of the distribution of the transcription factors’ binding sites (TFBSs) in the mouse genome, obtained with the help of ChIP-seq experiments in embryonic stem cells, are considered. Clusters of sites containing at least four binding sites of various TFs in the mouse genome are determined and their localization relative to the regulatory regions of the genes is described. Two types of colocalization of the sites are confirmed: clusters containing binding sites of factors Oct4, Nanog, and Sox2 located in the distal regions and clusters with n-Myc and c-Myc binding sites located mainly in the promoter regions of mouse genes. Analysis of the new ChIP-seq data on the binding of TFs Nr5a2, Tbx3, Cep, SRF, and USF1 in the same cell type confirmed the differentiation of clusters of the TFBSs into two types: those containing pluripotency regulator binding sites (Oct4, Nanog, and Sox2) and those not containing them. A computer program for the statistical processing of the data on the location of the sites in the genes is developed; it uses the experimental data on site localization obtained by ChIP-seq methods in mouse and human genomes. With the help of this program, the localization patterns of the binding sites of various TFs are detected. The distances between the closest binding sites of the TF groups Oct4, Nanog, and Sox2 and the binding sites of other factors in site clusters that serve as a basis for the analysis of the joint binding of protein complexes to DNA are calculated. The fraction of the presence of the known nucleotide motifs of TFBSs in the genomic regions of ChIP-seq is calculated. The weight matrices for such nucleotide motifs are recalculated. The correlation between the presence of motifs and the ChIP-seq binding intensity is shown. The programs implementing the computerized methods for assessing the clustering of binding sites of various TFs for new ChIP-seq data are available upon request from the authors.

Deciphering Transcription Factor Binding Patterns from Genome-Wide High Density ChIP-chip Tiling Array Data

Computer Analysis of Co-Localization of Transcription Factor Binding Sites in Genome by ChIP-seq Data

Probabilistic Peak Calling And Controlling False Discovery Rate Estimations In Transcription Factor Binding Site Mapping From Chip-Seq

A High-Throughput Percentage-of-binding Strategy to Measure Binding Energies in DNA-protein Interactions: Application to Genome-Scale Site Discovery.

Genome Wide Approaches to Identify Protein-DNA Interactions

A Streamlined and Generalized Analysis of Chromatin ImmunoPrecipitation Paired-End diTag Data

Chip-Seq: Using High-Throughput Dna Sequencing For Genome-Wide Identification Of Transcription Factor Binding Sites

Detection of Transcription Factor Co-Binding Patterns in Human Cells Via Point Process Models

Using DNase digestion data to accurately identify transcription factor binding sites.

Modelling the transcription factor DNA-binding affinity using genome-wide ChIP-based data

Estimating binding properties of transcription factors from genome-wide binding profiles

An Integrated Pipeline For The Genome-Wide Analysis Of Transcription Factor Binding Sites From Chip-Seq

COMPUTATIONAL ANALYSIS AND MODELING OF GENOME-SCALE AVIDITY DISTRIBUTION OF TRANSCRIPTION FACTOR BINDING SITES IN CHIP-PET EXPERIMENTS

Global Mapping of Transcription Factor Binding Sites by Sequencing Chromatin Surrogates: a Perspective on Experimental Design, Data Analysis, and Open Problems

Computer Analysis of Colocalization of the TFs’ Binding Sites in the Genome According to the ChIP-seq Data

Discover Context-Specific Combinatorial Transcription Factor Interactions by Integrating Diverse ChIP-Seq Data Sets.

A Microfluidics-Based Platform For Identification And Detailed Characterization Of Transcription Factor Binding Sites

[Application of Chromatin Immunoprecipitation Assay in Deciphering DNA-protein Interactions].

Understanding Transcriptional Regulation by Integrative Analysis of Transcription Factor Binding Data

Evaluating tools for transcription factor binding site prediction

Understanding Variation in Transcription Factor Binding by Modeling Transcription Factor Genome-Epigenome Interactions.