Exploring Contact Distance Distributions with Google Colaboratory

Ryuichiro Nakato
DOI: https://doi.org/10.1007/978-1-0716-4136-1_10
Abstract:Hi-C and Micro-C are the three-dimensional (3D) genome assays that use high-throughput sequencing. In the analysis, the sequenced paired-end reads are mapped to a reference genome to generate a two-dimensional contact matrix for identifying topologically associating domains (TADs), chromatin loops, and chromosomal compartments. On the other hand, the distance distribution of the paired-end mapped reads also provides insight into the 3D genome structure by highlighting global contact frequency patterns at distances indicative of loops, TADs, and compartments. This chapter presents a basic workflow for visualizing and analyzing contact distance distributions from Hi-C data. The workflow can be run on Google Colaboratory, which provides a ready-to-use Python environment accessible through a web browser. The notebook that demonstrates the workflow is available in the GitHub repository at https://github.com/rnakato/Springer_contact_distance_plot.
What problem does this paper attempt to address?