Capturing the complexity of topologically associating domains through multi-feature optimization

Natalie Sauerwald,Carl Kingsford
DOI: https://doi.org/10.1101/2021.01.04.425264
2021-01-05
Abstract:Abstract The three-dimensional structure of human chromosomes is tied to gene regulation and replication timing, but there is still a lack of consensus on the computational and biological definitions for chromosomal substructures such as topologically associating domains (TADs). TADs are described and identified by various computational properties leading to different TAD sets with varying compatibility with biological properties such as boundary occupancy of structural proteins. We unify many of these computational and biological targets into one algorithmic framework that jointly maximizes several computational TAD definitions and optimizes TAD selection for a quantifiable biological property. Using this framework, we explore the variability of TAD sets optimized for six different desirable properties of TAD sets: high occupancy of CTCF, RAD21, and H3K36me3 at boundaries, reproducibility between replicates, high intra- vs inter-TAD difference in contact frequencies, and many CTCF binding sites at boundaries. The compatibility of these biological targets varies by cell type, and our results suggest that these properties are better reflected as subpopulations or families of TADs rather than a singular TAD set fitting all TAD definitions and properties. We explore the properties that produce similar TAD sets (reproducibility and inter- vs intra-TAD difference, for example) and those that lead to very different TADs (such as CTCF binding sites and inter- vs intra-TAD contact frequency difference).
What problem does this paper attempt to address?