Comparative atlas of genome-wide chromatin-associated protein co-occupancy

Shannon M. White,Belle A. Moyers,Tao Wang,Mark Mackiewicz,Annika K. Weimer,Fabian Grubert,Vivekanandan Ramalingam,Jay X. J. Luo,Lixia Jiang,Minyi Shi,Xinqiong Yang,Tristan Chou,Jie Zhai,Konor Von Kraut,Jessika Adrian,E. Christopher Partridge,Kristina Paul,Anshul Kundaje,Eric M. Mendenhall,Richard M. Myers,Michael P. Snyder
DOI: https://doi.org/10.1101/2024.12.17.628199
2024-12-20
Abstract:Accurate transcriptional regulation and chromatin dynamics requires the coordination and activity of chromatin-associated proteins (CAPs) at distinct loci. While the combinatorial activity of a select set of CAPs has been previously examined, these studies are limited by the underrepresentation of proteins and cell types explored, making it difficult to identify the global associations as well as the conservation of these associations across different cell types. Here, we conducted an integrative analysis for 270 CAP chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-Seq) experiments conducted in both K562 and HepG2 cancer cell lines and explored the relationship between cell identity and CAP co-association using three distinct approaches. We employed a machine learning algorithm to organize the genome-wide binding profiles into 56 and 70 interpretable co-association modules for HepG2 and K562 cell lines, respectively. We found CAP co-association modules present in both cell lines are largely comprised of TFs from a single TF family and anchor to unique loci via lineage-specific factors. While enhancer-associated co-binding modules were largely composed of cell type-specific CAPs, we found regulatory activity at promoter-enhancer module contacts to be enriched for chromatin remodeling proteins. Additionally, we used colocalization information derived from co-association models in conjunction with neural network models of transcription factor (TF) activity to identify high-confidence candidate TF cooperative pairs. Finally, through comparing CAP enrichment in high occupancy target (HOT) regions in K562 and HepG2 cell lines, we found cell type-specific HOT sites, but not common HOT sites, are selectively enriched at high copy number loci. Overall, this study uncovers principles of sequence-level and large-scale CAP genomic organization and demonstrates how this contributes to cell type-specific regulatory mechanisms and cellular functions.
Genomics
What problem does this paper attempt to address?