Quantitative analysis of ZFY and CTCF reveals dependent recognition of tandem zinc finger proteins
Zheng Zuo,Timothy Billings,Michael Walker,Petko M. Petkov,Gary D. Stormo,Polly M. Fordyce
DOI: https://doi.org/10.1101/637298
2019-05-15
Abstract:Abstract The human genome contains around 800 C2H2 Zinc Finger Proteins (ZFPs), and many of them are composed of long tandem arrays of zinc fingers. Current motif prediction models assume longer finger arrays correspond to longer DNA-binding motifs and higher specificity. However, recent experimental efforts to identify ZFP binding sites in vivo contradict this assumption, with many having short motifs. Here, we systematically test how multiple zinc fingers contribute to binding for three model ZFPs: Zinc Finger Y (ZFY), CTCF, and ZNF343. Using ZFY, which contains 13 fingers, we quantitatively characterize its binding specificity with several methods, including Affinity-seq, HT-SELEX, Spec-seq and fluorescence anisotropy, and find evidence for ‘dependent recognition’ where downstream fingers can recognize some extended motifs only in the presence of an intact core site. For the genomic insulator CTCF, additional high-throughput affinity measurements reveal that its upstream specificity profile depends on the strength of the core, violating presumed additivity and positionindependence. Moreover, the effect of different epigenetic modifications within the core site depends on the strength of flanking upstream site, providing new insight into how the previously identified intellectual disability-causing and cancer-related mutant R567W disrupts upstream recognition and deregulates CTCF’s methylation sensitivity. Lastly, we used ZNF343 as example to show that a simple iterative motif analysis strategy based on a small set of prefixed cores can reveal the dependent relationship between cores and upstream motifs. These results establish that the current underestimation of ZFPs motif lengths is due to our lack of understanding of intrinsic properties of tandem zinc finger recognition, including irregular motif structure, variable spacing, and dependent recognition between sub-motifs. These results also motivate a need for better recognition models beyond additive, position-weight matrix to predict ZFP specificities, occupancies, and the molecular mechanisms of disease mutations.