Contextualising transcription factor binding during embryogenesis using natural sequence variation

Olga M Sigalova,Mattia Forneris,Frosina Stojanovska,Bingqing Zhao,Rebecca R Viales,Adam Rabinowitz,Fayrouz Hamal,Benoit Ballester,Judith B Zaugg,Eileen EM Furlong
DOI: https://doi.org/10.1101/2024.10.24.619975
2024-10-24
Abstract:Understanding how genetic variation impacts transcription factor (TF) binding remains a major challenge, limiting our ability to model disease-associated variants. Here, we used a highly controlled system of F1 crosses with extensive genetic diversity to profile allele-specific binding of four TFs at several embryonic time-points, using Drosophila as a model. Using a combined haplotype test, we identified 9-18% of TF bound regions impacted by genetic variation. By expanding WASP (a tool for allele-specific read mapping) to examine INDELs, we increased detection of allele imbalanced (AI) peaks by 30-50%. This fine-grained mutagenesis could reconstruct functionalized binding motifs of all factors. To prioritise potential causal variants, we trained a convolutional neural network (Basenji) to predict TF binding from DNA sequence. The model could accurately predict experimental AI for strong effect variants, providing a mechanistic interpretation for how genetic variation impacted TF binding. This revealed unexpected relationships between TFs, including potential cooperative pairs, and mechanisms of tissue specific recruitment of the ubiquitous factor CTCF.
Genetics
What problem does this paper attempt to address?