The challenge of chromatin model comparison and validation a project from the first international 4D Nucleome Hackathon
Jedrzej Kubica,Sevastianos Korsak,Ariana Brenner Clerkin,David Kouril,Dvir Schirman,Anurupa Devi Yadavalli,Krzysztof Banecki,Michal Kadlof,Ben Busby,Dariusz Plewczynski
DOI: https://doi.org/10.1101/2024.10.02.616241
2024-10-03
Abstract:The computational modeling of chromatin structure is highly complex and challenging due to the hierarchical organization of chromatin, which reflects its diverse biophysical principles, as well as inherent dynamism, which underlies its complexity. The variety of methods for chromatin structure modeling, which are based on different approaches, assumptions and scales of modeling, suggests that there is a necessity for a comprehensive benchmark. This inspired us to conduct a project at the NIH-funded 4D Nucleome Hackathon on March 18-21, 2024 at The University of Washington in Seattle, USA. The hackathon provided an amazing opportunity to gather an international, multi-institutional and unbiased group of experts to discuss, understand and undertake the challenges of chromatin model comparison and validation. These challenges seem straightforward in theory, however in practice, they are challenging and ambiguous. To address them, we developed a bioinformatics workflow for chromatin model comparison and validation, in which we use distance matrices to represent chromatin models, and we calculate Spearman correlation coefficients between pairs of matrices to estimate correlations between models, as well as between models and experimental data. During the 4-day hackathon, we tested our workflow on several distinct software packages for chromatin structure modeling and we discovered several challenges that include: 1) different aspects of chromatin biophysics and scales complicate model comparisons, 2) expertise in biology, bioinformatics, and physics is necessary to conduct a comprehensive research on chromatin structure, 3) bioinformatic software, which is often developed in academic settings, is characterized by insufficient support and documentation. Therefore, our work constitutes a way to advance the modeling of the 3D organization of the human genome, while emphasizing the importance of establishing guidelines for software development and standardization.
Biology