Genomic and epigenomic maps of mouse centromeres and pericentromeres

Gitika Chaudhry,Jingyue Chen,Lucy Snipes,Smriti Bahl,Jenika Packiaraj,Xuan Lin,Jitendra Thakur
DOI: https://doi.org/10.1101/2024.10.09.617447
2024-10-10
Abstract:Satellite DNA makes up ~11% of the mouse genome, predominantly located in centromeric and pericentric regions, which are crucial for chromosome segregation. While comprehensive assemblies of these regions have been established in the human genome, they are still lacking in the mouse genome. In this study, we used PacBio long-read sequencing, CUT&RUN sequencing, DNA methylation analysis, and RNA sequencing to generate genomic and epigenomic maps of these regions. We find that centromeric regions are primarily occupied by 120-mer Minor satellites, with other Minor Satellite length variants, 112-mers and 112-64-dimers, localized at centromere-pericentric junctions. Pericentromeric regions are mainly composed of homogeneous Major satellites, while pericentric-chromosomal junctions contain a higher density of divergent satellites. Additionally, the density of non-satellite repeats increases progressively from centromeres to pericentromeres, and further toward chromosomal arm junctions. We found that 120-mer Minor satellites in the core centromere are highly enriched with CENP-A, while the 112-mers and 112-64-dimers show lower CENP-A levels. Homogeneous Major satellites are more enriched with H3K9me3 heterochromatin, whereas divergent Major satellites are preferentially associated with H3K27me3. Furthermore, DNA methylation levels are lower in centromeres compared to pericentric regions. We also observed that only a small subset of satellites is transcribed into RNA, particularly regions exhibiting lower DNA methylation density. Our comprehensive assembly and characterization of the genomic and epigenomic landscape of mouse centromeric and pericentric regions have major implications for satellite biology and ongoing mouse telomere-to-telomere (T2T) assembly efforts.
Genomics
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the problem of incomplete genomic and epigenomic maps of centromeres and pericentromeres in the mouse genome. Specifically: 1. **Genome assembly problems**: - Although these regions in the human genome have been relatively comprehensively assembled, the centromeres and pericentromeres in the mouse genome still lack detailed assembly maps. These regions are rich in highly repetitive satellite DNA, making it difficult for traditional short - read sequencing techniques to accurately assemble them. - By using PacBio long - read sequencing technology, researchers hope to generate more accurate and complete genome assembly maps. 2. **Epigenetic features**: - Researchers also hope to draw epigenetic maps of these regions through a variety of experimental techniques (such as CUT&RUN sequencing, DNA methylation analysis, RNA sequencing, etc.), including chromatin modifications (such as CENP - A, H3K9me3, H3K27me3) and transcriptional activity information. - These epigenetic features are crucial for understanding the functions of these regions, such as their roles in chromosome segregation and gene expression regulation. 3. **Diversity and functions of satellite DNA**: - Researchers have found that although the centromeres and pericentromeres in mice are considered to be more homogeneous than those in humans, there is still a certain degree of sequence variation. These variations may have an impact on genome function and stability. - By detailed analysis of different types of satellite DNA (such as 120 - mer Minor satellites, 112 - mer and 112 - 64 - dimers, Major satellites, etc.) and their distribution patterns, researchers hope to reveal the complexity and functional diversity of these regions. 4. **Structural rearrangements and non - satellite repeat sequences**: - The study also focuses on the structural rearrangement events (such as inversions, insertions, etc.) within these regions and the distribution of non - satellite repeat sequences (such as LTR retrotransposons, LINEs, SINEs, etc.). This information is helpful for understanding genome evolution and stability. 5. **Contribution to mouse telomere - to - telomere (T2T) assembly work**: - The results of this study provide important data support for the ongoing mouse genome telomere - to - telomere (T2T) assembly work, which is helpful for constructing a complete, gap - free mouse genome map. In summary, by integrating a variety of advanced sequencing techniques and bioinformatics methods, this paper aims to fill the gaps in the assembly and epigenomic maps of centromeres and pericentromeres in the mouse genome, thereby providing new insights into understanding the biological functions and evolution of these regions.