The structural diversity of telomeres and centromeres across mouse subspecies revealed by complete assemblies

Bailey Francis,Landen Gozashti,Kevin Costello,Takaoki Kasahara,Olivia S. Harringmeyer,Jingtao Lilue,Mohab Helmy,Tadafumi Kato,Anne Czechanski,Michael Quail,Iraad Bonner,Emma Dawson,Anne Ferguson Smith,Laura Reinholdt,David J. Adams,Thomas M. Keane
DOI: https://doi.org/10.1101/2024.10.24.619615
2024-10-26
Abstract:It is over twenty years since the publication of the C57BL/6J mouse reference genome, which has been a key catalyst for understanding mammalian disease biology. However, the mouse reference genome still lacks telomeres and centromeres, contains 281 chromosomal sequence gaps, and only partially represents many biomedically relevant loci. We present the first T2T mouse genomes for two key inbred strains, C57BL/6J and CAST/EiJ. These T2T genomes reveal significant variability in telomere and centromere sizes and structural organisation. We add an additional 213 Mbp of novel sequence to the reference genome containing 517 protein-coding genes. We examined two important but incomplete loci in the mouse genome - the pseudoautosomal region (PAR) on the sex chromosomes and KRAB zinc finger proteins (KZFPs) loci. We identified distant locations of the PAR boundary, different copy number and sizes of segmental duplications, and a multitude of amino acid substitution mutations in PAR genes.
Genomics
What problem does this paper attempt to address?