The comprehensive detection of hemoglobinopathy variants via long-read sequencing

Jiale Xiang Sr.,Jiguang Peng Sr.,Xiangzhong Sun Sr.,Chen Jiang Sr.,Huiru Zhao,Yaya Guo,Haiyan Xu,Shanshan Gu,Haodong Ye,Long You,Xiaoyan Huang,Shiping Chen,Baosheng Zhu,Zhiyu Peng
DOI: https://doi.org/10.1101/2024.12.03.626522
2024-12-06
Abstract:BACKGROUND: The genetic complexity of hemoglobin genes, characterized by high GC content and homologous sequences, poses significant challenges for detecting hemoglobin variants in clinical settings. METHODS: A long-read indexed PCR method utilizing the novel CycloneSEQ nanopore sequencing platform was developed to detect all variant types, including single nucleotide variants (SNVs), deletions, structural variants (SVs) in HBA, HBB, HBD, and HBG genes. The method was validated using 507 clinical samples to assess its performance. RESULTS: The long-read indexed PCR system employed 13 primers targeting the hemoglobin gene clusters. This design enabled the detection of 37 types of HBA deletions, 5 SV (3 multicopies (alpha alpha alpha alpha alpha, alpha alpha alpha anti3.7, alpha alpha alpha anti4.2) and 2 fusion allele (HK alpha alpha and anti-HK alpha alpha)), 37 HBB deletions, and all SNVs in the targeted regions. Validation across 507 samples (84 with HBA variants, 60 with HBB variants, 256 with both HBA and HBB variants, and 107 with no known variants demonstrated 100.0% sensitivity and specificity. Additionally, the long-read sequencing enabled phasing of variants within hemoglobin genes, providing insights critical for clinical interpretation. CONCLUSIONS: The long-read indexed PCR method, combined with the CycloneSEQ nanopore sequencing platform, proved to be a robust and efficient solution for detecting hemoglobinopathy variants. The integration of indexed primers and barcoding enhances scalability, making this method ideal for large-scale population screening programs in the future.
Genomics
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is that existing detection methods are difficult to comprehensively and accurately detect hemoglobinopathy variants in the clinical environment. Specifically, the genetic complexity of the hemoglobin gene (such as high GC content and homologous sequences) poses significant challenges to detection, especially when detecting single - nucleotide variants (SNVs), deletions, and structural variants (SVs). ### Problem Background Hemoglobinopathies are a group of genetic diseases that affect the structure or production of hemoglobin, including sickle cell disease and thalassemia. The clinical manifestations of these diseases range from mild anemia to severely life - threatening conditions, depending on the specific gene mutations and their impact on hemoglobin function. Due to the high incidence of hemoglobinopathies and the huge health burden, early detection and diagnosis are crucial for effective management and treatment. ### Limitations of Existing Methods Traditional molecular diagnostic methods (such as complete blood cell count and hemoglobin electrophoresis) and gene - testing methods for known mutations have the following limitations: 1. **Inability to detect new or complex mutations**: Traditional methods can usually only detect known common mutations. 2. **Requirement for multiple tests**: In order to achieve a comprehensive diagnosis, a variety of different tests often need to be carried out. 3. **Lack of long - read - length sequencing ability**: Short - read - length sequencing techniques are difficult to accurately detect complex gene structural variants and phasing information. ### Solutions Proposed in the Paper To solve the above problems, this study has developed a long - read - length indexed PCR method based on the long - read - length nanopore sequencing platform CycloneSEQ. This method can simultaneously detect all types of variants in the α - and β - globin genes (HBA, HBB, HBD, and HBG), including single - nucleotide variants (SNVs), deletions, and structural variants (SVs). By using 13 primers targeting the hemoglobin gene cluster, this method can detect 37 types of α - thalassemia deletions, 5 structural variants (3 multi - copy and 2 fusion alleles), 37 types of β - thalassemia deletions, and all single - nucleotide variants within the target area. ### Advantages of the Method 1. **High sensitivity and specificity**: Verified in 507 clinical samples, this method has demonstrated 100% sensitivity and specificity. 2. **Phasing information detection**: Long - read - length sequencing can provide phasing information of variants within the hemoglobin gene, which is very important for clinical interpretation. 3. **Scalability**: By introducing index primers and barcodes, this method can significantly increase the detection throughput and reduce the cost of large - scale population screening. In conclusion, the method proposed in this study not only improves the accuracy of hemoglobinopathy variant detection but also provides an efficient and economical solution for future large - scale population screening.