Many Human Immunoglobulin Heavy‐chain IGHV Gene Polymorphisms Have Been Reported in Error

Yan Wang,Katherine Jl Jackson,William A. Sewell,Andrew M. Collins
DOI: https://doi.org/10.1038/sj.icb.7100144
2007-01-01
Immunology and Cell Biology
Abstract:The identification of the genes that make up rearranged immunoglobulin genes is critical to many studies. For example, the enumeration of mutations in immunoglobulin genes is important for the prognosis of chronic lymphocytic leukemia, and this requires the accurate identification of the germline genes from which a particular sequence is derived. The immunoglobulin heavy-chain variable (IGHV) gene repertoire is generally considered to be highly polymorphic. In this report, we describe a bioinformatic analysis of germline and rearranged immunoglobulin gene sequences which casts doubt on the existence of a substantial proportion of reported germline polymorphisms. We report a five-level classification system for IGHV genes, which indicates the likelihood that the genes have been reported accurately. The classification scheme also reflects the likelihood that germline genes could be incorrectly identified in mutated VDJ rearrangements, because of similarities to other alleles. Of the 226 IGHV alleles that have previously been reported, our analysis suggests that 104 of these alleles almost certainly include sequence errors, and should be removed from the available repertoire. The analysis also highlights the presence of common mismatches, with respect to the germline, in many rearranged heavy-chain sequences, suggesting the existence of twelve previously unreported alleles. Sequencing of IGHV genes from six individuals in this study confirmed the existence of three of these alleles, which we designate IGHV3-49*04, IGHV3-49*05 and IGHV4-39*07. We therefore present a revised repertoire of expressed IGHV genes, which should substantially improve the accuracy of immunoglobulin gene analysis.
What problem does this paper attempt to address?