Complete characterization of the human immune cell transcriptome using accurate full-length cDNA sequencing

Charles Cole,Ashley Byrne,Matthew Adams,Roger Volden,Christopher Vollmers
DOI: https://doi.org/10.1101/761437
2019-09-12
Abstract:ABSTRACT The human immune system relies on highly complex and diverse transcripts and the proteins they encode. These include transcripts for H uman L eukocyte A ntigen (HLA) class I and II receptors which are essential for self/non-self discrimination by the immune system as well as transcripts encoding B cell and T cell receptors (BCR and TCR) which recognize, bind, and help eliminate foreign antigens. HLA genes are highly diverse within the human population with each individual possessing two of thousands of different alleles in each of the 9 major HLA genes. Determining which combination of alleles an individual possesses for each HLA gene (high-resolution HLA-typing) is essential to establish donor-recipient compatibility in organ and bone-marrow transplantations. BCR and TCR genes in turn are generated by recombining a diverse set of gene segments on the DNA level in each maturing B and T cell, respectively. This process generates a daptive i mmune r eceptor r epertoires (AIRR) of composed of unique transcripts expressed by each B and T cells. These repertoires carry a vast amount of health relevant information. Both short-read RNA-seq based HLA-typing 1 and adaptive immune receptor repertoire sequencing 2–5 currently rely heavily on our incomplete knowledge of the genetic diversity at HLA 6 and BCR/TCR loci 7,8 . Here we used our nanopore sequencing based R olling Circle to C oncatemeric C onsensus (R2C2) protocol 9 to generate over 10,000,000 full-length cDNA sequences at a median accuracy of 97.9%. We used this dataset to demonstrate that deep and accurate full-length cDNA sequencing can - in addition to providing isoform-level transcriptome analysis for over 9,000 loci - be used to generate accurate sequences of HLA alleles for HLA allele typing and discovery as well as detailed AIRR data for the analysis of the adaptive immune system without requiring specific knowledge of the diversity at HLA and BCR/TCR loci.
What problem does this paper attempt to address?