Determination of the Evolutionary Pressure on Camellia Oleifera on Hainan Island Using the Complete Chloroplast Genome Sequence

Wan Zhang,Yunlin Zhao,Guiyan Yang,Jiao Peng,Shuwen Chen,Zhenggang Xu
DOI: https://doi.org/10.7717/peerj.7210
IF: 3.061
2019-01-01
PeerJ
Abstract:Camellia oleifera is one of the four largest woody edible oil plants in the world with high ecological and medicinal values. Due to frequent interspecific hybridization, it was difficult to study its genetics and evolutionary history. This study used C. oleifera that was collected on Hainan Island to conduct our research. The unique island environment makes the quality of tea oil higher than that of other species grown in the mainland. Moreover, a long-term geographic isolation might affect gene structure. In order to better understand the molecular biology of this species, protect excellent germplasm resources, and promote the population genetics and phylogenetic studies of Camellia plants, high-throughput sequencing technology was used to obtain the chloroplast genome sequence of Hainan C. oleifera. The results showed that the whole chloroplast genome of C. oleifera in Hainan was 156,995 bp in length, with a typical quadripartite structure of a large single copy (LSC) region of 86,648 bp, a small single copy (SSC) region of 18,297 bp, and a pair of inverted repeats (IRs) of 26,025 bp. The whole genome encoded a total of 141 genes (115 different genes), including 88 protein-coding genes, 45 tRNA genes, and eight rRNA genes. Among these genes, nine genes contained one intron, two genes contained two introns, and four overlapping genes were also detected. The total GC content of Hainan C. oleifera's chloroplast genome was 37.29%. The chloroplast genome structure characteristics of Hainan C. oleifera were compared with mainland C. oleifera and those of the other eight closely related Theaceae species; it was found that the contractions and expansions of the IR/LSC and IR/SSC regions affected the length of chloroplast genome. The chloroplast genome sequences of these Theaceae species were highly similar. A comparative analysis indicated that the Theaceae species were conserved in structure and evolution. A total of 51 simple sequence repeat (SSR) loci were detected in the chloroplast genome of Hainan C. oleifera, and all Camellia plants did not have pentanucleotide repeats, which could be used as a good marker in phylogenetic studies. We also detected seven long repeats, the base composition of all repeats was biased toward A/T, which was consistent with the codon bias. It was found that Hainan C. oleifera had a similar evolutionary relationship with C. crapnelliana, through the use of codons and phylogenetic analysis. This study can provide an effective genomic resource for the evolutionary history of Theaceae family.
What problem does this paper attempt to address?