Comprehensive Identification and Characterization of Simple Sequence Repeats Based on the Whole-Genome Sequences of 14 Forest and Fruit Trees

Xiaoming Song,Nan Li,Yuanyuan Guo,Yun Bai,Tong Wu,Tong Yu,Shuyan Feng,Yu Zhang,Zhiyuan Wang,Zhuo Liu,Hao Lin
DOI: https://doi.org/10.48130/fr-2021-0007
2021-01-01
Forestry Research
Abstract:Simple sequence repeats (SSRs) are popular and important molecular markers that exist widely in plants. Here, we conducted a comprehensive identification and comparative analysis of SSRs in 14 tree species. A total of 16, 298 SSRs were identified from 429, 449 genes, and primers were successfully designed for 99.44% of the identified SSRs. Our analysis indicated that tri-nucleotide SSRs were the most abundant, with an average of ~834 per species. Functional enrichment analysis by combining SSR-containing genes in all species, revealed 50 significantly enriched terms, with most belonging to transcription factor families associated with plant development and abiotic stresses such as Myeloblastosis_DNA-bind_4 (Myb_DNA-bind_4), APETALA2 (AP2), and Fantastic Four meristem regulator (FAF). Further functional enrichment analysis showed that 48 terms related to abiotic stress regulation and floral development were significantly enriched in ten species, whereas no significantly enriched terms were found in four species. Interestingly, the largest number of enriched terms was detected in Citrus sinensis (L.) Osbeck, accounting for 54.17% of all significantly enriched functional terms. Finally, we analyzed AP2 and trihelix gene families (Myb_DNA-bind_4) due to their significant enrichment in SSR-containing genes. The results indicated that whole-genome duplication (WGD) and whole genome triplication (WGT) might have played major roles in the expansion of the AP2 gene family but only slightly affected the expansion of the trihelix gene family during evolution. In conclusion, the identification and comprehensive characterization of SSR markers will greatly facilitate future comparative genomics and functional genomics studies.
What problem does this paper attempt to address?