Dosage-sensitivity of Human Transcription Factor Genes

Zhongfu Ni,Xiao-Yu Zhou,Sidra Aslam,Deng-Ke Niu
DOI: https://doi.org/10.1101/528554
2019-01-01
Abstract:Changes in the copy number of protein-coding genes would lead to detrimental effects if the consequent changes in protein concentration disrupt essential cellular functions. Large-scale genomic studies have identified thousands of dosage-sensitive genes in human genome. We are interested in the dosage-sensitivity of transcription factor (TF) genes whose products are essential for the growth, division and differentiation of cells by regulating the expression of the genetic information encoded in the genome. We first surveyed the enrichment of human TF genes in four recently curated datasets of dosage-sensitive genes, including the haploinsufficient genes identified by a large-scale genomic study, the haploinsufficient genes predicted by a machine learning approach, the genes with conserved copy number across mammals, and the ohnologs. Then we selected the dosage-sensitive genes that are present in all the four dataset and regarded them as the most reliable dosage-sensitive genes, and the genes that are absent from any one of the four datasets as the most reliable dosage-insensitive genes, and surveyed the enrichments of TFs genes in these two datasets. A large number of TF genes were found to be dosage-insensitive, which is beyond the expectation based on the role of TFs. In spite of this, the likeness of TF genes to be dosage-sensitive were supported by five datasets, with the conserved-copy-number genes as the exception. The nuclear receptors are the only one family of TFs whose dosage-sensitivity was consistently supported by all the six datasets. In addition, we found that TF families with very few members are also more likely to be dosage-sensitive while the largest TF family, C2H2-ZF, are most likely dosage-insensitive. The most extensively studied TFs, p53, are not special in dosage-sensitivity. They are significantly enriched in only three datasets. We also confirmed that dosage-sensitive genes generally have long coding sequences, high expression levels and experienced stronger selective pressure. Our results indicate some TFs function in a dose-dependent manner while some other not. Gene dosage changes in some TF families like nuclear receptor would result in disease phenotypes while the effects of such changes in some TFs like C2H2-ZF would be mild.
What problem does this paper attempt to address?