What don't we know about human genes?

LIU Shun,QU LiangHu
DOI: https://doi.org/10.1360/N972016-00761
2017-01-01
Chinese Science Bulletin
Abstract:In the past fifty years,biologists have begun to estimate protein-coding capacity of human genomes and the estimated human gene number fluctuated in a shrunken trend,ranging from two million to 25000 recognized by the Human Genome Project.This number fell to 19000 in recent studies,which suggested that human genes were even less than the nematode worm Caenorhabditis elegans.Apparently,the complexity and flexibility of higher mammal genomes are far more underestimated than they were once considered,which cannot be merely interpreted as the protein-coding gene counts.Scientists now hold the belief that the widening differences among higher organisms are primarily caused by the regulation of gene expression at the molecular levels,including transcriptional regulation and post-transcriptional regulation.With regard to human genome,two major strategies are for these processes.One is through the alternative splicing of exons and introns of pre-mRNAs transcribed from human genome,one gene may produce multiple protein isoforms,thus greatly increased the complexity of proteome.The phenomenon,over the past years,has unambiguously become one of the main reasons why human genome manifests such complexity with so few protein-coding genes.The second,there actually exist an enormous amount of active non-coding RNAs (ncRNAs) from non-protein coding regions that account for approximately 98% of the human genome,which form a highly intricate RNA regulatory network to make human genome more complicated.With the implementation of the encyclopedia of DNA elements (ENCODE) project,biologists surprisingly find that the ncRNA species are diverse,including snoRNAs,microRNAs,piRNAs,lncRNAs and circRNAs.They take part in maintaining the whole genetic information,regulating gene expression and constituting functional complexes in cells.Besides,novel classes of ncRNAs and various cis-RNA elements are expected to be discovered and identified.All together raise the fact that the human genome can be divided into many DNA regions which harbor potentials of transcribing multi-functional RNA products.But how transcription machinery determines which section of DNA sequences to read as multi-functional at particular time point still remains a mystery.Although the task of perceiving the significance of ncRNA's role that ncRNAs play is just beginning,further studies on the structure and function of these RNAs will facilitate the understanding human genes.In summary,human genome is operated as highly sophisticated protein and RNA-producing machinery,which contains huge amount of ncRNA genes besides the protein-coding genes.To understand the operation of human genome,we not only need to clarify the variety and counts of genes,but also need to explore their function and expression regulation.
What problem does this paper attempt to address?