The Mystery of “junk” DNA
Chang Zhang,Xinwen Wang,Liang Wang,Shan Gao
DOI: https://doi.org/10.1360/n972016-00381
2016-01-01
Abstract:The basic genetic law of life is built based on DNA double helix model and the central dogma of molecular biology (i.e. DNA-RNA bidirectional transcription and RNA-protein translation), in which DNA and most of RNA carry coding information, and proteins make up the structure of the body and carry out most of biological functions. This describes the flow of genetic information within and between individual and protein as the functional molecules in life, so DNA that codes for protein (known as exon) is functional in view of these rules. In this review, we provide an overview of the origin of “junk” DNA and further discuss how “junk” DNA functions in depth. With a glimpse on landscape of human genome, only very small fractions are protein coding DNA in our book of life. By contrast, the large fractions are non-coding DNA, which cannot be translated into proteins and have been assumed that such DNA do not contain any information nor have function. Also it has been found that the genome size of organism does not correlate well with the complexity of organism, suggesting large amounts of non-coding DNA exist in lower organism. Such non-coding DNA in organism genome are regarded as uselessness and commonly referred to as “junk” DNA. However, with the advent of next generation sequencing technologies and ability to improvement of analyzing data, these provide the possibility to systematically understand so called “junk” DNA. First, genome-wide association studies have successfully identified many single nucleotide polymorphisms (SNPs) underlying susceptibility to diseases; however, the majority of SNPs locate in non-coding region of genome. Moreover, the parts of non-coding DNA are highly conserved between human and mice. All of these suggest non-coding DNA are functional in some way. Second, it has been revealed that about 75 percent of our genome is actually transcribed. Such transcripts that do not code any protein are termed as non-coding RNAs (ncRNAs). These ncRNAs, such as canonical transfer and ribosomal RNAs, as well as the recently identified microRNAs (miRNAs), long non-coding RNAs (lncRNAs) circular RNAs (circRNAs) etc, have been shown to play the important physiological function in organism. Also the deregulation of these ncRNAs has been found to have relevance not only to tumorigenesis, but also to neurological, cardiovascular, developmental and other diseases. Here we further discuss the rapidly advancing fields of miRNA, lncRNA and circRNA in detail. We summarize their production, gene structure and organization in the genome and diverse functions. Although miRNA has been well studied in last decade, we are still in early step of understanding the nature and extent of the involvement of other ncRNAs in physiology and disease. This will shed light on great advances in therapeutic strategies and diagnostic approaches based on the understanding on the molecular mechanisms of ncRNAs.