Comprehensive Classification of the RNase H-like Domain-Containing Proteins in Plants

Shuai Li,Kunpeng Liu,Qianwen Sun
DOI: https://doi.org/10.1101/572842
2019-01-01
Abstract:Background R-loop is a nucleic acid structure containing an RNA-DNA hybrid and a displaced single-stranded DNA. Recently, accumulated evidence showed that R-loops widely present in various organisms’ genomes and are involved in many physiological processes, including DNA replication, RNA transcription, and DNA repair. RNase H-like superfamily (RNHLS) domain-containing proteins, such as RNase H enzymes, are essential in restricting R-loop levels. However, little is known about the function and relationship of other RNHLS proteins on R-loop regulation, especially in plants.Results In this study, we characterized 6193 RNHLS proteins from 13 representative plant species and clustered these proteins into 27 clusters, among which reverse transcriptases and exonucleases are the two largest groups. Moreover, we found 691 RNHLS proteins in Arabidopsis with a conserved catalytic alpha-helix and beta-sheet motif. Interestingly, each of the Arabidopsis RNHLS proteins is composed of not only an RNHLS domain but also another different protein domain. Additionally, the RNHLS genes are highly expressed in different meristems and metabolic tissues, which indicate that the RNHLS proteins might play important roles in the development and maintenance of these tissues.Conclusions In summary, we systematically analyzed RNHL proteins in plants and found that there are mainly 27 subclusters of them. Most of these proteins might be implicated in DNA replication, RNA transcription, and nucleic acid degradation. We classified and characterized the RNHLS proteins in plants, which may afford new insights into the investigation of novel regulatory mechanisms and functions of R-loops.* AGO : Argonaute CAF1 : putative CCR4-associated factor 1 homolog Dis : disorder protein domain DUF1744 : domain of unknown function 1744 domain Exo : exonuclease Gag-pol : putative gag-pol polyprotein GTF2B : general transcription factor 2-related zinc finger protein HBD : hybrid binding domain non-LTR : putative non-LTR retroelement reverse transcriptase Prp8 : pre-mRNA-processing-splicing factor 8 RBRE3UT : RBR-type E3 ubiquitin transferase RNase H : ribonucleases H RNH2 : ribonuclease H2 RNHLS : RNase H-like superfamily RRP6L : RRP6-like protein SDN : small RNA degrading nuclease SSN : sequence similarity network WRN : Werner syndrome ATP-dependent helicase ZBED1 : zinc finger BED domain-containing protein DAYSLEEPER (Transposase-like protein DAYSLEEPER).
What problem does this paper attempt to address?