CDD/SPARCLE: the conserved domain database in 2020

Shennan Lu,Jiyao Wang,Farideh Chitsaz,Myra K Derbyshire,Renata C Geer,Noreen R Gonzales,Marc Gwadz,David I Hurwitz,Gabriele H Marchler,James S Song,Narmada Thanki,Roxanne A Yamashita,Mingzhang Yang,Dachuan Zhang,Chanjuan Zheng,Christopher J Lanczycki,Aron Marchler-Bauer
DOI: https://doi.org/10.1093/nar/gkz991
IF: 14.9
2019-11-28
Nucleic Acids Research
Abstract:Abstract As NLM’s Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research. CDD offers both an archive of pre-computed domain annotations as well as live search services for both single protein or nucleotide queries and larger sets of protein query sequences. CDD staff has continued to characterize protein families via conserved domain architectures and has built up a significant corpus of curated domain architectures in support of naming bacterial proteins in RefSeq. These architecture definitions are available via SPARCLE, the Subfamily Protein Architecture Labeling Engine. CDD can be accessed at https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.
biochemistry & molecular biology
What problem does this paper attempt to address?