Abstract:The Protein Data Bank (PDB) is the single global repository for three-dimensional structures of biological macromolecules and their complexes, and its more than 100 000 structures contain more than 20 000 distinct ligands or small molecules bound to proteins and nucleic acids. Information about these small molecules and their interactions with proteins and nucleic acids is crucial for our understanding of biochemical processes and vital for structure-based drug design. Small molecules present in a deposited structure may be attached to a polymer or may occur as a separate, non-covalently linked ligand. During curation of a newly deposited structure by wwPDB annotation staff, each molecule is cross-referenced to the PDB Chemical Component Dictionary (CCD). If the molecule is new to the PDB, a dictionary description is created for it. The information about all small molecule components found in the PDB is distributed via the ftp archive as an external reference file. Small molecule annotation in the PDB also includes information about ligand-binding sites and about covalent and other linkages between ligands and macromolecules. During the remediation of the peptide-like antibiotics and inhibitors present in the PDB archive in 2011, it became clear that additional annotation was required for consistent representation of these molecules, which are quite often composed of several sequential subcomponents including modified amino acids and other chemical groups. The connectivity information of the modified amino acids is necessary for correct representation of these biologically interesting molecules. The combined information is made available via a new resource called the Biologically Interesting molecules Reference Dictionary, which is complementary to the CCD and is now routinely used for annotation of peptide-like antibiotics and inhibitors.

Unified access to up-to-date residue-level annotations from UniProt and other biological databases for PDB data via PDBx/mmCIF files

IHMCIF: An Extension of the PDBx/mmCIF Data Standard for Integrative Structure Determination Methods

PDBeCIF: an open-source mmCIF/CIF parsing and processing package

RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning

Worldwide Protein Data Bank Biocuration Supporting Open Access to High-Quality 3D Structural Biology Data

Protein Data Bank: the single global archive for 3D macromolecular structure data

The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema

Comprehensive Encoding of Conformational and Compositional Protein Structural Ensembles through mmCIF Data Structure

PDB-CAT: A User-Friendly Tool to Classify and Analyze PDB Protein-Ligand Complexes

RCSB Protein Data Bank: Enabling biomedical research and drug discovery

PDBe tools for an in-depth analysis of small molecules in the Protein Data Bank

The RCSB protein data bank: integrative view of protein, gene and 3D structural information

RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures

RCSB Protein Data Bank: Tools for Visualizing and Understanding Biological Macromolecules in 3D.

RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education

RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences

RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy

The RCSB Protein Data Bank: views of structural biology for basic and applied research and education

Facilities that make the PDB data collection more powerful

Small molecule annotation for the Protein Data Bank.

Supporting the CIF file format of proteins in molecular dynamics simulations