Abstract:Twenty years ago the Human Genome Project was initiated aiming to uncover the genetic factors of human diseases and to develop new strategies for diagnosis, treatment, and prevention. Despite the successful sequencing of the human genome and the discovery of many disease related genes, our understanding of molecular mechanisms is still largely incomplete for the majority of diseases. In the KEGG database project we have been organizing our knowledge on cellular functions and organism behaviors in computable forms, especially in the forms of molecular networks (KEGG pathway maps) and hierarchical lists (BRITE functional hierarchies). The computerized knowledge has been widely used as a reference for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies. Our efforts are now focused on human diseases and drugs. We consider diseases as perturbed states of the molecular system that operates the cell and the organism, and drugs as perturbants to the molecular system. Since the existing disease databases are mostly for humans to read and understand, we develop a more computable disease information resource where our knowledge on diseases is represented as molecular networks or gene/molecule lists. When the detail of the molecular system is relatively well characterized, we use the molecular network representation and draw KEGG pathway maps. The Human Diseases category of the KEGG PATHWAY database contains about 40 pathway maps for cancers, immune disorders, neurodegenerative diseases, etc. When the detail is not known but disease genes are identified, we use the gene/molecule list representation and create a KEGG DISEASE entry. The entry contains a list of known disease genes and other relevant molecules including environmental factors, diagnostic markers, and therapeutic drugs. The list simply defines the membership to the underlying molecular system, but is still useful for computational analysis. In the KEGG DRUG database we capture knowledge on two types of molecular networks. One is the interaction network of drugs with target molecules, metabolizing enzymes, transporters, other drugs, and the pathways involving all these molecules. The other is the chemical structure transformation network representing the biosynthetic pathways of natural products in various organisms, as well as the history of drug development where drug structures have been continuously modified by medicinal chemists. KEGG DRUG contains chemical structures and/or chemical components of all prescription and OTC drugs in Japan including crude drugs and TCM (Traditional Chinese Medicine) formulas, as well as most prescription drugs in USA and many prescription drugs in Europe. I will report on our strategy to analyze the chemical architecture of natural products derived from enzymatic reactions (and enzyme genes) and the chemical architecture of marketed drugs derived from human made organic reactions in the history of drug development, towards drug discovery from the genomes of plants and microorganisms.

KEGG as a reference resource for gene and protein annotation

KEGG: Kyoto Encyclopedia of Genes and Genomes

KEGG: biological systems database as a model of the real world

KEGG for linking genomes to life and the environment

From genomics to chemical genomics: new developments in KEGG

Automated Genome Annotation and Pathway Identification Using the KEGG Orthology (KO) As a Controlled Vocabulary

KEGG Mapper for inferring cellular functions from protein sequences

Recent Progress and Application of KEGG Database in the Research of Bioinformatics

Toward understanding the origin and evolution of cellular organisms

Representation and analysis of molecular networks involving diseases and drugs

Kobas Server: A Web-Based Platform For Automated Annotation And Pathway Identification

KOBAS 2.0: a Web Server for Annotation and Identification of Enriched Pathways and Diseases

kegg_pull: a software package for the RESTful access and pulling from the Kyoto Encyclopedia of Gene and Genomes

Gene Annotation Easy Viewer (GAEV): Integrating KEGG’s Gene Function Annotations and Associated Molecular Pathways

Metagen: A Promising Tool for Modeling Metabolic Networks from Kegg

KEGG-PATH: Kyoto encyclopedia of genes and genomes-based pathway analysis using a path analysis model

A Systematic Analysis of Gene Functions by the Metabolic Pathway Database

A Comprehensive App to Interpret and Visualize the Functional Analysis of KEGG Pathways and Gene Ontologies

gutMGene: a comprehensive database for target genes of gut microbes and microbial metabolites

MetagenomicKG: a knowledge graph for metagenomic applications

KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold