cancercelllines.org—a novel resource for genomic variants in cancer cell lines

Rahel Paloots,Michael Baudis
DOI: https://doi.org/10.1093/database/baae030
2024-01-01
Database
Abstract:Abstract Cancer cell lines are an important component in biological and medical research, enabling studies of cellular mechanisms as well as the development and testing of pharmaceuticals. Genomic alterations in cancer cell lines are widely studied as models for oncogenetic events and are represented in a wide range of primary resources. We have created a comprehensive, curated knowledge resource—cancercelllines.org—with the aim to enable easy access to genomic profiling data in cancer cell lines, curated from a variety of resources and integrating both copy number and single nucleotide variants data. We have gathered over 5600 copy number profiles as well as single nucleotide variant annotations for 16 000 cell lines and provide these data with mappings to the GRCh38 reference genome. Both genomic variations and associated curated metadata can be queried through the GA4GH Beacon v2 Application Programming Interface (API) and a graphical user interface with extensive data retrieval enabled using GA4GH data schemas under a permissive licensing scheme. Database URL: https://cancercelllines.org
mathematical & computational biology
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper introduces cancercelllines.org, a novel repository designed to integrate genomic variation data in cancer cell lines. Specifically, the main objectives of this repository include: 1. **Integrating data from multiple sources**: Collecting and organizing genomic variation data of cancer cell lines from multiple databases (such as ClinVar, CCLE, Cellosaurus, etc.), including single nucleotide variations (SNVs) and copy number variations (CNVs). 2. **Providing comprehensive query functionality**: Enabling users to conveniently query and retrieve genomic variation information of cancer cell lines through the GA4GH Beacon v2 API and a graphical user interface. 3. **Standardizing and unifying data formats**: Using standard data models and ontologies (such as Human Ancestry Ontology, Sequence Ontology, etc.) to ensure data consistency and comparability. 4. **Supporting research on various cancer types**: Covering more than 400 different cancer types and providing detailed metadata to facilitate various analyses by researchers. In summary, the goal of cancercelllines.org is to become a comprehensive repository that provides accessible and extensive genomic variation data for cancer research, thereby promoting the understanding of cancer mechanisms and drug development.