DORA: an interactive map for the visualization and analysis of ancient human DNA and associated data

Keith D. Harris,Gili Greenbaum
DOI: https://doi.org/10.1101/2024.01.15.575663
2024-01-16
Abstract:The ability to sequence ancient genomes has revolutionized the way in which we study evolutionary history by providing access to the most important aspect of evolution — time. Until recently, studying human demography, ecology, biology, and history using population genomic inference relied on contemporary genomic datasets. Over the past decade, the availability of human ancient DNA (aDNA) has increased rapidly, almost doubling every year, opening the way for spatiotemporal studies of ancient human populations. However, the multidimensionality of aDNA, with genotypes having temporal, spatial and genomic coordinates, and the need to integrate multiple sources of data, poses a challenge for developing meta-analyses pipelines. To address this challenge, we developed a publicly-available interactive tool, , which integrates multiple data types, genomic and non-genomic, in a unified interface. This web-based tool allows users to browse sample metadata along with additional layers of information, such as population structure, climatic data, and unpublished samples. Users can then perform analyses on genotypes of these samples, or export sample subsets for external analyses. integrates analyses and visualizations in a single intuitive interface, resolving the technical issues of combining datasets from different sources and formats, and allowing researchers to focus on analysis and the scientific questions that can be addressed through analysis of aDNA datasets.
Genomics
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper introduces DORA (Data Overlays for Research in Archaeogenomics), an interactive map tool for visualizing and analyzing ancient human DNA (aDNA) data. The main aim of the paper is to address the following issues: 1. **Multidimensional Data Integration**: Ancient human DNA data has multidimensional characteristics, including temporal, spatial, and genomic coordinates of genotypes. These multidimensional data require the integration of multiple data sources, which poses challenges for developing meta-analysis pipelines. 2. **Data Integration and Visualization**: There is currently a lack of a unified interface to integrate various data types (genomic and non-genomic data), making it difficult for researchers to select and analyze data on the same platform. 3. **Intuitive Data Exploration**: Existing tools often require multiple steps and libraries to complete data analysis and visualization. DORA integrates these functions into a single interactive interface, allowing users to intuitively select samples from geographic, temporal, and genomic windows and perform subsequent analyses. Through DORA, researchers can easily browse sample metadata and analyze it in conjunction with climate data and other unpublished sample information. DORA not only simplifies the data analysis process but also allows users to conduct various analyses in a unified environment, thereby focusing on the scientific research itself.