CNVizard: a lightweight streamlit application for an interactive analysis of copy number variants

Jeremias Krause,Carlos Classen,Daniela Dey,Eva Lausberg,Luise Kessler,Thomas Eggermann,Ingo Kurth,Matthias Begemann,Florian Kraft
DOI: https://doi.org/10.1101/2024.06.14.598969
2024-06-17
Abstract:Methods to call, analyze and visualize copy number variations (CNVs) from massive parallel sequencing data have been widely adopted in clinical practice and genetic research. To enable a streamlined analysis of CNV data, comprehensive annotation and good visualization are indispensable. The ability to detect single exon CNVs is another important feature for genetic testing. Nonetheless, most available open-source tools come with limitations in at least one of these areas. One drawback is that available tools deliver data in an unstructured and static format which requires subsequent visualization and formatting efforts. Here we present CNVizard, a lightweight streamlit app which requires minimal computational knowledge, and which is compatible with widely used CNV processing tools (CNVkit and AnnotSV). CNVizard can process short- and long-read sequencing data and provides an intuitive webapp-like experience enabling an interactive visualization of CNV data.
Bioinformatics
What problem does this paper attempt to address?
The paper primarily introduces a lightweight Streamlit application named CNVizard, aimed at providing an interactive tool for copy number variation (CNV) data analysis. The paper attempts to address the following key issues: 1. **Simplifying the CNV data analysis process**: Existing CNV analysis tools have limitations in terms of data structure, visualization, and annotation. CNVizard improves these issues by offering an intuitive web application interface, making data analysis more convenient for users. 2. **Enhancing single-exon CNV detection capability**: Reliable identification of single-exon variations is needed in genetic testing. Although some existing tools support single-exon CNV detection, CNVizard offers more powerful features and more intuitive visualizations in this regard. 3. **Improving data visualization capabilities**: Many existing tools lack comprehensive data visualization features, especially for single-exon level CNV analysis. CNVizard addresses this issue by generating box plots similar to MLPA/Coffalyser, allowing users to better compare samples with reference data. 4. **Providing comprehensive data annotation**: To accelerate and improve the reliability of CNV analysis, CNVizard integrates the AnnotSV tool for data annotation, which helps quickly identify known pathogenic and database-recorded CNVs. 5. **Enabling family mode analysis**: Genetic testing and research often use family mode analysis, such as trio analysis. CNVizard supports such analyses to help identify new variations. In summary, the main purpose of this paper is to propose a comprehensive CNV analysis tool—CNVizard. It aims to simplify the CNV data analysis process by providing an easy-to-use interactive interface and enhancing single-exon CNV detection and visualization capabilities. Additionally, it offers comprehensive data annotation features and supports family mode analysis.