GeneSetCluster 2.0: a comprehensive toolset for summarizing and integrating gene-sets analysis
Asier Ortega-Legarreta,Alberto Maillo,Daniel Mouzo,Ana Rosa Lopez-Perez,Lara Kular,Majid Pahlevan Kakhki,Jesper Tegner,Maja Jagodic,Vincenzo Lagani,Ewoud Ewing,David Gomez-Cabrero
DOI: https://doi.org/10.1101/2024.12.18.629178
2024-12-22
Abstract:Background
GeneSetCluster is an R package designed to summarize and integrate results from multiple gene-set analyses (GSA), regardless of the specific method used, as long as gene-sets are the primary focus. GeneSetCluster clusters gene-sets based on shared genes, providing a high-level overview of the involved biological pathways, thereby facilitating interpretability. However, the original version faced limitations in managing redundancies across multiple gene sets. Additionally, programmatic tools like GeneSetCluster can be challenging for users with limited coding expertise.
Results
We introduce GeneSetCluster 2.0 which substantially improves upon its predecessor. This update presents a new methodology for addressing duplicated gene-sets and incorporates a seriation-based clustering algorithm that reorders data, enabling the identification of patterns. The updated version is optimized for faster computational performance through parallelization, reducing execution times. Furthermore, the 2.0 version also enhances cluster annotation by identifying relevant tissues and biological processes associated with each cluster. Finally, to improve accessibility and usability, we have developed a user-friendly web application, making GeneSetCluster 2.0 suitable for a broader audience, including those without programming expertise. We also ensured the interplay between the R package, targeting users with programming knowledge, and the web-application, which targets the rest of the users.
Conclusion
GeneSetCluster 2.0 significantly improves the original version by providing a more comprehensive, intuitive, and efficient exploration of GSA results. On the other hand, it bridges the gap between bioinformaticians and clinicians in multidisciplinary teams. Both GeneSetCluster 2.0 web version and R package, along with detailed installation and usage documentation, are available on GitHub at https://github.com/TranslationalBioinformaticsUnit/GeneSetCluster2.0. The web application can be accessed at https://translationalbio.shinyapps.io/genesetcluster/.
Bioinformatics