Autoprot: Processing, Analysis and Visualization of Proteomics Data in Python

Julian Bender,Wignand W. D. Mühlhäuser,Johannes P. Zimmerman,Friedel Drepper,Bettina Warscheid
DOI: https://doi.org/10.1101/2024.01.18.571429
2024-01-23
Abstract:The increasing numbers of complex quantitative mass spectrometry-based proteomics data sets demand a standardised and reliable analysis pipeline. For this purpose, Python-based analysis, particularly through Jupyter notebooks, serves as a simple yet powerful tool. Nevertheless, the availability of Python software for standardised and accessible MS data analysis is limited, and this software is often constrained to using analysis functions written in Python. This excludes existing and well-tested software, for example written in R. Despite this, Python offers several interactive data visualisation modules that greatly enhance exploratory research and facilitate result communication with collaboration partners. Consequently, there is a need for an integrated and Jupyter-compatible Python analysis pipeline that incorporates R algorithms and interactive visualization for proteomics data analysis.
Bioinformatics
What problem does this paper attempt to address?
The problem this paper attempts to address is: With the increasing complexity of mass spectrometry-based proteomics datasets, there is a need for a standardized and reliable analysis workflow to handle these data. Although Python and its Jupyter notebooks offer simple yet powerful advantages in data processing and analysis, existing Python software still has limitations in terms of standards and ease of use, and is often restricted to analysis functions written in Python, excluding already existing and validated R language software. Furthermore, while Python provides various interactive data visualization modules, it lacks an integrated, Jupyter-compatible Python analysis pipeline that can combine R algorithms and interactive visualization features. Therefore, the authors developed **autoprot**, a Python module designed to simplify the data analysis of quantitative mass spectrometry proteomics experiments processed with MaxQuant software. It provides access to mature functions written in both Python and R for statistical testing and data transformation, and generates JavaScript-based interactive charts that can be integrated into interactive web applications. In this way, autoprot not only offers standardized, fast, and reliable data analysis but also maintains a high degree of customizability to adjust the analysis workflow according to specific experimental needs.