Abstract:Data quality in global metabolomics is of great importance for biomarker discovery and system biology studies. However, comprehensive metrics and methods to evaluate and compare the data quality of global metabolomics data sets are lacking. In this work, we combine newly developed metrics, along with well-known measures, to comprehensively and quantitatively characterize the data quality across two similar liquid chromatography coupled with mass spectrometry (LC–MS) platforms, with the goal of providing an efficient and improved ability to evaluate the data quality in global metabolite profiling experiments. A pooled human serum sample was run 50 times on two high-resolution LC-QTOF-MS platforms to provide profile and centroid MS data. These data were processed using Progenesis QI software and then analyzed using five important data quality measures, including retention time drift, the number of compounds detected, missing values, and MS reproducibility (2 measures). The detected compounds were fit to a γ distribution versus compound abundance, which was normalized to allow comparison of different platforms. To evaluate missing values, characteristic curves were obtained by plotting the compound detection percentage versus extraction frequency. To characterize reproducibility, the accumulative coefficient of variation (CV) versus the percentage of total compounds detected and intraclass correlation coefficient (ICC) versus compound abundance were investigated. Key findings include significantly better performance using profile mode data compared to centroid mode as well quantitatively better performance from the newer, higher resolution instrument. A summary table of results gives a snapshot of the experimental results and provides a template to evaluate the global metabolite profiling workflow. In total, these measures give a good overall view of data quality in global profiling and allow comparisons of data acquisition strategies and platforms as well as optimization of parameters.The Supporting Information is available free of charge at <a class="ext-link" href="/doi/10.1021/acs.analchem.0c01493?goto=supporting-info">https://pubs.acs.org/doi/10.1021/acs.analchem.0c01493</a>.Flow chart to process the raw LC–MS data (Figure S1), detected compounds and missing values versus compound abundance (Figure S2), the Pearson correlation coefficient versus compound abundance (Figure S3), missing-value performance for the reduced number of samples (Figure S4), detected compounds and missing values versus compound abundance for five QCs (Figure S5), the percentage of compounds versus CV (Figure S6), the 3-D plot of same versus log<sub>10</sub>(abundance) for five QCs, ICC values versus the percentage of compounds, the 3-D plot of same versus CV for five QCs (Figure S7), detected compounds and missing values versus compound abundance for five QCs (Figure S8), and the Pearson correlation coefficient versus compound abundance for five QCs (Figure S9) (<a class="ext-link" href="/doi/suppl/10.1021/acs.analchem.0c01493/suppl_file/ac0c01493_si_001.pdf">PDF</a>)This article has not yet been cited by other publications.

NOREVA: normalization and evaluation of MS-based metabolomics data

NOREVA: Enhanced Normalization and Evaluation of Time-Course and Multi-Class Metabolomic Data

Optimization of metabolomic data processing using NOREVA

Performance Evaluation and Online Realization of Data-driven Normalization Methods Used in LC/MS based Untargeted Metabolomics Analysis

Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics

Opennau: an Open-Source Platform for Normalizing, Analyzing, and Visualizing Untargeted Metabolomics Data

A Novel Bioinformatics Approach to Identify the Consistently Well-Performing Normalization Strategy for Current Metabolomic Studies

EigenRF: an improved metabolomics normalization method with scores for reproducibility evaluation on importance rankings of differential metabolites

Development and Validation of an Improved Probabilistic Quotient Normalization Method for LC/MS- and NMR-based Metabonomic Analysis

Normalization Approach by a Reference Material to Improve LC-MS-Based Metabolomic Data Comparability of Multibatch Samples.

OpenNAU: An open-source platform for normalizing, analyzing, and visualizing cancer untargeted metabolomics data

MetaboAnalystR 4.0: a unified LC-MS workflow for global metabolomics

Evaluation of normalization strategies for GC-based metabolomics

MetaNorm: incorporating meta-analytic priors into normalization of NanoString nCounter data

Evaluating Cross-Platform Normalization Methods for Integrated Microarray and RNA-seq Data Analysis

Optimal Normalization Method for GC-MS/MS-Based Large-Scale Targeted Metabolomics

Evaluating Protocols for Reproducible Targeted Metabolomics by NMR

NormAE: Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data.

metaX: a flexible and comprehensive software for processing metabolomics data

Five Easy Metrics of Data Quality for LC–MS-Based Global Metabolomics

MetaboAnalystR 3.0: Toward an Optimized Workflow for Global Metabolomics