Optimization of metabolomic data processing using NOREVA
Jianbo Fu,Ying Zhang,Yunxia Wang,Hongning Zhang,Jin Liu,Jing Tang,Qingxia Yang,Huaicheng Sun,Wenqi Qiu,Yinghui Ma,Zhaorong Li,Mingyue Zheng,Feng Zhu
DOI: https://doi.org/10.1038/s41596-021-00636-9
IF: 14.8
2021-12-24
Nature Protocols
Abstract:A typical output of a metabolomic experiment is a peak table corresponding to the intensity of measured signals. Peak table processing, an essential procedure in metabolomics, is characterized by its study dependency and combinatorial diversity. While various methods and tools have been developed to facilitate metabolomic data processing, it is challenging to determine which processing workflow will give good performance for a specific metabolomic study. NOREVA, an out-of-the-box protocol, was therefore developed to meet this challenge. First, the peak table is subjected to many processing workflows that consist of three to five defined calculations in combinatorially determined sequences. Second, the results of each workflow are judged against objective performance criteria. Third, various benchmarks are analyzed to highlight the uniqueness of this newly developed protocol in (1) evaluating the processing performance based on multiple criteria, (2) optimizing data processing by scanning thousands of workflows, and (3) allowing data processing for time-course and multiclass metabolomics. This protocol is implemented in an R package for convenient accessibility and to protect users’ data privacy. Preliminary experience in R language would facilitate the usage of this protocol, and the execution time may vary from several minutes to a couple of hours depending on the size of the analyzed data.
biochemical research methods