Imputation of plasma lipid species to facilitate integration of lipidomic datasets

Aleksandar Dakic,Jingqin Wu,Tingting Wang,Kevin Huynh,Natalie Mellett,Thy Duong,Habtamu B. Beyene,Dianna J. Magliano,Jonathan E. Shaw,Melinda J. Carrington,Michael Inouye,Jean Y. Yang,Gemma A. Figtree,Joanne E. Curran,John Blangero,John Simes,Corey Giles,Peter J. Meikle
DOI: https://doi.org/10.1038/s41467-024-45838-3
IF: 16.6
2024-02-21
Nature Communications
Abstract:Recent advancements in plasma lipidomic profiling methodology have significantly increased specificity and accuracy of lipid measurements. This evolution, driven by improved chromatographic and mass spectrometric resolution of newer platforms, has made it challenging to align datasets created at different times, or on different platforms. Here we present a framework for harmonising such plasma lipidomic datasets with different levels of granularity in their lipid measurements. Our method utilises elastic-net prediction models, constructed from high-resolution lipidomics reference datasets, to predict unmeasured lipid species in lower-resolution studies. The approach involves (1) constructing composite lipid measures in the reference dataset that map to less resolved lipids in the target dataset, (2) addressing discrepancies between aligned lipid species, (3) generating prediction models, (4) assessing their transferability into the targe dataset, and (5) evaluating their prediction accuracy. To demonstrate our approach, we used the AusDiab population-based cohort (747 lipid species) as the reference to impute unmeasured lipid species into the LIPID study (342 lipid species). Furthermore, we compared measured and imputed lipids in terms of parameter estimation and predictive performance, and validated imputations in an independent study. Our method for harmonising plasma lipidomic datasets will facilitate model validation and data integration efforts.
multidisciplinary sciences
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the difficult task of integrating plasma lipidomics datasets generated at different times or using different platforms. With the development of lipidomics analysis techniques, modern platforms are able to measure more lipid species with higher resolution and accuracy, which has led to significant differences in measurement granularity between old and new datasets. Such differences make it difficult to directly compare or integrate these datasets, especially when constructing risk prediction models or validating models in different studies. Therefore, the paper proposes a framework that utilizes the elastic - net prediction model to predict unmeasured lipid species in the target dataset, thereby achieving the harmonization of different lipidomics datasets and promoting data integration and model validation work. Specifically, this method includes the following steps: 1. **Construct composite lipid measurements**: Create composite lipid measurements in the reference dataset that map to less - resolved lipids in the target dataset. 2. **Handle inconsistencies between lipid species**: Resolve the differences between aligned lipid species. 3. **Generate a prediction model**: Generate a prediction model based on a high - resolution lipidomics reference dataset. 4. **Evaluate the transferability of the model**: Apply the prediction model to the target dataset and evaluate its prediction accuracy. 5. **Verify the prediction results**: Verify the prediction accuracy through independent studies. Through this method, the authors aim to overcome the problem of dataset incompatibility caused by technological progress, thereby improving the data integration ability across studies and the universality of the model.