A directed learning strategy integrating multiple omic data improves genomic prediction

Xuehai Hu,Weibo Xie,Chengchao Wu,Shizhong Xu
DOI: https://doi.org/10.1111/pbi.13117
IF: 13.8
2019-04-14
Plant Biotechnology Journal
Abstract:Genomic prediction (GP) aims to construct a statistical model for predicting phenotypes using genome‐wide markers and is a promising strategy for accelerating molecular plant breeding. However, current progress of phenotype prediction using genomic data alone has reached a bottleneck, and previous studies on transcriptomic and metabolomic predictions ignored genomic information. Here we designed a novel strategy of GP called multi‐layered least absolute shrinkage and selection operator (MLLASSO) by integrating multiple omic data into a single model that iteratively learns three layers of genetic features (GFs) supervised by observed transcriptome and metabolome. Significantly, MLLASSO learns higher order information of gene interactions, which enables us to achieve a significant improvement of predictability of yield in rice from 0.1588 (GP alone) to 0.2451 (MLLASSO). In the prediction of the first two layers, some genes were found to be genetically predictable genes (GPGs) as their expressions were accurately predicted with genetic markers. Interestingly, we made three dramatic discoveries for the GPGs: (1) GPGs are good predictors for highly‐complex traits like yield; (2) GPGs are mostly eQTL genes (<i>cis</i> or <i>trans</i>); (3) trait‐related transcriptional factor families are enriched in GPGs. These findings support the notion that learned GFs not only are good predictors for traits but also have specific biological implications regarding regulation of gene expressions. To differentiate the new method from conventional GP models, we called MLLASSO a directed learning strategy supervised by intermediate omic data. This new prediction model appears to be more reliable and more robust than conventional GP models.This article is protected by copyright. All rights reserved.
biotechnology & applied microbiology,plant sciences
What problem does this paper attempt to address?