Comparative Transcriptome Analysis of Bovine, Porcine, and Sheep Muscle Using Interpretable Machine Learning Models

Yaqiang Guo,Shuai Li,Rigela Na,Lili Guo,Chenxi Huo,Lin Zhu,Caixia Shi,Risu Na,Mingjuan Gu,Wenguang Zhang
DOI: https://doi.org/10.3390/ani14202947
2024-10-12
Abstract:The growth and development of muscle tissue play a pivotal role in the economic value and quality of meat in agricultural animals, garnering close attention from breeders and researchers. The quality and palatability of muscle tissue directly determine the market competitiveness of meat products and the satisfaction of consumers. Therefore, a profound understanding and management of muscle growth is essential for enhancing the overall economic efficiency and product quality of the meat industry. Despite this, systematic research on muscle development-related genes across different species still needs to be improved. This study addresses this gap through extensive cross-species muscle transcriptome analysis, combined with interpretable machine learning models. Utilizing a comprehensive dataset of 275 publicly available transcriptomes derived from porcine, bovine, and ovine muscle tissues, encompassing samples from ten distinct muscle types such as the semimembranosus and longissimus dorsi, this study analyzes 113 porcine (n = 113), 94 bovine (n = 94), and 68 ovine (n = 68) specimens. We employed nine machine learning models, such as Support Vector Classifier (SVC) and Support Vector Machine (SVM). Applying the SHapley Additive exPlanations (SHAP) method, we analyzed the muscle transcriptome data of cattle, pigs, and sheep. The optimal model, adaptive boosting (AdaBoost), identified key genes potentially influencing muscle growth and development across the three species, termed SHAP genes. Among these, 41 genes (including NANOG, ADAMTS8, LHX3, and TLR9) were consistently expressed in all three species, designated as homologous genes. Specific candidate genes for cattle included SLC47A1, IGSF1, IRF4, EIF3F, CGAS, ZSWIM9, RROB1, and ABHD18; for pigs, DRP2 and COL12A1; and for sheep, only COL10A1. Through the analysis of SHAP genes utilizing Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, relevant pathways such as ether lipid metabolism, cortisol synthesis and secretion, and calcium signaling pathways have been identified, revealing their pivotal roles in muscle growth and development.
What problem does this paper attempt to address?