Machine learning based identification potential feature genes for prediction of drug efficacy in nonalcoholic steatohepatitis animal model

Marwa Matboli,Ibrahim Abdelbaky,Abdelrahman Khaled,Radwa Khaled,Shaimaa Hamady,Laila M Farid,Mariam B Abouelkhair,Noha E El-Attar,Mohamed Farag Fathallah,Manal S Abd El Hamid,Gena M Elmakromy,Marwa Ali
DOI: https://doi.org/10.1186/s12944-024-02231-9
2024-08-24
Abstract:Background: Nonalcoholic Steatohepatitis (NASH) results from complex liver conditions involving metabolic, inflammatory, and fibrogenic processes. Despite its burden, there has been a lack of any approved food-and-drug administration therapy up till now. Purpose: Utilizing machine learning (ML) algorithms, the study aims to identify reliable potential genes to accurately predict the treatment response in the NASH animal model using biochemical and molecular markers retrieved using bioinformatics techniques. Methods: The NASH-induced rat models were administered various microbiome-targeted therapies and herbal drugs for 12 weeks, these drugs resulted in reducing hepatic lipid accumulation, liver inflammation, and histopathological changes. The ML model was trained and tested based on the Histopathological NASH score (HPS); while (0-4) HPS considered Improved NASH and (5-8) considered non-improved, confirmed through rats' liver histopathological examination, incorporates 34 features comprising 20 molecular markers (mRNAs-microRNAs-Long non-coding-RNAs) and 14 biochemical markers that are highly enriched in NASH pathogenesis. Six different ML models were used in the proposed model for the prediction of NASH improvement, with Gradient Boosting demonstrating the highest accuracy of 98% in predicting NASH drug response. Findings: Following a gradual reduction in features, the outcomes demonstrated superior performance when employing the Random Forest classifier, yielding an accuracy of 98.4%. The principal selected molecular features included YAP1, LATS1, NF2, SRD5A3-AS1, FOXA2, TEAD2, miR-650, MMP14, ITGB1, and miR-6881-5P, while the biochemical markers comprised triglycerides (TG), ALT, ALP, total bilirubin (T. Bilirubin), alpha-fetoprotein (AFP), and low-density lipoprotein cholesterol (LDL-C). Conclusion: This study introduced an ML model incorporating 16 noninvasive features, including molecular and biochemical signatures, which achieved high performance and accuracy in detecting NASH improvement. This model could potentially be used as diagnostic tools and to identify target therapies.
What problem does this paper attempt to address?