PredPSP: a novel computational tool to discover pathway-specific photosynthetic proteins in plants

Prabina Kumar Meher,Upendra Kumar Pradhan,Padma Lochan Sethi,Sanchita Naha,Ajit Gupta,Rajender Parsad
DOI: https://doi.org/10.1007/s11103-024-01500-6
2024-09-26
Plant Molecular Biology
Abstract:Photosynthetic proteins play a crucial role in agricultural productivity by harnessing light energy for plant growth. Understanding these proteins, especially within C 3 and C 4 pathways, holds promise for improving crops in challenging environments. Despite existing models, a comprehensive computational framework specifically targeting plant photosynthetic proteins is lacking. The underutilization of plant datasets in computational algorithms accentuates the gap this study aims to fill by introducing a novel sequence-based computational method for identifying these proteins. The scope of this study encompassed diverse plant species, ensuring comprehensive representation across C 3 and C 4 pathways. Utilizing six deep learning models and seven shallow learning algorithms, paired with six sequence-derived feature sets followed by feature selection strategy, this study developed a comprehensive model for prediction of plant-specific photosynthetic proteins. Following 5-fold cross-validation analysis, LightGBM with 65 and 90 LGBM-VIM selected features respectively emerged as the best models for C 3 (auROC: 91.78%, auPRC: 92.55%) and C 4 (auROC: 99.05%, auPRC: 99.18%) plants. Validation using an independent dataset confirmed the robustness of the proposed model for both C 3 (auROC: 87.23%, auPRC: 88.40%) and C 4 (auROC: 92.83%, auPRC: 92.29%) categories. Comparison with existing methods demonstrated the superiority of the proposed model in predicting plant-specific photosynthetic proteins. This study further established a free online prediction server PredPSP (https://iasri-sg.icar.gov.in/predpsp/) to facilitate ongoing efforts for identifying photosynthetic proteins in C 3 and C 4 plants. Being first of its kind, this study offers valuable insights into predicting plant-specific photosynthetic proteins which holds significant implications for plant biology.
biochemistry & molecular biology,plant sciences
What problem does this paper attempt to address?