SEPDB: a database of secreted proteins

Ruiqing Wang,Chao Ren,Tian Gao,Hao Li,Xiaochen Bo,Dahai Zhu,Dan Zhang,Hebing Chen,Yong Zhang
DOI: https://doi.org/10.1093/database/baae007
2024-01-01
Database
Abstract:Abstract Detecting changes in the dynamics of secreted proteins in serum has been a challenge for proteomics. Enter secreted protein database (SEPDB), an integrated secretory proteomics database offering human, mouse and rat secretory proteomics datasets collected from serum, exosomes and cell culture media. SEPDB compiles secreted protein information from secreted protein database, UniProt and Human Protein Atlas databases to annotate secreted proteomics data based on protein subcellular localization and disease markers. SEPDB integrates the latest predictive modeling techniques to measure deviations in the distribution of signal peptide structures of secreted proteins, extends signal peptide sequence prediction by excluding transmembrane structural domain proteins and updates the validation analysis pipeline for secreted proteins. To establish tissue-specific profiles, we have also created secreted proteomics datasets associated with different human tissues. In addition, we provide information on heterogeneous receptor network organizational relationships, reflective of the complex functional information inherent in the molecular structures of secreted proteins that serve as ligands. Users can take advantage of the Refreshed Search, Analyze, Browse and Download functions of SEPDB, which is available online at https://sysomics.com/SEPDB/. Database URL: https://sysomics.com/SEPDB/
mathematical & computational biology
What problem does this paper attempt to address?
The main objective of this paper is to introduce a new database called SEPDB (Secreted Protein Database), which aims to integrate a large amount of data related to secreted proteins and provide a comprehensive platform to support research on these proteins. Specifically, the paper attempts to address the following issues: 1. **Data Integration and Standardization**: Addressing the inconsistency and standardization challenges of secreted protein data from different sources to facilitate comparative analysis. 2. **Diversity Handling**: Facing the challenge of the diversity of secreted protein forms, including antibodies, digestive enzymes, exosomes, etc., by identifying and classifying these proteins through literature data validation and model prediction. 3. **Network Information Extraction**: Exploring how to extract network-level information from tissue-specific dynamics for the complex networks formed by secreted proteins as ligands binding specifically to target cell receptors. 4. **Signal Peptide Prediction and Membrane Domain Identification**: Updating signal peptide prediction methods and distinguishing between signal peptides and transmembrane domains to improve prediction accuracy. 5. **Cross-Species Data Integration**: Handling protein data from different model organisms (such as humans, mice, rats), reducing redundancy, and promoting cross-referencing of information. SEPDB overcomes the above challenges by collecting and integrating data from multiple sources (including experimentally validated data, predicted data, and information from other databases) and employing advanced prediction models and techniques. Additionally, the database provides information on the expression changes of secreted proteins under different physiological states, including changes related to aging, exercise, and the relationship of these changes to diseases. Users can access the data in the database through search, browse, and download functions, and utilize visualization tools to further analyze the functions and mechanisms of specific proteins.