TCRosetta: an integrated analysis and annotation platform for T-cell receptor sequences

Tao Yue,Si-Yi Chen,Wen-Kang Shen,Zhan-Ye Zhang,Liming Cheng,An-Yuan Guo
DOI: https://doi.org/10.1093/gpbjnl/qzae013
2024-02-08
Abstract:Abstract T cells and T-cell receptors (TCRs) are essential components of the adaptive immune system. Characterization of the TCR repertoire offers a promising and highly informative source for understanding the functions of T cells in the immune response and immunotherapy. Although TCR repertoire studies have attracted much attention, there are few online servers available for TCR repertoire analysis, especially for TCR sequence annotation or advanced analyses. Therefore, we developed TCRosetta, a comprehensive online server that integrates analytical methods for TCR repertoire analysis and visualization. TCRosetta combines general feature analysis, large-scale sequence clustering, network construction, peptide–TCR binding prediction, generation probability calculation, and k-mer motif analysis for TCR sequences, making TCR data analysis as simple as possible. The TCRosetta server accepts multiple input data formats and can analyze ∼ 20,000 TCR sequences in less than three minutes. TCRosetta is the most comprehensive web server available for TCR repertoire analysis and is freely available at http://bioinfo.life.hust.edu.cn/TCRosetta/ or https://guolab.wchscu.cn/TCRosetta/.
genetics & heredity
What problem does this paper attempt to address?
The main objective of this paper is to introduce TCRosetta, an integrated online analysis and annotation platform specifically designed for the study of T-cell receptor (TCR) sequences. The platform aims to address several key issues in the current analysis of TCR sequence data: 1. **Lack of comprehensive online servers**: Although TCR sequence research has garnered widespread attention, the number of available online servers is limited, especially in terms of TCR sequence annotation or advanced analysis. 2. **Insufficient user-friendliness**: Existing tools often need to be run in a local environment and are difficult to operate for users without programming knowledge. 3. **Functional limitations**: Existing online platforms typically require user registration and can only analyze general characteristics of the TCR repertoire. TCRosetta addresses these issues in the following ways: - **Comprehensive analysis capabilities**: TCRosetta integrates various TCR repertoire analysis methods, including general characteristic analysis, large-scale sequence clustering, network construction, peptide-TCR binding prediction, generation probability calculation, and k-mer motif analysis. - **User-friendly interface**: The platform offers intuitive file management features, supports multiple input data formats, and can quickly process large amounts of data (approximately 20,000 TCR sequences analyzed within 3 minutes). - **Advanced analysis features**: In addition to basic TCR repertoire characteristic analysis, TCRosetta provides advanced analysis tools such as network construction based on sequence similarity to identify antigen-specific TCRs and the calculation of TCR sequence generation probabilities. - **Batch search and annotation functionality**: This is the first platform with batch search and annotation capabilities, allowing users to search multiple CDR3 sequences in a reference population and annotate potential disease state information. In summary, this paper attempts to meet the needs for comprehensiveness, ease of use, and advanced analysis in TCR sequence research by developing the powerful online analysis platform TCRosetta.