Comparative study on differential expression analysis process of RNA-seq data
ZHAO Yanhui,CHEN Shaokang,ZHAI Liwei,SHI Yumei,YUAN Jiani,SHENG Xihui,Qi Xiaolong,GUO Yong,WANG Chuduan,XING Kai
DOI: https://doi.org/10.11841/j.issn.1007-4333.2023.06.14
2023-01-01
Abstract:To select an appropriate process for gene differential expression analysis, the study analyzed the performance of three comparison tools TopHat2, HISAT2 and STAR and three differentially expressed gene screening tools DESeq2, edgeR and limma based on the fat transcriptome data of Songliao Black Pig and landrace pig, and comprehensively evaluated the performance in combination with the enrichment results of KEGG pathway. The results show that: 1) HISAT2 has the fastest running speed and STAR has the highest unique mapping ratio. After comprehensive consideration, this study selects HISAT2 data for subsequent screening and analysis of differentially expressed genes. 2) 616 differential genes were screened by DESeq2, 890 differential genes were screened by edgeR and 829 differential genes were screened by limma, and 246 differential genes overlapped among the three. 3) The up-regulated differentially expressed genes of DESeq2, edgeR, and limma were enriched to 110, 108, and 142 pathways respectively, of which 72 were overlapped, while the down-regulated differentially expressed genes were enriched to 190, 247, and 177 pathways, of which 158 were overlapped. HISAT2 is recommended for genome mapping in this study. When there is no biological duplication, edgeR is recommended to screen differentially expressed genes. In order to reduce false positives during analysis, DESeq2 or the intersection of differentially expressed genes of two or more tools can be selected. The results of this study will help researchers obtain better and more comprehensive biological insights from transcriptome data.