Refining dual RNA-seq mapping: sequential and combined approaches in host-parasite plant dynamics

Carmine Fruggiero,Gaetano Aufiero,Davide D'Angelo,Edoardo Pasolli,Nunzio D'Agostino
DOI: https://doi.org/10.1101/2024.07.28.605052
2024-07-29
Abstract:Transcriptional profiling in 'host plant-parasitic plant' interactions is challenging due to the tight interface between host and parasitic plants and the percentage of homologous sequences shared. Dual RNA-seq offers a solution by enabling in silico separation of mixed transcripts from the interface region. However, it has to deal with issues related to multiple mapping and cross-mapping of reads in host and parasite genomes, particularly as evolutionary divergence decreases. In this paper, we evaluated the feasibility of this technique by simulating interactions between parasitic and host plants and refining the mapping process. More specifically, we merged host plant with parasitic plant transcriptomes and compared two alignment approaches: sequential mapping of reads to the two separate reference genomes and combined mapping of reads to a single concatenated genome. We considered Cuscuta campestris as parasitic plant and two host plants of interest such as Arabidopsis thaliana and Solanum lycopersicum. Both tested approaches achieved a mapping rate of ~90%, with only about 1% of cross-mapping reads. This suggests the effectiveness of the method in accurately separating mixed transcripts in silico. The combined approach proved slightly more accurate and less time demanding than the sequential approach. The evolutionary distance between parasitic and host plants did not significantly impact the accuracy of read assignment to their respective genomes since enough polymorphisms were present to ensure reliable differentiation. This study demonstrates the reliability of dual RNA-seq for studying host-parasite interactions within the same taxonomic kingdom, paving the way for further research into the key genes involved in plant parasitism.
Biology
What problem does this paper attempt to address?
The paper primarily explores how to effectively utilize dual RNA sequencing (dual RNA-seq) technology to separate and analyze mixed transcriptome data from host and parasitic plants in the study of their interactions. Specifically, the research evaluates two different read mapping strategies—sequential mapping and combined mapping—through simulation experiments to address the challenge of accurately separating transcriptomes between highly homologous host and parasitic plants. ### Research Background - **Interaction between parasitic and host plants**: Parasitic plants can obtain nutrients from host plants, and this interaction significantly affects the growth, reproduction, and physiological processes of the host plants. - **Technical challenges**: Traditionally, to distinguish the transcriptomes of parasitic and host plants, expensive and time-consuming techniques like laser capture microdissection (LCM) are used to physically separate the tissue samples of the two plants. Dual RNA-seq, as a bioinformatics method, can achieve the separation of mixed transcriptome data through computational means without the need for physical separation. ### Research Objectives - **Assessing the feasibility of dual RNA-seq technology**: The study evaluates the effectiveness of dual RNA-seq technology by simulating the interaction between the parasitic plant *Cuscuta campestris* and its two host plants *Arabidopsis thaliana* and *Solanum lycopersicum*. - **Comparing two mapping strategies**: The study compares sequential mapping (mapping reads first to the host plant genome and then to the parasitic plant genome, or vice versa) and combined mapping (mapping reads to a single reference sequence that combines the genomes of both host and parasitic plants). ### Main Findings - **Mapping efficiency**: Both sequential and combined mapping achieved approximately 90% mapping rates, with only about 1% cross-mapped reads, indicating that these methods can effectively separate mixed transcriptomes. - **Accuracy comparison**: Combined mapping showed a slight advantage over sequential mapping, not only in terms of higher accuracy but also in requiring less computational time. - **Impact of evolutionary distance**: Although the evolutionary distances between *Cuscuta campestris* and the two host plants differ, this difference did not significantly affect the accuracy of read allocation to their respective genomes, as there was sufficient polymorphism to ensure reliable distinction. In summary, this study demonstrates that dual RNA-seq is a reliable method that can effectively separate and analyze mixed transcriptome data between host and parasitic plants without the need for expensive physical separation steps. This is of great significance for further exploring key genes in plant parasitism phenomena.