Abstract:Poly(A) tails at the 3’ end of eukaryotic messenger RNAs control mRNA stability and translation efficiency. Facilitated by various NGS methods, alternative polyadenylation sites determining the 3ʹ-UTR length of gene transcripts have been extensively studied. However, poly(A) lengths demonstrating dynamic and developmental regulation remain largely unexplored. The recently developed NGS-based methods for genome-wide poly(A) profiling have promoted the study of genom-wide poly(A) dynamics. Here we present a straight forward NGS-method for poly(A) profiling, which applies a direct 3’-end adaptor ligation and the template switching for 5’-end adaptor ligation for cDNA library construction. Poly(A) lengths are directly calculated from base call data using a self-developed pipeline pA-finder. The libraries were directly sequenced from the 3ʹ-UTR regions into the followed poly(A) tails, firstly on NextSeq 500 to produce single-end 300-nt reads, demonstrating the method feasibility and that optimization of the fragmented RNA size for cDNA library construction could detecting longer poly (A) tails. We next applied Poly(A)-seq cDNA libraries containing 40-nt and 120-nt poly(A) tail spike-in RNAs on HiSeq X-ten and NovaSeq 6000 to obtain 150-nt and 250-nt pair-end reads. The sequencing profiles of the spike-in RNAs demonstrated both high accuracy and high quality score in reading poly(A) tails. The poly(A) signal bleeding into the 3’ adaptor sequence and a sharp decreased quality score at the junction were observed, allowing the modification of pA-finder to remove homopolymeric signal bleeding. We hope that wide applications of Poly(A)-seq help facilitate the study of the development- and disease-related poly(A) dynamics and regulation, and of the recent emerging mixed tailing regulation.

Extraction of Poly(A) Sites from Large-Scale RNA-seq Data

Poly(A)-tag deep sequencing data processing to extract poly(A) sites.

Computational Analysis of Plant Polyadenylation Signals.

APAtrap: Identification and Quantification of Alternative Polyadenylation Sites from RNA-seq Data.

Genome-wide Identification and Predictive Modeling of Polyadenylation Sites in Eukaryotes

scAPAtrap: identification and quantification of alternative polyadenylation sites from single-cell RNA-seq data

QuantifyPoly(A): Reshaping Alternative Polyadenylation Landscapes of Eukaryotes with Weighted Density Peak Clustering.

Poly(A)-seq: A Method for Direct Sequencing and Analysis of the Transcriptomic Poly(a)-Tails.

PRAPI: Post-Transcriptional Regulation Analysis Pipeline for Iso-Seq

A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq

Scapadb: a Comprehensive Database of Alternative Polyadenylation at Single-Cell Resolution.

Revealing the History and Mystery of RNA-Seq

Genome Level Analysis of Rice Mrna 3′-End Processing Signals and Alternative Polyadenylation

High throughput characterizations of poly(A) site choice in plants.

ARA: a flexible pipeline for automated exploration of NCBI SRA datasets

From Gigabyte to Kilobyte: A Bioinformatics Protocol for Mining Large RNA-Seq Transcriptomics Data

Novel Method of Full-Length RNA-seq That Expands the Identification of Non-Polyadenylated RNAs Using Nanopore Sequencing

Gene expression profiling of non-polyadenylated RNA-seq across species.

PlantAPA: A Portal for Visualization and Analysis of Alternative Polyadenylation in Plants

Prediction of plant mRNA polyadenylation sites.

Identification of Secondary Structures Surrounding Plant Polyadenylation Sites