Date of Award


Document Type



Transposable elements (TEs) are sequences that change position within the genome and play an important role in genome expansion. TEs are grouped into two categories based on their transposition mechanism. Class 1 retrotransposons spread via target-primed reverse transcription (RNA to DNA) into different genomic locations. Long interspersed element 1 (L1) is a class 1 retrotransposon that is able to move autonomously, as they encode the protein machinery with an endonuclease and reverse transcriptase activity, to insert themselves back into the genome. L1s were the focus of this study, because they are implicated in creating alternate poly(A) sites in genes. We analyzed 778,128 isoforms produced from 12 samples of long-read RNA (PacBio HiFi) sequencing data to investigate if L1s introduce polyadenylation sites. Isoforms were filtered based on L1 location within the isoforms’ 3’UTR, resulting in roughly 3,000 isoforms, spread across 757 genes. L1 subfamilies have arisen throughout evolutionary history due to species-specific substitutions. The L1 subfamilies in the dataset are mostly mammalian specific, while only 43 contain primate specific L1s. The majority of the L1s studied were classified as L1M5 (329), L1ME4b (165), L1MB7 (105), and L1ME4c (105). These L1s contain canonical and noncanonical polyadenylation signals within their 3’UTRs. Alternatively polyadenylated mRNA variants, generated from the same gene, are likely bound by different combinations of trans-acting factors that can affect mRNA localization, translation, stability, and decay. Understanding the roles of L1s in alternative polyadenylation will shed light on the impact of TEs on processing efficiency of gene expression.