Yixin Zhao, Noah Dukler, Gilad Barshad, Shushan Toneyan, Charles G Danko, Adam Siepel
Bioinformatics (Oxford, England) 2021 Dec 11Quantification of isoform abundance has been extensively studied at the mature RNA level using RNA-seq but not at the level of precursor RNAs using nascent RNA sequencing. We address this problem with a new computational method called Deconvolution of Expression for Nascent RNA-sequencing data (DENR), which models nascent RNA-sequencing read-counts as a mixture of user-provided isoforms. The baseline algorithm is enhanced by machine-learning predictions of active transcription start sites and an adjustment for the typical 'shape profile' of read-counts along a transcription unit. We show that DENR outperforms simple read-count-based methods for estimating gene and isoform abundances, and that transcription of multiple pre-RNA isoforms per gene is widespread, with frequent differences between cell types. In addition, we provide evidence that a majority of human isoform diversity derives from primary transcription rather than from post-transcriptional processes. DENR and nascentRNASim are freely available at https://github.com/CshlSiepelLab/DENR (version v1.0.0) and https://github.com/CshlSiepelLab/nascentRNASim (version v0.3.0). Supplementary data are available at Bioinformatics online. © The Author(s) 2021. Published by Oxford University Press.
Yixin Zhao, Noah Dukler, Gilad Barshad, Shushan Toneyan, Charles G Danko, Adam Siepel. Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells. Bioinformatics (Oxford, England). 2021 Dec 11;37(24):4727-4736
PMID: 34382072
View Full Text