UMMS Affiliation

RNA Therapeutics Institute

Publication Date

2018-04-27

Document Type

Article

Disciplines

Bioinformatics | Computational Biology | Genetic Phenomena | Statistics and Probability

Abstract

Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop BIISQ, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. BIISQ does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. BIISQ shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios.

Rights and Permissions

© The Author(s) 2018. Open Access: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

DOI of Published Version

10.1038/s41467-018-03402-w

Source

Nat Commun. 2018 Apr 27;9(1):1681. doi: 10.1038/s41467-018-03402-w. Link to article on publisher's site

Journal/Book/Conference Title

Nature communications

Related Resources

Link to Article in PubMed

PubMed ID

29703885

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.