University of Massachusetts Medical School Faculty Publications

UMMS Affiliation

Program in Bioinformatics and Integrative Biology; Program in Molecular Medicine

Publication Date

11-11-2015

Document Type

Article Preprint

Disciplines

Amino Acids, Peptides, and Proteins | Computational Biology | Ecology and Evolutionary Biology | Genetic Phenomena | Nucleic Acids, Nucleotides, and Nucleosides

Abstract

BACKGROUND: Recent advances in transcriptome sequencing have enabled the discovery of thousands of long non-coding RNAs (lncRNAs) across multitudes of species. Though several lncRNAs have been shown to play important roles in diverse biological processes, the functions and mechanisms of most lncRNAs remain unknown. Two significant obstacles lie between transcriptome sequencing and functional characterization of lncRNAs: 1) identifying truly noncoding genes from de novo reconstructed transcriptomes, and 2) prioritizing hundreds of resulting putative lncRNAs from each sample for downstream experimental interrogation.

RESULTS: We present slncky, a computational lncRNA discovery tool that produces a high-quality set of lncRNAs from RNA-Sequencing data and further prioritizes lncRNAs by characterizing selective constraint as a proxy for function. Our filtering pipeline is comparable to manual curation efforts and more sensitive than previously published approaches. Further, we develop, for the first time, a sensitive alignment pipeline for aligning lncRNA loci and propose new evolutionary metrics relevant for both sequence and transcript evolution. Our analysis reveals that selection acts in several distinct patterns, and uncovers two notable classes of lncRNAs: one showing strong purifying selection at RNA sequence and another where constraint is restricted to the regulation but not the sequence of the transcript.

CONCLUSION: Our novel comparative methods for lncRNAs reveals 233 constrained lncRNAs out of tens of thousands of currently annotated transcripts, which we believe should be prioritized for further interrogation. To aid in their analysis we provide the slncky Evolution Browser as a resource for experimentalists.

Keywords

evolutionary biology, RNA, long non-coding RNAs, slncky, long noncoding RNAs, evolution, comparative genomics, molecular evolution, annotation, lincRNA, RNA-Seq, transcriptome

Rights and Permissions

The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

DOI of Published Version

10.1101/031385

Source

bioRxiv 031385; doi: https://doi.org/10.1101/031385. Link to preprint on bioRxiv service.

Related Resources

Now published in Genome Biology doi: 10.1186/s13059-016-0880-9.

Journal/Book/Conference Title

bioRxiv

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.