Program in Bioinformatics and Integrative Biology; Program in Molecular Medicine
Amino Acids, Peptides, and Proteins | Computational Biology | Ecology and Evolutionary Biology | Genetic Phenomena | Nucleic Acids, Nucleotides, and Nucleosides
BACKGROUND: Recent advances in transcriptome sequencing have enabled the discovery of thousands of long non-coding RNAs (lncRNAs) across multitudes of species. Though several lncRNAs have been shown to play important roles in diverse biological processes, the functions and mechanisms of most lncRNAs remain unknown. Two significant obstacles lie between transcriptome sequencing and functional characterization of lncRNAs: 1) identifying truly noncoding genes from de novo reconstructed transcriptomes, and 2) prioritizing hundreds of resulting putative lncRNAs from each sample for downstream experimental interrogation.
RESULTS: We present slncky, a computational lncRNA discovery tool that produces a high-quality set of lncRNAs from RNA-Sequencing data and further prioritizes lncRNAs by characterizing selective constraint as a proxy for function. Our filtering pipeline is comparable to manual curation efforts and more sensitive than previously published approaches. Further, we develop, for the first time, a sensitive alignment pipeline for aligning lncRNA loci and propose new evolutionary metrics relevant for both sequence and transcript evolution. Our analysis reveals that selection acts in several distinct patterns, and uncovers two notable classes of lncRNAs: one showing strong purifying selection at RNA sequence and another where constraint is restricted to the regulation but not the sequence of the transcript.
CONCLUSION: Our novel comparative methods for lncRNAs reveals 233 constrained lncRNAs out of tens of thousands of currently annotated transcripts, which we believe should be prioritized for further interrogation. To aid in their analysis we provide the slncky Evolution Browser as a resource for experimentalists.
evolutionary biology, RNA, long non-coding RNAs, slncky, long noncoding RNAs, evolution, comparative genomics, molecular evolution, annotation, lincRNA, RNA-Seq, transcriptome
Rights and Permissions
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
DOI of Published Version
bioRxiv 031385; doi: https://doi.org/10.1101/031385. Link to preprint on bioRxiv service.
Now published in Genome Biology doi: 10.1186/s13059-016-0880-9.
Chen J, Shishkin AA, Zhu X, Kadri S, Maza I, Hanna JH, Regev A, Garber M. (2015). Evolutionary analysis across mammals reveals distinct classes of long noncoding RNAs. University of Massachusetts Medical School Faculty Publications. https://doi.org/10.1101/031385. Retrieved from https://escholarship.umassmed.edu/faculty_pubs/1568
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License