UMass Chan Medical School Faculty Publications
Title
Progressive alignment with Cactus: a multiple-genome aligner for the thousand-genome era [preprint]
UMMS Affiliation
Program in Molecular Medicine
Publication Date
2019-08-26
Document Type
Article Preprint
Disciplines
Computational Biology | Genomics
Abstract
Cactus, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequence. We describe progressive extensions to Cactus that enable reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We show that Cactus is capable of scaling to hundreds of genomes and beyond by describing results from an alignment of over 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment yet created. Further, we show improvements in orthology resolution leading to downstream improvements in annotation.
Keywords
Cactus, multiple genome alignment program, genomes, reference-free alignment
Rights and Permissions
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
DOI of Published Version
10.1101/730531
Source
bioRxiv 730531; doi: https://doi.org/10.1101/730531. Link to preprint on bioRxiv service.
Journal/Book/Conference Title
bioRxiv
Repository Citation
Armstrong J, Karlsson EK, Paten B. (2019). Progressive alignment with Cactus: a multiple-genome aligner for the thousand-genome era [preprint]. UMass Chan Medical School Faculty Publications. https://doi.org/10.1101/730531. Retrieved from https://escholarship.umassmed.edu/faculty_pubs/1623
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Comments
Full author list omitted for brevity. For the full list of authors, see paper.