UMMS Affiliation

Department of Molecular, Cell, and Cancer Biology; Program in Molecular Medicine

Publication Date

2020-11-06

Document Type

Article

Disciplines

Amino Acids, Peptides, and Proteins | Bioinformatics | Computational Biology | Molecular Biology

Abstract

Sequence logos have been widely used as graphical representations of conserved nucleic acid and protein motifs. Due to the complexity of the amino acid (AA) alphabet, rich post-translational modification, and diverse subcellular localization of proteins, few versatile tools are available for effective identification and visualization of protein motifs. In addition, various reduced AA alphabets based on physicochemical, structural, or functional properties have been valuable in the study of protein alignment, folding, structure prediction, and evolution. However, there is lack of tools for applying reduced AA alphabets to the identification and visualization of statistically significant motifs. To fill this gap, we developed an R/Bioconductor package dagLogo, which has several advantages over existing tools. First, dagLogo allows various formats for input sets and provides comprehensive options to build optimal background models. It implements different reduced AA alphabets to group AAs of similar properties. Furthermore, dagLogo provides statistical and visual solutions for differential AA (or AA group) usage analysis of both large and small data sets. Case studies showed that dagLogo can better identify and visualize conserved protein sequence patterns from different types of inputs and can potentially reveal the biological patterns that could be missed by other logo generators.

Keywords

amino acids, amino acid alphabet, protein motifs, sequence logos

Rights and Permissions

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

DOI of Published Version

10.1371/journal.pone.0242030

Source

Ou J, Liu H, Nirala NK, Stukalov A, Acharya U, Green MR, Zhu LJ. dagLogo: An R/Bioconductor package for identifying and visualizing differential amino acid group usage in proteomics data. PLoS One. 2020 Nov 6;15(11):e0242030. doi: 10.1371/journal.pone.0242030. PMID: 33156866; PMCID: PMC7647101. Link to article on publisher's site

Journal/Book/Conference Title

PloS one

Related Resources

Link to Article in PubMed

PubMed ID

33156866

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons 1.0 Public Domain Dedication.

Share

COinS