UMass Chan Medical School Faculty Publications

UMMS Affiliation

Program in Bioinformatics and Integrative Biology; Graduate School of Biomedical Sciences

Publication Date

2022-01-07

Document Type

Article

Disciplines

Amino Acids, Peptides, and Proteins | Bioinformatics | Genetics and Genomics | Integrative Biology | Nucleic Acids, Nucleotides, and Nucleosides

Abstract

The human genome contains approximately 2000 transcriptional regulatory proteins, including approximately 1600 DNA-binding transcription factors (TFs) recognizing characteristic sequence motifs to exert regulatory effects on gene expression. The binding specificities of these factors have been profiled both in vitro, using techniques such as HT-SELEX, and in vivo, using techniques including ChIP-seq. We previously developed Factorbook, a TF-centric database of annotations, motifs, and integrative analyses based on ChIP-seq data from Phase II of the ENCODE Project. Here we present an update to Factorbook which significantly expands the breadth of cell type and TF coverage. The update includes an expanded motif catalog derived from thousands of ENCODE Phase II and III ChIP-seq experiments and HT-SELEX experiments; this motif catalog is integrated with the ENCODE registry of candidate cis-regulatory elements to annotate a comprehensive collection of genome-wide candidate TF binding sites. The database also offers novel tools for applying the motif models within machine learning frameworks and using these models for integrative analysis, including annotation of variants and disease and trait heritability. Factorbook is publicly available at www.factorbook.org; we will continue to expand the resource as ENCODE Phase IV data are released.

Keywords

DNA-binding transcription factors, binding sites, motifs, catalog

Rights and Permissions

Copyright © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact moc.puo@snoissimrep.slanruoj

DOI of Published Version

10.1093/nar/gkab1039

Source

Pratt HE, Andrews GR, Phalke N, Purcaro MJ, van der Velde A, Moore JE, Weng Z. Factorbook: an updated catalog of transcription factor motifs and candidate regulatory motif sites. Nucleic Acids Res. 2022 Jan 7;50(D1):D141-D149. doi: 10.1093/nar/gkab1039. PMID: 34755879; PMCID: PMC8728199. Link to article on publisher's site

Related Resources

Link to Article in PubMed

Journal/Book/Conference Title

Nucleic acids research

PubMed ID

34755879

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Share

COinS