UMMS Affiliation

Department of Biochemistry and Molecular Pharmacology; Program in Bioinformatics and Integrative Biology

Publication Date


Document Type



Base Composition; Binding Sites; Cell Line; Chromatin; Computational Biology; *Gene Expression Regulation; *Genomics; Histones; Humans; Models, Biological; Promoter Regions, Genetic; Protein Binding; Transcription Factors; Transcription Initiation Site; *Transcription, Genetic


Bioinformatics | Biostatistics | Computational Biology | Genetics and Genomics | Systems Biology


Statistical models have been used to quantify the relationship between gene expression and transcription factor (TF) binding signals. Here we apply the models to the large-scale data generated by the ENCODE project to study transcriptional regulation by TFs. Our results reveal a notable difference in the prediction accuracy of expression levels of transcription start sites (TSSs) captured by different technologies and RNA extraction protocols. In general, the expression levels of TSSs with high CpG content are more predictable than those with low CpG content. For genes with alternative TSSs, the expression levels of downstream TSSs are more predictable than those of the upstream ones. Different TF categories and specific TFs vary substantially in their contributions to predicting expression. Between two cell lines, the differential expression of TSS can be precisely reflected by the difference of TF-binding signals in a quantitative manner, arguing against the conventional on-and-off model of TF binding. Finally, we explore the relationships between TF-binding signals and other chromatin features such as histone modifications and DNase hypersensitivity for determining expression. The models imply that these features regulate transcription in a highly coordinated manner.

Rights and Permissions

© 2012, Published by Cold Spring Harbor Laboratory Press. This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at

DOI of Published Version



Genome Res. 2012 Sep;22(9):1658-67. doi: 10.1101/gr.136838.111. Link to article on publisher's site

Journal/Book/Conference Title

Genome research

Related Resources

Link to Article in PubMed

PubMed ID




To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.