GSBS Dissertations and Theses

Approval Date

3-20-2009

Document Type

Doctoral Dissertation

Department

Graduate School of Biomedical Sciences; Department of Biochemistry and Molecular Pharmacology

Subjects

DNA; Drosophila Proteins; Homeodomain Proteins; Transcription Factors; Regulatory Elements, Transcriptional; Two-Hybrid System Techniques

Abstract

From the yeast genome completed in 1996 to the 12 Drosophila genomes published earlier this year; little more than a decade has provided an incredible amount of genomic data. Yet even with this mountain of genetic information the regulatory networks that control gene expression remain relatively undefined. In part, this is due to the enormous amount of non-coding DNA, over 98% of the human genome, which needs to be made sense of. It is also due to the large number of transcription factors, potentially 2,000 such factors in the human genome, which may contribute to any given network directly or indirectly. Certainly, one of the central limitations has been the paucity of transcription factor (TF) specificity data that would aid in the prediction of regulatory targets throughout a genome.

The general lack of specificity data has hindered the prediction of regulatory targets for individual TFs as well as groups of factors that function within a common regulatory pathway. A large collection of factor specificities would allow for the combinatorial prediction of regulatory targets that considers all factors actively expressed in a given cell, under a given condition. Herein we describe substantial improvements to a previous bacterial one-hybrid system with increased sensitivity and dynamic range that make it amenable for the high-throughput analysis of sequence-specific TFs. Currently we have characterized 108 (14.3%) of the predicted TFs in Drosophila that fall into a broad range of DNA-binding domain families, demonstrating the feasibility of characterizing a large number of TFs using this technology.

To fully exploit our large database of binding specificities, we have created a GBrowse-based search tool that allows an end-user to examine the overrepresentation of binding sites for any number of individual factors as well as combinations of these factors in up to six Drosophila genomes (veda.cs.uiuc.edu/cgi-bin/gbrowse/gbrowse/Dmel4). We have used this tool to demonstrate that a collection of factor specificities within a common pathway will successfully predict previously validated cis-regulatory modules within a genome. Furthermore, within our database we provide a complete catalog of DNA-binding specificities for all 84 homeodomains in Drosophila. This catalog enabled us to propose and test a detailed set of recognition rules for homeodomains and use this information to predict the specificities of the majority of homeodomains in the human genome.

Rights and Permissions

Copyright is held by the author, with all rights reserved.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.