GSBS Dissertations and Theses

Publication Date


Document Type

Doctoral Dissertation

Academic Program

Biochemistry and Molecular Pharmacology


Biochemistry and Molecular Pharmacology Program

First Thesis Advisor

C. Robert Matthews, Ph.D.


Protein Folding, Protein Structure, Secondary, Repetitive Sequences, Amino Acid


The most common structural platform in biology, the βα-repeat classes of proteins, are represented by the (βα)8TIM barrel topology and the α/β/α sandwich, CheY-like topology. Previous studies on the folding mechanisms of several members of these proteins have suggested that the initial event during refolding involves the formation of a kinetically trapped species that at least partially unfolds before the native conformation can be accessed. The simple topologies of these proteins are thought to permit access to locally folded regions that may coalesce in non-native ways to form stable interactions leading to misfolded intermediates. In a pair of TIM barrel proteins, αTS and sIGPS, it has been shown that the core of the off-pathway folding intermediates is comprised of locally connected clusters of isoleucine, leucine and valine (ILV) residues. These clusters of Branched Aliphatic Side Chains (BASiC) have the unique ability to very effectively prevent the penetration of water to the underlying hydrogen bond networks. This property retards hydrogen exchange with solvent, strengthening main chain hydrogen bonds and linking tertiary and secondary structure in a cooperative network of interactions. This property would also promote the rapid formation of collapsed species during refolding. From this viewpoint, the locally connected topology and the appropriate distribution of ILV residues in the sequence can modulate the energy landscapes of TIM barrel proteins. Another sequence determinant of protein stability that can significantly alter the structure and stability of TIM barrels is the long-range main chain-side chain hydrogen bond. Three of these interactions have been shown to form the molecular underpinnings for the cooperative access to the native state in αTS.

Global analysis results presented in Chapter II and Chapter III, suggest that the off-pathway mechanism is common to three proteins of the CheY-like topology, namely CheY, NT-NtrC and Spo0F. These results are corroborated by Gō-simulations that are able to identify the minimal structure of kinetically trapped species during the refolding of CheY and Spo0F. The extent of transient, premature structure appears to correlate with the number of ILV side chains involved in a large sequence-local cluster that is formed between the central β-sheet and helices α2, α3 and α4. The failure of Gō-simulations to detect off-pathway species during the refolding of NT-NtrC may reflect the smaller number of ILV side chains in its corresponding hydrophobic cluster.

In Chapter IV, comparison of the location of large ILV clusters with the hydrogen exchange protected regions in 19 proteins, suggest that clusters of BASiC residues are the primarily determinants of the stability cores of globular proteins. Although the location of the ILV clusters is sufficient to determine a majority of the protected amides in a protein structure, the extent of protection is over predicted by the ILV cluster method. The survey of 71 TIM barrel proteins presented in Chapter V, suggests that a specific type of long-range main chain-side chain hydrogen bond, termed “βα hairpin clamp” is a common feature in the βα-repeat proteins. The location and sequence patterns observed demonstrate an evolutionary signature of the βαβ modules that are the building blocks of several βα-repeat protein families.

In summary, the work presented in this thesis recognizes the role of sequence in modulating the folding free energy landscapes of proteins. The formation of off-pathway folding intermediates in three CheY-like proteins and the differences in the proposed extent of structure formed in off-pathway intermediates of these three proteins, suggest that both topology and sequence play important and concerted roles in the folding of proteins. Locally connected ILV can clusters lead to off-pathway traps, whereas the formation of the productive folding path requires the development of long-range nativelike topological features to form the native state. The ability of ILV clusters to link secondary and tertiary structure formation enables them to be at the core of this cooperative folding process. Very good correlations between the locations of ILV clusters and both strong protection against exchange and the positions of folding nuclei for a variety of proteins reported in the literature support the generality of the BASiC hypothesis. Finally, the discovery of a novel pattern of H-bond interactions in the TIM barrel architecture, between the amide hydrogen of a core ILV residue with a polar side chain, bracketing βαβ modules, suggests a means for establishing cooperativity between different types of side chain interactions towards formation of the native structure.

See Additional Files for copies of the source code for the global analysis program and the cluster analysis program.



Rights and Permissions

Copyright is held by the author, with all rights reserved.

Global (71 kB)
Global analysis software source code Visual Basic files (36 files compressed to Zip file) (25 kB)
Cluster analysis software source code Visual Basic files (20 files compressed to Zip file)



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.