posted on 2017-03-02, 23:24authored byPorebski, Benjamin Thomas
At the molecular level, protein molecules embody a remarkable relationship between structure and function. They are the most versatile macromolecules in living systems and serve crucial functions in essentially all biological processes. Most proteins are only marginally stable under physiological conditions, with an overall thermodynamic stability, or Gibbs free energy of folding (ΔG), in the range of -5 to -15 kcal molˉ¹. This marginal stability complicates the design and application of industrial enzymes and therapeutic drugs, whilst also leaving wild-type proteins susceptible to pathologically destabilizing mutations. There are currently several approaches employed to enhance protein stability. The rational approach to stabilization is challenging, as it is difficult to predict the energetic and structural response to mutations in proteins, whilst in vitro evolutionary approaches are often expensive and time consuming. An alternative approach is to utilize statistical sequence analysis of an entire protein fold, motif or domain of interest. This is based on the hypothesis that at a given position in a multiple sequence alignment (MSA) of homologous proteins, the respective consensus amino acid contributes more than average to the stability of the protein than non-conserved amino acids. Conservation can be applied as an engineering approach, called consensus design. Here, either point mutations are made to a target protein, or a de novo sequence is calculated, which is known as “full sequence design”. Consensus design has produced many successful examples of stabilization, however little is understood about how and why the method works, nor the cause and effect of design variables.
This thesis explores the application of full sequence consensus design to two protein folds, the fibronectin type III (FN3) domain and the serine protease inhibitor (serpin). A thorough biophysical characterization of the two resulting proteins, FN3con and conserpin, reveals remarkable thermodynamic and kinetic stabilities, with melting temperatures above 100°C, reversible folding and improved aggregation resistance. These results are exceptional achievements of protein engineering, with both FN3con and conserpin being the most stable variants (engineered or wild- type) of their respective protein family, and conserpin being the first serpin with true refoldability. In turn, this has allowed for the direct application of FN3con as a binding scaffold through rational loop grafting and directed evolution, while maintaining its biophysical properties. Further, conserpin provides key insights into evolution, function and stability of the serpin superfamily, and a long sought-after model system for the elucidation of the serpin folding pathway. These results advance our understanding of consensus design, suggesting the capacity of full sequence design to smoothen out the protein energy landscape. This thesis therefore highlights the utility of the technique for engineering highly stable and robust proteins that may serve as model protein systems for future biophysical studies or the basis of industrial enzymes and therapeutic drugs.