Improvement of an Existing Part
Overview
We improved 4-coumarate:CoA ligase (At4CL) from Arabidopsis thaliana (BBa_K1033001) to (BBa_K4388002) by the addition of two mutations, L57I and L460H, to achieve an increased yield of 4-coumaroyl-CoA (an intermediate in the pterostilbene pathway). Dr Zhi-Bo Yan informed us of four enzymes needed to produce pterostilbene from an L-tyrosine precursor, and discussed mutant variants evidenced to produce higher yields. Yan et. al (2021) found the L57I and L460H mutations increased 4-coumaroyl-CoA levels. We ran docking simulations of this At4CL mutant enzyme and its wild type in silico with the ligand p-Coumaric acid to predict KD and ΔG values to identify if this supported our choice, providing further characterisation.
Molecular Docking
Introduction
As part of engineering our plasmid constructs to include the genes for four enzymes in the pathway to synthesize pterostilbene, ensuring the most efficient enzymes possible was important. The four genes we decided to use for our biosynthetic pathway for pterostilbene production were mutant versions of the wild type forms of the enzymes RgTAL, At4Cl, VvSTS and VvROMT found by (Yan et. al, 2021). These mutations claimed to increase the pterostilbene production titre by a factor of 13.7 compared to their respective wild type forms. As this specific mutant At4CL1 was found in literature to have greater catalytic efficiency than the Wild-type (Yan et. al, 2021), it presented a potential variant to use in our plasmid constructs. As the wild type form of At4CL was already present in the iGEM registry under the code BBa_K1033001, created by the 2013 Uppsala iGEM team, we decided to explore experimental avenues that would support and further characterize this wild type and mutant variant (BBa_K4388002) to justify our reasoning for choosing the mutant form of the enzymes and to improve the part. We aimed to determine differences in binding energy and affinity between this mutant and its wild type computationally to further investigate and characterise the mutant and wild type for use in our project.
At4CL
4-Coumaroyl-CoA Ligase (4CL) catalyses the conversion of p-Coumaric acid to p-Coumaroyl-CoA. The BioBrick At4CL1 part BBa_K1033001 has 100% local identity with Genbank Accession number AAA82888.1, and Uniprot accession code Q42524.
The mutant At4CL1 found to have greater catalytic efficiency than the Wild-type has point mutations at L57I and L460H.(Yan et. al, 2021) Docking simulations were performed to obtain KD and ΔG of both the wild-type and mutant.
Obtaining PDB models for the Wild Type and Mutant
The closest template found from the Swiss Model was part of a fusion protein with Stilbene Synthase from (Wang et. al, 2011). Structural differences of At4CL1 as part of the fusion protein were reported to not vary drastically when compared with At4CL1 alone,(Wang et. al, 2011) and the Root Mean Squared Deviation (RMSD) value comparing the 4CL section of the 3TSY fusion protein to the At4CL1 Alphafold model was low (0.445), suggesting the Alphafold model is similar in structure with the 3TSY model, as can be seen in Figure 1. As we were investigating whether point mutations in only two locations had an effect on binding energy, obtaining the most accurate PDB model was important. Using a model of a homolog would not have been accurate enough for docking simulations, given that catalytic specificity and efficiency for a specific substrate can vary significantly even between the different At4CL isoforms. The Alphafold model was chosen as it was of high confidence, and had a more complete structure than 3TSY could provide.(Figure 1 and 2) The mutant At4CL1 PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutations L57I and L460H. Our new part, BBa_K4388002 ,contains these mutations.
Choosing a Docking programme
Predicted docking energies from both Autodock Vina or Autodock4 have been found to correlate well with experimentally determined docking energies, with values obtained using Autodock4 consistently closer to experimentally determined values. Autodock4 has been found to be the superior option for estimating binding affinity (Nguyen et. al, 2020), making it the preferred option for our purposes. Yasara Structure was therefore used to perform docking simulations using Autodock4.
Using Yasara
Energy minimisation was run in Yasara for both enzyme and substrate to find the most energetically favourable conformations. The PDB structures for the enzymes were found to have improved Molprobity results after Energy minimisation compared to before. Each Autodock4 docking simulation performed 25 runs, and clustered results with high similarity into distinct complex conformations. The results from the best-scoring distinct complex conformation for wild type and mutant according to Autodock4 was selected for comparison.
Results and Conclusion
The value of KD found for the mutant At4CL1 was lower than that of the wild type, suggesting it has better affinity with p-Coumaric acid (Table 1). This may help explain the greater catalytic efficiencies of the mutant variant compared to its wild type.(Yan et. al, 2021) The At4CL mutant had a more favorable binding energy change with p-Coumaric acid than the wild-type.
From the Yasara docking simulation, the results for wild type and mutant were compared in Pymol. The position of p-Coumaric acid when bound to wild type and mutant At4CL is different and can be seen in Figure 4.
Further examination of the amino acid residues of At4CL involved in binding to p-Coumaric acid can be seen in Figure 5. As seen in Table 2, though most of the amino acid residues involved in binding to p-Coumaric acid are predicted to be the same in both mutant and wild type, some residues involved in the binding are different in the mutant and wild-type.
Docking Simulations
Docking simulations were also performed for RgTAL, VvROMT and VvSTS. For the docking simulations, PDB files for the enzymes and either SDF or PDB files for the ligands were needed. For all the enzymes, the closest Swiss Model template was compared to an Alphafold model. As we were investigating whether point mutations in only one or two positions had an effect on binding energy, obtaining the most accurate PDB model was important. In the case for all enzymes, Alphafold models proved to be of high confidence, and more complete than the closest Swiss Model templates, so were chosen for the Wild-type templates. For the mutant variants, mutagenesis of the Wild-type Alphafold models was performed using Pymol.
When using a PDB structure in Yasara, it should first be “cleaned” to only contain the protein or protein domain intended for use in modelling, with other molecules which could interfere with the docking simulation removed.
Energy minimisation was run in Yasara for both enzyme and substrate to find the most energetically favourable conformations. The PDB structures for the enzymes were found to have improved Molprobity results after Energy minimisation compared to before. Each Autodock4 docking simulation performed 25 runs, and clustered results with high similarity into distinct complex conformations. The results from the best-scoring distinct complex conformation for wild type and mutant can be seen in Table 3 .
The lower values of KD found with our mutant variants of all four enzymes suggest they have better affinities than their respective Wild-types.(Table 3) This may help explain the greater catalytic efficiencies of the mutant variants compared to their Wild types.(Yan et. al, 2021)
RgTAL
The closest PDB match through Swiss Model was 1Y2M, which is for Phenylalanine ammonia-lyase, and shorter than our Tyrosine ammonia-lyase amino acid sequence. As the Alphafold model was more complete and of high confidence, it was chosen as the preferred model. The mutant RgTAL PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutations S9N, A11T and E518V.
VvROMT
The closest Swiss Model template, 1FP2, had a low percentage identity, at just 50%. As the Alphafold model was of high confidence, as seen in figure 8, it was chosen as the preferred model. The mutant VvROMT PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutation S29P.
VvSTS
The highest result was the fusion protein of 4CL:STS, 3TSY. For the same reasons as with At4CL, this model was not chosen for the docking simulation as being part of a fusion protein was reported to have changed its conformation. Even if not drastically, this could make any results obtained less reliable. As the Alphafold model was of high confidence, it was chosen in favour of the fusion protein.
The mutant VvSTS PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutations T50I and V170A.
Computational Results: Autodock4
References
Wang, Y., Yi, H., Wang, M., Yu, O., & Jez, J. M. (2011). Structural and kinetic analysis of the unnatural fusion protein 4-coumaroyl-CoA ligase::stilbene synthase. Journal of the American Chemical Society, 133(51), 20684–20687. https://doi.org/10.1021/ja2085993
Nguyen, N. T., Nguyen, T. H., Pham, T., Huy, N. T., Bay, M. V., Pham, M. Q., Nam, P. C., Vu, V. V., & Ngo, S. T. (2020). Autodock Vina Adopts More Accurate Binding Poses but Autodock4 Forms Better Binding Affinity. Journal of chemical information and modeling, 60(1), 204–211.. https://doi.org/10.1021/acs.jcim.9b00778
Yan, Z. B., Liang, J. L., Niu, F. X., Shen, Y. P., & Liu, J. Z. (2021). Enhanced Production of Pterostilbene in Escherichia coli Through Directed Evolution and Host Strain Engineering. Frontiers in microbiology, 12, 710405. https://doi.org/10.3389/fmicb.2021.710405
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
Varadi, M., Anyango, S., Deshpande, M., Nair, S., Natassia, C., Yordanova, G., Yuan, D., Stroe, O., Wood, G., Laydon, A., Žídek, A., Green, T., Tunyasuvunakool, K., Petersen, S., Jumper, J., Clancy, E., Green, R., Vora, A., Lutfi, M., Figurnov, M., … Velankar, S. (2022). AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic acids research, 50(D1), D439–D444. https://doi.org/10.1093/nar/gkab1061
Krieger, E., & Vriend, G. (2014). YASARA View - molecular graphics for all devices - from smartphones to workstations. Bioinformatics (Oxford, England), 30(20), 2981–2982. https://doi.org/10.1093/bioinformatics/btu426
The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.