Overview

We improved 4-coumarate:CoA ligase (At4CL) from Arabidopsis thaliana (BBa_K1033001) to (BBa_K4388002) by the addition of two mutations, L57I and L460H, to achieve an increased yield of 4-coumaroyl-CoA (an intermediate in the pterostilbene pathway). Dr Zhi-Bo Yan informed us of four enzymes needed to produce pterostilbene from an L-tyrosine precursor, and discussed mutant variants evidenced to produce higher yields. Yan et. al (2021) found the L57I and L460H mutations increased 4-coumaroyl-CoA levels. We ran docking simulations of this At4CL mutant enzyme and its wild type in silico with the ligand p-Coumaric acid to predict KD and ΔG values to identify if this supported our choice, providing further characterisation.

Molecular Docking

Introduction

As part of engineering our plasmid constructs to include the genes for four enzymes in the pathway to synthesize pterostilbene, ensuring the most efficient enzymes possible was important. The four genes we decided to use for our biosynthetic pathway for pterostilbene production were mutant versions of the wild type forms of the enzymes RgTAL, At4Cl, VvSTS and VvROMT found by (Yan et. al, 2021). These mutations claimed to increase the pterostilbene production titre by a factor of 13.7 compared to their respective wild type forms. As this specific mutant At4CL1 was found in literature to have greater catalytic efficiency than the Wild-type (Yan et. al, 2021), it presented a potential variant to use in our plasmid constructs. As the wild type form of At4CL was already present in the iGEM registry under the code BBa_K1033001, created by the 2013 Uppsala iGEM team, we decided to explore experimental avenues that would support and further characterize this wild type and mutant variant (BBa_K4388002) to justify our reasoning for choosing the mutant form of the enzymes and to improve the part. We aimed to determine differences in binding energy and affinity between this mutant and its wild type computationally to further investigate and characterise the mutant and wild type for use in our project.

At4CL

4-Coumaroyl-CoA Ligase (4CL) catalyses the conversion of p-Coumaric acid to p-Coumaroyl-CoA. The BioBrick At4CL1 part BBa_K1033001 has 100% local identity with Genbank Accession number AAA82888.1, and Uniprot accession code Q42524.

The mutant At4CL1 found to have greater catalytic efficiency than the Wild-type has point mutations at L57I and L460H.(Yan et. al, 2021) Docking simulations were performed to obtain KD and ΔG of both the wild-type and mutant.

Obtaining PDB models for the Wild Type and Mutant

The closest template found from the Swiss Model was part of a fusion protein with Stilbene Synthase from (Wang et. al, 2011). Structural differences of At4CL1 as part of the fusion protein were reported to not vary drastically when compared with At4CL1 alone,(Wang et. al, 2011) and the Root Mean Squared Deviation (RMSD) value comparing the 4CL section of the 3TSY fusion protein to the At4CL1 Alphafold model was low (0.445), suggesting the Alphafold model is similar in structure with the 3TSY model, as can be seen in Figure 1. As we were investigating whether point mutations in only two locations had an effect on binding energy, obtaining the most accurate PDB model was important. Using a model of a homolog would not have been accurate enough for docking simulations, given that catalytic specificity and efficiency for a specific substrate can vary significantly even between the different At4CL isoforms. The Alphafold model was chosen as it was of high confidence, and had a more complete structure than 3TSY could provide.(Figure 1 and 2) The mutant At4CL1 PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutations L57I and L460H. Our new part, BBa_K4388002 ,contains these mutations.

Figure 1. Alphafold model (purple), 4CL section from fusion protein 3TSY (blue). RMSD = 0.445. Made using Pymol. The Alphafold structure can be seen to be more complete than the 4CL section from the fusion protein 3TSY.
Figure 2. Alphafold model of the Wild-type 4-Coumaroyl-CoA Ligase from Arabidopsis thaliana. The legend indicates levels of confidence in structural accuracy. pLDDT is a per-residue metric of the structure’s confidence on a scale of 0 - 100.

Choosing a Docking programme

Predicted docking energies from both Autodock Vina or Autodock4 have been found to correlate well with experimentally determined docking energies, with values obtained using Autodock4 consistently closer to experimentally determined values. Autodock4 has been found to be the superior option for estimating binding affinity (Nguyen et. al, 2020), making it the preferred option for our purposes. Yasara Structure was therefore used to perform docking simulations using Autodock4.

Using Yasara

Energy minimisation was run in Yasara for both enzyme and substrate to find the most energetically favourable conformations. The PDB structures for the enzymes were found to have improved Molprobity results after Energy minimisation compared to before. Each Autodock4 docking simulation performed 25 runs, and clustered results with high similarity into distinct complex conformations. The results from the best-scoring distinct complex conformation for wild type and mutant according to Autodock4 was selected for comparison.

Results and Conclusion

The value of KD found for the mutant At4CL1 was lower than that of the wild type, suggesting it has better affinity with p-Coumaric acid (Table 1). This may help explain the greater catalytic efficiencies of the mutant variant compared to its wild type.(Yan et. al, 2021) The At4CL mutant had a more favorable binding energy change with p-Coumaric acid than the wild-type.

Table 1.
Results of the top distinct complex conformation in each of the docking simulations run with Autodock4 (AD4) through Yasara Structure. These were performed for both the wild type and mutant At4CL to obtain KD and ΔG.
Figure 3. Yasara docking simulation result with Autodock4. Shown is the Wild-type 4-Coumaroyl-CoA Ligase enzyme from Arabidopsis thaliana in complex with the substrate p-Coumaric acid.Yasara automatically colour codes the secondary structure elements as follows: Alpha helices (dark blue), inside of helix (grey), beta sheets (red), turn (light green), helix 310 (yellow), coil (light blue).

 

From the Yasara docking simulation, the results for wild type and mutant were compared in Pymol. The position of p-Coumaric acid when bound to wild type and mutant At4CL is different and can be seen in Figure 4.

Figure 4. p-Coumaric acid (Yellow) in complex with wild-type At4CL (Light blue), overlaid onto p-Coumaric acid (Red) in complex with mutant At4CL (Dark blue). Created in Pymol using Yasara docking simulation results.

 

Further examination of the amino acid residues of At4CL involved in binding to p-Coumaric acid can be seen in Figure 5. As seen in Table 2, though most of the amino acid residues involved in binding to p-Coumaric acid are predicted to be the same in both mutant and wild type, some residues involved in the binding are different in the mutant and wild-type.

Table 2.
At4CL amino acid residues involved in binding to p-Coumaric acid. Comparisons of amino acid residues from wild type and mutant At4CL from Autodock4 results using Yasara
Figure 5. A) p-Coumaric acid (purple) in complex with the wild-type At4CL. B) p-Coumaric acid (Dark green) in complex with mutant At4CL. Created in Pymol.

 

Docking Simulations

Docking simulations were also performed for RgTAL, VvROMT and VvSTS. For the docking simulations, PDB files for the enzymes and either SDF or PDB files for the ligands were needed. For all the enzymes, the closest Swiss Model template was compared to an Alphafold model. As we were investigating whether point mutations in only one or two positions had an effect on binding energy, obtaining the most accurate PDB model was important. In the case for all enzymes, Alphafold models proved to be of high confidence, and more complete than the closest Swiss Model templates, so were chosen for the Wild-type templates. For the mutant variants, mutagenesis of the Wild-type Alphafold models was performed using Pymol.

When using a PDB structure in Yasara, it should first be “cleaned” to only contain the protein or protein domain intended for use in modelling, with other molecules which could interfere with the docking simulation removed.

Energy minimisation was run in Yasara for both enzyme and substrate to find the most energetically favourable conformations. The PDB structures for the enzymes were found to have improved Molprobity results after Energy minimisation compared to before. Each Autodock4 docking simulation performed 25 runs, and clustered results with high similarity into distinct complex conformations. The results from the best-scoring distinct complex conformation for wild type and mutant can be seen in Table 3 .

The lower values of KD found with our mutant variants of all four enzymes suggest they have better affinities than their respective Wild-types.(Table 3) This may help explain the greater catalytic efficiencies of the mutant variants compared to their Wild types.(Yan et. al, 2021)

RgTAL

The closest PDB match through Swiss Model was 1Y2M, which is for Phenylalanine ammonia-lyase, and shorter than our Tyrosine ammonia-lyase amino acid sequence. As the Alphafold model was more complete and of high confidence, it was chosen as the preferred model. The mutant RgTAL PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutations S9N, A11T and E518V.

Figure 6. Alphafold model of the Wild-type Tyrosine ammonia lyase from Rhodotorula glutinis.
Figure 7. Yasara docking simulation result with Autodock4.Shown is the Wild-type Tyrosine ammonia lyase from Rhodotorula glutinis with L-Tyrosine..

VvROMT

The closest Swiss Model template, 1FP2, had a low percentage identity, at just 50%. As the Alphafold model was of high confidence, as seen in figure 8, it was chosen as the preferred model. The mutant VvROMT PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutation S29P.

Figure 8. Alphafold model of the Wild-type Resveratrol O-methyltransferase from Vitis vinifera
Figure 9. Yasara docking simulation result with Autodock4. Shown is the Wild-type Resveratrol O-methyltransferase from Vitis vinifera with Resveratrol

VvSTS

The highest result was the fusion protein of 4CL:STS, 3TSY. For the same reasons as with At4CL, this model was not chosen for the docking simulation as being part of a fusion protein was reported to have changed its conformation. Even if not drastically, this could make any results obtained less reliable. As the Alphafold model was of high confidence, it was chosen in favour of the fusion protein.

The mutant VvSTS PDB was created by mutagenesis in Pymol of the Wild-type Alphafold model to introduce the mutations T50I and V170A.

Figure 10. Alphafold model of the Wild-type Stilbene synthase from Vitis vinifera.
Figure 11. Yasara docking simulation result with Autodock4.Shown is the Wild-type Stilbene synthase from Vitis vinifera with 4-Coumaroyl CoA.

Computational Results: Autodock4

Table 3.
Computational results from docking simulations run with Autodock4 (AD4) through Yasara Structure.

References

Wang, Y., Yi, H., Wang, M., Yu, O., & Jez, J. M. (2011). Structural and kinetic analysis of the unnatural fusion protein 4-coumaroyl-CoA ligase::stilbene synthase. Journal of the American Chemical Society, 133(51), 20684–20687. https://doi.org/10.1021/ja2085993

Nguyen, N. T., Nguyen, T. H., Pham, T., Huy, N. T., Bay, M. V., Pham, M. Q., Nam, P. C., Vu, V. V., & Ngo, S. T. (2020). Autodock Vina Adopts More Accurate Binding Poses but Autodock4 Forms Better Binding Affinity. Journal of chemical information and modeling, 60(1), 204–211.. https://doi.org/10.1021/acs.jcim.9b00778

Yan, Z. B., Liang, J. L., Niu, F. X., Shen, Y. P., & Liu, J. Z. (2021). Enhanced Production of Pterostilbene in Escherichia coli Through Directed Evolution and Host Strain Engineering. Frontiers in microbiology, 12, 710405. https://doi.org/10.3389/fmicb.2021.710405

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2

Varadi, M., Anyango, S., Deshpande, M., Nair, S., Natassia, C., Yordanova, G., Yuan, D., Stroe, O., Wood, G., Laydon, A., Žídek, A., Green, T., Tunyasuvunakool, K., Petersen, S., Jumper, J., Clancy, E., Green, R., Vora, A., Lutfi, M., Figurnov, M., … Velankar, S. (2022). AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic acids research, 50(D1), D439–D444. https://doi.org/10.1093/nar/gkab1061

Krieger, E., & Vriend, G. (2014). YASARA View - molecular graphics for all devices - from smartphones to workstations. Bioinformatics (Oxford, England), 30(20), 2981–2982. https://doi.org/10.1093/bioinformatics/btu426

The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.