Overview
Ahead of the wet lab, teammates from dry lab have constructed the model to simulate the actual situation and make accurate prediction. Each set of the model was schemed according to the experiment design and the physical requirement of our project. While the dry lab can lead the experiment schedule to a certain degree, it also receives feedbacks from the result and makes further validation.
Basically, the dry lab has made achievement in two parts. The molecule dynamics characteristic of the patchoulol synthase (PTS) was primarily concentrated, which corresponded to the experiment of directed evolution. Through structure prediction and molecule docking, we discovered the potential active site during the enzymatic reaction between PTS and farnesyl pyrophosphate (FPP, the precursor of patchoulol). Among all the sites, Lys-425 from PTS presented the highest binding activity, which might serve as a complement to previous study. The discovery further provided guidance for successive experiment with semi-rational and rational design. Moreover, the feedback from these results is also under expectation. Apart from above, the simulation for co-expression of patchoulol and lycopene was carried out as well. We characterized the activity of species in a single module, and tending to expand the model to a systematic scope. Based on the differential equation, we simulated the amount of both products responding to different light control conditions. The model matched with the experiment result to a certain degree. While constructing the model, we also gathered many precious experiences in this research fields, and some of our attempts are documented after the text.
Hopefully, the model could provide accurate simulation for this co-expression system, and work as a propagable toolkit for people to make optimization based on the physical requirement.
Fig.1 The workflow of the dry lab
The idea from dry lab was spread and examined while carrying out the human practice. So far, we have received many precious feedbacks, which will improve our model and further promote the project.
Molecular dynamics of patchoulol synthesis
The molecular dynamics analysis reveals the possible active site during the enzymatic reaction between PTS and FPP, thus providing guidance for further experiment with semi-rational or rational design.
With the antisepsis property from patchoulol, FUNCDYES can prevent the clothes from mildewing and rotting. In the mevalonate pathway, the patchoulol is directly formed from FPP by the catalysis of PTS. Current study has reported that the production of PTS remains in around 40~50 mg/L. Compared to lycopene, the patchoulol causes slight oscillation, thus requiring a much stronger synthetase to drive the regulation. Here we majorly performed the molecular dynamics analysis for the PTS, and presented the result through visualization. Compared to the previous study [1], we discovered a new active site with more significant difference than of others.
Fig.2 The workflow of this part
Structure prediction
The protein sequence of PTS was obtained from the National Centre for Biological Information (NCBI)[2], with the accession number as ALQ43502.1, and is identical to the one that we used in the experiment. According to NCBI, the residues from 304th to 457th is considered as the active pocket, which may exhibit the site for molecular docking.
Notably, according to the search result in Protein Data Bank (PDB), the crystal structure and relative data of PTS remains lacked. To obtain the 3D structure and conduct molecular docking, the Alpha-Fold[3] was applied to make prediction (here we took PTS as a single dimer). PyMOL[4], a widely-used visualization software, was then applied to visualize the 3D structure. As is mentioned above, the chain that possibly acts as active pocket was plotted as yellow, while the rest are blue:
Fig.3 The 3D structure of PTS (yellow stands for the possible active pocket while the rest of chain are blue)
Investigation into the active site
Molecular docking presents the possible interaction between a certain pair of receptor and ligand, thus revealing the potential active site in a certain degree. During the docking, the PTS is considered as the receptor while the FPP is considered as the ligand.
Through the ZINC database[5], we obtained the 3D structure of FPP with the code number as ZINC12494625. Autodock[6], a widely-used toolkit for predicting the interaction of ligands with macromolecular targets, was then applied. In this procedure, AutoDock uses a semi-empirical free energy force field to evaluate conformations. The force field was parameterized using a large number of protein-inhibitor complexes. The evaluation of binding starts in an unbound conformation between the ligand and protein. The intramolecular energetics are firstly estimated for the transition from the unbound states to the conformation of the ligand and protein in the bound state. Then the intermolecular energetics of combining the ligand-protein is evaluated. The force field includes six pair-wise evaluations (V) and an estimate of the conformational entropy lost upon binding (ΔSconf):
where L refers to the ligand and P refers to the protein. Then, each of the pair-wise energetic terms includes evaluations for dispersion/repulsion, hydrogen bonding, electrostatics, and desolation:
where the W is optimized weighting constants to calibrate the empirical free energy based on a set of experimentally determined binding constants; r stands for the physical distance between two species; A and B are parameters related to dispersion/repulsion interactions; C and D are assigned to give a maximal well depth of 5 kcal/mol at 1.9Å for hydrogen bonds with oxygen and nitrogen, and a well depth of 1 kcal/mol at 2.5Å for hydrogen bonds with sulfur; E(t) provides directionality based on the angle t from ideal H-bonding geometry; V represents the volume that surrounds a given atom and shelters it from solvent, and weighted by a solvation parameter (S) and an exponential term with distance-weighting factor σ=3.5Å.[6]
Autodock works under the PyMOL environment with other plugins including Autodock Vina and Mgltools. Here, we adopted the version of Autodock4, Autodock Vina v.1.2[7], and Mgltools v.1.5.7[8] to configure the environment. By genetic algorithm, multiple docking sites have been found, along with various energy parameters. Among them the Lys-425: HZ3 exhibited the highest potential, which might compile to the previous finding. The result is presented as follows:
Site | Binding Energy (kcal/mol) | Internal Energy (kcal/mol) | Electrostatic Energy (kcal/mol) | Torsional Energy (kcal/mol) | Ligand Efficiency (kcal/mol) |
---|---|---|---|---|---|
Lys-425: HZ3 | -2.22 | -5.5 | -1.21 | 3.28 | -0.09 |
Lys-425: HZ1 | -1.87 | -5.15 | -1.74 | 3.28 | -0.08 |
Lys-425: HZ3 | |||||
Lys-425: HZ1 | -1.72 | -5.0 | -1.67 | 3.28 | -0.07 |
Lys-425: HZ3 | |||||
Lys-425: HZ1 | 1.45 | -4.73 | -0.63 | 3.28 | -0.06 |
Lys-425: HZ1 | -1.43 | -4.71 | -1.69 | 3.28 | -0.06 |
Lys-425: HZ3 |
Table.1 The top 5 active sites and their parameters
Visualization was conducted through PyMOL, with the left representing for a whole view of the ligand-pair, and the right representing a zoom view:
Fig.4 The Lys425: HZ3 docking site
Simulation for co-expression system
Simulation was carried out for this co-expression system. Through the response to different light control conditions, the model will guide the user to optimize the property of the product.
FUNCDYES contains a light-induced element to regulate the product. In this system, the time of light-inducement, along with the carbon source and physical conditions were considered as the input signal. Meanwhile, the amount of lycopene and patchoulol were taken as the output. In a single module that we investigated, the time of light-inducement was the only input, and the amount of methanol was replaced by that of the FPP. Based on the differential equation, the mass accumulation responding to different light control conditions was simulated. By switching the condition, the user can predict and optimize the amount of patchoulol and lycopene, and their ratio as well. Further, we attempted to reinforce the complexity and extend the model to a comprehensive scope.
Abbreviation
Following are the abbreviations in this part:
Notation | Description | Value | Unit |
---|---|---|---|
t | Time | Variant | s |
[el222] | Concentration of EL222 DNA sequence | Variant | mM |
[el222 - mRNA] | Concentration of EL222 mRNA sequence | Variant | mM |
[EL222] | Concentration of EL222 protein | Variant | mM |
[EL222:pts] | Concentration of PTS DNA sequence binding with EL222 | Variant | mM |
[FPP] | Concentration of FPP | Variant | mM |
[pts] | Concentration of PTS DNA sequence | Variant | mM |
[pts - mRNA] | Concentration of PTS mRNA sequence | Variant | mM |
[PTS] | Concentration of PTS | Variant | mM |
[crtE] | Concentration of CrtE DNA sequence | Variant | mM |
[crtE - mRNA] | Concentration of CrtE mRNA sequence | Variant | mM |
[CrtE] | Concentration of CrtE protein | Variant | mM |
[GGPP] | Concentration of GGPP | Variant | mM |
[crtB] | Concentration of CrtB DNA sequence | Variant | mM |
[crtB - mRNA] | Concentration of CrtB mRNA sequence | Variant | mM |
[CrtB] | Concentration of CrtB protein | Variant | mM |
[Phytopene] | Concentration of Phytopene | Variant | mM |
[crtI] | Concentration of CrtI DNA sequence | Variant | mM |
[crtI - mRNA] | Concentration of CrtI mRNA sequence | Variant | mM |
[CrtI] | Concentration of CrtI protein | Variant | mM |
Lycopene | Concentration of lycopene | Variant | mM |
[kcrtE] | CrtE transcription | - | s-1 |
[kcrtB] | CrtB transcription | - | s-1 |
[kcrtI] | CrtI translation | - | s-1 |
[kel222-mRNA] | EL222 translation | 1.2*10-2 | s-1 |
[kpts-mrna] | PTS translation | 4*10-4 | s-1 |
[kcrtB-mrna] | CrtB translation | 1*10-2 | s-1 |
[kcrtE-mrna] | CrtE translation | 3.5*10-2 | s-1 |
[kcrtI-mrna] | CrtI translation | 2.8*10-2 | s-1 |
[kEL222] | Binding rate of EL222: pts complex | 2.2*104 | M-1s-1 |
[km PTS] | Michaelis constant of patchoulol expression | 8*10-3 | mM |
[kCrtE] | GGPP expression rate | - | s-1 |
[k-CrtE] | GGPP disassociation rate | - | M-1s-1 |
[km CrtB] | Michaelis constant of phytoene expression | 3.74*10-1 | mM |
[km CrtI] | Michaelis constant of lycopene expression | 3.33*10-1 | mM |
[Vm PTS] | Maximum reaction rate of patchoulol | 0.072 | Ms-1 |
[Vm CrtB] | Maximum reaction rate of phytoene | - | Ms-1 |
[Vm CrtI] | Maximum reaction rate of lycopene | - | Ms-1 |
Table.2 The abbreviation of this part
Ordinary differential equation for light-induced switch
EL222, a photosensitive protein is introduced to regulate the expression of patchoulol. The protein remains a monomer structure under the dark environment. When exposed under the blue light (normally 460~480 nm), the structure changes and binds to DNA sequence easily. The information of these parts has been recorded in our website.
Fig.5 The light-induced reaction by EL222[8]
Under the experimental condition with blue light of 475nm inducing, following the ODEs characterize the light induced reaction:
with initial conditions:
other notations have been declar in Table.2. According to a previous study[9], the reaction rate constant of the DNA binding step was independent of the DNA sequence, regardless of binding affinity. The binding efficiency shows no significant difference among various sequence. Hence, we estimated the kEL222 as 2.2*104 M-1s-1.
Simulation for the co-expression in a single module
In this single system containing the FPP, the patchoulol synthesis module, and the lycopene synthesis module, the inducement of light is considered as the input, while the response of lycopene and patchoulol synthesis can be taken as the output. We primarily inspected the kinetic characteristic in such a single module.
Fig.6 Gene circuit in the single module
Based on previous studies, we constructed the ODEs and examined the kinetic data. The ODEs for product synthesis are listed here:
with initial conditions:
where all the notifications and kinetic data are shown in Table.2.
With SimBiology, the ODEs were then simulated under various conditions. We first performed sensitivity analysis to examine the parameters (Fig.7). According to the result, except for crtE, the expression of other species is less sensitive to the variation of input signal. The errors from these papameters might cause only slight disturbance, regardless some kinetic data remains lacked.
The simulation result was corresponded to the control group, long-time induced group and short-time induced group. The simulation for control group, which means simply lycopene synthesis, is shown in Fig.7:
Fig.7 Sensitivity analysis (left) and the simulation of control group (right)
Compared to long-time inducement, the short-time group achieved a more balance ratio between two products, which matched well with the experiment and demonstrated the advantage of dynamic regulation:
Fig.8 The simulation of long-time induced group
Hopefully, through the response to different light control conditions, the model will guide the user to optimize the property of the product.
Extension for the metabolic pathway
To carry out the future work, we made an attempt to extend the model to a comprehensive scope. With more complexity, the model will take various input signals into account, including the carbon source, nitrogen source and other physical conditions. By combination with the genome-scale network, the model can further predict the growth rate of cells accurately. Multiple methods can be merged and applied, which recently comes into a trend in metabolism analysis. The combination of dynamic flux balance analysis (dFBA) with various kinds of data limited, may perform well when lacking the enzymatic data, meanwhile overcoming the disadvantage of static analysis. It is expected to apply these methods in our future work!
In this part, we have primarily constructed a genome-scale model with FBA method. During the construction, we also gathered a lot information and our experience are documented here. The file contains a genome-scale model of Pichia pastoris, which is the modification of a previous model Imt1026 (v3.0) for a better operation under multi environment, including COBRA and CellNetAnalyzer.
References
[1] Faraldos JA, Wu S, Chappell J, Coates RM. Doubly deuterium-labeled patchouli alcohol from cyclization of singly labeled [2-(2)H(1)]farnesyl diphosphate catalyzed by recombinant patchoulol synthase. J Am Chem Soc. 2010 Mar 10;132(9):2998-3008. doi: 10.1021/ja909251r.
[2] National Center for Biotechnology Information (NCBI)[Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [1988] - [cited 2022 Oct 04]. Available from: https://www.ncbi.nlm.nih.gov/
[3] Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583-589. doi:10.1038/s41586-021-03819-2
[4] The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.
[5] Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: a free tool to discover chemistry for biology. J Chem Inf Model. 2012;52(7):1757-1768. doi:10.1021/ci3001277
[6] Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S. and Olson, A. J. (2009) Autodock4 and AutoDockTools4: automated docking with selective receptor flexiblity. J. Computational Chemistry 2009, 16: 2785-91.
[7] Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31(2):455-461. doi:10.1002/jcc.21334
[8] Takakado A, Nakasone Y, Terazima M. Sequential DNA Binding and Dimerization Processes of the Photosensory Protein EL222. Biochemistry. 2018;57(10):1603-1610. doi:10.1021/acs.biochem.7b01206