| Wego - iGEM 2022

Introduction

Overview

Regarding the paraoxon (PXN) hydrolysis experiment , we sought to build a model that can explain our result. First, we aimed to quantify the ability of our target protein organophosphate hydrolase (OPH_P) to hydrolyze PXN into p-nitrophenol (pNP) by fitting our model with the data from the experiments. We also sought to construct a model that is approachable for future high school iGEM teams by forming our model on simple yet useful foundations -- the Law of mass action and Enzyme kinetics. In addition, we proposed that the transportation of chemicals across the membrane is essential for simulating the in vivo system. Then, we sought the optimal condition, more specifically the optimal concentration of isopropyl-β-D-1- thiogalactopyranoside (IPTG), since a high concentration of IPTG exacerbates the expression of our target protein.

Brief Lac Operon Review

According to the work by Griffiths et al. (1999), a literature review of the mechanism of the lac operon, repressor (rep) protein blocks, not the binding of RNA polymerase (RP), but its movement. In other words, RP would still bind to the promoter even when the operator (lac O) is occupied by the repressor and vice versa. That is, the repressor and RP will not interfere with each other upon binding. IPTG would either bind to the repressor in the cytosol or bind to the repressor on the DNA. Bound to IPTG, the repressor's affinity to DNA decreases and thus allows RP to transcribe the message RNA (mRNA) encoding OPH (denoted as OPH_mRNA).

Fig. 1 Regulation of the lac operon (Griffiths et al.,1999, Fig. 14-5)

Preliminary Work

DNA Module

According to lac operon kinetics provided by Montalva-Medel et al. (2021), we observed that their approach to model lac operon using a boolean model. However, we sought that the ordinary differential equation (ODE) system defined by the Law of mass action would be suitable for our study. To model this mechanism, we categorized the plasmids in our engineered Escherichia coli into 6 types and identified five important reactions between these categories as provided below in Tables 1 and 2, respectively. The components involved in these reactions include repressor (rep) and RNA polymerase (RP) in the cytoplasm (fr_rep &fr_RP), IPTG, and the complex of IPTG and rep (C_I_rep).

Table 1: Types of DNA

Table 2: Reaction

To summarize above, we described the relationship between different types of DNA in a simple interaction diagram.

Fig. 2 Interactions of DNA

Module Evaluations

Reviewing the study, which took the same approach as ours, done by Dunaway et al. (1980), we found that the rate reaction (RR) constants of the repressor are very high. The binding and decomposing rate constants for fr_rep to lac O are 7200000 hr^-1μM^-1 and 144 hr^-1, respectively, while those for C_I_rep are 7200000 hr^-1μM^-1 and 144000 hr^-1, respectively. Plugging in these numbers, we stimulated the balance states for different IPTG concentrations, specifically 15.625 μM vs. 2000 μM, using the initial conditions that we will explain below (see Table 6 for a summary).

Fig. 3
Left: DNA under 15.625/2000 μM IPTG
Right: Extract DNA_nrep_RP (using logX axis)

From this simulation plotted in log X scale, we extracted DNA_nrep_RP specifically because this species was the one that can transcribe OPH_mRNA which translates OPH. Yet, the simulation deviated from our experimental data. From Table 3, we can see that, under 15.625 μM IPTG, the production of pNP is disproportional compared with that under 2000 μM IPTG, suggesting that the PXN hydrolysis process is far from completion. This delay in hydrolyzation suggests two possibilities. First, there are much fewer plasmids that can express OPH under 15.625 μM than under μM IPTG resulting in much less OPH to hydrolyze the PXN. Secondly, there should be an observable delay for DNA_nrep_RP to peak up and reach an equilibrium state such that the OPH concentration increases slowly. Yet, from the simulation above, the concentration of DNA_nrep_RP under 15.625 μM IPTG is only 4.68 times lower than that under 2000 μM IPTG. This difference can not account for the discrepancy between the pNP production under these two conditions. This leaves us with the possibility that there is a significant delay for plasmids to reach their equilibrium states. However, the simulation above shows a delay time, the interval marked between two vertical dotted lines in the right chart of Fig. 3, that is insignificant (only a 0.006-hour difference). Therefore, it implied that there were interactions in our system that we had not considered.

Table 3: pNP concentration at 6 hr under 15.625/2000 μM IPTG and 500μM PXN

We attributed the transportation of IPTG across the E. coli membrane to the delay of the equilibrium state. Thus, we added the reaction of IPTG transportation across the membrane. Furthermore, we incorporated the transportation of PXN and pNP for compliance. This improvement divided IPTG into IPTG inside the bacteria (IPTG_in) and IPTG outside the bacteria (IPTG_ex). This division also holds true for PXN and pNP resulting in PXN_in, PXN_ex, and pNP_in, pNP_ex.

Establishment of the Model

Assumptions

Here we made several assumptions to construct our model. First, the liquid within the eppendorf is uniform, therefore the concentration changes of PXN_ex, pNP_ex, and IPTG_ex caused by each bacteria are the same in the whole eppendorf. Second, each bacteria is synchronized to perform the same reactions under a given time, for example, plasmids in the bacteria reach the equilibrium state simultaneously. Finally, the number of bacteria is enough to divide up the space in the eppendorf such that each bacteria encounters the substance in the space that has a volume equivalent to the volume of a bacteria. Therefore, while a molecule transports into a bacteria, the concentration changes of this species inside the bacteria and the space with which it is interacting are the same. These assumptions simplified our model so that we can model the status of our experiment by simulating the interactions of a bacteria that can reflect the situation of the whole eppendorf.

Improved Model

Upon the completion of our DNA module, we obtained the important product of transcription -- mRNA encoding OPH. The two major reactions left are the translation of mRNA and the hydrolysis of PXN. To model the translation, we approximated it to enzyme kinetics in which the enzyme is the ribosome (fr_R), the substrate is the OPH_mRNA, the intermediate complex is the complex of mRNA and ribosome (C_mRNA_R), and the product is the OPH_P. The main difference between our model and enzyme kinetics is that the mRNA will not turn into the OPH_P but retain itself after the reactions. In this model, we assumed that the concentration of tRNA is consistent throughout the whole duration, therefore our translation rate constant (kTL) includes the constant concentration of tRNA.

Fig. 4 Translation

The remaining part of the reactions was modeled by standard enzyme kinetics with a single intermediate complex (C_OPH_PXN).

Fig. 5: Hydrolysis

To construct our model, we used the MATLAB SimBiology Model Builder to create the visible diagram for our model. Each circle represents a reaction with straight lines linking every reactant involved in the reaction and arrows pointing to the products. The two small arrows on top of the circle indicate that the reaction is reversible. Since the pNP concentration is our primary target to observe, we needed a species that can reflect the concentration of the total pNP. Therefore, we added the Monitor_pNP with an algebraic rule that gives its value according to the summation of pNP_in and pNP_ex. There are 20 species interacting, one monitor species, and 25 RR constants. 12 of these constants can be found in the literature review, and the values for the other three constants (kTX, kTL, and kdmRNA) are estimated with the assumption that all of these values are proportional to the length of the sequences. Thus, our model was composed of 20 differential equations. Due to the recently published nature of the OPH we used, there are limited resources regarding this variant of OPH. According to our literature review, there was no research that provides quantitative analysis of the reactions of OPH, such as its hydrolysis. Therefore we contributed to this OPH part by fitting our experimental data with our model to estimate the RR constants of this OPH quantitatively.

Fig. 6 Model overview

Table 4: Species Review

Table 5: Reaction Constants Review & Value

Simulation & Fitting

Initial Conditions

We used a pET expression vector in our engineered bacteria; thus our plasmids have a copy number between 15~20 (Morgan, 2020). In our model, we picked 20 to be our copy number which is equivalent to the number of available DNA to transcribe mRNA. Furthermore, according to Kubitschek and Friske (1986), one molecule in E. coli roughly equals 1nM, thus 20 plasmids equate to 0.02 μM. We further set our model to start without any complex; that is, the 20 plasmids are in the state of DNA_nrep_nRP which give the initial concentration of this species -- 0.02 μM. The initial concentration of fr_rep, fr_RP, and fr_r are 0.05, 3, and 73 respectively (Bintu et al., 2005; Muller-Hill, 1996, p.134; Bremer & Dennis, 2008). The table below provides a summary of the initial condition for our model simulation. IPTG and PXN are set according to the experimental conditions.

Table 6: General Initial conditions

Guessed Values

To test the accuracy of our model, we first arbitrarily yet reasonably assigned value for the constants that are not previously tested. We used the MATLAB SimBiology Model Analyzer to simulate. Since our model contains reaction constants, such as kRD_f, with high values, our model is stiff. Thus, we used the built-in differential equation solver ode23t and set the absolute tolerance to 10^-9. Using the guesses and initial conditions provided below in the table, we had the curves for each species throughout the hydrolysis process. Among all the elements, we were particularly interested in PXN_ex, PXN_in, OPH_P, C_OPH_PXN, and Moniter_pNP, so we plotted these specifically in Fig. 7 to have an overview of the hydrolysis process. This graph makes a reasonable prediction of the enzymes’ behaviors thus we continue to analyze those guesses and estimations of the reaction constants by fitting the model to our experimental data.

Table 7: Simulation setting

Fig. 7 Simulation Curves

Fitting

To perform the quantitative analysis of the OPH protein, we fit our model with the data collected from the experiments under 500 μM PXN and various concentrations of IPTG from 15.625~2000 μM, a total number of 8 sets of data as shown in the table below. We only fit the rate constants that we estimated and those without references, using the initial guesses provided above in Table 7.

Table 8: IDs and Concentrations

Identified Flaws & Solution

Using these guesses we yielded an array of estimated constants, yet we encountered a problem that several constants converge to be negative. However, this can not be true since none of the reactions in our system is zeroth-order, the only case in which the RR constant can be negative. Reviewing the differential equations we implemented for simulation, we found out that those observations were reasonable, rather than systematic error, during the fitting process. In quest of a fit value, the algorithm only searches a range of values around the initial guess and ends at the combination that creates the combination of the constants. This caused a problem we identified as the Equivalent Parameters Problem. By name, it means that there are multiple constants that have the same effect on the curves; while the algorithm computes the curve, they will take these parameters equally without considering their chemical significance, resulting in unrealistic estimations of the RR constants. For example, the estimation we reached using our initial guesses results in a negative kIPTG_out.

Fig. 8 Differential Equation

From the equation above, we can see that kIPTG_in and kIPTG_out are mathematically equivalent. Therefore, the negative kIPTG_out, in fact, acts as kIPTG_in but is proportional to IPTG_in instead of IPTG_ex, which means it contributes to the influx of IPTG into the cell. We denoted this kind of combination of RR constants that have the same mathematical effect as the equivalent set. Therefore, by heightening the value of the counterpart in the equivalence set, we can adjust the one with a negative value to a reasonable number. We devised this method to deal with the negative RR constant estimation systematically rather than randomly testing some other RR constants that only indirectly affect the one we intended to change.

Despite this systematic method and continuous effort, kPXN_out still remained negative, just below zero. This might be because the method we acquired to deal with the Equivalent Parameters Problem had its limitations. We were aware that a constant could be part of multiple equivalence sets, therefore, we can not change a RR constant unlimitedly to an unrealistically high value this would either make other RR constants become negative to balance the effect caused by the augmentation of one value, or would collapse the whole algorithm such that it could not converge to a tolerable estimation. Taking one more glance at our model, we saw that the OPH_P consistently hydrolyzes the PXN inside the bacterium; this suggests that the dynamic equilibrium between the PXN inside and outside the membrane is not reached, given that PXN_in is quickly bound to OPH_P. While the PXN indeed transports across the membrane outward, the presence of OPH_P that binds with PXN and hydrolyzes it into pNP makes our model difficult to estimate kPXN_out merely by the pNP data we collected. Thus, to take PXN influx outnumbering the outflux into consideration, we decided to set kPXN_out to 0 for our stimulation and left it undetermined for the future improvement of both our experiments and model.

Discussion

Results

The removal of kPXN_out from our fitting provides a shift in the value of estimated RR constants previously unseen in our fitting so far. Repeating the method several times we obtained a set of estimations as shown in the table below.

Table 9: Fit Results
(0* see the previous section for more information)

Verification

The enlargement of these RR constants made the curve turn more sharply than the previous simulation (see Fig. 7) did. We used these values to simulate the production of pNP under another experimental condition (100 μM IPTG and 500 μM PXN). The curve was plotted with the experimental data to see if the simulation can predict the experiment. As shown in Fig. 9 below, the simulation curve made a good estimation for the experiment; thus it suggests our fit results are a reasonable estimation.

Fig. 9 Simulation vs Experimental Data

Applicability and Implication

The fitting results showed that our assumption for three RR constants (kTX, kTL, and kdmRNA) that their values are proportional to the length of the sequences of the target protein is reasonable. Their proposed values are consistent with our initial estimations. Besides confirming our assumption, the values of kOPH_PXN_f, kOPH_PXN_r, khydro, and kdOPH we obtained via fitting are our contributions to this specific variant of engineered OPH . Furthermore, besides kPXN_out we can not determine with our fitting due to the limited experimental data and the fact that PXN influx occurs more readily than outflux, we successfully determine the other five RR constants for transportation of IPTG, PXN, and pNP across the cell membrane. The data we obtained shows that these are rather slow reactions compared to other reactions happening simultaneously in the cell. Thus, this also proved our idea of adding transportation into our model to account for an in vivo experiment.

With such a detailed model, we believe that our model has wide applicability. The DNA module we devised can be applied to all the inducible operons, such as gal operon and L-arabinose, just to name a few, by changing the RR constant values (Sanganeria & Bordoni, 2021). Also, this model only models the PXN hydrolysis experiment, yet since we design it to be as a module, this model is extended to simulate our future experiments such as the poly-P sensor, or the mCherry fluorescent protein. Our model is designed based on fundamental reactions we learned in high school, yet we have shown its ability to describe a complex system. This approach can be followed by future iGEM teams to develop a simple yet accurate model. Finally, the systematic method we used to deal with the Equivalent Parameters Problem can also assist future teams in their fitting process.

Limitation and Future Improvement

One thing that we were aware of is that due to our limited experimental data, there exist multiple estimations for the RR constant that can fit the data almost perfectly. Therefore, an improvement we can make to receive a more accurate value for the RR constants is to have a greater database that allows us to evaluate the fitness of our model with a higher standard. Furthermore, IPTG is known to suppress the expression of protein when its concentration exceeds a certain concentration. Since the reason for such suppression is still obscure, we did not include this biochemical interaction in our model. Even though in our fitting process, we saw peaks of kTX and kTL under 250 μM IPTG which is the optimal condition we identified via experiments, the differences of these RR constants between this concentration and others are not significant enough for us to determine whether this is caused by the suppression of high IPTG concentration or simply the instability of our stiff model. Therefore, we hope to add more reactions to our model that can account for such suppression; thus, we will be able to find out the optimal concentration of IPTG. Last but not least, we also hope to factor in the influence of the toxicity of PXN at high concentrations on the expression of protein and bacterial division. By having such consideration, we will be capable of simulating the experiment more accurately and even simulate the environment in which our bacteria are implemented.

References

Bintu, L., Buchler, N.E., Garcia, H.G., Gerland, U., Hwa, T., Kondev, J., Kuhlman, T., & Philips, R. (2005). Transcriptional regulation by the numbers: applications. Current Opinion in Genetics & Development, 15(2), 125-135. https://doi.org/10.1016/j.gde.2005.02.006.

Bremer, H., & Dennis, P. P. (2008). Modulation of chemical composition and other parameters of the cell at different exponential growth rates. EcoSal Plus, 3(1). https://doi.org/10.1128/ecosal.5.2.3.

Dunaway, M., Olson, J. S., Rosenberg, J. M., Kallai, O. B., Dickerson, R. E., & Matthews, K. S. (1980). Kinetic studies of inducer binding to lac repressor operator complex. The Journal of Biological Chemistry, 255. (10), 3. https://doi.org/10.1016/s0021-9258(19)70435-9.

Dvorak, P., Chrast, L., Nikel, P. I., Fedr, R., Soucek, K., Sedlackova, M., Chaloupkova, R., de Lorenzo, V., Prokop, Z., & Damborsky, J. (2015). Exacerbation of substrate toxicity by IPTG in Escherichia coli BL21(DE3) carrying a synthetic metabolic pathway. Microbial Cell Factories, 14(1). https://doi.org/10.1186/s12934-015-0393-3.

Griffiths, A. J. F., Gelbart, W. M., Miller, J. H., Lewontin, R. C. (1999). Modern Genetic Analysis. W. H. Freeman and Company.

Kubitschek, H. E., & Friske, J. A. (1986). Determination of bacterial cell volume with the Coulter Counter. Journal of Bacteriology, 168(3), 1466–1467. https://doi.org/10.1128/jb.168.3.1466-1467.1986.

Montalva-Medel M, Ledger T, Ruz GA, Goles E. Lac Operon Boolean Models: Dynamical Robustness and Alternative Improvements. Mathematics. 2021; 9(6):600. https://doi.org/10.3390/math9060600

Morgan, K. (2020, November 10). Plasmids 101: Origin of replication. Addgene blog. Retrieved September 28, 2022, from https://blog.addgene.org/plasmid-101-origin- of-replication.

Muller-Hill B. (1996) The Lac Operon: A Short History of a Genetic Paradigm. Berlin: Walter de Gruyter.

Sanganeria, T., & Bordoni, B. (2021, October 12). Genetics, Inducible Operon. National Library of Medicine. Retrieved October 2, 2022.