Introduction
Overview
Regarding the
paraoxon (PXN) hydrolysis experiment
, we sought to build a model that can explain our result. First, we aimed
to quantify the ability of our target protein organophosphate hydrolase
(OPH_P) to hydrolyze PXN into p-nitrophenol (pNP) by fitting our model
with the data from the experiments. We also sought to construct a model
that is approachable for future high school iGEM teams by forming our
model on simple yet useful foundations -- the Law of mass action and
Enzyme kinetics. In addition, we proposed that the transportation of
chemicals across the membrane is essential for simulating the in vivo
system. Then, we sought the optimal condition, more specifically the
optimal concentration of isopropyl-β-D-1- thiogalactopyranoside (IPTG),
since a high concentration of IPTG exacerbates the expression of our
target protein.
Brief Lac Operon Review
According to the work by Griffiths et al. (1999), a literature review of
the mechanism of the lac operon, repressor (rep) protein blocks, not the
binding of RNA polymerase (RP), but its movement. In other words, RP would
still bind to the promoter even when the operator (lac O) is occupied by
the repressor and vice versa. That is, the repressor and RP will not
interfere with each other upon binding. IPTG would either bind to the
repressor in the cytosol or bind to the repressor on the DNA. Bound to
IPTG, the repressor's affinity to DNA decreases and thus allows RP to
transcribe the message RNA (mRNA) encoding OPH (denoted as OPH_mRNA).
Fig. 1 Regulation of the lac operon (Griffiths et al.,1999, Fig. 14-5)
Preliminary Work
DNA Module
According to lac operon kinetics provided by Montalva-Medel et al. (2021),
we observed that their approach to model lac operon using a boolean model.
However, we sought that the ordinary differential equation (ODE) system
defined by the Law of mass action would be suitable for our study. To
model this mechanism, we categorized the plasmids in our engineered
Escherichia coli into 6 types and identified five important reactions
between these categories as provided below in Tables 1 and 2,
respectively. The components involved in these reactions include repressor
(rep) and RNA polymerase (RP) in the cytoplasm (fr_rep &fr_RP), IPTG, and
the complex of IPTG and rep (C_I_rep).
Table 1: Types of DNA
Table 2: Reaction
To summarize above, we described the relationship between different types
of DNA in a simple interaction diagram.
Fig. 2 Interactions of DNA
Module Evaluations
Reviewing the study, which took the same approach as ours, done by Dunaway
et al. (1980), we found that the rate reaction (RR) constants of the
repressor are very high. The binding and decomposing rate constants for
fr_rep to lac O are 7200000 hr^-1μM^-1 and 144 hr^-1, respectively, while
those for C_I_rep are 7200000 hr^-1μM^-1 and 144000 hr^-1, respectively.
Plugging in these numbers, we stimulated the balance states for different
IPTG concentrations, specifically 15.625 μM vs. 2000 μM, using the initial
conditions that we will explain below (see Table 6 for a summary).
Fig. 3
Left: DNA under 15.625/2000 μM IPTG
Right: Extract DNA_nrep_RP (using logX axis)
From this simulation plotted in log X scale, we extracted DNA_nrep_RP
specifically because this species was the one that can transcribe OPH_mRNA
which translates OPH. Yet, the simulation deviated from our experimental
data. From Table 3, we can see that, under 15.625 μM IPTG, the production
of pNP is disproportional compared with that under 2000 μM IPTG,
suggesting that the PXN hydrolysis process is far from completion. This
delay in hydrolyzation suggests two possibilities. First, there are much
fewer plasmids that can express OPH under 15.625 μM than under μM IPTG
resulting in much less OPH to hydrolyze the PXN. Secondly, there should be
an observable delay for DNA_nrep_RP to peak up and reach an equilibrium
state such that the OPH concentration increases slowly. Yet, from the
simulation above, the concentration of DNA_nrep_RP under 15.625 μM IPTG is
only 4.68 times lower than that under 2000 μM IPTG. This difference can
not account for the discrepancy between the pNP production under these two
conditions. This leaves us with the possibility that there is a
significant delay for plasmids to reach their equilibrium states. However,
the simulation above shows a delay time, the interval marked between two
vertical dotted lines in the right chart of Fig. 3, that is insignificant
(only a 0.006-hour difference). Therefore, it implied that there were
interactions in our system that we had not considered.
Table 3: pNP concentration at 6 hr under 15.625/2000 μM IPTG and 500μM PXN
We attributed the transportation of IPTG across the E. coli membrane to
the delay of the equilibrium state. Thus, we added the reaction of IPTG
transportation across the membrane. Furthermore, we incorporated the
transportation of PXN and pNP for compliance. This improvement divided
IPTG into IPTG inside the bacteria (IPTG_in) and IPTG outside the bacteria
(IPTG_ex). This division also holds true for PXN and pNP resulting in
PXN_in, PXN_ex, and pNP_in, pNP_ex.
Establishment of the Model
Assumptions
Here we made several assumptions to construct our model. First, the liquid
within the eppendorf is uniform, therefore the concentration changes of
PXN_ex, pNP_ex, and IPTG_ex caused by each bacteria are the same in the
whole eppendorf. Second, each bacteria is synchronized to perform the same
reactions under a given time, for example, plasmids in the bacteria reach
the equilibrium state simultaneously. Finally, the number of bacteria is
enough to divide up the space in the eppendorf such that each bacteria
encounters the substance in the space that has a volume equivalent to the
volume of a bacteria. Therefore, while a molecule transports into a
bacteria, the concentration changes of this species inside the bacteria
and the space with which it is interacting are the same. These assumptions
simplified our model so that we can model the status of our experiment by
simulating the interactions of a bacteria that can reflect the situation
of the whole eppendorf.
Improved Model
Upon the completion of our DNA module, we obtained the important product
of transcription -- mRNA encoding OPH. The two major reactions left are
the translation of mRNA and the hydrolysis of PXN. To model the
translation, we approximated it to enzyme kinetics in which the enzyme is
the ribosome (fr_R), the substrate is the OPH_mRNA, the intermediate
complex is the complex of mRNA and ribosome (C_mRNA_R), and the product is
the OPH_P. The main difference between our model and enzyme kinetics is
that the mRNA will not turn into the OPH_P but retain itself after the
reactions. In this model, we assumed that the concentration of tRNA is
consistent throughout the whole duration, therefore our translation rate
constant (kTL) includes the constant concentration of tRNA.
Fig. 4 Translation
The remaining part of the reactions was modeled by standard enzyme
kinetics with a single intermediate complex (C_OPH_PXN).
Fig. 5: Hydrolysis
To construct our model, we used the MATLAB SimBiology Model Builder to
create the visible diagram for our model. Each circle represents a
reaction with straight lines linking every reactant involved in the
reaction and arrows pointing to the products. The two small arrows on top
of the circle indicate that the reaction is reversible. Since the pNP
concentration is our primary target to observe, we needed a species that
can reflect the concentration of the total pNP. Therefore, we added the
Monitor_pNP with an algebraic rule that gives its value according to the
summation of pNP_in and pNP_ex. There are 20 species interacting, one
monitor species, and 25 RR constants. 12 of these constants can be found
in the literature review, and the values for the other three constants
(kTX, kTL, and kdmRNA) are estimated with the assumption that all of these
values are proportional to the length of the sequences. Thus, our model
was composed of 20 differential equations. Due to the recently published
nature of the OPH we used, there are limited resources regarding this
variant of OPH. According to our literature review, there was no research
that provides quantitative analysis of the reactions of OPH, such as its
hydrolysis. Therefore we contributed to this OPH part by fitting our
experimental data with our model to estimate the RR constants of this OPH
quantitatively.
Fig. 6 Model overview
Table 4: Species Review
Table 5: Reaction Constants Review & Value
Simulation & Fitting
Initial Conditions
We used a pET expression vector in our engineered bacteria; thus our
plasmids have a copy number between 15~20 (Morgan, 2020). In our model, we
picked 20 to be our copy number which is equivalent to the number of
available DNA to transcribe mRNA. Furthermore, according to Kubitschek and
Friske (1986), one molecule in E. coli roughly equals 1nM, thus 20
plasmids equate to 0.02 μM. We further set our model to start without any
complex; that is, the 20 plasmids are in the state of DNA_nrep_nRP which
give the initial concentration of this species -- 0.02 μM. The initial
concentration of fr_rep, fr_RP, and fr_r are 0.05, 3, and 73 respectively
(Bintu et al., 2005; Muller-Hill, 1996, p.134; Bremer & Dennis, 2008). The
table below provides a summary of the initial condition for our model
simulation. IPTG and PXN are set according to the experimental conditions.
Table 6: General Initial conditions
Guessed Values
To test the accuracy of our model, we first arbitrarily yet reasonably
assigned value for the constants that are not previously tested. We used
the MATLAB SimBiology Model Analyzer to simulate. Since our model contains
reaction constants, such as kRD_f, with high values, our model is stiff.
Thus, we used the built-in differential equation solver ode23t and set the
absolute tolerance to 10^-9. Using the guesses and initial conditions
provided below in the table, we had the curves for each species throughout
the hydrolysis process. Among all the elements, we were particularly
interested in PXN_ex, PXN_in, OPH_P, C_OPH_PXN, and Moniter_pNP, so we
plotted these specifically in Fig. 7 to have an overview of the hydrolysis
process. This graph makes a reasonable prediction of the enzymes’
behaviors thus we continue to analyze those guesses and estimations of the
reaction constants by fitting the model to our experimental data.
Table 7: Simulation setting
Fig. 7 Simulation Curves
Fitting
To perform the quantitative analysis of the OPH protein, we fit our model
with the data collected from the experiments under 500 μM PXN and various
concentrations of IPTG from 15.625~2000 μM, a total number of 8 sets of
data as shown in the table below. We only fit the rate constants that we
estimated and those without references, using the initial guesses provided
above in Table 7.
Table 8: IDs and Concentrations
Identified Flaws & Solution
Using these guesses we yielded an array of estimated constants, yet we
encountered a problem that several constants converge to be negative.
However, this can not be true since none of the reactions in our system is
zeroth-order, the only case in which the RR constant can be negative.
Reviewing the differential equations we implemented for simulation, we
found out that those observations were reasonable, rather than systematic
error, during the fitting process. In quest of a fit value, the algorithm
only searches a range of values around the initial guess and ends at the
combination that creates the combination of the constants. This caused a
problem we identified as the Equivalent Parameters Problem. By name, it
means that there are multiple constants that have the same effect on the
curves; while the algorithm computes the curve, they will take these
parameters equally without considering their chemical significance,
resulting in unrealistic estimations of the RR constants. For example, the
estimation we reached using our initial guesses results in a negative
kIPTG_out.
Fig. 8 Differential Equation
From the equation above, we can see that kIPTG_in and kIPTG_out are
mathematically equivalent. Therefore, the negative kIPTG_out, in fact,
acts as kIPTG_in but is proportional to IPTG_in instead of IPTG_ex, which
means it contributes to the influx of IPTG into the cell. We denoted this
kind of combination of RR constants that have the same mathematical effect
as the equivalent set. Therefore, by heightening the value of the
counterpart in the equivalence set, we can adjust the one with a negative
value to a reasonable number. We devised this method to deal with the
negative RR constant estimation systematically rather than randomly
testing some other RR constants that only indirectly affect the one we
intended to change.
Despite this systematic method and continuous effort, kPXN_out still
remained negative, just below zero. This might be because the method we
acquired to deal with the Equivalent Parameters Problem had its
limitations. We were aware that a constant could be part of multiple
equivalence sets, therefore, we can not change a RR constant unlimitedly
to an unrealistically high value this would either make other RR constants
become negative to balance the effect caused by the augmentation of one
value, or would collapse the whole algorithm such that it could not
converge to a tolerable estimation. Taking one more glance at our model,
we saw that the OPH_P consistently hydrolyzes the PXN inside the
bacterium; this suggests that the dynamic equilibrium between the PXN
inside and outside the membrane is not reached, given that PXN_in is
quickly bound to OPH_P. While the PXN indeed transports across the
membrane outward, the presence of OPH_P that binds with PXN and hydrolyzes
it into pNP makes our model difficult to estimate kPXN_out merely by the
pNP data we collected. Thus, to take PXN influx outnumbering the outflux
into consideration, we decided to set kPXN_out to 0 for our stimulation
and left it undetermined for the future improvement of both our
experiments and model.
Discussion
Results
The removal of kPXN_out from our fitting provides a shift in the value of
estimated RR constants previously unseen in our fitting so far. Repeating
the method several times we obtained a set of estimations as shown in the
table below.
Table 9: Fit Results
(0* see the previous section for more information)
Verification
The enlargement of these RR constants made the curve turn more sharply
than the previous simulation (see Fig. 7) did. We used these values to
simulate the production of pNP under another experimental condition (100
μM IPTG and 500 μM PXN). The curve was plotted with the experimental data
to see if the simulation can predict the experiment. As shown in Fig. 9
below, the simulation curve made a good estimation for the experiment;
thus it suggests our fit results are a reasonable estimation.
Fig. 9 Simulation vs Experimental Data
Applicability and Implication
The fitting results showed that our assumption for three RR constants
(kTX, kTL, and kdmRNA) that their values are proportional to the length of
the sequences of the target protein is reasonable. Their proposed values
are consistent with our initial estimations. Besides confirming our
assumption, the values of kOPH_PXN_f, kOPH_PXN_r, khydro, and kdOPH we
obtained via fitting are our contributions to
this specific variant of engineered OPH
. Furthermore, besides kPXN_out we can not determine with our fitting due
to the limited experimental data and the fact that PXN influx occurs more
readily than outflux, we successfully determine the other five RR
constants for transportation of IPTG, PXN, and pNP across the cell
membrane. The data we obtained shows that these are rather slow reactions
compared to other reactions happening simultaneously in the cell. Thus,
this also proved our idea of adding transportation into our model to
account for an in vivo experiment.
With such a detailed model, we believe that our model has wide
applicability. The DNA module we devised can be applied to all the
inducible operons, such as gal operon and L-arabinose, just to name a few,
by changing the RR constant values (Sanganeria & Bordoni, 2021). Also,
this model only models the PXN hydrolysis experiment, yet since we design
it to be as a module, this model is extended to simulate our future
experiments such as the poly-P sensor, or the mCherry fluorescent protein.
Our model is designed based on fundamental reactions we learned in high
school, yet we have shown its ability to describe a complex system. This
approach can be followed by future iGEM teams to develop a simple yet
accurate model. Finally, the systematic method we used to deal with the
Equivalent Parameters Problem can also assist future teams in their
fitting process.
Limitation and Future Improvement
One thing that we were aware of is that due to our limited experimental
data, there exist multiple estimations for the RR constant that can fit
the data almost perfectly. Therefore, an improvement we can make to
receive a more accurate value for the RR constants is to have a greater
database that allows us to evaluate the fitness of our model with a higher
standard. Furthermore, IPTG is known to suppress the expression of protein
when its concentration exceeds a certain concentration. Since the reason
for such suppression is still obscure, we did not include this biochemical
interaction in our model. Even though in our fitting process, we saw peaks
of kTX and kTL under 250 μM IPTG which is the optimal condition we
identified via experiments, the differences of these RR constants between
this concentration and others are not significant enough for us to
determine whether this is caused by the suppression of high IPTG
concentration or simply the instability of our stiff model. Therefore, we
hope to add more reactions to our model that can account for such
suppression; thus, we will be able to find out the optimal concentration
of IPTG. Last but not least, we also hope to factor in the influence of
the toxicity of PXN at high concentrations on the expression of protein
and bacterial division. By having such consideration, we will be capable
of simulating the experiment more accurately and even simulate the
environment in which our bacteria are implemented.
References
Bintu, L., Buchler, N.E., Garcia, H.G., Gerland, U., Hwa, T., Kondev, J.,
Kuhlman, T., & Philips, R. (2005). Transcriptional regulation by the
numbers: applications. Current Opinion in Genetics & Development, 15(2),
125-135. https://doi.org/10.1016/j.gde.2005.02.006.
Bremer, H., & Dennis, P. P. (2008). Modulation of chemical composition and
other parameters of the cell at different exponential growth rates. EcoSal
Plus, 3(1). https://doi.org/10.1128/ecosal.5.2.3.
Dunaway, M., Olson, J. S., Rosenberg, J. M., Kallai, O. B., Dickerson, R.
E., & Matthews, K. S. (1980). Kinetic studies of inducer binding to lac
repressor operator complex. The Journal of Biological Chemistry, 255.
(10), 3. https://doi.org/10.1016/s0021-9258(19)70435-9.
Dvorak, P., Chrast, L., Nikel, P. I., Fedr, R., Soucek, K., Sedlackova,
M., Chaloupkova, R., de Lorenzo, V., Prokop, Z., & Damborsky, J. (2015).
Exacerbation of substrate toxicity by IPTG in Escherichia coli BL21(DE3)
carrying a synthetic metabolic pathway. Microbial Cell Factories, 14(1).
https://doi.org/10.1186/s12934-015-0393-3.
Griffiths, A. J. F., Gelbart, W. M., Miller, J. H., Lewontin, R. C.
(1999). Modern Genetic Analysis. W. H. Freeman and Company.
Kubitschek, H. E., & Friske, J. A. (1986). Determination of bacterial cell
volume with the Coulter Counter. Journal of Bacteriology, 168(3),
1466–1467. https://doi.org/10.1128/jb.168.3.1466-1467.1986.
Montalva-Medel M, Ledger T, Ruz GA, Goles E. Lac Operon Boolean Models:
Dynamical Robustness and Alternative Improvements. Mathematics. 2021;
9(6):600. https://doi.org/10.3390/math9060600
Morgan, K. (2020, November 10). Plasmids 101: Origin of replication.
Addgene blog. Retrieved September 28, 2022, from
https://blog.addgene.org/plasmid-101-origin- of-replication.
Muller-Hill B. (1996) The Lac Operon: A Short History of a Genetic
Paradigm. Berlin: Walter de Gruyter.
Sanganeria, T., & Bordoni, B. (2021, October 12). Genetics, Inducible
Operon. National Library of Medicine. Retrieved October 2, 2022.