Model | Thessaly - iGEM 2022

Overview

Our team's modeling effort is focused on providing a clear and detailed description of the function of our genetically engineered plants. Although modeling complex organisms such as eukaryotic cells is a challenging task, a comprehensive literature has been established for the formulation of simple deterministic models that provide an accurate depiction. Thus, we designed a continuous deterministic model for a genetic inverter in a eukaryotic cell with the goal of an in silico simulation of our system’s function, the prediction of its time response with its environment and the merging of wet & dry lab work to the development of our design.

Design

Anthropogenic eutrophication is highly linked with increased phosphate inflow in natural water bodies. Due to the overabundance of toxic algal blooms and, consequently, the increase in microcystin concentrations, the diversity and disposition of aquatic plants are disrupted (1,2).The wet lab members of our team designed a gene network, in which a riboswitch with a particular aptamer recognizes Microcystin-LR toxin. In the presence of Microcystin-LR, the riboswitch will suppress the Tet-Repressor (Tet-R) protein, which is the next part of our system. TetR's inhibition will lead to the optimal expression of the PHT1 protein family's Pi transporters, leading to increased Pi uptake by the roots.

States

In modeling terms, repressors and microcystins serve as our input and Pi transporters as our output. Their interaction is similar to a digital logic gate called IMPLY gate (3), where if A (the repressor) is true(is produced) then the output (Pi-transporter proteins) follows B (microcystins), otherwise output is always true. This means that our inducer (microcystins) controls the way our system behaves.

	A	B	A --> B
	tetR	MC-LR	Pi-transporter
State 1	1	0	0
State 2	1	1	1
State 3	0	1	1
State 4	0	0	1

Figure 1. NAND Logical Gate.

Apparently, our system advances in four possible states.

State one

The first state is when tetR protein levels are prevalent and microcystin absent. This state is simple and practically means that our plants are not in eutrophic water at the time. As a consequence the phosphate transporters are repressed.

Figure 2. State 1 phase, high signal is interpreted as 1 and the low signal as 0.The blue color signifies microcystin levels and the red one the repressor.

State two

The second state is when both inputs demonstrate high signaling. A sudden spike in microcystins occurs and both inputs have high signals momentarily at a very small interval time ΔΤ while a drop in tetR protein levels follows. This is the state that is probably going to arise when placing the plants in a eutrophic water body for the first time assuming the riboswitch activates instantly and effectively.

Figure 3. State 2 phase, high signal is interpreted as 1 and the low signal as 0. The blue color signifies microcystin levels and the red one the repressor

State three

The third state occurs when there is only presence of microcystins and no tetR protein levels. This state is fairly simple and the one that is expected for our system to be operating most of the time (ΔΤ). Maximum phosphorus uptake will be achieved during this state.

Figure 4. State 3 phase, high signal is interpreted as 1 and the low signal as 0.The blue color signifies microcystin levels and the red one the repressor.

State four

The last is the absence of both the repressor and microcystin which realistically would happen in a transitional stage where the signal of microcystin declines from a high to low signal and tetR protein levels respond to an increase after a time ( ΔΤ). State one occurs during Δτ. In that short time interval Pi-proteins are not repressed

Figure 5. Diagram 4: State 4 phase, high signal is interpreted as 1 and the low signal as 0.The blue color signifies microcystin levels and the red one the repressor.

It is of importance to understand that these states are of different order of magnitude in the dimension of time. For example, states 2 and 3 are in the span of hours or even days while 1 and 4 are in mere seconds or less.

Course of Action

A continuous set of states is considered a course of action. The manner in which our four states interact is governed by our application of the system. A typical course of action would entail the modified plants ready to use at our lab. Since there are no microcystins in our lab to our knowledge then we are in state 1 (tetR 1 /MC-LR 0). Next, after applying our innovative CFW to the selected eutrophic water body, the system adjusts to a new state momentarily, state 2(tetR 1/MC-LR 1). Then state 3 (tetR 0/MC-LR 1)) activates for a relatively longer period of time and finally when the eutrophic body is microcystin neutral, state 4(tetR 0/MC-LR 1) shortly manifests itself and is followed by state 1 again.This set of states forms a cycle of operations of 1-2-3-4-1.

Figure 6. Finite States Machine Diagram.

Figure 7. Course of Action Plots.

Concentration Estimate

A closer look at the above interactions reveals that microcystin levels control the states of our system. Up until now we examined the system in boolean terms of ones and zeroes, but in real world application there are ranges of values to take into consideration. In eutrophic lakes, for example, there are different spectrums of values affected by a variety of factors such as topography, climate, season and wastewater volume inflows. Evidently, there is no set of high and low microcystin values that we can tune our system to react with most eutrophic water bodies. Thus, each application is specifically estimated considering an upper and low limit. The mean microcystin concentration (μg/L) of the eutrophic water body we will remediate is our upper limit and the accepted microcystin concentration for human exposure is our low constraint. The high limit in lake Karla depends on multiple factors and as a result a varying upper limit. Regarding the US EPA drinking water health advisory for microcystins, an acceptable concentration is 1.6 μg/L for adults(4).

MCH: depends on application

MCL = 1.6 μg/L

Deterministic models

In order to understand biological processes, researchers introduced bio-synthetic models to describe the dynamic interactions between biomolecules. Through the application of analytical thought we are able to quantitatively assess populations of molecules in a cell. One such example is deterministic models. Deterministic models assume predictable outcomes independent of diverse inputs without any room for random variability. In detail, accepting that large population sizes of our interacting molecules are fit to derive a mean average and used in the model to deduce a definitive answer. Nonetheless, there are limitations of the deterministic approach when the reaction rates are too slow, typically in small molecular quantities. As a result our analysis is centered around large molecular quantities.

Dynamic analysis of biological systems is frequently achieved with the use of ordinary differential equations (ODEs) in detailing productions for each species represented in the model by different rate equations. In a rate equation shown below, vector x illustrates the production of proteins, mRNA transcripts and any other molecule we evaluate while fi is typically a nonlinear function.

\begin{equation} \frac{dxi}{dt} = fi(x), 1 < i < n, fi : R^n \to\ R \end{equation}

Module 2 consists of the developing interactions between tetR repressor and PHT1 transporter proteins. Both species have to go through the stages of transcription, mRNA maturation and translation.

Transcription occurs when a DNA sequence is copied to mRNA molecule and develops in three steps, transcription initiation where RNA polymerase (in our case RNA Pol Il) binds to a DNA molecule and scans until it recognizes the promoter, a regulatory region of DNA transcription, which helps initiate the transcription of a gene. Elongation which accounts for the addition of nucleotides to the RNA strand and the termination stage when RNA polymerase finalizes the process at the termination region.

In eukaryotic organisms, including plants, premature mRNA is unstable and has to be processed in order to remain in the demanding environment of the cytoplasm. Maturation of mRNA ensues also in three steps, 5’ capping where A 7-methylguanosine cap is added to the 5’ end of the pre-mRNA, 3’ polyadenylation where the 3’ end of pre-mRNA is cleaved and about 250 adenine residues are added to form a polyA tail and RNA splicing where introns are spliced from pre-mRNA and the remaining exons are linked directly to re-form a single continuous molecule, which later can be translated into a protein.

Translation is the process of producing proteins from a mature mRNA molecules in three stages: initiation where a methionine-bearing tRNA binds to the small subunit of the ribosome (divided into a small and large subunit) at same time the mRNA inserts and positions so as the initiation codon is paired with the anticodon of the initiator tRNA. Then the two subunits connect and the initiation complex ribosome-mRNA-tRNA is formed. The second stage of translation is chain elongation, where amino acids are added via the route of new tRNAs carrying the next amino acids and paired with their respective anticodons along the transcript. At this continuous process newly added amino acids form peptide bonds with the previous amino acids carried by tRNAs as the ribosome moves along the transcript and the peptide chain grows. The final stage is termination where a stop codon has no pair tRNA and by special enzymatic reactions the ribosome releases the peptide chain.

To conclude, the three aforementioned processes do not represent a unitary biological picture but a general outline of the main functions that we have to model. Using the dynamic analysis method proposed, we have to select the species involved in these processes to insert to our model so as to contain calculation overload. Since the analysis of all the molecules that take part in the biochemical interactions with ODEs would end up in a huge cluster of equations improbable to solve. Thankfully, selecting only the protein, premRNA and mRNA molecules as our species is enough to form a deterministic approach as shown in literature (5,6,7). Thus, we have six separate species we will introduce to our model premature, mature mRNA and protein for tetR and PHT1 likewise.

For simplicity sake we use the below abbreviations to limit the repetitiveness of these expressions.

$$ premRNA-tetR \to\ pmTR $$
$$ premRNA-PHT1 \to\ pmP1 $$
$$ mRNA-tetR \to\ mTR $$
$$ mRNA-PHT1 \to\ mP1 $$
$$ protein-tetR \to\ pTR $$
$$ protein-PHT1 \to\ pP1 $$

Regarding our propositions above we conclude that our ODEs generally form as follows (parameters will be explained in latter section),

Ordinary differential equation describing the rate production of premRNA species.

Ordinary differential equation describing the rate production of mRNA species.

Ordinary differential equation describing the rate production of protein species.

Hill functions

So far our model evaluates the production of our species but does not involve the ligand binding between the tetR protein at the promoter site of PHT1 DNA sequence. Hill functions interpret the number of ligands that simultaneously bind to a single receptor. In our case the receptor is the PHT1 promoter site and ligand tetR protein. If we consider the reaction: \begin{equation} P1_{promoter} + n \cdot TR_{protein} \rightleftharpoons P1nTR_{promoter - protein,complex} \end{equation} then the fraction of free P1 promoters is: \begin{equation} H(pTR(t)) = \frac{K^n}{K^n + pTR(t)^n} \end{equation} ,where \begin{equation}K^n\end{equation} is equal to the dissociation constant and n is the hill coefficient, also known as cooperativity.

Introducing this function by multiplying it to the Ki2 parameter of the PHT1 premRNA ordinary differential equation, $$ \frac{dpmTR(t)}{dt} = ki1 - ks1 \cdot pmTR(t) $$
$$ \frac{dpmP1(t)}{dt} = ki2 \cdot \frac{K^n}{K^n + pTR(t)^n} - ks2 \cdot pmP1(t) $$
$$ \frac{dmTR(t)}{dt} = ks1 \cdot pmTR(t) - kdm1 \cdot mTR(t) $$
$$ \frac{dmP1(t)}{dt} = ks2 \cdot pmP1(t) - kdm2 \cdot mP1(t) $$
$$ \frac{dpTR(t)}{dt} = kt1 \cdot pTR(t) - kdp1 \cdot mTR(t) $$
$$ \frac{dpP1(t)}{dt} = kt2 \cdot pP1(t) - kdp2 \cdot mP1(t) $$

Assumptions

Modeling a complex biological system is a daunting task and a purely deterministic approach is only sufficient with the proper set of assumptions for most of the processes involved. Our assumptions are as follows,

A single monotonous production rate for transcripts, although a transcript production rate depends on various proteins (transcription factors) and can occur monotonically or in bursts.
RNA polymerase and ribosome levels are not a limit factor even at high expression rates.
RNA polymerase translocation rate is substantially higher such that it does not limit the transcription initiation rate.
Transcription factors bind and unbind to promoter sequence at a faster rate than other processes.
The time required for the mRNA to exit the nucleus is negligible.
Considering cytoplasmic mRNA entry flow rates and nuclear transcriptional initiation rates as similar since we calculate them in molar terms (numbers of molecules) because premRNA size is several times larger than cytoplasmic mRNA.
The way splicing occurs (co- or post transcriptionally) does not affect our model.
The degradation of mRNA is proportional to the total mRNA rather than the cytoplasmic fraction.
The degradation of mRNA in the nucleus is minimal thus all premRNA lost is converted to mRNA.
The nutrients necessary for gene expression (tRNA, amino acids) are in infinite supply during the steady state.
The productions of our species act as continuous functions across time.

Parameter values

Our model demands an initial parameter estimation towards a result to be evaluated and a parameter tuning cycle to ensue, conducing to a realistic representation of our system. Our parameters are the premRNA transcription rates for tetR and PHT1 species respectively (ki1,ki2), defined as the molecules produced per unit of time, splicing coefficients (ks1,ks2), as in the time needed for one molecule of premRNA to be converted, mRNA degradation rates (kdm1,kdm2) which are the time of one molecule of mRNA to be degraded, protein production rates (kt1,kt2) and similarly are the time for one protein molecule to be synthesized and and their corresponding degradation rates (kdp1,kdp2). Lastly, parameters regarding the hill function are the $K^n$ element equal to the dissociation constant KD for the tetR_PHT1 promoter binding interaction and the hill coefficient n.

Parameter	Value	Description	Comments
ki1	34[M/hr]	TetR transcription rate	Ref [10],[13]
ki2	3.4[M/hr]	PHT1 transcription rate	Ref [10],[13]
ks1	23[1/hr]	TetR premRNA splicing coefficient	estimate
ks2	23[1/hr]	PHT1 premRNA splicing coefficient	estimate
kdm1	0.078[1/hr]	TetR mRNA degradation rate	calculated according to the mRNA half life
kdm2	0.06[1/hr]	PHT1 mRNA degradation rate	calculated according to the mRNA half life
kt1	138[1/hr]	TetR protein production rate	Ref [8]
kt2	92[1/hr]	PHT1 protein production rate	Ref [14]
kdp1	1.38[1/hr]	TetR protein degradation rate	Ref [9]
kdp2	0.65[1/hr]	PHT1 protein degradation rate	Ref [9]
KD	6[nM]	tetR dissociation constant	Ref [11]
n	3	tetR hill coefficient	estimation

Model simulation

After our model is set, assumptions defined and our parameters estimated, the last step is to initiate the simulation. We used the SimBiology tool in MATLAB from Mathworks. Initially, we defined our six species (pmTR,mTR,pTR,pmP1,mP1,pP1) and the respective reactions involved (transcription, translation, degradation) in a superficial root cell.

Figure 8. Mathworks SimBiology Root Cell.

Since our system revolves around four states (1 to 4, described in module 1) then we have 4 different plots to describe the tetR-PHT1 dynamic interplay. Essentially, microcystins regulate tetR expression which in turn regulates PHT1 expression, states one (tetR 1 /MC-LR 0), two (tetR 1 /MC-LR 1), three (tetR 0/MC-LR 1) and four (tetR 0/MC-LR 0) are interpreted to module 2 by the existence or not of tetR population and used as an initial condition to dictate four analogous plots. Thus, connecting module 1 and 2 we end up with four plots, two that describe steady states of our two main species and two that dynamically fluctuate tetR and PHT1 populations. The steady state graphs are plot 1, which has a steady state tetR equal to a non-zero and zero for PHT1 and plot 3, which is the inverse. The dynamic plots are plot 2 and plot 4, the first is described as a gradual decrease of tetR population through the inducing action of MC-LR and our riboswitch which negates the transcription rate of the tetR premRNA species and a corresponding increase of PHT1 at the moment tetR levels eliminate. The second depicts a typical repressor action where the repressor increases and reaches a steady state and the inhibited species reaches zero. We set 100 hours as an x axis plot limit ,although the duration is surely longer for our steady state plots and is dependent on a variety of factors. We set up our plots in the order described in module 1(course of action).

Similarly with the states described in module 1, when the plants are at our lab this is the condition that they will exist until we place them in eutrophic water.

Figure 9. Plot 1.

The Y axis is the number of molecules and the X axis the time The red curve represents tetR number of molecules across time,the green curve pht1 number of molecules across time.

At the moment our plants are placed in the eutrophic water body and the instantaneous activation of the riboswitch nullifies the transcription rate of the tetR gene and our system switches steady states. We observe that the system reacts and completely negates the tetR expression at approximately 6 hours while the pht1 expression activates ar 23 hours.

Figure 10. Plot 2.

The Y axis is the number of molecules and the X axis the time The red curve represents tetR number of molecules across time, the green curve pht1 number of molecules across time.

It is the reverse situation of plot 3 and its duration is as long as the phytoremediation needs to yield results.The Y axis is the number of molecules and the X axis the time The red curve represents tetR number of molecules across time,the green curve pht1 number of molecules across time.

Figure 11. Plot 3.

The Y axis is the number of molecules and the X axis the time The red curve represents tetR number of molecules across time, the green curve pht1 number of molecules across time.

Once our results are sufficient and microcystins decrease to an acceptable level then the riboswitch disengages and tetR expression peaks again while PHT1 Pi-transporter proteins zero again to end up to the steady states of plot 3.It is observed that PHT1 levels are negligible at around 9 hours after tetR activation. The red curve represents tetR number of molecules across time,the green curve pht1 number of molecules across time.

Figure 13. Plot 4.

The Y axis is the number of molecules and the X axis the time The red curve represents tetR number of molecules across time, the green curve pht1 number of molecules across time.

Results & Conclusion

To conclude we applied a logical gate to interpret the microcystin-tetR correlation into four possible states. Following this, we proceeded with a deterministic approach of the tetR-PHT1 dynamic interaction in a eukaryotic cell and the correlation of the possible states of module 1 with four respective plots from the deterministic model. Through the selective phosphorus accumulation of our system, we are able to efficiently phytoremediate eutrophic water bodies with increasing Pi-transporter carrier proteins and by effect increasing Pi uptake (12). Although the overexpression of such proteins enhances the ability of the plants to uptake nutrients, we do not know the exact way that the correlation occurs and it is a subject for further study, maybe by a future iGEM team.

References

Eutrophication,V.H.Smith,Reference Module in Earth systems and Environmental Sciences,Encyclopedia of inland Waters 2009, Pages 61-73
Eutrophication,A.J.Gold,J.T.Sims,Reference Module in Earth Systems and Environmental Sciences Encyclopedia of soils in the Environment 2005, Pages 486-494
Ron Weiss, et al., Genetic circuit building blocks for cellular computation, communications,and signal processing. Weiss Natural Computing, Dec 2002.
United States Environmental Protection Agency, Health and Ecological Criteria Division. Drinking Water Health for the Cyanobacterial Microcystin Toxins. Office of Water, June 2015.
Krzysztof Kuchta, et al., Predicting proteome dynamics using gene expression data.}, Scientific Reports, Sep 2018.
Amit Zeisel et al. Coupled pre-mRNA and mRNA dynamics unveil operational strategies underlying transcriptional responses to stimuli. Molecular Systems Biology 7, Sept 2011
Smadar Ben-Tabou de-Leon and Eric H. Davidson, Modeling the dynamics of transcriptional gene regulatory networks for animal development. Dev Biol, Nov 2008.
Kristian Moss Bendtsen, et al., The role of mRNA and protein stability in the function of coupled positive and negative feedback systems in eukaryotic cells. Science Reports, Sept 2015.
Lei Li, et al., Protein Degradation Rate in Arabidopsis thaliana Leaf Growth and Development. The plant cell, Feb 2017.
Kate Sidaway-Lee, et al., Direct measurement of transcription rates reveals multiple mechanisms for configuration of the Arabidopsis ambient temperature response. Genome Biology, Mar 2014.
Hsin-Ho Huang, et al., Analysis of the leakage of gene repression by an artificial TetR-regulated promoter in cyanobacteria. BMC Research Notes, Sep 2015.
Hayato Maruyama and Jun Wasaki, Transgenic approaches for improving phosphorus use efficiency in plants. Academic Press, 2017.
Hao Wei, et al., Transgenic sterility in Populus: expression properties of the poplar PTLF, Agrobacterium NOS and two minimal 35S promoters in vegetative tissues. Tree Physiology, Jan 2006.
Björn Schwanhäusser, et al., Global quantification of mammalian gene expression control. Nature, May 2011.