Modelling iGEM 2022 | Wageningen University & Research

Overview

Modelling the colon

Modelling lactate sensing

For our project we used the versatility of modelling in two distinct ways. Incorporating the feedback we got from the models into our experiments, we let the model inform the implementation and fine-tuning of our living diagnostic.

First, we used modelling to gain insight into how E. coli would behave in the colon by adapting an existing published model to include our engineered bacterium. This model was then used to determine the impact of introducing our living diagnostic on the gut microbiome, as well as to analyse the impact of genetic modifications to our living diagnostic such as adherence. Finally using this model, we devised a dosing strategy to provide a strong bacterial population in the colon for around six days, the time recommended by our stakeholders.

Second, we constructed a model to characterise our lactate sensing biosensor which we fit to experimental data from the lab. Additionally, we made a model for a previously published Clustered Regularly Interspaced Short Palindromic Repeats interference (CRISPRi) system which we hypothesized could be used to tune the lactate concentration at which we wanted to induce chromoprotein expression. After combining the models, we compared the system and recommended that it would be better to not include this system and suggested new experimental designs instead.

The difficulty of analysing the colon

Our living diagnostic must reside in the colon for enough time to be able to detect cancer. Therefore, it is crucial to gain an understanding of how our tool functions in, and interacts with, this environment.

The colon is an intricate environment. Every part of the colon is coated in mucus, which is a protective layer for the human colonic cells. The hollow part of the colon, where faeces move through, is called the lumen [1], see Figure 1. Furthermore, the colon can be divided into compartments along its length with different conditions [2,3]. The proximal colon is the entrance into the large intestine from the small intestine, which consists of the caecum and the ascending colon. It is followed by the transverse colon, which moves horizontally. Finally, the distal colon, which consists of the descending colon, sigmoid colon, and the rectum. Colorectal cancer tumours can be found across all colon sections [4]. This stresses the importance of our living diagnostic being able to detect cancer in all segments, meaning we should not specialise our microbe to only one of these environments.

Figure 1: Overview of the layers inside the colon (left) and the sections of the colon (right).

Unfortunately, developing an in vitro or in vivo model of the intestines is a time consuming and complex task [5]. Therefore, we decided to use an in silico model that would simulate a western diet and describe the concentrations of microbial communities in the colon [6]. We can then include our living diagnostic into the model and analyse its behaviour. This allows us to analyse a large number of situations, providing insights into the implementation of our product [7].

The base model

At the basic level, the model we used considers a western diet as the input and the concentration and residence times of microbes as output. The model, adapted from the work of Muñoz-Tamayo et al. (2010), describes a six compartment system simulating the six colon partitions shown in Figure 2. Five of these compartments have a constant volume and are modelled as continuous stirred tank reactors. However, the distal lumen is modelled as a semi-batch reactor to simulate the accumulation and excretion of faeces. Four major microbial groups are included in the model to simulate the microbiome: sugar-utilising bacteria, lactate-utilising bacteria, bacteria performing acetogenesis, and bacteria performing methanogenesis, as well as 13 metabolites including sugar and lactate. This resulted in a model that consisted of 108 differential equations in total, a full description and explanation of which is found at the bottom of the page .

Figure 2: Overview of the model. Continuous volumes are maintained in the proximal and transverse lumen compartments creating a continuous flow of the diet and produced components, while the distal colon accumulates and secretes volume periodically. From the lumen compartments the tracked variables are exchanged with the mucus, and the mucus exchanges with the lumen and host. The host absorbs short chain fatty acids and secretes mucin to the mucus. Each compartment is perfectly mixed. Figure adapted from [6].

All of the 108 different equations, describing concentrations of microbial groups and metabolites, can be generalised into 4 major groups: soluble liquid components, soluble gas components, microbes, and polysaccharides.

Equation 1 shows the general equation for soluble components, where $\frac{ds}{dt}$ stands for the change in soluble component over time. It is used to describe sugar, lactate, acetate, butyrate, propionate, methane, carbon dioxide, and hydrogen concentrations over time.

$$\begin{align*} \frac{ds}{dt} = \frac{q_{in}}{V} s_{in} - \frac{q_{out}}{V} s - \gamma s + \sum_{j=1}^{5} Y_j \rho_j - k_{L}a (s - K_{H}RTs_{g}) \tag{1} \end{align*}$$

The first term describes the inflow of the component to a specific compartment and the second term the outflow. In these terms $q_{in}$ and $q_{out}$ stands for the flux in an out the compartment, $V$ stands for the volume of the compartment, $s_{in}$ stands for the input concentration and $s$ stands for the concentration of the soluble compound in the studied compartment. The third term describes the absorption of the compound by the host, where $\gamma$ is the absorption rate. The fourth term describes the production and consumption of the compound by the different microbial groups, an elaboration of which can be seen in Muñoz-Tamayo et al., (2010). In this term $Y$ stands for the yield of the specific reaction and $\rho$ is the kinetic rate associated with that reaction. For example, sugar-utilising bacteria consume sugar and produce lactate with a specific yield and with their relevant positive and negative reaction rates. Finally, where applicable, the fifth term describes the liquid-gas exchange of the compound, where $s_{g}$ is the concentration of the compound in gas form, $k_{L}a$ is the liquid-gas transfer coefficient, $K_{H}$ is the Henry’s law coefficient, $R$ is the ideal gas coefficient, and $T$ is the absolute temperature.

The body produces mucins to create the mucosal layers. Mucins are polysaccharides which are degraded into monosaccharides, such as glucose. Due to this a different equation is used for concentration of polysaccharides, $z$, the general version of which is described by Equation 2, these are considered differently because there is an added component of host production of mucins influencing the concentration.

$$\begin{align*}\frac{dz}{dt} = \frac{q_{in}}{V} z_{in} - \frac{q_{out}}{V} z + \frac{q^{m}_{out}}{V} z^{m} - \rho_{1} \tag{2}\end{align*}$$

The first term describes the inflow of polysaccharides, and the outflow is described by the second term. The third term describes the flow of polysaccharides from the mucus in the form of mucin, and the fourth term is the hydrolysis of polysaccharides.

Finally, Equation 3 describes a generalisation of the change of a specific microbial group $x$ in this model.

$$\begin{align*}\frac{dx}{dt} = \frac{q_{in}}{V} x_{in} - \frac{1}{\tau + \frac{V}{q_{out}}} x - a x + b \frac{V^{m}}{V} x^{m} \sum_{j=2}^{11} Y_j \rho_j \tag{3}\end{align*}$$

The first and second terms describe the in- and outflow of the microbe in the compartment, where $x_{in}$ is the input concentration of the relevant microbial group. In the proximal lumen the model considers a constant microbial input from the diet. In the outflow term 𝜏 is added to represent an extra residence time for the microbes in the microbiome due to simulating their aggregation. The third term describes the adherence of microbes in the lumen to the mucus where $a$ is the adherence rate. The fourth term describes the microbes gained due to shear loss created by the faeces moving past the mucus wall where $b$ is the shear rate, $V^m$ is the volume in the corresponding mucosal compartment, and $x^m$ is the concentration of the microbial group in that mucus compartment. Finally, the last term describes microbial growth due to substrate consumption and cell degradation.

Implementation of our living diagnostic into the base model

The implementation of our living diagnostic follows the same general equation as the rest of the microbial groups, where specific parameters were adjusted to reflect literature data. A challenge was presented by the fact that most in vivo analysis of probiotics and the human microbiome is performed on the faeces, an amalgamation of every compartment. This restricted the data that could be used to validate the model.

First, as a basis, wild-type Escherichia coli Nissle (EcN) was implemented in the model. A standard adherence for regular E. coli Nissle to mucus was calculated at 0.038 [d -1] from in vitro experiments where the amount of cells that adhered to immobilised mucin was quantified [8]. When all other parameters were fixed to the default values, simulations resulted in unreasonably long residence time (more than 100 days). However, that standard E. coli Nissle residence time in adult humans and primates is 4 days [9]. Therefore, the degradation rate was adjusted to match experimental data. This resulted in a degradation rate of 1.4 [d-1], which was a factor 140 times higher than the degradation times of the other microbes in the model. While the standard degradation rates of the model are extremely low, the newly fitted rate lies in the feasible range of 0.05 to 1 [d-1] [10].

With these adjustments wild-type E. coli Nissle was implemented into the model. As our living diagnostic is heavily engineered, which impacts its ability to grow, we further modified the model to account for this burden. To estimate the toll that our engineering of the cell will take on its growth, we estimated the impact of chromoprotein production on the growth rate at 75% from literature data [11]. This was then used to estimate the specific consumption rate, which is part of the kinetic rate calculation $\rho$ in Equation 3, of our living diagnostic at 6.58 [mol sugar/mol biomass], or 83% of the full rate [12]. This results in two full general equations for E. coli Nissle: for the lumen compartments Equation 4 and for the mucus compartments Equation 5. These differ from the normal equation for bacteria in a number of ways. In the second term the residence time, $\tau$ is set to zero due to not being part of the established microbiome. In the third term the standard adherence, $a$, is set to 0.038. The fourth term has a factor 0.83 on the $\rho$ equation, and finally in the fifth term $k_{dec, ecn}$ is set to 1.4.

$$\begin{align*} \frac{dx_{ecn}}{dt}=\frac{q_{in}}{V}x_{ecn,in}-\frac{1}{\tau+\frac{V}{q_{out}}}x_{ecn}-a{\ x}_{ecn}+b\frac{V^m}{V}x_{ecn}^m+Y_{ecn}\ \rho_{ecn}\ – k_{dec,ecn xecn} \tag{4} \end{align*}$$
$$\begin{align*} \frac{dx_{ecn}^m}{dt}=a\ \frac{V}{V^m}\ x_{ecn}^m-b\ x_{ecn}^m+Y_{ecn}\ \rho_{ecn}–k_{dec,ecn}\ x_{ecn}^m \tag{5} \end{align*}$$

This results in a successful implementation of our living diagnostic into the model. A simulation assuming a normal intake of probiotic (0.2441 g/l) at 10 d can be seen in Figure 3.

Figure 3: Concentrations of our living diagnostic implementation in different colon compartments, after an input of 0.2441 g/l at time 10.

With the successful implementation of our living diagnostic, we determined a number of important questions to answer with this model. First, we analysed the effect it has on the microbiome, assessing that we can safely use it in vivo. Second, we assessed a number of different strategies to maintain a significant population in the mucus compartments for the desired time, key to ensuring successful cancer detection. Third, we studied the effect that our activatable kill switchwould have on the residence time of out microbe.

Our living diagnostic has a small impact on the gut microbiome

The microbiome is crucial to a healthy life, being considered an important endocrine organ [13]. This makes it critical to research the impact that our living diagnostic has on this system. This was analysed by simulating the original model and the model with the implementation of our living diagnostic. The impact was measured at the peak concentration of our living diagnostic, at around 0.7 days after its input, seen in Figure 3. The results of this analysis can be found in Figure 4.

Figure 4: Simulations on the impact of our living diagnostic on the microbiome at the highest peak of its concentration. Concentrations were normalised on their own compartments and percentage change was averaged. The error bars are due to averages taken over multiple compartments.

The error bars in this figure represent the variability of the impact over multiple compartments. As can be seen the microbiome is minimally affected, at a maximum of 0.5 +/- 3.5% for the acetogenic bacteria, by the presence of our microbe scenario.

Strategies to strengthen the mucosal residence time of our living diagnostic

The mucus residence time is an important factor for the functioning of our living diagnostic since this is where it needs to reside to detect a possible colorectal cancer tumour. For this purpose, we have considered three strategies: increasing the adherence of our living diagnostic, a prebiotic strategy, and different dosing strategies. Since no data is available at this time for the concentration or residence time that is necessary for the functioning of our living diagnostic, we assumed a required minimal concentration of 0.01 g/l for 6 days, following advice from our stakeholders, as a test case.

Adherence increases peak concentration of E. coli Nissle, but does not increase its residence time

To simulate the increase in adherence that could result from the modification and overexpression of fimH, the adherence rate, a in Equation 3, was evaluated over a range of values between 0 and 0.25 [d-1] generated by Latin hypercube sampling. Our living diagnostic showed similar behaviour in all mucus compartments. For simplification, three representative values have been taken and shown in the proximal mucus compartment (Figure 5).

Figure 5: An increase in adherence shows that a larger concentration of E. coli Nissle is able to stay in the mucus for longer, lengthening the critical population residence time but not affecting the total residence time a lot. Shown here is the proximal mucosal compartment, other mucosal compartments showed similar results.

Figure 5 showed that higher adherence results in a higher concentration and longer residence time in the mucus of our living diagnostic. The peaks get wider with more adherence, which indicated that a significant population is present for longer, but the test case is not reached requiring the other strategies to be explored. Moreover, reaching an adherence of a factor 2 more is already hard to engineer.

Prebiotic strategy is ineffective at increasing residence time

Prebiotics are metabolites that certain bacteria prefer to metabolise to support or increase their population in the microbiome [14]. We simulated a prebiotic strategy for our living diagnostic as a dose of extra sugar being input into the model at the same time as our living diagnostic. The input intensity was varied to analyse the amount of sugar needed to strengthen the population of our living diagnostic in the mucus. Unfortunately, the model indicated that the strategy had little to no effect on the concentration of our living diagnostic with an intensity of up to 2 kg/l of sugar.

Designing dosing strategy is a path to implementation

The dosing of our living diagnostic was evaluated in two ways. First, the dosing intensity was changed and the effect on residency time was analysed. Second, a dosing frequency strategy was evaluated. This means that instead of a single bigger dose, multiple smaller doses were used to strengthen the mucus population of our living diagnostic.

The effect of the intensity of the dose on the residence time was assessed extensively up to ten times the dose, in Figure 6 two representative results are displayed to simplify. From this data it can be seen that increasing the dosage intensity does not significantly increase the residence time, not meeting the test case requirements. This is in accordance with the literature, where a 40-fold increase over normal dose in a smaller primate, 4 kg vs 70 kg, only increased the residence time by 3 days, showed a low impact of the input concentration to the residence time [9,15].

The frequency dosing strategy was implemented into the model as a variable number of doses in which the time between dosages is variable. The results of the dosing frequency experiments can be seen in Figure 7.

From this data it can be seen that the frequency dosing strategy can significantly increase the residence time of the microbe. When the time between doses is too low, as seen in case Δt=0.3, the residence time is not significantly increased only the concentration, thus not meeting the test case requirements. In the second case, where Δt=1, the concentration remains above the test case concentration, but it cannot increase the residence time enough. In the third case, Δt=2, the dosages are too far apart, letting the concentration fall too low.

Therefore, another test was performed to tune a real-world scenario of one dose per day for three days. This is slightly lower than necessary to maintain a constant concentration of 0.01 g/l but was chosen for ease of use for the patients. The results of which can be seen in Figure 8. In this figure a dosage regiment can be seen that fulfils all the test case requirements, a concentration above 0.01 for 6 days, meaning a successful implementation was achieved. As soon as real-world values for the required concentration and residence time are available, the dosage requirements can be optimised to attain that goal.

Figure 6: Increasing the dosage intensity does not increase residence time by a lot. Shown is the proximal mucus.

Figure 7: The frequency dosing experiments at the standard input concentration, where $\Delta t$ stands for the time between dosages. Extra doses can significantly impact the residence time. Shown here for the proximal mucus.

Figure 8: Three doses, once per day, provide a strong population of our living diagnostic until around 6 days, shown here in the proximal mucus.

Kill switch optimisation recommended from stakeholder perspective

Finally, the kill switch was modelled to see the effect of a bacteriostatic activatable kill switch. The mechanism for the kill switch does not destroy the cell, leaving the passive adherence and the degradation rate unaffected. Therefore, it was implemented as setting the yield of our living diagnostic, $\gamma$ (Equation 3), to 0 at a certain point in time, as if the kill switch were triggered. The results are found in Figure 9.

Figure 9: Kill switch activation only visibly affects the residence time of our living diagnostic when activated immediately.

As can be seen in the figure, the activation of this kill switch has little to no effect on the residence time of our microbes. The cells are presumed to be dead, but intact due to the mechanism of killing. The only simulation in which it causes a noticeable effect was when it was immediately triggered.

Conclusions

In summary, the model was successfully adjusted and validated with literature to simulate the behaviour of our living diagnostic in the colon. We studied the effect of the implementation of our living diagnostic in the model, which showed a minimal effect on the concentrations of the other microbial groups in the microbiome. The kill switch was assessed by changing the yield to zero simulating stopping metabolic processes. The results of this showed that residence time of our living diagnostic was not significantly impacted, which from experience in our human practices research is less desirable for patients. Therefore, we would recommend adjusting the kill switch to include a more aggressive cell-destroying system to ensure our living diagnostic gets removed from the body as quickly as possible. Furthermore, strategies were used to improve the residence time of our living diagnostic to match our test case, a concentration of 0.01 g/l for 6 days. Increasing the intensity of the dosage and using a sugar prebiotic are not a viable strategy to increase the residence time of our living diagnostic. Although increasing the adherence can improve the residence time slightly, it did not meet the test case requirements. As well as, engineering our living diagnostic to have more adherence is more difficult than changing a dosing regimen. The frequency dosing was successful in meeting our test case requirements and showed a successful implementation path for our living diagnostic. As soon as experiments reveal a target concentration and required residence time, it can be adjusted and optimised to determine a real-world dosing regimen for the successful usage of our living diagnostic.

Open full explanation

The goal

The mechanism by which our living diagnostic detects cancer relies on the biomarkers L-lactate and MMP-9. L-lactate is present in concentrations from 10 to 25mM in cancerous environments. However, L-lactate can already be present in the healthy colon in concentrations of around 1.5 to 3 mM [1,2]. We tested a lactate biosensing circuit, which was not able to distinguish cancer cells from healthy cells by lactate. The circuit showed a high fluorescence output at both healthy (1-3 mM) and cancerous (10-25 mM) lactate concentrations. For more information about this circuit, check out our lactate sensing results.

For this reason, we aimed to design a threshold system using a mathematical model, whereby our chromoprotein is only expressed in environments with high levels of L-lactate.

Biosensor characterisation

A biosensor is often characterized by its dose-response curve, where the response of the biosensor is plotted against the concentration of the sensed compound . We use two terms to describe the performance of our biosensor: ‘operational range’ and ‘dynamic range’. The operational range is the range of concentrations where our biosensor response changes from its minimum to its maximum. The dynamic range (also known as the fold-change) is the difference between ‘OFF’ and ‘ON’ states. See Figure 1 for an example.

Figure 1: Example of a dose-response curve, where the operational range and dynamic range are shown in green and purple respectively.

We consider the beginning of the operational range the threshold at which the circuit activates. Therefore, for our system, this would ideally be exactly at the upper limit of healthy lactate concentrations (approx. 3 mM).

Biosensor tuning - Our hypothesis

The mechanism that we wanted to use for regulation of chromoprotein expression was based on Clustered Regularly Interspaced Short Palindromic Repeats interference (CRISPRi), where single guide RNA-bound deactivated CRISPR associated (dCas) constitutively represses the expression of our chromoprotein. To lift repression, we use an anti-sense RNA (asRNA) that is complementary to the guide RNA. The anti-sense strand can bind to the single guide RNA (sgRNA) preventing the CRISPRi-complex from binding to DNA, lifting repression and enabling transcription of chromoprotein [3]. We hypothesised that by tuning expression levels and binding affinities of antisense and guide RNA, the expression of chromoprotein can be controlled to activate at a certain threshold of lactate.

Tuning a biosensor to a specific operational range, in our case the range of lactate concentrations between healthy and cancerous levels, can be quite an experimental challenge. Therefore, we developed a mathematical model to characterise the system in two stages, first the lactate sensing-ability of the system was modelled using data from the lab, and then we combined it with a model based on data about a previously published CRISPR system and predicted how the system should be tuned in practice to our desired operational range.

Specifically, with our model we can perform sensitivity analysis and test what happens when we change the parameters describing antisense and guide RNA and make an informed decision on our genetic circuit design. Sensitivity analysis has already been proven in the past to be a useful tool for predicting genetic circuit improvements [4].

Model explanation

Our model is based on ordinary differential equations, which describe the components of the system as they change over time. To construct the model, we first made a schematic in which we include all reactions and interactions that we thought possible for the initial biosensor constructed in the lab, which can be seen in Figure 2. For the biosensor including the CRISPRi-asRNA system we made the schematic in Figure 3.

Figure 2: Schematic of the basic lactate sensing system. LldR represses GFP production but when LldR is bound to lactate, production of GFP is activated by the LldR-lactate complex instead. Purple lines represent production/degradation. The red line represents transcription repression, and the line represents transcription activation. Theta symbol represents the pool of amino acids and nucleotides, from which all species are produced and returned to once degraded or diluted.

Figure 3: Schematic of the system with the CRISPRi-asRNA system included. Purple lines represent production/degradation. The red line represents transcription repression, and the line represents transcription activation. Theta has been omitted for clarity.

To construct our model from this schematic we use the law of mass action, a general but effective way to approximate reaction kinetics, which has been used for a long time [5]. To describe the interactions between LldR and the lactate responsive promoter, we used a framework set out by Andrew D. Keller [6]. For this we made some assumptions based on literature on how we think LldR operates [7]. We assume that the promoter can be in three states:

One where free LldR binds to the first operator site of the promoter, enabling weak transcription.
One where free LldR dimerizes and binds to both the first and second operator site, repressing transcription.
LldR-lactate complex can bind to the first operator site, strongly activating transcription.

Considering that transcription and translation are affected by growth [8], we included the growth rate of our organism in terms that describe transcription and translation. This has been done before by Gutiérrez Mena et al. (2022).

Equations & Parameters

Model assumptions

All species are degraded and diluted by growth
Transcription/Translation is affected by the growth rate
To simplify, the Lactate concentration remains constant during the experiment.

List of equations

Equations for biosensor in the lab

$\begin{align*} \mu = \mu_{max}e^{-0.5\left(\frac{t-t_c}{\sigma}\right)^2}\left(1-\frac{[C_x]}{cap}\right) \end{align*}$

$\begin{align*} ALPaGA = \frac{k\_b_{alp} +(alp_{act1} +k\_b_{alp})*alp\_k_{act1}[lldR] +(alp_{act2} +k\_b_{alp})*alp\_k_{act2}[lldRcomplex]}{1 +alp\_k_{act1}[lldR] +alp\_k_{act2}[lldRcomplex] +alp_{k3}[lldR]^2} \end{align*}$

$\begin{align*} \frac{d[lldR_{mRNA}]}{dt} = \frac{\mu}{\mu_{max}}\left(k\_op_{succ} +copy\_nr*k\_b_{p3}\right) -[lldr_{mRNA}]\left(\mu+k\_d_{lldRmRNA}\right) \end{align*}$

$\begin{align*} \frac{d[lldR]}{dt} = \mu*k\_pt_{lldR}[lldR_{mRNA}] -k\_f_{lldRcomplex}[lldR][Lactate]^2 -[lldR]\left(\mu+k\_d_{lldR}\right) \end{align*}$

$\begin{align*} \frac{d[lldRcomplex]}{dt} = k\_f_{lldRcomplex}[lldR][Lactate]^2 -[lldRcomplex]\left(\mu+k\_d_{lldRcomplex}\right) \end{align*}$

$\begin{align*} \frac{d[GFP_{mRNA}]}{dt} = \frac{\mu}{\mu_{max}}*copy\_nr*ALPaGA -[GFP_{mRNA}]\left(\mu+k\_d_{gfpmRNA}\right)\end{align*}$

$\begin{align*} \frac{d[GFP]}{dt} = \mu*k\_pt_{gfp}[GFP_{mRNA}] -[GFP]\left(\mu+k\_d_{gfp}\right) \end{align*}$

$\begin{align*} \frac{d[Cx]}{dt} = \mu[Cx] \end{align*}$

Equations for CRISPRi-asRNA model fit to published data

$\begin{align*}\mu = 1.2h^{-1}\end{align*}$

$\begin{align*}\frac{d[dCas_{mRNA}]}{dt} = k\_b_{dCas} - [dCas_{mRNA}]\left(\mu+k\_d_{dcasmrna}\right)\end{align*}$

$\begin{align*}\frac{d[dCas]}{dt} = k\_pt_{dCas}[dCas_{mRNA}] + k\_f_{1_{rv}}[CasComplex] - k\_f_{1_{fw}}[sgRNA][dCas] + k\_f_{4_{rv}}[asCasComplex] - k\_f_{4_{fw}}[dCas][asComplex] - [dCas]\left(\mu+k\_d_{dcas}\right) \end{align*}$

$\begin{align*}\frac{d[sgRNA]}{dt} = k\_b_{sgrna} + k\_f_{1_{rv}}[CasComplex] - k\_f_{1_{fw}}[sgRNA][dCas] - k\_f_{2_{fw}}[asRNA][sgRNA] + k\_f_{2_{rv}}[asComplex]- [sgRNA]\left(\mu+k\_d_{sgrna}\right)\end{align*}$

$\begin{align*}\frac{d[CasComplex]}{dt} = k\_f_{1_{fw}}[sgRNA][dCas]- k\_f_{1_{rv}}[CasComplex] - k\_f_{3_{fw}}[asRNA][CasComplex] + k\_f_{3_{rv}}[asCasComplex] -[CasComplex]\left(\mu+k\_d_{cascomplex}\right)\end{align*}$

$\begin{align*}\frac{d[asRNA]}{dt} = \frac{I^{n_{inducer}}}{kd_{asrna}^{n_{inducer}}+I^{n_{inducer}}}k\_p_{asrna}+ k\_f_{2_{rv}}asComplex - k\_f_{2_{fw}}[asRNA][sgRNA]+ k\_f_{3_{rv}}[asCasComplex] - k\_f_{3_{fw}}[asRNA][CasComplex] - [asRNA]\left(\mu+k\_d_{asrna}\right) \end{align*}$

$\begin{align*}\frac{d[asComplex]}{dt} = k\_f_{2_{fw}}[asRNA][sgRNA] - k\_f_{2_{rv}}[asComplex] + k\_f_{4_{rv}}[asCasComplex] - k\_f_{4_{fw}}[dCas][asComplex] - [asComplex]\left(\mu+k\_d_{ascomplex}\right)\end{align*}$

$\begin{align*}\frac{d[asCasComplex]}{dt} = k\_f_{3_{fw}}[asRNA][CasComplex]- k\_f_{3_{rv}}[asCasComplex] - k\_f_{4_{rv}}[asCasComplex] + k\_f_{4_{fw}}[dCas][asComplex] - [asCasComplex]\left(\mu+k\_d_{ascascomplex}\right)\end{align*}$

$\begin{align*}\frac{d[GFP_{mRNA}]}{dt} = k\_p_{gfpmrna}\frac{kd_{dcas}^{n_{cascomplex}}}{kd_{dcas}^{n_{cascomplex}} + [CasComplex]^{n_{cascomplex}})} - [GFP_{mRNA}]\left(\mu+k\_d_{gfpmrna}\right)\end{align*}$

$\begin{align*}\frac{d[GFP]}{dt} = k\_pt_{gfp}[GFP_{mRNA}]-[GFP](\mu+k\_d_{gfp})\end{align*}$

Equations for the combined model

$\begin{align*} \mu = \mu_{max}e^{-0.5\left(\frac{t-t_c}{\sigma}\right)^2}\left(1-\frac{[C_x]}{cap}\right) \end{align*}$

$\begin{align*} \frac{d[dCas_{mRNA}]}{dt} = \frac{\mu}{\mu_{max}}\left(k\_b_{dcas}*copy\_nr\right) -[dCas_{mRNA}]\left(k\_d_{dcasmrna}+\mu\right) \end{align*}$

$\begin{align*} \frac{d[dCas]}{dt} = \mu*k\_pt_{dCas}[dCas_{mRNA}] +k\_f_{1_{rv}}[CasComplex] -k\_f_{1_{fw}}[sgRNA][dCas] +k\_f_{4_{rv}}[asCasComplex] -k\_f_{4_{fw}}[dCas][asComplex] -[dCas]\left(\mu+k\_d_{dcas}\right)\end{align*}$

$\begin{align*} \frac{d[sgRNA]}{dt} = \frac{\mu}{\mu_{max}}k\_b_{sgrna} +k\_f_{1_{rv}}[CasComplex] -k\_f_{1_{fw}}[sgRNA][dCas] -k\_f_{2_{fw}}[asRNA][sgRNA] + k\_f_{2_{rv}}[asComplex] -[sgRNA]\left(\mu+k\_d_{sgrna}\right)\end{align*}$

$\begin{align*} \frac{d[CasComplex]}{dt} = k\_f_{1_{fw}}[sgRNA][dCas] -k\_f_{1_{rv}}[CasComplex] -k\_f_{3_{fw}}[asRNA][CasComplex] +k\_f_{3_{rv}}[asCasComplex] -[CasComplex]\left(\mu+k\_d_{cascomplex}\right)\end{align*}$

$\begin{align*} \frac{d[asRNA]}{dt} = \frac{\mu}{\mu_{max}}*copy\_nr*ALPaGA+ k\_f_{2_{rv}}[asComplex] -k\_f_{2_{fw}}[asRNA][sgRNA] +k\_f_{3_{rv}}[asCasComplex] -k\_f_{3_{fw}}[asRNA][CasComplex] -[asRNA]\left(k\_d_{asrna}+\mu\right)\end{align*}$

$\begin{align*} \frac{d[asComplex]}{dt} = k\_f_{2_{fw}}[asRNA][sgRNA] -k\_f_{2_{rv}}[asComplex] +k\_f_{4_{rv}}[asCasComplex] -k\_f_{4_{fw}}[dCas][asComplex] -[asComplex]\left(\mu+k\_d_{ascomplex}\right)\end{align*}$

$\begin{align*} \frac{d[asCasComplex]}{dt} = k\_f_{3_{fw}}[asRNA][CasComplex] -k\_f_{3_{rv}}[asCasComplex] - k\_f_{4_{rv}}[asCasComplex] +k\_f_{4_{fw}}[dCas][asComplex] -[asCasComplex](\mu+k\_d_{ascascomplex})\end{align*}$

$\begin{align*} \frac{d[lldR_{mRNA}]}{dt} = \frac{\mu}{\mu_{max}}\left(k\_op_{succ} +copy\_nr*k\_b_{p3}\right) -[lldr_{mRNA}]\left(\mu+k\_d_{lldRmRNA}\right)\end{align*}$

$\begin{align*} \frac{d[lldR]}{dt} = \mu*k\_pt_{lldR}[lldR_{mRNA}] -k\_f_{lldRcomplex}[lldR][Lactate]^2 -[lldR]\left(\mu+k\_d_{lldR}\right)\end{align*}$

$\begin{align*} \frac{d[lldRcomplex]}{dt} = k\_f_{lldRcomplex}[lldR][Lactate]^2 -[lldRcomplex]\left(\mu+k\_d_{lldRcomplex}\right)\end{align*}$

$\begin{align*} \frac{d[GFP_{mRNA}]}{dt} = \frac{\mu}{\mu_{max}}*copy\_nr*k\_p_{gfpmrna}*\frac{kd_{dcas}^{n_{cascomplex}}}{kd_{dcas}^{n_{cascomplex}} +[CasComplex]^{n_{cascomplex}}} - [GFP_{mRNA}]\left(\mu+k\_d_{gfpmRNA}\right)\end{align*}$

$\begin{align*} \frac{d[GFP]}{dt} = \mu*k\_pt_{gfp}[GFP_{mRNA}] -[GFP]\left(\mu+k\_d_{gfp}\right)\end{align*}$

$\begin{align*} \frac{d[Cx]}{dt} = \mu[Cx] \end{align*}$

List of parameters in the model

Name	Description	Value	Unit
$alp_{act1}$	Activated production of pALPaGA when activated by monomeric LldR	0.875	mM/OD.h
$alp_{act2}$	Activated production of pALPaGA when activated by the LldR-lactate complex	4.969	mM/OD.h
$k\_b_{alp}$	Basal production by the pALPaGA promoter	1.748	mM/OD.h
$alp\_k_{act1}$	Binding affinity of LldR monomeric LldR	42.532	mM
$alp\_k_{act2}$	Binding affinity of LldR-lactate complex	67.022	mM
$alpaga_{k3}$	Repression strength of LldR-dimer	8.623	mM
$ cap $	Maximum cell density	0.234	OD
$ copy\_nr $	Copy number of the plasmid	5.199	1
$k\_b_{dcas}$	Constitutive production of dCas	7.422	mM/OD.h
$k\_b_{p3}$	Constitutive production of LldR on the plasmid	8.611	mM/OD.h
$k\_b_{sgrna}$	Constitutive production of sgRNA	4.49	mM/OD.h
$k\_d_{ascascomplex}$	Degradation rate of asCasComplex	5.331	1/h
$k\_d_{ascomplex}$	Degradation rate of asRNA-sgRNA complex	10.793	1/h
$k\_d_{asrna}$	Degradation rate of asRNA	5.751	1/h
$k\_d_{cascomplex}$	Degradation rate of sgRNA-Cas complex	1.151	1/h
$k\_d_{dcas}$	Degradation rate of dCas9 protein	5.076	1/h
$k\_d_{dcasmrna}$	Degradation rate of dCas9 mRNA	2.389	1/h
$k\_d_{gfp}$	Degradation rate of GFP protein	0.003	1/h
$k\_d_{gfpmrna}$	Degradation rate of GFP mRNA	0.067	1/h
$k\_d_{LldR}$	Degradation rate of LldR	3.758	1/h
$k\_d_{LldRcomplex}$	Degradation rate of LldR-lactate complex	0.768	1/h
$k\_d_{LldRmrna}$	Degradation rate of LldR mRNA	7.939	1/h
$k\_d_{sgrna}$	Degradation rate of sgRNA	1.389	1/h
$k\_f_{3_{fw}}$	Formation rate of asCasComplex	6.016	1/mM.OD.h
$k\_f_{3_{rv}}$	Dissociation rate of asCascomplex	1.1	1/h
$k\_f_{4_{fw}}$	Formation rate of asCasComplex but dCas comes last	6.426	1/mM.OD.h
$k\_f_{4_{rv}}$	Dissociation rate of asCasComplex but dCas comes last	1.76	1/h
$k\_f_{2_{fw}}$	Formation rate of asRNA-sgRNA complex	3.281	1/mM.OD.h
$k\_f_{2_{rv}}$	Dissociation rate of asRNA-sgRNA complex	7.799	1/h
$k\_f_{1_{fw}}$	Formation rate of dCas-sgRNA complex	9.631	1/mM.OD.h
$k\_f_{1_{rv}}$	Dissociation rate of CasComplex	0.855	1/h
$k\_f_{LldRcomplex}$	Formation rate of LldR-lactate complex	0.234	1/mM^2.OD.h
$k\_op_{succ}$	Production rate of LldR from the operon	0.624	mM/OD.h
$k\_p_{gfpmrna}$	Max production rate of GFP_mRNA	0.674	mM/OD.h
$k\_pt_{dcas}$	Proportionality constant of dCas9 translation	6.854	1
$k\_pt_{gfp}$	Proportionality constant of GFP translation	0.705	1
$k\_pt_{LldR}$	Proportionality constant of LldR translation	8.91	1
$kd_{dcas}$	Dissociation constant of casComplex with DNA	1.142	mM
$\mu_{max}$	Maximal specific growth rate	0.405	1/h
$n_{alpaga}$	Hill coefficient of ALPaGA promoter	2	1
$n_{cascomplex}$	Hill coefficient dCas repression	4	1
scale	Scaling factor, which incorporates unmodeled fluorescence production and dilution effects	327.984	1
$ \sigma $	Spread of the growth curve	4.292	h
$t_c$	Time where maximal growth is reached	10.423	h

Click here to show the equations, parameter descriptions and values

Experiment simulation

Initial system and combined model

We first simulated 24 hours of ‘overnight’ growth without lactate, then we diluted everything back with the same dilution factor until the starting OD was back to its initial value (0.05). Afterwards, we changed the concentration of lactate and simulated the system for 21 hours. This mimics what is done in the lab: First a pre-culture, which is then diluted and measured over time in, in our case, a plate reader. In the initial conditions for the pre-culture everything started at 0, except the cell population which was set to 0.05 OD.

CRISPRi-asRNA model

Since the data starts in what we assume a steady state where no dCas or sgRNA is present, we approximated the initial GFP condition by setting the equations to zero and solving for GFP, shown below:

$$\begin{align*} \frac{d[GFP_{mRNA}]}{dt} = k\_p_{gfpmRNA} -\left[GFP_mRNA\right]\left(\mu+k\_d_{gfpmrna}\right)= 0 \end{align*}$$ $$\begin{align*} \left[GFP_mRNA\right]=k\_p_{gfpmRNA}/\left(\mu+k\_d_{gfpmrna}\right)\end{align*}$$ $$\begin{align*} \frac{d[GFP]}{dt} = k\_pt_{gfp}*[GFP_mRNA] -[GFP]\left(\mu+k\_d_{gfpmrna}\right)= 0 \end{align*}$$ $$\begin{align*} k\_pt_{gfp}\ast k\_p_{gfpmRNA}/\left(\mu+k\_d_{gfpmrna}\right)-\left[GFP\right]\left(\mu+k\_d_{gfpmrna}\right)= 0\end{align*}$$ $$\begin{align*} \left[GFP\right]=k\_pt_{gfp}\ast k\_p_{gfpmRNA}/\left(\mu+k\_d_{gfpmrna}\right)/\left(\mu+k\_d_{gfpmrna}\right)\end{align*}$$ $$\begin{align*} [GFP] =k\_pt_{gfp}\ast k\_p_{gfpmRNA}/\left(\left(\mu+k\_d_{gfp}\right)\left(\mu+k\_d_{gfpmrna}\right)\right)\end{align*}$$

This set the initial concentration of GFP. We then simulated the system for a total of 18 hours, where at time of induction the inducer concentration was changed to 5mM.

Model fitting

Since there is very little data on what values the parameters should be, we used Latin hypercube sampling to sample 1 million parameter sets, and the top 1000 scoring parameter sets were selected for optimisation by the minimisation algorithm L-BFGS-B in Python. For the scoring function we calculated the weighted sum of squared error between the data and the simulation:

$$Score(j) = \sum_{i=t_0}^{t_f}\frac{\bigg(Fluorescence_{sim}(i,j)-Fluorescence_{data}(i)\bigg)^2}{\sigma_i^2}\nonumber$$

For a specific parameter set $j$, where for timepoint $t_0$ to the final timepoint $t_f$, the difference between simulated and experimental timepoints is calculated, squared, divided by the variance of the data ($\sigma^2$) and summed together.

Sensitivity analysis

To make predictions about how the system should be changed, we performed one-at-a-time parameter sensitivity analysis. As the name suggests, each parameter of the model was increased and decreased by a maximum of factor 10 one at a time and simulated to see what effect it would have on the dose-response curve.

Model fit to experimental data

After fitting the model, we compared it to data for every tested condition, which can be seen in Figure 4 below. Our model fits the plate-reader experiment reasonably well, except for the 20 mM condition where the predicted fluorescence is much lower than experimental values, as seen in Figure 5.

Figure 4: Comparison of the data (solid) and simulations (dashes). Orange: 0.05 mM L-lactate, Turquoise: 0.2 mM L-lactate, Red: 0.5 mM L-lactate, Green: 2 mM L-lactate, Purple: 5mM L-lactate, Yellow: 20 mM L-lactate.

Matching the CRISPRi-asRNA model to data

We fit our second model to time-series data from Lee et al. [3], where production of the antisense RNA is induced at various time-points, as shown in Figure 5.

Figure 5: Fit of the asRNA model to experimental data from Lee et al., 2016 [3], where the CRISPRi complex is induced at t=0 and asRNA is induced at 7 or 9h respectively. A time-delay was implemented at the start and at time of induction, to account for time cells take to acclimatise to new conditions.

The model showed a good enough qualitative fit to the data for us to then merge the models and simulate what would happen if we combined both systems.

Predicted dose-response when systems are combined

After fitting each model separately, we combined the two models, simulated the response to lactate and compared the two systems, which can be seen in Figure 6 below.

Figure 6: Comparison of the lactate sensing system models with and without CRISPRi-asRNA system, with the 5%, 50% and 95% response levels marked with red dots. Fluorescence was normalised by dividing each curve by its maximum.

For both systems, the EC5 (5% response) is predicted to be much lower than healthy lactate levels, indicating that this system would have already elevated expression levels in healthy colons which is undesired. The source of the background response at low lactate concentrations is likely due to the leakiness of the lactate promoter [9].

When dose response curves for our system with and without the asRNA extension were compared, the full asRNA system performed worse in both operational and dynamic range. The only upside to the asRNA system is that the slope of the response (sensitivity) is steeper than without the asRNA system, meaning that the operational range is shorter. This could be advantageous to see an already significant change in expression when developing cancer. This means that the system we are using now is inadequate to distinguish healthy lactate concentrations from cancerous concentrations. For proper lactate sensing to work we need to adapt the system to shift the operational range to higher concentrations, either by tuning components that are already there or adding and/or removing components. Which is why we performed sensitivity analysis.

Sensitivity analysis

We performed one-at-a-time sensitivity analysis for the system where asRNA is included. Our hypothesis was that by tuning the sgRNA expression and/or the asRNA expression, we could shift the operational range to the desired range of around "3 to 10" mM. The effects of changing every parameter on the EC5 can be seen in Figure 7 below.

Figure 7: Sensitivity analysis summary which show the increase or decrease in the lactate concentration where 5% of the dynamic range is reached, when changing a single parameter by a factor of 10. alp_act1 and alp_act2 were omitted because they skewed the chart due to wrong interpolation.

What we see is that most of the parameters corresponding to the asRNA system do not have much of an effect on the shift in operational range. For example, $k\_{f2_{rv}}$ and $k\_{f2_{fw}}$, which describe the rate at which asRNA and sgRNA bind together, has almost no effect on the operational range. Instead, we see that the parameters related to LldR have the most impact on the systems operational range.

From these results, we selected three parameters to investigate further:

$k\_b_{p3}$, describing the constitutive production of LldR. Increasing LldR has a positive effect on both the operational range and decreases the background production, shown in Figure 8.

Figure 8: Sensitivity analysis results showing the effect of LldR transcription on the system. Marked with red dots is when the system reaches 5%, 50% and 95% of the total response

$k\_f_{LldRcomplex}$, describing the rate at which the LldR-lactate complex can form. Changing this parameter only affects the operational range, shown in Figure 9.

Figure 9: Sensitivity analysis results showing the effect of the LldR-lactate complex formation rate on the system. Marked with red dots is when the system reaches 5%, 50% and 95% of the total response

$k\_b_{alp}$, describing the basal transcription rate of asRNA under the pALPaGA promoter. Reducing basal transcription reduces the background expression (shown in Figure 10), which also shifts the operational range to higher concentrations.

Figure 10: Sensitivity analysis results showing the effect of basal transcription of pALPaGA on the system. Marked with red dots is when the system reaches 5%, 50% and 95% of the total response

Suggestions for future experiments

From our sensitivity analysis we learned that there are parameters that could be changed to improve our system, but how would we implement this experimentally?

Increasing $k\_b_{p3}$, describing the rate at which LldR is produced from our plasmid, would decrease background production and improve the operational range by shifting the EC5 to higher concentrations. In practice, this should be relatively easy to achieve. We can do this by using a stronger promoter or including multiple copies of the gene. Since the goal of this strategy is to increase the total amount of LldR produced, we can also investigate improving the translation of the mRNA transcript with a better ribosome binding site or tune the codon bias [10]. But initially overexpression would be much simpler.

Decreasing $k\_f_{LldRcomplex}$, describing the rate at which the LldR-lactate complex is formed would shift the operational range to higher concentrations. Without complex protein engineering we cannot directly change this rate. However, in the LldR complex equation the production term for complex formation is ${k\_f_{LldRcomplex}* [LldR] * [lactate]^2}$,indicating that the rate of complex formation is most affected by the lactate concentration in the cells. In literature it is mentioned that the internal concentration can be tuned by increasing or decreasing the amount of analyte that is exported outside the cell [11].

However, the known transporters for lactate (LldP and GlcA) import lactate into the cell instead of exporting [12]. Therefore, we came up with another strategy for tuning the internal lactate concentrations in the cell: Lactate dehydrogenase. This enzyme converts lactate to pyruvate, lowering the internal concentration of lactate in the cell. Overexpression of this enzyme would be much easier than attempting to engineer the lactate transporters to export instead. The conversion of lactate to pyruvate also gives an advantage to our living diagnostic, enabling it to grow better in cancerous environments.

The final parameter that we chose for improving the system is $k\_b_{alp}$ , describing the basal transcription rate of the pALPaGA promoter. This is the second most important component next to LldR, because without it we would not be able to detect any lactate. While the engineered promoter we are using does seem to work better than the wild type in anoxic conditions, it is also about five times leakier in those conditions. So, there is still much room for improving this part. Another strategy of reducing the leakiness would be getting rid of this part altogether, but instead use a synthetic constitutive promoter which contains multiple LldR binding sites to still enable the sensing of lactate. More binding sites may decrease the background expression levels [13].

Conclusion

We suggested three parameters for improvement which can also be applied to the system without CRISPRi and asRNA, so we suggest continuing with improving the system without these components.

In conclusion, as it stands currently it might be hard to differentiate healthy levels of lactate from cancerous levels of lactate. The background production needs to be reduced and the operational range needs to be shifted to higher concentrations. We have come up with some hypotheses to further improve our system design thanks to our model.