For any specific protein of interest, we first explore Protein Data Bank (PDB) for the availability of its experimental 3D structures. For protein structures that are not available in the PDB, we primarily rely on SWISS-MODEL which is dedicated to the homology modeling of 3D protein structures. As for small molecular compounds, we mainly obtained their 3D structures from PubChem, the world's largest collection of chemical molecules and their activities against biological assays. To explore molecular docking, we applied AutoDock for the docking of proteins to small molecular compounds, and HDOCK for the docking of proteins to DNA.

Structural analysis of system 1

In System 1, we engineered a kind of bacteria specifically expressing Nicotinamidase (PncA), which can convert nicotinamide (NAM) to niacin (NA), leading to an increase in the concentration of nicotinamide adenine dinucleotide+ (PmerR+) and subsequently a potential improvement in mitochondrial function in autistic children.

We mainly focused on the 3D structure of the PncA molecule, the protein of the pncA gene. Since no record of the 3D structure of PncA is available in PDB, we turned to SWISS-MODEL for homology modeling to predict the 3D structure of PncA in Lactobacillus Plantarum L168.

Predicted 3D structure of PncA

Structural analysis of System 2

We designed an engineered bacteria in System 2 to respond to overdose of mercury or lead ions by expressing human metallothionein (MT), which is known to prevent heavy metal accumulation in children with ASD. Specifically, we selected MerR and PbrR proteins to eliminate heavy metallic ions, including mercury (Hg) and lead (Pb). The principle is that the ions can act as a transcriptional activator upon binding with the merR promoter and the pbrA promoter to trigger expression of their coding product, Lpp-OmpA-MT. Lpp-OmpA acts as a membranous protein that anchors to the outer membrane of the bacteria, while MT is a protein with considerable affinity for divalent metallic ions, making this compound an efficient heavy metal eliminator.

In accordance with the design of this system, we first used HDOCK to dock MerR and PmerR to obtain their structure as followed, as well as the evaluation scores.

MerR and PmerR docking
Rating Numerical value
Docking Score -240.76
Confidence Score 0.8600
Ligand rmsd (Å) 120.53
MerR and PmerR docking score table

After that, we docked PbrR and PpbrA with the help of HDOCK to obtain another image and score chart.

PbrR and PpbrA docking
Rating Numerical value
Docking Score -234.40
Confidence Score 0.8440
Ligand rmsd (Å) 88.56
PbrR and PpbrA docking score table

To assist in the verification of the functions, we decided to build a structural model of Lpp-OmpA-MT. Likewise, we used SWISS-MODEL to predict structure.

Predicted 3D structure of Lpp-OmpA-MT

Notice that the top half of the predicted model demonstrates the structure of the membranous protein Lpp-OmpA and the bottom half stands for human MT.

Structural analysis of System 4

Although we are on the therapeutic track this year, we still made an effort to develop a testing strip to help screen mitochondrial dysfunction in children with ASD whom would response well to our two engineered bacteria. Therefore, we selected L-lactate as a biomarker and designed a lactate testing strip and a testing kit as well. We noticed that the lldPRD operon is present in Escherichia coli, and the lldR gene can encode the regulatory protein, LldR, to regulate lactate metabolism. When L- lactate binds to LldR, the P11 promoter will be activated, leading to an increased expression of the LacZ gene. Eventually, a chromogenic reaction will help us to detect a rise in L-lactate levels.

Due to the significance of the sensor in this system, we figured that it would benefit if we could understand the docking of LldR and L-lactate, as well as that of LlDR and P11. Similarly, since the 3D structure of LldR in our chassis, Escherichia coli Nissle 1917, was not found in PDB, we used SWISS-MODEL again for homology modeling to predict the 3D structure of LldR. For L-lactate, we obtained its 3D structure from PubChem.

Predicted 3D structure of LldR
3D structure of L-lactate

With the help of AutoDock, we simulated the docking of LldR and L-lactate. The resulting figure is presented below. The docking sites are colored yellow.

Docking of LldR and L-lactate

Finally, we obtained the following structure as long as a score table using HDOCK for the docking of LldR and P11.

LldR and P11 docking
Rating Numerical value
Docking Score -186.97
Confidence Score 0.6769
Ligand rmsd (Å) 64.40
LldR and P11 docking score table

Mathematical modeling of system 2

Based on the gene circuit of System 2, we built a mathematical models to simulate the dynamics and functional role of our engineered bacteria . We noticed that the gene circuits to sense mercury ions and lead ions are both connected by an "OR" gate, meaning that their working mechanism are essentially the same. As a result, we will only focus on the model for the first circuit. FYI, “Lpp-OmpA-MT” is written as “MT”.

Model Assumptions


  1. The number of gene copies depends on replicon and plasmid copies.

  2. Both RNA polymerase and ribosomes are sufficient

  3. (1) The time required for binding transcription factors to genes or the degradation of transcription factors is shorter than that of gene transcription.

    (2) The time required for gene transcription is shorter than that for translation

    (3) The time required for gene translation is shorter than that needed for experiment conduction.

    Therefore, we can use the Quasi-Steady-State Approximation (QSSA) for model simplification.

  4. Protein degradation also includes dilution due to bacterial growth.

  5. The concentration of mRNA and protein is zero at the beginning.

Variables and Parameters

Name Explanation
$[Hg']$ The concentration of mercury ions diffusing into cells
$[Hg]$ The concentration of mercury ions in the environment
$k_a$ The coefficient of the diffusion of mercury ions into cells
$[MerR]$ The concentration of MerR
$[PmerR\cdot MerR]$ The concentration of the PmerR and MerR compound
$[MerR\cdot PmerR\cdot Hg']$ The concentration of the PmerR, MerR, and cellular mercury ion compound
$k_{on}$ The binding coefficient of the PmerR and MerR compound to mercury ions that are diffused into cells
$k_{off}$ The separation coefficient of the PmerR and MerR compound from mercury ions that are diffused into cells
$C_N$ The number of copies of the plasmid
$[mRNA]$ The concentration of mRNA
$[MT]$ The concentration of MT
$V_{mRNA}$ The speed of mRNA transcription
$V_{MT}$ The speed of MT translation
$k_1$ The transcription rate constant of MT
$k_2$ The translation rate constant of MT
$d_1$ The degradation rate of mRNA of MT
$d_2$ The degradation rate of MT

Model Building

The ODE model for System 2

We built an ODE model for the second system to help verify its function. Notice that the binding of MerR to PmerR, as is marked with the blue rectangle in the equation, can be ignored because the MerR protein is constitutively expressed.

The equation for the pre-translational stage, or the interconverting stage of the MerR and PmerR compound and the mercury ions that have been diffused into bacteria, goes as follows:

$Hg' + MerR\cdot PmerR \rightleftharpoons MerR\cdot PmerR · Hg'$

$(1)$

Meanwhile, the binding process of mercury ions that diffuse into the bacteria can be clearly explained by the following ordinary differential equations.

$ \dfrac{d[MerR\cdot PmerR\cdot Hg']}{dt}= -k_{off}[MerR\cdot PmerR\cdot Hg']+k_{on}[MerR\cdot PmerR][Hg']$

$(2)$

$\dfrac{d[Hg']}{dt}=k_{off}[MerR\cdot PmerR\cdot Hg']-k_{on}[MerR\cdot PmerR][Hg']$

$(3)$

$\dfrac{d[MerR\cdot PmerR]}{dt}=k_{off}[MerR\cdot PmerR\cdot Hg']-k_{on}[MerR\cdot PmerR][Hg']$

$(4)$

Inside the bacteria, the sum of genes that bind to transcription factors and those that do not is a constant equating the number of copies of the bacteria. In addition, the sum of the changing rates of bound and unbound bacteria should also be a constant, or zero, to be more precise. This leads us to the two equations below.

$[MerR\cdot PmerR]+[MerR\cdot PmerR\cdot Hg']=C_N$

$(5)$

$\dfrac{d[MerR\cdot PmerR]}{dt} + \dfrac{d[MerR\cdot PmerR\cdot Hg']}{dt} = 0$

$(6)$

Since the time needed for transcription factors to bind to genes or to degrade is shorter than that for gene transcription, the model can be further simplified using the Quasi-Steady-State Approximation (QSSA).

$\dfrac{d[MerR\cdot PmerR\cdot Hg']}{dt} = 0$

$(7)$

We can rearrange the equation by substituting it into equation (2).

$k_{on}[MerR\cdot PmerR][Hg']=k_{off}[MerR\cdot PmerR\cdot Hg']$

$(8)$

Then, we can substitute equation (5) into the above equation, which gives us a further rearranged version:

$[MerR\cdot PmerR\cdot Hg']=C_N\dfrac{k_{on}[Hg']}{k_{on}[Hg']+k_{off}}=C_N\dfrac{[Hg']}{[Hg']+k_d}$

$(9)$

The $k_d$ in the denominator is the disassociation to binding ratio. In other words, $k_d = \dfrac{k_{off}}{k_{on}}$ The equation for the transcription process goes as follows:

$\dfrac{d[mRNA]}{dt} = k_1C_N\dfrac{Hg'}{K_d + [Hg']}-d_1[mRNA]$

$(10)$

Similarly, since the time needed for transcription is shorter than that for translation, the equation can be simplified using the Quasi-Steady-State Approximation (QSSA), which comes to the following equation.

$\dfrac{d[mRNA]}{dt} = 0$

$(11)$

This can also be rearranged through substitution into equation (10).

$[mRNA] = \dfrac{k_1}{d_1}C_N\dfrac{[Hg']}{k_d+[Hg']}$

$(12)$

The equation for the translation process goes as follows:

$\dfrac{d[MT]}{dt}=k_2[mRNA]-d_2[MT]$

$(13)$

We can then rearrange the equation by substituting equation (12) into it, which gives the equation as listed below.

$\dfrac{d[MT]}{dt}=\alpha\dfrac{[Hg']}{[Hg'] + k_d}-d_2[MT]$

$(14)$

Notice that $\alpha = \dfrac{k_1k_2}{d_1}C_N$

Since the duration of gene translation is also shorter in comparison with experimental time, the model can be simplified using the Quasi-Steady-State Approximation (QSSA) as well.

$\dfrac{d[MT]}{df} = 0$

$(15)$

Substituting it into equation (14) and we will get this equation:

$[MT] = \dfrac{\alpha}{d_2}\dfrac{[Hg']}{[Hg'] + k_d}$

$(16)$

The concentration of mercury ions that have been diffused into the cell is also in correlation with that in the environment, or:

$[Hg'] = k_{\alpha}[Hg]$

$(17)$

Therefore, the relationship between the concentration of MT and ions in the environment can be presented as follows.

$[MT] = \dfrac{\alpha}{d_2}\dfrac{k_{\alpha}[Hg]}{k_{\alpha}[Hg]+k_d}=\dfrac{\alpha}{d_2}\dfrac{[Hg']}{[Hg]+{k_d}'}$

$(18)$

${k_d}'$represents the $k_d$to $k_a$ratio, or ${k_d}' = \dfrac{k_d}{k_{\alpha}}$

Solution

We solved this ordinary differential equation in MATLAB and got the prediction image.

Prediction of the Lpp-OmpA-MT expression

Mathematical modeling of system 4

Similar to that of System 2, we built mathematical models that go from assumption to solution in System 4 as well. We chose to use “Lactate” instead of “L-lactate” and wrote “β-Galactosidases” as “Gal” for clarity.

Model Assumptions


  1. The number of gene copies depends on replicon and plasmid copies.

  2. Both RNA polymerase and ribosomes are sufficient

  3. (1) The time required for the binding of transcription factors to genes or the degradation of transcription factors is shorter than that of gene transcription.

    (2) The time required for gene transcription is shorter than that for translation

    (3) The time required for gene translation is shorter than that needed for experiment conduction.

    Therefore, we can use the Quasi-Steady-State Approximation (QSSA) for model simplification.

  4. Protein degradation also includes dilution due to bacterial growth.

  5. The concentration of mRNA and protein is zero at the beginning.

Variables and Parameters

Name Explanation
$[lactate']$ The concentration of L-lactate diffusing into cells
$[lactate]$ The concentration of L-lactate in the environment
$k_a$ The coefficient of the diffusion of L-lactate ions into cells
$[LldR]$ The concentration of LldR
$[P11\cdot LldR]$ The concentration of the P11 and LldR compound
$[P11\cdot LldR\cdot lactate']$ The concentration of the P11, LldR, and cellular L-lactate compound
$k_{on}$ The binding coefficient of the P11 and LldR compound to L-lactate that are diffused into cells
$k_{off}$ The separation coefficients of the P11 and LldR compound from L-lactate that are diffused into cells
$C_N$ The number of copies of the plasmid
$[mRNA]$ The concentration of the β-Galactosidase mRNA
$[Gal]$ The concentration of β-Galactosidases
$V_{mRNA}$ The speed of mRNA transcription
$V_{Gal}$ The speed of β-Galactosidases translation
$k_1$ The transcription rate constant of β-Galactosidases
$k_2$ The translation rate constant of β-Galactosidases
$d_1$ The degradation rate of the β-Galactosidase mRNA
$d_2$ The degradation rate of β-Galactosidase

Model Building

The ODE model for System 4

Here we present the ODE model for the third system. We can also ignore the binding of LldR to P11, as is marked in the above diagram, since the LldR protein is expressed constitutively.

The equation for the interconversion of the LldR and P11 compound and L-lactate diffused into cells before transcription can be listed below.

$lactate' + LldR\cdot P11 \rightleftharpoons LldR\cdot P11 · lactate'$

$(19)$

In the meantime, the binding process of L-lactate that diffuses into the bacteria can be represented by the following ordinary differential equations.

$\dfrac{d[LldR\cdot P11\cdot lactate']}{dt} = -k_{off}[LldR\cdot P11\cdot lactate'] + k_{on}[LldR\cdot P11][lactate']$

$(20)$

$\dfrac{d[lactate']}{dt} = k_{off}[LldR\cdot P11\cdot lactate'] - k_{on}[LldR\cdot P11][lactate']$

$(21)$

$\dfrac{d[LldR\cdot P11]}{dt} = k_{off}[LldR\cdot P11\cdot lactate'] - k_{on}[LldR\cdot P11][lactate']$

$(22)$

As was explained in the former model, the sum of genes that bind to transcription factors and those that do not equate to the number of copies of the bacteria and the sum of the changing rates of bound and unbound bacteria is also a constant, namely, zero. Thus, we can list these two equations:

$[LldR\cdot P11] + [LldR\cdot P11\cdot lactate'] = C_N$

$(23)$

$\dfrac{d[LldR\cdot P11]}{dt} + \dfrac{d[LldR\cdot P11\cdot lactate']}{dt} = 0$

$(24)$

The model can be simplified with the Quasi-Steady-State Approximation (QSSA) because the time required for the binding or degradation of transcription factors to genes is shorter than that required for gene transcription.

$\dfrac{d[LldR\cdot P11\cdot lactate']}{dt} = 0$

$(25)$

Substituting it into equation (20) and we will get the following equation:

$k_{on}[LldR\cdot P11][lactate'] = k_{off}[LldR\cdot P11\cdot lactate']$

$(26)$

By substituting equation (23) into the above equation, we will be able to further rearrange it:

$[LldR\cdot P11\cdot lactate'] = C_N\dfrac{k_{on}[LldR\cdot P11]}{k_{on}[LldR\cdot P11] + k_{off}} = C_N\dfrac{[lactate']}{[lactate']+k_d}$

$(27)$

Similarly, $k_d = \dfrac{k_{off}}{k_{on}}$

The equation of the transcription process goes as follows.

$\dfrac{d[mRNA]}{dt} = k_1C_N \dfrac{[lactate']}{k_d + [lactate']} - d_1[mRNA]$

$(28)$

We can also simplify the model by using the Quasi-Steady-State Approximation (QSSA) since the duration for gene transcription is shorter than that for translation.

$\dfrac{d[mRNA]}{dt} = 0$

$(29)$

This equation can be rearranged through substitution into equation (28).

$[mRNA] = \dfrac{k_1}{d_1}C_N\dfrac{[lactate']}{k_d + [lactate']}$

$(30)$

Here is the equation for the translation process:

$\dfrac{d[Gal]}{dt} = k_2[mRNA] - d_2[Gal]$

$(31)$

Substituting equation (30) into the above equation leads us to the following equation.

$\dfrac{d[Gal]}{dt} = \alpha\dfrac{[lactate']}{k_d + [lactate']} - d_2[Gal]$

$(32)$

Similarly, $\alpha = \dfrac{k_1k_2}{d_1}C_N$

The time required for gene translation is shorter than that of experimental time, meaning that the model can also be simplified through the Quasi-Steady-State Approximation (QSSA).

$\dfrac{d[Gal]}{df} = 0$

$(33)$

By substituting it into equation (32), we will get the following equation.

$[Gal] = \dfrac{\alpha}{d_2} \dfrac{[lactate']}{k_d + [lactate']}$

$(34)$

In addition, the relationship between the concentration of L-lactate diffusing into cells and those in the environment goes as:

$[lactate'] = k_{\alpha}[lactate]$

$(35)$

Therefore, the relationship between the concentration of β-Galactosidases and that of L-lactate in the environment can be presented as follows.

$[Gal] = \dfrac{\alpha}{d_2} \dfrac{k_{\alpha}[lactate]}{k_{\alpha}[lactate] + k_d} = \dfrac{\alpha}{d_2} \dfrac{[lactate]}{[lactate] + {k_d}'} $

$(36)$

Notice that, similarly ${k_d}' = \dfrac{k_d}{k_{\alpha}[lactate]+k_d}$

Solution

We solved this ordinary differential equation in MATLAB and got the prediction image.

Prediction of the lacZ expression