}
To establish a more reasonable mathematical model, the following assumptions are made:
1. The timescale for binding and transcription reactions is much faster than translation.
2. The variation of the activity of engineered bacteria with temperature and pH can be described by Gaussian curve
3. The growth of bacteria conforms to the assumptions of the logistics equations
4. The enzymatic reaction process conforms to the Michaelis–Menten equation
5. The binding reaction of the transcription factor to the environmental signal and the binding reaction of the transcription factor to the promoter will reach equilibrium in a short period of time and can be regarded as a balancing process in subsequent analysis
From the data and theoretical knowledge, the reaction process of the engineered strains can be describe as bell-shaped curve with respect to pH value and temperature[1], and there exists an optimal pH value and an optimal temperature.
To find the optimal pH and temperature model for the reaction of engineered strains, we try to use a two-dimensional Gaussian distribution model to find the optimal pH and temperature of the reaction, as well as the theoretical maximum reaction yield.
Let x,y be two random variables, and if they satisfy Standard two-dimensional normal distribution, then the probability density function ρ(x,y) of them is[2]:
Generally speaking, if we denote the reaction rate of engineering strains as γ(x,y), and the temperature and pH of the reaction system are x,y. Without loss of generality, the optimal model γ(x,y) we propose can be expressed as:
Where, a is the theoretical maximum yield of engineered strains, b is the amount of extrusion to the x-axis, c is the amount of extrusion to the y-axis, x0 is the best pH value of engineered strains, y0 is the best temperature value of engineered strains.
We fit the resulting data to minimize the error between the data and the model, and finally get the optimization parameters as follows:
From the above table, we can get: The theoretical maximum yield of engineered strains is 11.7 OD600 in 3h. The best pH value of engineered strains is 6.999, The best temperature value of engineered strains is 39.03.
Finally, we plot the fitted 2D Gaussian surface in the same spatial coordinate system as the actual data curve, and the result is shown in the following figure:
We developed an ordinary differential equation model to simulate the case of the time curve of the number of bacteria at a certain concentration of L-Arabinose.
We denote r as the growth rate of E. coli flora, N as the number of E. coli flora. If we assume that r doesn't change over time, then the simplest model is that:
In practice, for the limit of nutrients and living space in the bacterial culture environment, there will be a maximum upper limit number of flora Nmax , and at the same time, as the number of flora increases, the growth rate of the flora should decrease, so r is related to N. Under the assumption that the natural growth rate is r0, we can obtain:
When N=Nmax, the number of flora reaches its maximum, at which point the growth rate of the flora should be 0, so it can be obtained:
Therefore, we have:
Thus, a logistics equation that describe the growth of the flora can be obtained[3]:
In this engineered strain, our gene line takes pAraC as the promoter, BBa B0034 as the ribosome-binding site, then GFP as the main gene fragment, and finally BBa B0015 as the terminator.
Under the action of a certain concentration of L-Arabinose, it undergoes transcription, translation and other processes that cause the engineering strain to produce lysi.
Since the input environmental signals L-Arabinose bind the repressor AraC, the repressor bound the environmental signal is obtained, and after that, the repressor bounded the environmental signal will not bind the L-Arabinose-specific promoter pAraC, therefore, the RMA enzyme can not be inhibited to the binding of the promoter, thereby making the lysis of strain.
We denote DT as the total promoter concentration, D as the promoter concentration without binding repressor, D+ as the promoter concentration after binding repressor, and P as the repressor concentration without binding environmental signal. From the law of conservation of mass, we have:
Let kon as the binding rate of the repressor to the promoter, kon for the separation rate of the repressor from the promoter, we have:
In general, kon/koff, hence, the reaction will quickly achieve chemical equilibrium relation, at this point, the (dD+ )⁄dt=0. We can get:
where, kt=koff/kon , is the equilibrium constant of the transcription factor binding reaction to the promoter.By D+ =DT-D we have:
For many repressors, D+ complexes dissociate within less than 1 sec. Therefore, we can average over times much longer than 1 sec and consider D/DT as the probability that site P is free, averaged over many binding and unbinding events.
If we denote γ0 as the maximum transcription rate, then γ0 is also denoted as the maximum bacterial cleavage rate since the gene is a cleavage gene, the bacterial cleavage rate γ can be obtained as follows:
On the other hand, If we denote PT as the total inhibitory factor concentration, P as the inhibitory factor concentration without binding environmental signal, and P+ as the inhibitory factor concentration after binding environmental signal. S is the concentration of environmental signal in the environment[4].
From the law of conservation of mass, we have:
Let kon denote the binding rate of the environmental signal and the inhibitory factor, and kon denote the separation rate of the environmental signal and the inhibitory factor. We have:
In general, kon≫koff, hence, the reaction will quickly achieve chemical equilibrium relation, at this point, the (dP+ )⁄dt=0. We can get:
where, ks =koff/kon , is the equilibrium constant of the signal-transcription factor binding reaction. From P+ =PT-P, we have:
Thus, the rate of bacterial lysis under the action of environmental signals γ can be obtained as follows:
Therefore, we can get:
If we denote Nj i as the experimental data of the i th parallel group at j th sampling time, and N is the number of data samples in each parallel group, then the data sampling time can be denoted as \{tj \}_(j=1)^N. Since our experiment was conducted in triplicate control group, i ∈\{1,2,3\},. Our goal is to select the above parameters reasonably so as to minimize the error between the curve of the formula and the experimental data.
Geometrically, our model curve needs to be the closest to the experimental data curve, mathematically, we need to find the parameter r0 ,Nmax ,γ0 ,PT,kT,ks that minimizes the error between the model and the experimental data and satisfies the following conditions:
The essence of this problem is an optimization problem. So, for the target function: ∑(j=1)N(Ñj ) ̃-(Ñj )2 , we apply the genetic algorithm (GA)[5],To find the global optimal parameter r0 ,Nmax ,a,b, the following is the iterative solution procedure of the particle swarm optimization algorithm. The iteration process is shown in the following figure.
After that, we selected the average OD600 concentration of three parallel groups as the data point, plotted the test data curve that generated the change of OD600 concentration at different concentration of L-Arabinose over time, and plotted the model curve and the test data curve at the same time, as shown in the following figure:
The dotted line in the figure above shows the data curve, and the solid line shows the model that best fits the data
Under different L-Arabinose concentrations, let the concentration of L-Arabinose as S the optimal parameters of the model are as follows:
For S=101 μmol/L, the optimal parameters are
For S=1μmol/L, the optimal parameters are
For S=10(-1) μmol/L, the optimal parameters are
For S=10(-2) μmol\/L, the optimal parameters are
[1] Silvestre, M.P.C., Carreira, R.L., Silva, M.R. et al. Effect of pH and Temperature on the Activity of Enzymatic Extracts from Pineapple Peel. Food Bioprocess Technol 5, 1824–1831 (2012).
[2] Bodin, N. A. , and V. A. Zalgaller . "Concavity of certain functions connected with the two-dimensional normal distribution. " Litovsk.mat.sb (1967):389-393.
[3] Gang, X. U. , H. Wen , and W. U. Kun . "Primal chaos data system and feedback control research of Logistic population increase model." Journal of Natural Science of Heilongjiang University (2003).
[4] Alon, U. An Introduction to Systems Biology: Design Principles of Biological Circuits 3–19 (Chapman & Hall/CRC, 2007).
[5] Zwickl, Derrick Joel . "Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion." Dissertations & Theses - Gradworks 3.5(2008):257-260.