Loading...Please wait

Model

We attempted to explore the resonance between mathematical models and biological experiments, using effective models to guide experimental data and improve efficiency. Our modelling mainly focused on the synthetic process of the subtance in the blue light system. We conducted experiments to obtain the corresponding sampling data for the development and optimization of our models, which could hopefully facilitate the following experiments, and offer inspirations and guidance for our project.

Overview

We attempted to explore the resonance between mathematical models and biological experiments, using effective models to guide experimental data and improve efficiency. Our modelling mainly focused on the synthetic process of the subtance in the blue light system. We conducted experiments to obtain the corresponding sampling data for the development and optimization of our models, which could hopefully facilitate the following experiments, and offer inspirations and guidance for our project.

In this regard, our modelling section can be divided into the following three sections:

(1) In this section on colony growth prediction, we explored the growth process of our own laboratory environment for blue light plasmids. In turn, we explore growth prediction models that are more suitable for our own conditions. By comparing different fitting methods and common machine learning approaches, we finally settled on an improved logistic equation and linear iterative growth prediction model, in the hope that it can be used as a guide for subsequent plasmid culture and selection in our laboratory for other light systems such as red light systems.

(2) In the panel of phenylethanol (2-PE) yield, our team explored the trend of pH and temperature changes on the product during the actual reaction in order to better optimise the experimental conditions. In this section, we measured Adh1 and KdcA enzyme activity at intervals of 20-45 °C and pH 5-10, and then quantified the 2-PE yield from the consumption of the reaction process. Finally, our team used the sampling point conditions and the corresponding production values to construct template interpolation and cubic spline interpolation surfaces to explore and predict the optimum yield conditions for 2-PE.

(3) In the section on the analysis of the physicochemical properties of proteins, in order to better guide the experimental process, we have analyzed the physicochemical properties of common proteins designed for the experiments, such as Adh1 (catalyzing the production of 2-PE from PAD in the blue light release system), KdcA (catalyzing the production of PAD from PPA in the blue light release system), etc.

1. E.coli growth prediction model

Overall overview

In the establishment of the project, we needed to effectively culture the synthetically transformed engineered bacteria, which required us to be able to reasonably and effectively predict the E. coli growth model we needed. For this, we used the growth of E. coli transferred into four different blue light-associated plasmids (pSB1C3-stuffer-pro /pSB1C3-stuffer-LP/pET-30a-KdcA/pET-22b-Adh1.For ease of expression, we refer to Pro,LP,KdcA,Adh1 in the following) designed in our own experiments as a sample set and the existing predictive model dataset, combined with imported genome length and culture colonies and other growth conditions as variables. A machine learning regression algorithm was used to build the model in our laboratory under existing culture conditions (37°C, 180rpm,CO2), which will be used to guide future growth changes in our other light source engineered bacteria systems. This will facilitate future experimental collections related to culture and optimal selection times.

1. Logistic growth fitting models and refinements:

As standard model bacteria, E. coli is not only fundamental to the experiments of our project itself, but also its growth prediction is instructive for other synthetic biology projects. In similar S-shaped growth curve fitting, the traditional logistic model is classically instructive. The model was first proposed in the form of a differential equation by Belgian mathematician Pierre Verhulst in 1838 and applied to population growth models, and was later modified and refined by microbiologists who were able to apply it to bacterial population growth curves and continue to modify it. In 1948, Hutchinson introduced the delay parameter, for example. To this day, it is still the dominant algorithm for estimating microbial growth modelling data.

1.1 Traditional Logistic model

Logistic equations are most commonly used to model bacterial growth kinetics equations, usually based on a combination of OD value growth curves. We will also add a description of OD value measurements and experimental conditions in the next paragraph. One of the traditional equations describing the population size N(t) versus time t is shown below:

Where N0 is the initial number or initial concentration, r is the growth rate and K is the maximum population size:

Figure 1:Approximate growth curve of microorganisms in a closed culture environment


Generally, E. coli exhibits exponential growth under adaptive conditions and generally replicates by binary fission. However, in a closed medium environment, with restrictions of resources and other factors, the growth model is basically an S-curve as shown in Fig. 1 above, with four main phases: Lag phase, Logarithmicphase, Stationaryphase, and Declinephase.The overall growth curve was influenced by the type of medium, temperature, pH, initial bacteria concentration and the innate features of the E. coli strain.

After combining the above equation with Fig. 1, with a higher N0 value (the initial growth concentration), the lag phase tends to be shorter and the logarithmic phase starts earlier. The parameter r reflects the growth rate during the logarithmic phase and can be approximated as the slope of the curve of this phase. The larger the r is, the steeper the curve is. The second term of the denominator of the N=K equation is approximated as 0, suggesting that the growth rate becomes 0 at the point and new critical equilibrium is reached.

The same parameter r reflects the growth rate during the second growth phase and can be approximated as the slope of the curve for that phase. r is steeper the larger the curve. The second term of the denominator of the N=K equation is approximated as 0, which means that the growth rate is 0 and a new critical equilibrium point is reached.

1.2 Improved decay model

So combining the above equation with Figure 1 we can see that the initial growth concentration is given by N0. The higher the N0 concentration, the shorter the first stage delay and the earlier the logarithmic growth stage:


2. Determination and conversion of OD absorbance

2.1 Absorbance measurement method

In order to build a mathematical model, it is also important to accurately determine the number of bacteria in the culture medium, and to do this we have compared the advantages and disadvantages of optical density (OD) measurement and ATP quantification methods.

In order to be able to build a mathematical model, it is also important to accurately determine the number of bacteria in the culture medium. In this regard, we have compared the advantages and disadvantages of Optical Density (OD) measurement and the ATP quantification method and have finally chosen OD measurement.

2.2 Conversion of OD absorbance to standard colony counts.

In order to be able to build a mathematical model, it is also important to accurately determine the number of bacteria in the culture medium. In this regard, we have compared the advantages and disadvantages of Optical Density (OD) measurement and the ATP quantification method and have finally chosen OD measurement.

Although measurements using optical density (OD) readings can be used to estimate cell density, the units are arbitrary and may vary from device to device and experiment to experiment. We have therefore taken full account of the official iGEM recommendations and converted the OD values to absolute colony forming units (CFU), making them comparable between devices and laboratories. The following equation relates OD to CFU, with x being OD600 and y being the number of CFUs (in 10^11 units):

y=6.7671x-0.1866

Design for the determination of modelling data:

In order to be able to effectively measure the growth of the four strains designed for our own blue light system, we used machine learning to make reasonable predictions of the growth of other strains transferred to base length in laboratory conditions. For this, we conducted experiments in our own laboratory under the same culture conditions (pH=7, 36.7°C, 180 rpm, 5% CO2), which also guided us in our subsequent experiments.

We transformed four different blue light expression system-related plasmids: pSB1C3-stuffer-pro, pSB1C3-stuffer-LP, pET-22b-Adh1, and pET-22b-Adh1, respectively, in E. coli BL21.

The bacterial broth was placed in four 100-ml conical flasks, each with 75 ml of LB liquid medium. Place in a shaker at 36.7 °C, 180 rpm, 5% CO2 and take 500 μl of the bacterial broth every hour to measure the OD value. Note that during the rapid growth period of the colony, the OD value was measured once every 20 min to ensure the accuracy of the data.


Every hour, we removed the E. coli from the conical flask and placed it in a spectrophotometer. We measured the OD of the bacteria using OD600 as a standard to indicate the growth data of the bacteria for a total measurement time of 15 h.

In total, we collected 3 x 4 x 15 sets of data, which were modelled and analysed to obtain E. coli growth curves. These raw growth curves could be used to calculate the growth rate and density of each strain under each growth condition to explore the conditions for subsequent experiments.

We selected four plasmids of different lengths from the following blue light system as the data for the preliminary experiments and classified them by base sequence length. Three sets of parallel experiments were done for each E. coli species, with OD correlations recorded every hour until the overall S-curve reached the late stage of equilibrium decay. The experimental data were also recorded as shown in the following table:


Record the converted OD absorbance value as follows:


4. Data processing and analysis

4.1 Data pre-processing for media selection:

At the beginning of the culture, we compared the general trends of the growth of LB medium, M63 medium and MAA medium in the data set in parallel with the distribution of the strain growth data. After comparison, we found that the growth rate of LB medium was much higher than that of M63 and MAA medium due to its most abundant nutrient content, and this will be our final choice for culturing the flora later on. The LB medium is used as the final choice for our later cultures, unless otherwise stated.

Also, considering the comparative growth rates between the three, for better data accuracy, we suggest that for LB media, OD600 values can be sampled every half hour, while MAA and M63 medium can be measured at one hour intervals.

Figure 2: Schematic representation of the growth of bacteria in different cultures


4.2 Fitting treatment of growth data:

Firstly, we counted the growth of the four strains and checked whether there were any large error points for elimination through three sets of control experiments, such as the OD600 value of the initial time LP recorded as 0.1 (marked in red in the figure below, so it was eliminated from the mean fit in Figure 4), an order of magnitude far above the average, so we eliminated it from the mean of the other values measured. A schematic of the parallel growth curve was plotted as shown in Figure below:

Figure 3 : Comparison of experimental data on the growth of the four species of colonies


Figure 4. Average growth curve of four types of bacteria(error points excluded)


As shown in Fig. 3 and Fig. 4, signified by different symbols, the growth of these four bateria were fitted using the least squares method. From the figures above, it can be inferred that LP-type bacteria started to amplify and showed a surge after around 3 h. The other three bacteria had more or less the same initial growth trend and diverged after 7 h. In the end, ADH1-type bacteria reached the highest concentration equibrilium compared to the other three.

4.3 Evaluation of the Fitting

In order to effectively illustrate the fitting of the logistic and the modified decaying logistic, while reducing the evaluation risk of chance errors. In this regard, we chose the goodness of fit and the squared residuals as evaluation metrics for the fit results. We let y be the original data, y (upper triangle) be the fitted value, and y be the mean, and the evaluation indicators of goodness of fit SST, squared residual, and SSE can be denoted as equation (3) (4) (5):




where SSE reflects the sum of squares of the differences in data points between the original data and the fitted data

SST is the sum of squared differences between the original data and the mean, and is the goodness-of-fit value calculated by taking SSE into account in equation (2). The goodness-of-fit values fall in the [0,1] range. The closer the value is to 1, the better the model fits the growth curve.

Figure 5. Comparison of predictions from two logistic models (community number and growth rate)


As shown above, the improved logistic has suggested a better fit after comparing our own growth curves with the data in the open source dataset (700 in total). It was also found that, for the main growth parameters, the results of the two algorithms' fits had relatively small differences in maximum population capacity N, while there was a large difference in growth rate r.

4.4 Growth Prediction by ML Models

By comparing our growth data with the open source data, we demonstrated the mathematically best fit of the improved logistic model, as well as the logical refinement of the biological sense of the decline phase. However, we needed a predictive model taking into account different plasmid lengths in the optimal culture environment (36.7 °C, 180 rpm 5% CO2, LB liquid culture media), in order to effectively guide the timing of bacteria cultures and the selection of optimal values. To achieve the goal, we used a variety of ML-based methods for modeling, and the results are shown below:

A variety of common regression models were selected to fit the data to the S-curves measured during colony growth, such as support vector machines, decision trees, and Gaussian process regression. It is only the root mean square error (RMSE) between the predicted and actual values that is used as a criterion to judge the prediction effectiveness. The RMSE is the square root of the ratio of the square of the deviation of the predicted value from the true value to the number of observations n. The smaller the RMSE, the better the prediction.


Figure 6. Comparison of bacteria growth fits by different prediction methods


In summary, we compared the fits of common ML regression models after illustrating the advantages of the logistic modified equation through the SSE residual variance profile. After the RMSE comparison, we decided to use the guiding idea of logistic correction equation combined with segmented linear regression to predict and analyze the future growth model of red light-induced plasmid under the same culture conditions.

5. Conclusion

Through the determination and modification of the logistic function, we explored the fitting and prediction between culture models of E. coli in culture by using the culture environment in our own laboratory as an indicator. Through the evaluation criteria of SSE residual variance and RMSE, etc., we finally selected the guiding idea of logistic modified equation combined with segmented linear regression to predict and analyse the future growth model of red light plasmid in the same laboratory environment situation.

2. A model for exploring the optimum situation of multiple enzyme activities and products

1. Enzyme Activity and Michaelis-Menten Equation (MME)

The MME is an equation that calculates the initial reaction rate of an enzymatic reaction in relation to the concentration of the substrate:


MME assumes the existence of the state of homeostasis, where value is known as the MEE constant, the is the reaction rate when the enzyme is saturated and with the substrate, and is the substrate concentration.

MEE can be illustrated as

Figure 7. Schematic representation of the Michaelis-Menten equation


2. Enzymatic Processes and Reaction Mechanisms in Blue Light Systems

As the enzyme activity was identified and the product phenylethanol (2-PE) was extracted, we found against the literature that the experimentally obtained 2-PE never reached the desired yield. We therefore considered whether this difference was due to changes in enzyme activity under specific conditions, so we determined the enzyme activity under different conditions, including the effect of different pH on enzyme activity and the effect of different temperatures on enzyme activity

Upon blue light irradiation, the LOV structural domain of the EL222 photosensitive protein changes conformation, initiating the expression of the downstream kdcA and adh1 genes, which can further catalyse the synthesis of PPA to 2-PE in E. coli using self-use of its own genes.

Figure 8. Catalytic synthesis of 2-PE


The specific reaction principle of KdcA and Adh1 catalyzing the substrate PPA is as follows:

Figure 9:Reaction principle of KdcA and Adh1 catalyzed substrate PPA


2.1 KdcA reaction equation

2.2 Adh1 reaction equation

3. Indirect method for the determination of phenylethanol products.

We measured the absorbance of NADH after 5 min of reaction at different temperatures and pH for the above two reactions. Next, the NADH concentration was calculated from the standard curve. We obtained the NADH consumption by subtracting it from the initial amount of NADH added, and the yield of phenylethanol can be obtained indirectly according to the stoichiometric relationship between NADH and alcohol in the reaction equation.

3.1 NADH Standard Curve

The standard NADH values were measured and recorded at 340 nm as follows:


A linear fit of the above data gives the following graph.

Figure 10. NADH standard curve


The equation of the standard curve was calculated as y=0.115+4.456x. Where y is the absorbance value and x is the NADH concentration.

3.2 NADH consumption

Based on the standard curve equation above, we obtain:

Where c is the NADH consumption, c0 is the initial NADH concentration and Abs is the absorbance measured at different temperatures pH.

3.3 Phenylethyl alcohol production

According to 2.1 and the chemical reaction equation in 2.1.



NADH consumption: phenylethanol production = 1:1, then phenylethanol production.

where ρ is the phenylethanol yield, c is the NADH consumption and M is the molecular weight of phenylethanol.

4. Experimental process design and measurement

4.1 Experimental measurement plan.

KdcA: other laboratory supporting data from the literature, we finally developed enzymatic activity conditions around 30°C to determine the optimum conditions for the enzymatic activity of KdcA. Six parallel sets of experiments were performed in the range 20°C-45°C at 5°C intervals. If there is a large deviation (≥5°C) from the literature guidance for the optimum temperature with 30°C, the temperature interval will be refined and measured accurately.

As for the range of pH, we set up 6 sets of parallel experiments ranging from 5-10, unfolding the experiments at intervals of 1.

Adh1: The optimum action pH of Adh1 is between 7.0-10.0. The enzyme activity reaches its maximum at pH 7.0 and is more stable at pH 7.0; the optimum action temperature of ADH is 37°C. The enzyme activity is more stable at 30-40°C and drops sharply when the temperature exceeds 45°C.

Considering the stability of the enzyme activity conditions of Adh1, we set the temperature to 20-45°C with 5°C interval and the pH to 5-10 with 1 interval.

4.2: Experimental data measurement

We put the sample solution to be tested into an enzyme standard plate and measure the enzyme activity.

Experimental steps:

  1. Prepare different concentrations of phenylethanol standards (diluted 100x, 200x, 300x, 400x, 600x, 800x, 1000x, 1200x, 1500x, 2000x)
  2. take 200 μL of each standard in turn and measure the absorbance
  3. fit a standard curve for the concentration of phenylethyl alcohol
  4. 200 μL of the extracted sample was taken and the absorbance was measured
  5. compare the standard curve to obtain the concentration of phenylethanol in the sample

Our team members are performing enzyme activity measurements and observing the results of absorbance values

Team members used the UV-2700 spectrometer to measure the content of 2-PE (configuring the measurement solution).

4.3 Experimental data recording

We recorded the experimental absorbance data on a table and used the colour scale from red, white and green in descending order of magnitude.

4.3.1 Adh1

4.3.2 KdcA, Adh1 complex enzymes

As the relevant OD measurements are positively correlated with the remaining reactants. Using the above table, we can initially see that: the darker red (most of the OD) part of the above table indicates a high residual, a low reaction volume and a low final yield of phenethyl alcohol. Similarly, the darker the green colour, the higher the corresponding phenylethanol yield. The calculation process will be developed step by step as follows.

5. Analysis and pre-processing of results

5.1 Absorbance of NADH at Different temperatures and pH Values
5.1.1 Adh1

Add reagents (200 μL in total):

10 mM Tris-HCl pH 7 (140 μL), LP-type bacteria solution (30 μL), 20 mM NADH (10 μL), 10% CH3CHO (20 μL)


5.1.2 KdcA, Adh1 Complex Enzymes

Add reagents(200 μL in total):

10 mM Tris-HCl pH 7 (140 μL), LP-type bacteria solution (30 μL), 1M MgCl2 (10 μL), 20 mM NADH (10 μL), 1 M CH3COCOOH (10μ)


5.2 NADH Consumption (mM)
5.2.1 Adh1

5.2.2 KdcA, Adh1 Complex Enzymes

5.3 2-PE Production (mg/L)
5.3.1 Adh1

1. We used the Thin Plate Spline (TPS) method to fit the surface where the temperature and pH sampling points are located as shown below:

Figure 11. TPS for Adh1


For the entire surface, it can be seen in the bottom right, amost all of the sampling points fall on the fit surface, where the SSE is 2.405e-25 (an approximate perfect fit). Adh1 is less active at 20 ℃, with a plateau within the condition range of 35-45 ℃ and pH 7-8 as the optimal values. And the sudden yield changes were observed when the temperature dropped below 35 ℃ or when pH rose over 8.

Considering that temperature and pH are essentially relatively independent variables on the corresponding enzyme activity effects and that they are essentially uncorrelated. We therefore followed the longitudinal and tangential directions along the axes, we searched for surface maxima and obtained the results shown in the upper right corner of Fig. 11. It is essentially close to the maximum enzyme activity recommended in the literature at pH 7, 35 ℃, and the overall trend also fitted well.

2. Cubic interpolation is to find a cubic equation that divides the known number into a small interval, and a multi-segment function of higher order is fitted. We use cubic interpolation to fit the PH surface of the yield temperature and to find the best point of the optimal Zmax in the x and y directions and its environment


We compared the fitting results of the two methods and testified through the results to each other. The best case was obtained in the interval of pH 7-8 and temperature 35-40 and guided us towards pH=7.5 temperature=36 degrees or so for the exploration and guidance of experimental conditions for finer environments.

5.3.2 KdcA, Adh1 (LP Complex Enzyme)

1. The results of fitting the temperature and pH surfaces using TPS interpolation are shown below.


It can be seen that the plateau for max activity (highest yield) in the sampling sites lies around the centre of the 40 ℃ and pH 7. The highest values were around 98.6 mg/L. When temperatures were below 35 ℃, the overall yield plummeted. The plane reached the bottom when it was lower than 30 ℃.

2. Using cubic spline interpolation to fit the production yield and temperature & pH surface and explore the best point of the optimal Zmax in the x and y directions and its environmental conditions. Piecewise cubic interpolant:


Similarly, the pH and temperature of the lp complex enzymes in KdcA and Adh1 were fitted using cubic spline interpolation and the optimum point was approximately 98.5 mg/L at pH=7 and temperature of 38.5°C.

6. Summary

In summary, in order to better investigate the relationship between the final product of our blue light expression system, phenylethanol, and the corresponding temperature and pH conditions. We measured the activity of Adh1 and the complex enzymes of both Adh1 and KdcA during the reaction with reference to the chemical mechanism of the reaction, and then spatially fitted the sampling points. We inferred the optimum enzyme activity changes and the corresponding product production under different conditions during the corresponding reactions. By comparing the equilibrium values of the two enzyme activities of Adh1 and KdcA, we were able to obtain the reaction environment for the next experimental preparation and control.

3. Physicochemical analysis of proteins

In order to better guide the experimental procedure, we analyzed the physicochemical properties of common proteins designed for the experiment, such as Adh1 (catalyzing the production of phenetol alcohol from PAD in the blue light release system), KdcA (catalyzing the production of PAD from PPA in the blue light release system), etc.

1. Hydrophilic hydrophobic analysis

Here we have chosen ProtScale for the hydrophilic hydrophobic analysis of the selected proteins, using Hydropath. / Kyte & Doolittle criteria: The individual values for the 20 amino acids are:

Ala: 1.800 Arg: -4.500 Asn: -3.500 Asp: -3.500 Cys: 2.500 Gln: -3.500

Glu: -3.500 Gly: -0.400 His: -3.200 Ile: 4.500 Leu: 3.800 Lys: -3.900

Glu: -3.500 Gly: -0.400 His: -3.200 Ile: 4.500 Leu: 3.800 Lys: -3.900

Met: 1.900 Phe: 2.800 Pro: -1.600 Ser: -0.800 Thr: -0.700 Trp: -0.900

Tyr: -1.300 Val: 4.200 : -3.500 : -3.500 : -0.490

To facilitate the generation of graph data, we scanned the protein sequence using a sliding window of a given size. At each position, the average scale value of the amino acids within the window is calculated and the value is plotted for the midpoint of the window. The final result obtained is shown below:

1.1 Adh1 BBa_K4427004

SEQUENCE LENGTH: 375


1.2 Adh1 BBa_K4427005

SEQUENCE LENGTH: 547


1.3 EL222 BBa_K4427003

SEQUENCE LENGTH: 635


1.4 bphp1 BBa_K4427000

SEQUENCE LENGTH: 731


1.5 Ppsr2 BBa_K4427001

SEQUENCE LENGTH: 456


2. Physicochemical Characterization of Proteins

For better analysis, we use the ProtParam tool, which can take a user-entered protein sequence and calculate and analyse relevant physicochemical parameters such as atomic composition, estimated half-life, molecular weight, amino acid composition, extinction coefficient, instability index, etc.

2.1 Adh1 BBa_K4427004

Number of amino acids: 375

Molecular weight: 39771.42

Theoretical pI: 8.44

Atomic composition:

Carbon C 1771

Hydrogen H 2859

Nitrogen N 473

Oxygen O 516

Sulfur S 23

Formula: C1771H2859 N473 O516 S23

Total number of atoms: 5642

Estimated half-life:30 hours in mammalian reticulocytes (in vitro), >20 hours in yeast (in vivo), >10 hours in E. coli (in vivo).

Instability index: calculated as 32.76

Aliphatic index: 93.55

Grand average of hydropathicity (GRAVY): 0.203

2.2 KdcA BBa_K4427005

Number of amino acids: 547

Molecular weight: 60884.13

Theoretical pI: 4.95

Atomic composition:

Carbon C 2739

Hydrogen H 4293

Nitrogen N 701

Oxygen O 841

Sulfur S 12

Formula: C2739H4293 N701 O841 S12

Total number of atoms: 8586

Estimated half-life:The N-terminal of the sequence considered is M (Met). The estimated half-life is: 30 hours in mammalian reticulocytes (in vitro), >20 hours in yeast (in vivo), >10 hours in E. coli (in vivo).

Instability index: The instability index (II) is computed to be 32.76, suggesting the stability of the protein.

Aliphatic index: 93.55

Grand average of hydropathicity (GRAVY): 0.203

2.3 EL222 BBa_K4427003

Number of amino acids: 635

Molecular weight: 71384.41

Theoretical pI: 6.10

Atomic composition:

Carbon C 3202

Hydrogen H 4981

Nitrogen N 843

Oxygen O 954

Sulfur S 26

Formula: C3202H4981 N843 O954 S26

Total number of atoms:10006

Estimated half-life:30 hours in mammalian reticulocytes (in vitro), >20 hours in yeast (in vivo), >10 hours in E. coli (in vivo).

Instability index: calculated as 33.61

Aliphatic index: 89.01

Grand average of hydropathicity (GRAVY): -0.216

2.4 bphp1 BBa_K4427000

Number of amino acids:731

Molecular weight: 80180.22

Theoretical pI: 5.67

Atomic composition:

Carbon C 3531

Hydrogen H 5623

Nitrogen N 1025

Oxygen O 1059

Sulfur S 25

Formula: C3531H5623 N1025 O1059 S25

Total number of atoms:11263

Estimated half-life:30 hours in mammalian reticulocytes (in vitro), >20 hours in yeast (in vivo), >10 hours in E. coli (in vivo).

Instability index: calculated as 45.41

Aliphatic index: 94.12

Grand average of hydropathicity (GRAVY): -0.124

2.5 Ppsr2 BBa_K4427001

Number of amino acids:456

Molecular weight: 50376.36

Theoretical pI: 5.38

Atomic composition:

Carbon C 2191

Hydrogen H 3601

Nitrogen N 651

Oxygen O 683

Sulfur S 12

Formula: C2191H3601 N651 O683 S12

Total number of atoms:7138

Estimated half-life:30 hours in mammalian reticulocytes (in vitro), >20 hours in yeast (in vivo), >10 hours in E. coli (in vivo).

Instability index: calculated as 46.65

Aliphatic index: 100.79

Grand average of hydropathicity (GRAVY): -0.281

4. Refences

[1] Matsushita, M., Hiramatsu, F., Kobayashi, N., Ozawa, T., Yamazaki, Y., & Matsuyama, T. (2004). Colony formation in bacteria: experiments and modeling. Biofilms, 1(4), 305-317.

[2] Pirt, S. J. (1967). A kinetic study of the mode of growth of surface colonies of bacteria and fungi. Microbiology, 47(2), 181-197.

[3] Koseki, S., & Nonaka, J. (2012). Alternative approach to modeling bacterial lag time, using logistic regression as a function of time, temperature, pH, and sodium chloride concentration. Applied and environmental microbiology, 78(17), 6103-6112.

[4] Wei, J., Timler, J. G., Knutson, C. M., & Barney, B. M. (2013). Branched-chain 2-keto acid decarboxylases derived from Psychrobacter. FEMS microbiology letters, 346(2), 105-112.

[5]Jayaraman, P., Devarajan, K., Chua, T. K., Zhang, H., Gunawan, E., & Poh, C. L. (2016). Blue light-mediated transcriptional activation and repression of gene expression in bacteria. Nucleic acids research, 44(14), 6994-7005.

[6]Hoeren, F. U., Dolferus, R., Wu, Y., Peacock, W. J., & Dennis, E. S. (1998). Evidence for a role for AtMYB2 in the induction of the Arabidopsis alcohol dehydrogenase gene (ADH1) by low oxygen. Genetics, 149(2), 479-490.

[7]Berthold, C. L., Gocke, D., Wood, M. D., Leeper, F. J., Pohl, M., & Schneider, G. (2007). Structure of the branched-chain keto acid decarboxylase (KdcA) from Lactococcus lactis provides insights into the structural basis for the chemoselective and enantioselective carboligation reaction. Acta Crystallographica Section D: Biological Crystallography, 63(12), 1217-1224.

[8] Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A et.al (2005)..; Protein Identification and Analysis Tools on the ExPASy Server; The Proteomics Protocols Handbook, Humana Press pp. 571-607

[9] Pace, C.N., Vajdos, F., Fee, L., Grimsley, G., and Gray, T. (1995) How to measure and predict the molar absorption coefficient of a protein. Protein Sci. 11, 2411-2423.

[10] Edelhoch, H. (1967) Spectroscopic determination of tryptophan and tyrosine in proteins. Biochemistry 6, 1948-1954.

[11] Gill, S.C. and von Hippel, P.H. (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182:319-326(1989).

[12] Bachmair, A., Finley, D. and Varshavsky, A. (1986) In vivo half-life of a protein is a function of its amino-terminal residue. Science 234, 179-186.

[13] Gonda, D.K., Bachmair, A., Wunning, I., Tobias, J.W., Lane, W.S. and Varshavsky, A. J. (1989) Universality and structure of the N-end rule. J. Biol. Chem. 264, 16700-16712.

[14] Tobias, J.W., Shrader, T.E., Rocap, G. and Varshavsky, A. (1991) The N-end rule in bacteria. Science 254, 1374-1377.

[15] Ciechanover, A. and Schwartz, A.L. (1989) How are substrates recognized by the ubiquitin-mediated proteolytic system? Trends Biochem. Sci. 14, 483-488.

[16] Varshavsky, A. (1997) The N-end rule pathway of protein degradation. Genes Cells 2, 13-28.

[17] Guruprasad, K., Reddy, B.V.B. and Pandit, M.W. (1990) Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Eng. 4,155-161.

[18] Ikai, A.J. (1980) Thermostability and aliphatic index of globular proteins. J. Biochem. 88, 1895-1898.

[19] Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105-132.

Document