Sπthση 3.0
One of the most significant limiting factors for MaSp synthesis in E. coli is the translational difficulties associated with the poly-alanine regions the protein contains. Whilst interactions between these regions contribute to the useful macroscopic properties of spider silk fibres [1], the resultant disproportionately high alanine content, relative to the overall abundance of alanyl-tRNAs (alanine bound to its corresponding tRNA) in the pool maintained in each E. coli cell by default, can result in termination errors in translation [2].
Our group hypothesises that given the available pool of alanyl-tRNAs is limited both by the availability of intracellular alanine and alanine tRNAs, increasing transport of the former into the cell and synthesis of the latter, as well as the alanyl-tRNA synthetase enzyme which catalyses their binding, may elevate MaSp yield. Thus, we aim to develop mathematical models, coded in Python, which investigate the effect of co-expressing an alanine import channel protein (CycA), alanyl-tRNA synthetase and alanine tRNA on MaSp yield over time.
Results obtained indicate that whilst CycA or alanyl-tRNA synthetase overexpression alone has no effect on MaSp yield, tRNA overexpression increases MaSp yield. Yield is further elevated by additional co-expression of the synthetase, with maximum yield obtained via co-expression of all 3. From these simulations, our group concludes that engineering E. coli to express CycA, alanyl-tRNA synthetase and alanine tRNAs on a 'helper plasmid' will benefit co-expression of MaSp proteins, influencing plasmid design.
Figure 1 - Schematic showing the interactions between cycA and alanine tRNAs in importing alanine and shuttling it to ribosomes for MaSp production
Quantities Modelled
- CycA - Inner membrane permease [3] which enables active transport based uptake of alanine from external growth medium via a proton symport mechanism [4]
- tRNA - Alanine tRNA responsible for shuttling intracellular alanine to ribosome complexes for incorporation into MaSp constructs
- Aintra - Intracellular alanine, available for incorporation into MaSp constructs
- A-tRNA - Alanine aminoacyl tRNA i.e. tRNA with alanine bound, ready for incorporation into proteins
- MaSp - Spider protein constructs, synthesised in E. coli and ultimately extracted, purified and precipitated out to form macroscopic fibres [1]
- Synth - Alanyl-tRNA synthetase which catalyses the attachment of L-alanine to its corresponding tRNA in an aminoacylation reaction [5]
Parameters
- Np (No. plasmids per cell) = 20
- - Derived from medium copy number of plasmid backbone
- Lk (Nucleotide base pair length of species k (or the gene encoding it))
- - LcycA = 1413 bp [4]
- - LtRNA = 76 bp [6]
- - LSynth = 2628 bp [5]
- k0 (base transcription rate) = 45 nucleotide bases/second [7]
- kt Rate constant for transcription = Npk0/Lk
- - ktCycA = 0.6369 s-1
- - kttRNA = 11.84 s-1
- - ktSynth = 0.3424 s-1
- kd (Rate constant for degradation) = ln(2)/τk (modelling degradation as following first order kinetics) where τk is the half-life for the degradation of species k
- - Protein
- - Mean τprotein = 20 hours[8]
- - kdCycA = kdMaSp = kdSynth = 9.6 x 10-6 s-1
- - tRNA
- - tRNA degradation is only non-negligible under amino acid starvation conditions
- - Mean τtRNA = 10 minutes[9]
- - kdtRNA = 1.2 x 10-3 s-1
- - Here we assume alanyl-tRNA complexes are used up sufficiently quickly that they are unlikely to be degraded (by a tRNA nuclease)
- - Protein
- Nk (no. alanine residues per molecule of a protein k)
- - NCycA = 47[4]
- - NMaSp = 51[10]
- - NSynth = 91[5]
- Other rate constants
- - kcat (rate constant for rate-limiting step of alanine import by CycA) = 0.9953 s-1
- - No experimental data available for amino acid uptake pumps in E. coli
- - Estimated value obtained using ML tool DLKcat which models kcat values based on the structure of the substrate (alanine) and AA sequence of protein binding to it (CycA) [11]
- - kSynth (Rate constant for rate-determining aminoacylation step) = 2.0 s-1 [12]
- - kcat (rate constant for rate-limiting step of alanine import by CycA) = 0.9953 s-1
- U0 (base alanine uptake rate for E. coli)
- - U0 = 1.17 mmol/30 seconds/kg wet weight of E. coli[13]
- - E. coli wet cell weight = 10−12 g/cell[14]
- - Therefore, U0 = 23500 molecules/second/cell
- Initial values of quantities (if non-zero)
- - Equilibrium population of alanine tRNAs per cell [tRNA]0 = 4000[15]
- - Equilibrium population of alanyl-tRNA synthetases per cell = [Synth]0 = 6695[15]
- - Uses known equilibrium rate constant for the alanyl-tRNA synthetase from [12]
- No. iterations for model = 57600 seconds
- - Typical overnight incubation period of 16 hours
Governing Equations
All equations are solved using a ‘time-stepping method’ whereby the change in the value of each quantity modelled, depending on the current values, is calculated, as per the differential equations below, and applied at each timestep. The resolution of the model (i.e. timestep length) is one second, which is negligibly short relative to the simulation duration.
Alanine import step
$$\frac{d[A_{intra}]}{dt} = U_0 + k_{cat}[CycA]$$
Alanine uptake is given by the sum of base uptake rate U0 (accounting for uptake through routes other than import via CycA) and kcat[CycA]. This second term represents uptake of alanine through CycA, given by the Michaelis-Menten equation in the limit of a large substrate excess (which reduces to rate = Vmax[Enzyme]).
Aminoacylation step
If tRNA availability is rate-limiting:
$$\frac{d[A-tRNA]}{dt} = [tRNA] $$
$$\frac{d[A_{intra}]}{dt} = -[tRNA]$$
$$\frac{d[tRNA]}{dt} = -[tRNA]$$
If tRNA availability is the limiting factor for aminoacylation, all available tRNA is aminoacylated, using up an amount of intracellular alanine equal to tRNA concentration, and producing an amount of alanyl-tRNA equal to this concentration.
If intracellular alanine availability is rate-limiting:
$$\frac{d[A-tRNA]}{dt} = [A_{intra}] $$
$$\frac{d[tRNA]}{dt} = -[A_{intra}]$$
$$\frac{d[A_{intra}]}{dt} = -[A_{intra}]$$
If intracellular alanine availability is the limiting factor for aminoacylation, all available intracellular alanine is bound to tRNA, using up an amount of tRNA equal to alanine concentration, and producing an amount of alanyl-tRNA equal to this concentration.
If synthetase activity is rate-limiting:
$$\frac{d[A-tRNA]}{dt} = k_{Synth}[Synth] $$
$$\frac{d[A_{intra}]}{dt} = -k_{Synth}[Synth]$$
$$\frac{d[tRNA]}{dt} = -k_{Synth}[Synth]$$
If synthetase activity is the limiting factor for aminoacylation, aminoacylation occurs at its maximum rate defined by the concentration of synthetase molecules and their turnover rate kSynth, producing an amount of alanyl-tRNA equal to this value, and depleting intracellular alanine and tRNA concentrations to an equal extent.
Transcription/translation step
Demand = Alanyl-tRNA demand for translation of CycA and alanyl-tRNA synthetase
$$Demand = k_t^{CycA}N_{CycA} + k_t^{Synth}N_{Synth}$$
If [A-tRNA] ≥ Demand:
$$\frac{d[CycA]}{dt} = k_t^{CycA}$$
$$\frac{d[Synthetase]}{dt} = k_t^{Synth}$$
$$\frac{d[MaSp]}{dt} = \frac{[A-tRNA] - Demand}{N_{MaSp}}$$
If the amount of alanyl-tRNA available exceeds the demand imposed by translation of CycA and alanyl-tRNA synthetase, both CycA and the synthetase are translated at their respective maximum rates, with the remainder of alanyl-tRNA being used to translate MaSp constructs.
If [A-tRNA] < Demand:
$$\frac{d[CycA]}{dt} = \frac{[A-tRNA]}{Demand}k_t^{CycA}$$
$$\frac{d[Synthetase]}{dt} = \frac{[A-tRNA]}{Demand}k_t^{Synth}$$
If insufficient alanyl-tRNA is available to translate both CycA and the synthetase at their respective maximum rates, all alanyl-tRNA is used to translate these proteins, yielding concentrations proportional to fraction of the total demand satisfied by available alanyl-tRNA. In this case, no alanyl-tRNA is available for MaSp translation.
Degradation step
$$\frac{d[CycA]}{dt} = -k_d^{CycA}[CycA]$$
$$\frac{d[MaSp]}{dt} = -k_d^{MaSp}[MaSp]$$
$$\frac{d[Synthetase]}{dt} = -k_d^{Synth}[Synthetase]$$
If alanine availability limits aminoacylation:
$$\frac{d[tRNA]}{dt} = -k_d^{tRNA}[tRNA]$$
All proteins are degraded following first order reaction kinetics, with their rate constants determined using the average half-life for protein degradation. tRNA is similarly degraded but only under amino acid starvation conditions, arising when insufficient alanine is available for another factor to be rate-limiting for aminoacylation.
Assumptions
- Extracellular alanine remains in large excess in the growth medium for the entire incubation period so the rate of alanine uptake rate can be modelled using the Michaelis-Menten equation in the limit of large substrate concentration
- - The growth medium will contain a high concentration of alanine, with continuous supply during incubation to maintain a constant excess
- ATP availability does not limit transcription and translation of proteins and tRNAs, or pumping of alanine molecules through CycA
- - The growth medium will also contain a high concentration of glucose to supply ATP synthesis
- In the absence of other rate-limiting factors stemming from availability of relevant amino acids and tRNAs, transcription occurs at a constant, maximum rate for E. coli
- - Metabolic capacity of E. coli for protein synthesis is maximised by amino acid and glucose supply in the growth medium, as well as the overexpression of tRNAs[16] and removed expression of the Ion and OmpT proteases (which elevate protein breakdown) in the chassis used (Rosetta BL21 DE3) [17]
- Rate of transcription and translation are coupled, with both processes occurring almost simultaneously (with negligible lag-time between them relative to the timescale of the incubation period)
- - Literature shows the two processes are highly coupled in E. coli, justifying their combination into one process for the synthesis of CycA and MaSp [18].
- The limiting factor for MaSp synthesis is the rate at which alanine can be shuttled by alanine tRNAs for translation of MaSp mRNA templates
- - MaSp constructs have an extremely high alanine content, containing long poly-A regions, from which result some of the intermolecular interactions which confer the macroscopic properties of MaSp proteins (high tensile strength and elasticity) [1]. Therefore, alanine availability is highly likely to limit the rate of MaSp translation (in turn determining the coupled rate of MaSp transcription)
- The rate-limiting step for the incorporation of alanine into CycA, synthetase and MaSp proteins is always the production of alanyl-tRNA, rather than its diffusion to the ribosome or the process of translation itself
Results
Figure 2 - Graph showing MaSp yield over time with varied co-expression of the three genes of the helper plasmid (cycA, alanine tRNA and alanyl-tRNA synthetase) alongside MaSp expression, both with and without protein and tRNA degradation considered
As shown in Figure 2, with degradation neglected, MaSp yield increases over time across the entire incubation period. Alanine tRNA overexpression increases MaSp yield, indicating tRNA availability is by default the limiting factor for MaSp synthesis. Additional co-expression of the alanyl-tRNA synthetase further increases MaSp yield, indicating, with the alanine tRNA overexpressed, synthetase activity becomes rate-limiting factor for MaSp production. Finally, further co-expression of CycA increases MaSp yield, but only from approximately t = 30000 seconds (indicated by the dashed line) into the incubation period, suggesting alanine availability is only rate-limiting from this point onward. With degradation considered, the same pattern of elevated MaSp yield is obtained for varied gene expression, with CycA expression only increasing MaSp production from approximately t = 42000 seconds (again indicated by the dashed line) into the incubation period. This is logical given degradation reduces tRNA availability and overall synthetase activity delaying the onset of alanine availability being the rate-limiting factor for MaSp production.
Figure 3 - Graph showing excess intracellular alanine over time with varied co-expression of the three genes of the helper plasmid (cycA, alanine tRNA and alanyl-tRNA synthetase) alongside MaSp expression, both with and without protein and tRNA degradation considered
As shown in Figure 3, both with and without the effects of degradation considered, intracellular alanine is in large excess across the entire incubation period in both the cases of no gene expression and tRNA overexpression. In the case of tRNA overexpression and synthetase expression, demand for alanine is sufficient that excess intracellular alanine is reduced to zero as tRNA availability and synthetase activity increase over time. It is at this point (denoted by the dashed line) intracellular alanine availability becomes rate-limiting for MaSp production. Time taken for this point to be reached is greater if degradation is considered, as this reduces the rate at which tRNA availability and synthetase activity increase over time. If CycA is additionally co-expressed, supply of intracellular alanine is sufficient at all points for an excess to be present, increasing MaSp yield relative to tRNA and synthetase expression only from the time point at which alanine availability is rate limiting for MaSp production in that case.
From these results, we conclude that overexpression of alanine tRNA in E. coli is likely to significantly elevate MaSp yield across an overnight incubation period. Additional over-expression of alanyl-tRNA synthetase and CycA will increase yield further, justifying their co-expression in E. coli with MaSp proteins, to maximise production of the final protein fibres. The primary limitation of the model is the fact that the limited metabolic capacity of the chassis is ignored. With a limited ATP pool at any point in time, rate of CycA, alanyl-tRNA synthetase and alanine tRNA translation may not always be limited by the availability of alanyl-tRNA. Given the limited availability of both ATP and transcriptional/translational machinery, these factors may be rate-limiting for protein/tRNA production. Moreover, alanine pumping through CycA itself requires ATP indirectly because alanine is transported via a proton symport mechanism, requiring a favourable electrochemical gradient established by ATP-requiring proton pumping. The more CycA molecules present, the less favourable the electrochemical gradient will be for each pump (only a finite amount of ATP can be invested in proton pumping) and thus, the less efficient CycA alanine transport will be. Additional limitations of the model include the fact diffusion time for alanine and alanyl-tRNA throughout the cell is taken to be negligible. The rate of alanine binding to and shuttling by tRNAs, is also taken to be sufficient that the concentration of alanyl-tRNAs is always negligible.
Design, Build, Test, Learn
Earlier Models -
The first generation model (Sπthση 1.0) predicted alanine import rate would be proportional to the concentrations of CycA in the cell membrane and extracellular alanine, treating the active transport step as having elementary reaction kinetics. Such models predicted an implausibly high rate of alanine uptake, given the potential for saturation of binding sites for CycA was ignored. Thus, later models modelled uptake using the Michaelis-Menten equation, in the limit of a large excess of the substrate.
This model adopted the same form as the final iteration, but separated the transcription and translation steps of MaSp synthesis, simulating the availability of MaSp mRNA templates for translation. Such models predicted complete independence of MaSp synthesis rate from alanine and tRNA availability, because rate of transcription of the comparatively long mRNA template of MaSp constructs (relative to alanine tRNA or CycA) is the limiting factor at biologically relevant values of kcat and U0. Given the observed tight coupling of the rates of transcription and translation in E. coli [12] and the assumed scarcity of alanyl-tRNA available for MaSp translation, it seemed reasonable to assume translation is the rate-limiting step for MaSp production. Thus, transcription and translation were functionally combined in the final model, with the rate of MaSp production reflecting only the final rate-limiting step of translation.
The second generation model (Sπthση 2.0) simulated MaSp production using coupled transcription and translation processes but did not factor in degradation. Thus, the model did not bound maximum MaSp production, making biologically implausible predictions of MaSp yield over time shooting off to infinity. When consulted, Professor Ozgur Akman (an expert in the field of mathematical biology) suggested that including such degradation steps is vital for obtaining biologically relevant predictions of yield of a target biomolecule over time. Thus, the third generation model (Sπthση 3.0) incorporated degradation processes, modelled as following first order reaction kinetics. Sπthση 3.0 also modelled synthetase activity as a potential limiting factor for alanyl-tRNA production, removing a necessay assumption of earlier models that enzyme-catalysed binding of alanine and tRNA is never the rate-limiting step of the overall process of alanyl-tRNA synthesis.
Sπthση 4.0 -
To eliminate the assumption that alanyl-tRNA availability is always the rate-limiting factor for MaSp production, a set of differential equations has been derived, modelling transcription and translation as decoupled processes. Whilst the literature suggests these processes are tightly associated in E. coli 18, modelling the processes as decoupled allows mRNA degradation to be factored in, considering the possibility of transcription being the rate-limiting factor for MaSp production. Moreover, this model accounts for the two-step mechanism of aminoacylation [19] using Michaelis-Menten kinetics in the limit of high intracellular alanine availability (as is the case for all expression states except for the case of tRNA and synthetase expression). Insufficient time was available to code this model, however, a complete mathematical description is available for download below.
References
- Stark, M et al. Macroscopic fibers self-assembled from recombinant miniature spider silk proteins. Biomacromolecules. 2007;8(5): 1695-1701. doi: 10.1021/bm070049y
- Rosenberg A, Goldman E, Dunn J, Studier F, Zubay G. Effects of consecutive AGG codons on translation in Escherichia coli, demonstrated with a versatile codon test system. J Bacteriol. 1993;175(3): 716-722. doi: 10.1128/jb.175.3.716-722.1993
- Hook C, Eremina N, Zaytsev P, Varlamova D, Stoynova N. The Escherichia coli Amino Acid Uptake Protein CycA: Regulation of Its Synthesis and Practical Application in l-Isoleucine Production. Microorganisms. 2022;10(3):647. doi: 10.3390/microorganisms10030647
- CycA. ECOCYC version 26.0. Available at: https://biocyc.org/gene?orgid=ECOLI&id=EG12504 [Accessed 19/7/22]
- UniProt. P00957. Available at: https://www.uniprot.org/uniprotkb/P00957/entry#sequences [Accessed 8/10/22]
- tRNA-Ala-GGC-1-2. GtRNAdb. Available at: http://gtrnadb.ucsc.edu/genomes/bacteria/Esch_coli_K_12_MG 1655/genes/tRNA-Ala-GGC-1-2.html [Accessed 19/7/22]
- Yu J, Xiao J, Ren X, Lao K, Xie XS. Probing gene expression in live cells, one protein molecule at a time. Science. 2006;311(5767): 1600-1603. doi: 10.1126/science.1119623
- Moran M et al. Sizing up metatranscriptomics. ISME J 2013: 237-243. doi: 10.1038/ismej.2012.94
- Svenningsen S et al. Transfer RNA is highly unstable during early amino acid starvation in Escherichia coli. Nucleic Acids Research. 2017;45(2): 793-804. doi: 10.1093/nar/gkw1169
- Registry of Standard Biological Parts. Part:BBa_K4339002. http://parts.igem.org/Part:BBa_K4339002 [Accessed 10/10/22]
- Li, F., Yuan, L., Lu, H. et al. Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction. Nat Catal. 2022;5: 662-672. doi:10.1038/s41929-022-00798-z
- Zhang C-M et al. Distinct Kinetic Mechanisms of the Two Classes of Aminoacyl-tRNA Synthetases. Journal of Molecular Biology. 2006;361(2): 300-311. doi: 10.1016/j.jmb.2006.06.015
- Piperno JR, Oxender DL. Amino Acid Transport Systems in Escherichia coli K12. Journal of Biological Chemistry. 1968;243(22): 5914-5920. doi: 10.1016/S0021-9258(18)94507-2
- ECMDB. Escherichia coli Statistics. https://ecmdb.ca/e_coli_stats [Accessed 10/10/2022]
- Jakubowski H, Goldman E. Quantities of individual aminoacyl-tRNA families and their turnover in Escherichia coli. J Bacteriol. 1984;158(3): 769-776. doi: 10.1128/jb.158.3.769-776.1984
- Fu, W., Lin, J. & Cen, P. 5-Aminolevulinate production with recombinant Escherichia coli using a rare codon optimizer host strain. Appl Microbiol Biotechnol 75, 777–782 (2007). doi: 10.1007/s00253-007-0887-y
- The Wolfson Centre for Applied Structural Biology. Bacterial Strains for Protein Expression. http://wolfson.huji.ac.il/expression/bac-strains-prot-exp.html [Accessed 19/7/22]
- Proshkin S, Rahmouni AR, Mironov A, Nudler E. Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science. 2010;328(5977): 504-508. doi: 10.1126/science.1184939
- Santra M and Bagchi B. Catalysis of tRNA Aminoacylation: Single Turnover to Steady-State Kinetics of tRNA Synthetases. J. Phys. Chem. B. 2012;116(39): 11809-11817. doi: 10.1021/jp305045w