Preamble


Regulatory Mechanism








Outlook


References


Part Improvement


We successfully engineered two improved versions of the araBAD promoter, which are parts of a new collection of pBAD_APs. All the designed pBADs have truncated length of less than 160 base-pairs, with some even showing higher performance than the 300 bp wild-type counterpart. We are thus qualified for the Gold criterion - Part Improvement.


Preamble

Originally, we had hoped to reuse several components in the RPA circuit from Aoki et al (2019) for our strategic design. The process species was chosen to be AraC, which controls the araBAD promoter (dubbed P_BAD) and is inducible by arabinose [1]. However, we were concerned that the common wild-type P_BAD is relatively large, topping at around 300 bp, which increases the bulkiness of our LVL2 construct and reduces efficiency. Also, in an interview with Dr. Aoki herself (see Human Practice) shared that one caveat in her previous circuit design involved araC-B_BAD system. P_BAD, by nature, is a relatively leaky and medium-strength promoter, therefore, more araC is needed to fully reach maximal activation. However, she emphasised that overexpression of araC is toxic to cells, and thus can drastically impede performance of the circuit.

We were faced with two options: choose an alternative regulatory system, or find a more superior version of araBAD promoter in literature. Logically, our initial hope was to tackle both paths and use one as a back-up. Through part-mining, we saw that Meyer et al (2019) had designed several different P_BADs. One of which, called P_BADmin, is significantly shorter and shows lower leakiness than wild-type P_BAD due to rational macro-deletion of an upstream segment, but suffers lower maximal activity by more than an order of magnitude [2]. After considering both this hard trade-off and the host-construct non-orthogonality (interfering the endogenous arabinose operon) [3], we rationalised that this inducible system would not be suitable for the circuit implemented in E.coli strain DH5alpha, and switched to using the orthogonal CinR-pCinR pair.

Still, as planned, we thought it would be an interesting side project for Part Improvement to redesign the existing wild-type promoter. Our initial goal was to engineer a P_BAD with minimal length, but shows both lower leakiness and higher maximal activity (ambitious, I know!).

To achieve this, we did intensive literature search for prior optimization strategies. To understand the rationales, we also gathered information on the regulatory mechanism of the promoter, which was presumably only prevalent in papers from the 1970s. A brief explanation of the mechanism is discussed below.

In the end, we managed to create two araBAD promoters with significant improvements compared to the wild-type version. These promoters are advantageous in that they:

Information of the pBAD collection is stored in the BBa_K4491007 entry. A brief recap is also documented on the existing Part page, BBa_K2442101.

Regulatory Mechanism of araBAD promoter

The exact definition of the araBAD promoter varies ambiguously between sources. Historically, P_BAD only refers to a short segment upstream of the +1 transcription start site (referred to as the core promoter), containing the -35 and -10 boxes [4]. However, as a complete promoter Part, the regulatory region further upstream is also included. In the following discussion, our definition of P_BAD refers to the whole sequence consisting of both the regulatory sequence and the core promoter.

The araBAD promoter regulates the araBAD operon and is controlled by araC - a regulatory protein known for its “love-hate” mechanism of action. The upstream regulatory region consists of various protein binding sites - araI1, araI2, araO1 and araO2 can be occupied by araC. Between araI1 and araO1 also lies a CAP binding site, which recruits Catabolite receptor protein for transcription activation [5]. The spacing between araO1 and O2 is noticeably large, reaching nearly 210 bp, which, unsurprisingly, is responsible for the overall bulkiness of the promoter. Interestingly, a region within this spacer contains a promoter of araC gene, which runs in the opposite direction to the araBAD operon. This promoter is regulated by araO1 just downstream. The araC protein therefore not only regulates the araBAD operon but also controls its own production by binding to araO1 region (negative autoregulation). However, our focus will be solely on the regulation of the P_BAD promoter.

Figure 1.
Figure 1. Schematic representation of the araBAD operon and the upstream components of the core araBAD promoter .

In the absence of L-arabinose, transcription is repressed by the action of two araC molecules binding to the structure. One araC binds to araI1 and the other binds to araO2 further upstream. The dimerization domain confers high affinity, bringing the two protein molecules closer to dimerize, essentially creating a DNA loop as a result. This looping mechanism prevents any sigma factors, RNA Polymerase or CRP from being recruited, thus repressing transcription.

Figure 2.
Figure 2: Schematic representation of the regulatory mechanism of araC-P_BAD with (bottom) or without (top) addition of arabinose.

In the presence of L-arabinose, binding of the sugar molecule to the arabinose-binding domain triggers a conformational change in the DNA-binding domain of araC, which reduces its affinity to distal araO2 site. This makes the protein more favourable to bind to araI2, just downstream of araI1. Therefore, the dimerized complex is formed at araI1-araI2 site instead of araO2-araI1, which breaks the loop and hence activates transcription. The dimer araC also “nudges” the RNA Polymerase for enhanced transcription.

Design Rationales

Understanding the overall mechanism of araBAD promoter allowed us to pinpoint several aspects of the structure that can be rationally improved.

Rationale 1: Spacing between araI1 - araO2

To minimise the length of P_BAD, we investigated the presence of araO2 site, as well as the 211 bp spacer between araI1 and araO2 (this region includes CAP binding sites and araO1). Previous literature suggested that complete or even partial deletions of araO2 would increase leaky expression of P_BAD, emphasising that the existence of this region is essential for normal repression [6]. Strangely, most commercially available sequences for P_BAD (in CIDAR MoClo kit, for example) omitted the araO2 site, so its significance is still uncertain.

The spacing will dictate the size of the loop, which also controls the level of repression under the absence of arabinose and determines the promoter’s leakiness. Here, the spacing is defined to be between position -59 within araI1 and -270 within araO2 (underlined and asterisked). It was shown that there are no lower bounds for loop size - a functional araBAD promoter was designed with a 34-bp loop, after deleting a significant portion of the spacer [7]. While it maintained similar leakiness to that of the wild-type counterpart, the loss of CAP binding site drastically reduced the maximal strength. Still, the finding demonstrated the great flexibility of bulky araC proteins in mediating small loop formation.

Another important observation was that as the spacing varied, the promoter activity oscillated with a 11.1 bp periodicity. Specifically, any insertion or deletion of integer multiples of 5bp noticeably increased leakiness, while integer multiples of 11.1 bp retained wild-type’s full repression [8]. This value was determined to be approximately equivalent to one helical repeat of DNA (10.5 bp). It was further explained that insertion of 5 bp between araO2 and araI1 would rotate one site halfway around the DNA double helix with respect to the other and impede repression. Despite araC’s flexibility, the torsional stress of DNA makes such looping much more energetically unfavourable.

Taking into account these results, our first design strategy is to reduce the spacer region down to 56 bp while still maintaining the CAP binding site downstream. We removed the araO1 site completely due to its minor role on P_BAD activity. The schematic of our rationale is depicted below.

Figure 3.
Figure 3: Schematic representation of rational modification of araO2-araI1 spacing.

Rationale 2: araI1 and araI2

We then investigated araI1 and araI2 17-bp regions, both containing two unique sites called the A- and B-box, which serve as specific binding sites for araC. In previous literature, Niland et al (1996) showed that any single base-pair substitution occurring in these two sites would drastically reduce binding of araC. Flanked between the two boxes are seven invariant nucleotides that, upon selected single substitution, demonstrated higher binding affinity to araC by somewhat 140% compared to that of wild-type araI1 [4]. We therefore modified the araI1 site to contain all the different substitutions which initially yielded tighter binding, while keeping the A- and B-boxes unchanged.

In another paper, Reeder (1993) found that the B-box of araI2 overlaps with four base pairs of the -35 consensus sequence [9]. Thus, any substitution in this box will negatively impact P_BAD’s activity, either resulting in very high leaky expression or lowered inducibility. However, the author remarked that araI1 has much higher affinity to araC than araI2 does, especially when no arabinose is present. This makes sense, as araC prefers binding to distal araO2 and araI1 than the nearby araI2. From this insight, we questioned whether a duplicate araI1-I1 may confer higher maximal activity than wild-type araI1-I2. We then sought to change the last nucleotide of the A-box and the interbox sequence of araI2 to that of wild-type araI1, but still leaving the araI2 B-box untouched. In a sense, we created a chimeric half-araI1-half-araI2 in place of wild-type araI2. We did not duplicate the entire araI1 as this would affect the overlapping -35 consensus sequence and make the promoter extremely leaky.

We group the modifications for both araI1 and araI2 as our second strategy.

Figure 4.
Figure 4: Schematic representation of rational base-pair substitutions within the araI1-araI2 region.

Rationale 3: -35 and -10

Our final design input comes from the work of the 2013 DTU iGEM team (see here). They managed to create a synthetic promoter library (SPL) for araBAD promoter by randomly mutating different base-pairs between the -35 and -10 boxes, right downstream of araI2. We decided to use the Col15 sequence, which showed promising low level of leakiness and high induced strength.

Combinatorial Design

We finally opted for a combinatorial approach for the three strategies, and thus have initially designed 23 = 8 different P_BADs, with or without the preceding optimizations. After some thoughts, we introduced another design with a slightly larger spacing between araO2 and araI1 (PB5). These designs were tested against the wild-type, full-length araBAD promoter, corresponding to part BBa_K2442101 in the registry (see here). We were aware that while an individual strategy may yield noticeable improvements, this might not be true when combining them together. In fact, some can result in antagonistic effects rather than the desired synergy. Still, we hoped that within the different permutations, some good designs may emerge. We also thought that, given this variety of promoter expression, we should not restrict our aim to only creating a single better P_BAD, but also making a “family” of the promoter with different strengths, such as weak, medium and strong, for various purposes (in some cases, lower maximal activity might be necessary).

Identifier* Rationale 1** Rationale 2 Rationale 3 Part sequences
AP1 + - - agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatc
ctacctgacgctttttatcgcaactctctactgtttctccatacccg
AP2 + + - agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttt
tatcctgacgctttttatcgcaactctctactgtttctccatacccg
AP3 + - + agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatc
ctacctgacgtgcgcctgccgtccaaagtaatatccttacatacccg
AP4 + + + agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttt
tatcctgacgtgcgcctgccgtccaaagtaatatccttacatacccg
AP5 +(78)*** + + agaaaccaattgtccatattgcatcagacattgccgtcacattgattatttgcacggcgtcacactttgctatgccatagcaaga
tagtccataagattagcgtttttatcctgacgtgcgcctgccgtccaaagtaatatccttacatacccg
AP6 - + - acattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttttatcctgacgctttttat
cgcaactctctactgtttctccatacccg
AP7 - + + acattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttttatcctgacgtgcgcctg
ccgtccaaagtaatatccttacatacccg
AP8 - - + acattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgacgtgcgcctg
ccgtccaaagtaatatccttacatacccg
AP9 - - - acattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgacgctttttatc
gcaactctctactgtttctccatacccg

* AP refers to the initials of the principal designer of the araBAD promoters. ** (-) in Rationale 1 refers to complete removal of both araO2 and the spacing, but still leaves the CAP binding site. *** AP5 has a spacing of 78 bp instead of the common 56 bp.

Table 1: Sequences and brief description of nine rationally designed P_BAD.

Experiment

Overview

To test the designed araBAD promoters, we cloned nine respective Level 0 promoter parts into Level 1 JUMP constructs [10], using a standard Golden Gate Assembly protocol with BsaI-HFv2. Lying downstream of the promoter is a B0032 medium RBS, an mVenus reporter and a DT5 double terminator. Our rationale for choosing mVenus - a relatively weak YFP, rather than superfolderGFP, is discussed in the Design page. A medium-strength RBS was chosen to avoid burden caused by accidental overexpression of mVenus. All TUs were put into a pJUMP27-1A destination vector. The low copy number plasmid makes it favourable for avoiding phototoxicity and aggregate bodies, as well as reducing noise. A list of constructs ID and their respective constituent promoter is given below:

LVL1 ID Promoter
PB1 AP1
PB2 AP2
PB3 AP3
PB4 AP4
PB5 AP5
PB6 AP6
PB7 AP7
PB8 AP8
PB9 AP9
PB10 BBa_K2442101
PBneg Dummy control

Table 2: Eleven Level-1 PB constructs and their corresponding constituent promoters.

Figure 5.
Figure 5: Schematic representation (in SBOL) of Level 1 PB constructs.

Golden-gate-assembled mixtures were chemically transformed into DH5𝛼 cells for selection on Kanamycin plates. cPCR was carried out on seemingly-good colonies to screen for bands with correct size. Colonies with successful assemblies, indicated by cPCR, were liquid cultured for next-day miniprep. The retrieved plasmids, once verified through Sanger sequencing, were then transformed into a Marionette-Wild strain by electroporation. We chose Marionette as the testing strain because it is highly optimised as an arabinose sensor (For further details on the host-construct orthogonality, see the Design page)[2]. Colonies of Marionette cells (selected on Kanamycin plates) were inoculated in 2 mL Neidhardt EZ Rich Defined Medium (EZRDM) and shaking-incubated overnight for 20 hours, at 37oC and 250 rpm.

Figure 6.
Figure 6: Schematic representation of mechanism of action for LVL1 testing circuits in Marionette strain.

We conducted two experiments to validate the nine P_BADs against the wild-type counterpart and a negative control. The negative control (termed PBneg) was a dummy construct with similar length, containing mVenus CDS, but no araBAD promoters. Both the wild-type (PB10) and the PBneg constructs were cloned using the method described above.

Both experiments would involve measuring mVenus fluorescence intensity (FI) of transformed Marionettes grown under varying arabinose concentrations. FI was measured using a ClarioStar plate reader, on Greiner F-bottom 96-well plates or standard Greiner 384-well plates.

Two-level Factorial Screen

We first investigated the leakiness and maximal strength of the eight P_BADs. We chose arabinose concentrations of 0 uM and 1000 uM as the testing conditions. We shall, for now, not take Hill’s dynamics into account, and focus solely on the two concentration endpoints that best showcase the promoters’ two characteristics. We also calculated the fold-change (or fold-induction), defined as the ratio of maximal expression to leakiness:

fold change = Max expression/Leakiness

Each sample was done in four replicates. Plate reader setting is set as below. The raw data are plotted in Figure 8,9 and 10.

Figure 7.
Figure 7: ClarioStar plate reader setting for fluorescence intensity measurement, using Greiner 96 F-bottom plate.


Figure 8: Time-dependent FI measurement of all PB constructs at 0uM and 1000uM arabinose concentrations.



Figure 9: Time-dependent OD600 measurement showing cell growth of transformed Marionettes at 0μM and 1000μM arabinose concentrations.



Figure 10: Performances (leakiness, maximal expression and fold-change) of Marionettes with different PB constructs, measured by fluorescence intensity (a.u.). The scale of the circles indicates the degree of fold-change.

Hills Induction Assay

We also hoped to characterise classical parameters of inducible systems, namely, the dissociation constant KD and the Hill’s coefficient n. Therefore, we carried out a combinatorial induction assay with arabinose concentrations ranging from 0, 10, 25, 50, 75, 100, 125, 200, 500, 1000 and 2000 uM. For this testing, we used a 384-well plate instead to test all constructs. We were very fortunate to have an Opentron OT-2 in the lab, which is a liquid-handling robot. The robot allows fast and automated plate preparation, especially when the number of wells and testing conditions exceeds manual pipetting capacity (such as in this case). The layout of the plate, before any experiment, is generated and then executed into the programme through a Jupyter Notebook pipelined by our instructor, Camillo Moschner. We also took this opportunity to be trained on operating liquid-handling automation [11].

Figure 11.
Figure 11 : Front view of the Opentron OT-2 work station, showing the automated pipette arm, tips and tubes holders.

The plate reader setting for Greiner 384 is outlined below.

Figure 12.
Figure 12: Setting of ClarioStar plate reader for Fluorescence Intensity measurement - Gain 1500.
Figure 13.
Figure 13: Setting of ClarioStar plate reader for absorbance (OD600) measurement


Figure 14: Time-dependent FI measurement of all PB constructs under varying arabinose concentrations.

Raw data are plotted in Figure 15 and 16.



Figure 15: Time-dependent OD600 measurement showing cell growth of transformed Marionettes under varying arabinose concentrations.

We selected PB1, PB2, PB3, PB8, PB9 and PB10 and used the template equation given in Meyer et al (2019) for least-square regression [2]:



$$y = y_{min} + \left(y_{max}-y_{min}\right)\frac{x^n}{K^n + x^n}$$



Where y is the relative promoter unit, ymin is the leakiness, ymax is the maximal expression, x is concentration of arabinose, n is Hill’s coefficient and K is the dissociation constant. An example plot for the fitted curve of PB10 is given in Figure 16, and the six fitted functions are collected in a single plot in Figure 17.



Figure 16: Fitted Hill’s function of PB10 (containing wild-type P_BAD) under varying arabinose concentrations.



Figure 17: Fitted Hill’s functions of selected PB constructs (PB1, PB2, PB3, PB8, PB9 and PB10) under varying arabinose concentrations.

Discussion and Outlook

Result from the screen (Figure 10) shows that PB1, PB2, PB3 and PB8 constructs (corresponding to promoter AP1, AP2, AP3 and AP8) showcased higher maximal expression but maintained same level of leakiness as that of PB10 (wild-type P_BAD). AP2 and AP3 are definite contenders for significant improvements, being only half the length of their wild-type counterpart. We do not consider here AP8 as an improvement in our project, as this single rationale was fully accredited by the 2013 DTU Team. Still, it is worthwhile to note that in their previous characterization, the SPL (-35)-to-(-10) segment was incorporated into the wildtype length. Here, we put the optimised segment into the truncated 130 bp length, which also showed predictably high fold-change. Table 3 summarised the fitted parameters KD and n of the six selected constructs. Note that, these fitted values should be treated as relative - it may vary between different testing strains, construct constituents and experimental conditions.

Promoter KD (fitted) (μM) n (fitted)
AP1 29.6 0.54
AP2 31.1 0.56
AP3 32.9 0.54
AP8 64.5 0.63
AP9 140 0.67
AP10 179 1.25

Table 3: Fitted parameters obtained from least-squared curve fitting of six selected constructs.

We had slight issues with the induction assay. The experimental design failed to take into account the arabinose concentration range between 0 and 10µM. Therefore, for PB5, PB6, and PB7, we could not investigate the kinetics within this range. Our preliminary fitted curves for these functions (not shown) indicated a very steep, ‘all-or-nothing’ behaviour, and their maximal expression starts to plateau from even as low as 10uM. We hypothesised such observations with two possibilities:

More careful testing must be done to verify these scenarios.

It was, perhaps not very surprising, that AP4 (containing all three optimizations) did not live up to the expectation. We hypothesised that the modification in araI sites interfered with the SPL segment, which severely reduces the maximal strength. We also observed the same phenomenon between AP6 (- + +) and AP8 (- - +). From the time-dependent FI curve, we observed that cells with PB4 construct (containing AP4) consistently showed a strong fluorescence burst, followed by a quick drop. Furthermore, beyond 10uM, PB4 shared similar behaviour with the three promoters above in that FI is relatively unchanged (possibly reaching maximum already). Glancing at OD600 graphs, we noted that PB4 cell viability is comparable to the rest, with no sign of impeded growth. We hypothesised that the FI drop could be due to an intrinsic mechanism that degrades mVenus protein, or somehow purposely interferes with the transcription mechanism. It might also be possible that the overengineered mechanism causes a ‘’congestion’’ of araC onto the araI1-araO2 site, which permanently blocks transcription. Though PB4 is unable to serve as an improved part, future investigations could generate very interesting knowledge about the underlying emergent mechanism of this P_BAD.

Moreover, AP5, which is similar to AP4 but instead with a 78 bp spacing, showed some promise for its towering maximal strength, but is unfortunately devalued by the higher leakiness compared to wild-type. It is interesting to note that AP5 did not suffer the burst-effect like AP4 did; therefore, it can be reasonably assumed that the spacing of 56 bp did have a drastic influence on AP4’s performance.

We are optimistic that AP9, despite lacking all three optimizations and thus being the most truncated, can be of use in certain circumstances where an inducible promoter with low expression is favoured. We therefore also made the sequence available on the Registry.

We envisioned that the significant reduction in the promoter’s size can be of economical advantage for future circuit designs. With less than 150 bp, the promoter can be synthesised by oligo-annealing, which is significantly faster and cheaper than ordering DNA parts. This could also be of great benefit for future iGEM teams from regions where DNA synthesis and delivery are often slow, allowing them to speed up project timelines.

We still hope to take on a continuation of this side project to fully achieve our initial goal of making a minimised araBAD promoter that has both lower leakiness and higher fold-change.

References

  1. Aoki, S. K., Lillacci, G., Gupta, A., Baumschlager, A., Schweingruber, D., & Khammash, M. (2019, June 19). A universal biomolecular integral feedback controller for robust perfect adaptation. Nature News.
  2. Meyer, A. J., Segall-Shapiro, T. H., Glassey, E., Zhang, J., & Voigt, C. A. (2018, November 26). Escherichia coli "marionette" strains with 12 highly optimized small-molecule sensors. Nature News.
  3. Moschner, Camillo, Wedd, C., and Bakshi, S. (2022). The context matrix: Navigating biological complexity for advanced biodesign. Frontiers in Bioengineering and Biotechnology 10. doi:10.3389/fbioe.2022.954707.
  4. Niland, P., Hühne, R., & Müller-Hill, B. (2002, May 25). How arac interacts specifically with its target dnas. Journal of Molecular Biology.
  5. Zhang, X., Reeder, T., & Schleif, R. (2002, May 25). Transcription activation parameters Atara pBAD. Journal of Molecular Biology.
  6. Dunn, T. M., Hahn, S., Ogden, S., & Schleif, R. F. (1984). An operator at -280 base pairs that is required for repression of arabad operon promoter: Addition of DNA helical turns between the operator and promoter cyclically hinders repression. Proceedings of the National Academy of Sciences, 81(16), 5017–5020. https://doi.org/10.1073/pnas.81.16.5017
  7. Lee, D.-H., & Schleif, R. (n.d.). In vivo DNA loops in Aracbad: Size limits and helical repeat. https://www.pnas.org/doi/10.1073/pnas.86.2.476
  8. Lee, N., Francklyn, C., & Hamilton, E. P. (1987). Arabinose-induced binding of ARAC protein to araI2 activates the Arabad operon promoter. Proceedings of the National Academy of Sciences, 84(24), 8814–8818. https://doi.org/10.1073/pnas.84.24.8814
  9. Reeder, T., & Schleif, R. (2002, May 25). ARAC protein can activate transcription from only one position and when pointed in only one direction. Journal of Molecular Biology. Retrieved October 11, 2022
  10. Valenzuela-Ortega, M., & French, C. (2021, February 2). Joint Universal Modular Plasmids (JUMP): A flexible vector platform for synthetic biology. OUP Academic. Retrieved October 11, 2022
  11. Moschner Camillo, Wedd Charlie, Hardo Georgeos, Bakshi Somenath (2022). The iBioFoundry: Automated, Low-Cost, High-Throughput Experimentation (IWBDA manuscript accepted)