We successfully engineered two improved versions of the araBAD promoter, which are parts of a new collection of pBAD_APs. All the designed pBADs have truncated length of less than 160 base-pairs, with some even showing higher performance than the 300 bp wild-type counterpart. We are thus qualified for the Gold criterion - Part Improvement.
Originally, we had hoped to reuse several components in the RPA circuit from Aoki et al (2019) for our strategic design. The process species was chosen to be AraC, which controls the araBAD promoter (dubbed P_BAD) and is inducible by arabinose [1]. However, we were concerned that the common wild-type P_BAD is relatively large, topping at around 300 bp, which increases the bulkiness of our LVL2 construct and reduces efficiency. Also, in an interview with Dr. Aoki herself (see Human Practice) shared that one caveat in her previous circuit design involved araC-B_BAD system. P_BAD, by nature, is a relatively leaky and medium-strength promoter, therefore, more araC is needed to fully reach maximal activation. However, she emphasised that overexpression of araC is toxic to cells, and thus can drastically impede performance of the circuit.
We were faced with two options: choose an alternative regulatory system, or find a more superior version of araBAD promoter in literature. Logically, our initial hope was to tackle both paths and use one as a back-up. Through part-mining, we saw that Meyer et al (2019) had designed several different P_BADs. One of which, called P_BADmin, is significantly shorter and shows lower leakiness than wild-type P_BAD due to rational macro-deletion of an upstream segment, but suffers lower maximal activity by more than an order of magnitude [2]. After considering both this hard trade-off and the host-construct non-orthogonality (interfering the endogenous arabinose operon) [3], we rationalised that this inducible system would not be suitable for the circuit implemented in E.coli strain DH5alpha, and switched to using the orthogonal CinR-pCinR pair.
Still, as planned, we thought it would be an interesting side project for Part Improvement to redesign the existing wild-type promoter. Our initial goal was to engineer a P_BAD with minimal length, but shows both lower leakiness and higher maximal activity (ambitious, I know!).
To achieve this, we did intensive literature search for prior optimization strategies. To understand the rationales, we also gathered information on the regulatory mechanism of the promoter, which was presumably only prevalent in papers from the 1970s. A brief explanation of the mechanism is discussed below.
In the end, we managed to create two araBAD promoters with significant improvements compared to the wild-type version. These promoters are advantageous in that they:
Information of the pBAD collection is stored in the BBa_K4491007 entry. A brief recap is also documented on the existing Part page, BBa_K2442101.
The exact definition of the araBAD promoter varies ambiguously between sources. Historically, P_BAD only refers to a short segment upstream of the +1 transcription start site (referred to as the core promoter), containing the -35 and -10 boxes [4]. However, as a complete promoter Part, the regulatory region further upstream is also included. In the following discussion, our definition of P_BAD refers to the whole sequence consisting of both the regulatory sequence and the core promoter.
The araBAD promoter regulates the araBAD operon and is controlled by araC - a regulatory protein known for its “love-hate” mechanism of action. The upstream regulatory region consists of various protein binding sites - araI1, araI2, araO1 and araO2 can be occupied by araC. Between araI1 and araO1 also lies a CAP binding site, which recruits Catabolite receptor protein for transcription activation [5]. The spacing between araO1 and O2 is noticeably large, reaching nearly 210 bp, which, unsurprisingly, is responsible for the overall bulkiness of the promoter. Interestingly, a region within this spacer contains a promoter of araC gene, which runs in the opposite direction to the araBAD operon. This promoter is regulated by araO1 just downstream. The araC protein therefore not only regulates the araBAD operon but also controls its own production by binding to araO1 region (negative autoregulation). However, our focus will be solely on the regulation of the P_BAD promoter.
In the absence of L-arabinose, transcription is repressed by the action of two araC molecules binding to the structure. One araC binds to araI1 and the other binds to araO2 further upstream. The dimerization domain confers high affinity, bringing the two protein molecules closer to dimerize, essentially creating a DNA loop as a result. This looping mechanism prevents any sigma factors, RNA Polymerase or CRP from being recruited, thus repressing transcription.
In the presence of L-arabinose, binding of the sugar molecule to the arabinose-binding domain triggers a conformational change in the DNA-binding domain of araC, which reduces its affinity to distal araO2 site. This makes the protein more favourable to bind to araI2, just downstream of araI1. Therefore, the dimerized complex is formed at araI1-araI2 site instead of araO2-araI1, which breaks the loop and hence activates transcription. The dimer araC also “nudges” the RNA Polymerase for enhanced transcription.
Understanding the overall mechanism of araBAD promoter allowed us to pinpoint several aspects of the structure that can be rationally improved.
To minimise the length of P_BAD, we investigated the presence of araO2 site, as well as the 211 bp spacer between araI1 and araO2 (this region includes CAP binding sites and araO1). Previous literature suggested that complete or even partial deletions of araO2 would increase leaky expression of P_BAD, emphasising that the existence of this region is essential for normal repression [6]. Strangely, most commercially available sequences for P_BAD (in CIDAR MoClo kit, for example) omitted the araO2 site, so its significance is still uncertain.
The spacing will dictate the size of the loop, which also controls the level of repression under the absence of arabinose and determines the promoter’s leakiness. Here, the spacing is defined to be between position -59 within araI1 and -270 within araO2 (underlined and asterisked). It was shown that there are no lower bounds for loop size - a functional araBAD promoter was designed with a 34-bp loop, after deleting a significant portion of the spacer [7]. While it maintained similar leakiness to that of the wild-type counterpart, the loss of CAP binding site drastically reduced the maximal strength. Still, the finding demonstrated the great flexibility of bulky araC proteins in mediating small loop formation.
Another important observation was that as the spacing varied, the promoter activity oscillated with a 11.1 bp periodicity. Specifically, any insertion or deletion of integer multiples of 5bp noticeably increased leakiness, while integer multiples of 11.1 bp retained wild-type’s full repression [8]. This value was determined to be approximately equivalent to one helical repeat of DNA (10.5 bp). It was further explained that insertion of 5 bp between araO2 and araI1 would rotate one site halfway around the DNA double helix with respect to the other and impede repression. Despite araC’s flexibility, the torsional stress of DNA makes such looping much more energetically unfavourable.
Taking into account these results, our first design strategy is to reduce the spacer region down to 56 bp while still maintaining the CAP binding site downstream. We removed the araO1 site completely due to its minor role on P_BAD activity. The schematic of our rationale is depicted below.
We then investigated araI1 and araI2 17-bp regions, both containing two unique sites called the A- and B-box, which serve as specific binding sites for araC. In previous literature, Niland et al (1996) showed that any single base-pair substitution occurring in these two sites would drastically reduce binding of araC. Flanked between the two boxes are seven invariant nucleotides that, upon selected single substitution, demonstrated higher binding affinity to araC by somewhat 140% compared to that of wild-type araI1 [4]. We therefore modified the araI1 site to contain all the different substitutions which initially yielded tighter binding, while keeping the A- and B-boxes unchanged.
In another paper, Reeder (1993) found that the B-box of araI2 overlaps with four base pairs of the -35 consensus sequence [9]. Thus, any substitution in this box will negatively impact P_BAD’s activity, either resulting in very high leaky expression or lowered inducibility. However, the author remarked that araI1 has much higher affinity to araC than araI2 does, especially when no arabinose is present. This makes sense, as araC prefers binding to distal araO2 and araI1 than the nearby araI2. From this insight, we questioned whether a duplicate araI1-I1 may confer higher maximal activity than wild-type araI1-I2. We then sought to change the last nucleotide of the A-box and the interbox sequence of araI2 to that of wild-type araI1, but still leaving the araI2 B-box untouched. In a sense, we created a chimeric half-araI1-half-araI2 in place of wild-type araI2. We did not duplicate the entire araI1 as this would affect the overlapping -35 consensus sequence and make the promoter extremely leaky.
We group the modifications for both araI1 and araI2 as our second strategy.
Our final design input comes from the work of the 2013 DTU iGEM team (see here). They managed to create a synthetic promoter library (SPL) for araBAD promoter by randomly mutating different base-pairs between the -35 and -10 boxes, right downstream of araI2. We decided to use the Col15 sequence, which showed promising low level of leakiness and high induced strength.
We finally opted for a combinatorial approach for the three strategies, and thus have initially designed 23 = 8 different P_BADs, with or without the preceding optimizations. After some thoughts, we introduced another design with a slightly larger spacing between araO2 and araI1 (PB5). These designs were tested against the wild-type, full-length araBAD promoter, corresponding to part BBa_K2442101 in the registry (see here). We were aware that while an individual strategy may yield noticeable improvements, this might not be true when combining them together. In fact, some can result in antagonistic effects rather than the desired synergy. Still, we hoped that within the different permutations, some good designs may emerge. We also thought that, given this variety of promoter expression, we should not restrict our aim to only creating a single better P_BAD, but also making a “family” of the promoter with different strengths, such as weak, medium and strong, for various purposes (in some cases, lower maximal activity might be necessary).
Identifier* | Rationale 1** | Rationale 2 | Rationale 3 | Part sequences |
---|---|---|---|---|
AP1 | + | - | - | agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatc ctacctgacgctttttatcgcaactctctactgtttctccatacccg |
AP2 | + | + | - | agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttt tatcctgacgctttttatcgcaactctctactgtttctccatacccg |
AP3 | + | - | + | agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatc ctacctgacgtgcgcctgccgtccaaagtaatatccttacatacccg |
AP4 | + | + | + | agaaaccaattgtccataattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttt tatcctgacgtgcgcctgccgtccaaagtaatatccttacatacccg |
AP5 | +(78)*** | + | + | agaaaccaattgtccatattgcatcagacattgccgtcacattgattatttgcacggcgtcacactttgctatgccatagcaaga tagtccataagattagcgtttttatcctgacgtgcgcctgccgtccaaagtaatatccttacatacccg |
AP6 | - | + | - | acattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttttatcctgacgctttttat cgcaactctctactgtttctccatacccg |
AP7 | - | + | + | acattgattatttgcacggcgtcacactttgctatgccatagcaagatagtccataagattagcgtttttatcctgacgtgcgcctg ccgtccaaagtaatatccttacatacccg |
AP8 | - | - | + | acattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgacgtgcgcctg ccgtccaaagtaatatccttacatacccg |
AP9 | - | - | - | acattgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgacgctttttatc gcaactctctactgtttctccatacccg |
* AP refers to the initials of the principal designer of the araBAD promoters.
** (-) in Rationale 1 refers to complete removal of both araO2 and the spacing, but still leaves the CAP binding site.
*** AP5 has a spacing of 78 bp instead of the common 56 bp.
Table 1: Sequences and brief description of nine rationally designed P_BAD.
To test the designed araBAD promoters, we cloned nine respective Level 0 promoter parts into Level 1 JUMP constructs [10], using a standard Golden Gate Assembly protocol with BsaI-HFv2. Lying downstream of the promoter is a B0032 medium RBS, an mVenus reporter and a DT5 double terminator. Our rationale for choosing mVenus - a relatively weak YFP, rather than superfolderGFP, is discussed in the Design page. A medium-strength RBS was chosen to avoid burden caused by accidental overexpression of mVenus. All TUs were put into a pJUMP27-1A destination vector. The low copy number plasmid makes it favourable for avoiding phototoxicity and aggregate bodies, as well as reducing noise. A list of constructs ID and their respective constituent promoter is given below:
LVL1 ID | Promoter |
---|---|
PB1 | AP1 |
PB2 | AP2 |
PB3 | AP3 |
PB4 | AP4 |
PB5 | AP5 |
PB6 | AP6 |
PB7 | AP7 |
PB8 | AP8 |
PB9 | AP9 |
PB10 | BBa_K2442101 |
PBneg | Dummy control |
Table 2: Eleven Level-1 PB constructs and their corresponding constituent promoters.
Golden-gate-assembled mixtures were chemically transformed into DH5𝛼 cells for selection on Kanamycin plates. cPCR was carried out on seemingly-good colonies to screen for bands with correct size. Colonies with successful assemblies, indicated by cPCR, were liquid cultured for next-day miniprep. The retrieved plasmids, once verified through Sanger sequencing, were then transformed into a Marionette-Wild strain by electroporation. We chose Marionette as the testing strain because it is highly optimised as an arabinose sensor (For further details on the host-construct orthogonality, see the Design page)[2]. Colonies of Marionette cells (selected on Kanamycin plates) were inoculated in 2 mL Neidhardt EZ Rich Defined Medium (EZRDM) and shaking-incubated overnight for 20 hours, at 37oC and 250 rpm.
We conducted two experiments to validate the nine P_BADs against the wild-type counterpart and a negative control. The negative control (termed PBneg) was a dummy construct with similar length, containing mVenus CDS, but no araBAD promoters. Both the wild-type (PB10) and the PBneg constructs were cloned using the method described above.
Both experiments would involve measuring mVenus fluorescence intensity (FI) of transformed Marionettes grown under varying arabinose concentrations. FI was measured using a ClarioStar plate reader, on Greiner F-bottom 96-well plates or standard Greiner 384-well plates.
We first investigated the leakiness and maximal strength of the eight P_BADs. We chose arabinose concentrations of 0 uM and 1000 uM as the testing conditions. We shall, for now, not take Hill’s dynamics into account, and focus solely on the two concentration endpoints that best showcase the promoters’ two characteristics. We also calculated the fold-change (or fold-induction), defined as the ratio of maximal expression to leakiness:
fold change = Max expression/Leakiness
Each sample was done in four replicates. Plate reader setting is set as below. The raw data are plotted in Figure 8,9 and 10.
Figure 8: Time-dependent FI measurement of all PB constructs at 0uM and 1000uM arabinose concentrations.
Figure 9: Time-dependent OD600 measurement showing cell growth of transformed Marionettes at 0μM and 1000μM arabinose concentrations.
Figure 10: Performances (leakiness, maximal expression and fold-change) of Marionettes with different PB constructs, measured by fluorescence intensity (a.u.). The scale of the circles indicates the degree of fold-change.
We also hoped to characterise classical parameters of inducible systems, namely, the dissociation constant KD and the Hill’s coefficient n. Therefore, we carried out a combinatorial induction assay with arabinose concentrations ranging from 0, 10, 25, 50, 75, 100, 125, 200, 500, 1000 and 2000 uM. For this testing, we used a 384-well plate instead to test all constructs. We were very fortunate to have an Opentron OT-2 in the lab, which is a liquid-handling robot. The robot allows fast and automated plate preparation, especially when the number of wells and testing conditions exceeds manual pipetting capacity (such as in this case). The layout of the plate, before any experiment, is generated and then executed into the programme through a Jupyter Notebook pipelined by our instructor, Camillo Moschner. We also took this opportunity to be trained on operating liquid-handling automation [11].
The plate reader setting for Greiner 384 is outlined below.
Figure 14: Time-dependent FI measurement of all PB constructs under varying arabinose concentrations.
Raw data are plotted in Figure 15 and 16.
Figure 15: Time-dependent OD600 measurement showing cell growth of transformed Marionettes under varying arabinose concentrations.
We selected PB1, PB2, PB3, PB8, PB9 and PB10 and used the template equation given in Meyer et al (2019) for least-square regression [2]:
$$y = y_{min} + \left(y_{max}-y_{min}\right)\frac{x^n}{K^n + x^n}$$
Where y is the relative promoter unit, ymin is the leakiness, ymax is the maximal expression, x is concentration of arabinose, n is Hill’s coefficient and K is the dissociation constant. An example plot for the fitted curve of PB10 is given in Figure 16, and the six fitted functions are collected in a single plot in Figure 17.
Figure 16: Fitted Hill’s function of PB10 (containing wild-type P_BAD) under varying arabinose concentrations.
Figure 17: Fitted Hill’s functions of selected PB constructs (PB1, PB2, PB3, PB8, PB9 and PB10) under varying arabinose concentrations.
Result from the screen (Figure 10) shows that PB1, PB2, PB3 and PB8 constructs (corresponding to promoter AP1, AP2, AP3 and AP8) showcased higher maximal expression but maintained same level of leakiness as that of PB10 (wild-type P_BAD). AP2 and AP3 are definite contenders for significant improvements, being only half the length of their wild-type counterpart. We do not consider here AP8 as an improvement in our project, as this single rationale was fully accredited by the 2013 DTU Team. Still, it is worthwhile to note that in their previous characterization, the SPL (-35)-to-(-10) segment was incorporated into the wildtype length. Here, we put the optimised segment into the truncated 130 bp length, which also showed predictably high fold-change. Table 3 summarised the fitted parameters KD and n of the six selected constructs. Note that, these fitted values should be treated as relative - it may vary between different testing strains, construct constituents and experimental conditions.
Promoter | KD (fitted) (μM) | n (fitted) |
---|---|---|
AP1 | 29.6 | 0.54 |
AP2 | 31.1 | 0.56 |
AP3 | 32.9 | 0.54 |
AP8 | 64.5 | 0.63 |
AP9 | 140 | 0.67 |
AP10 | 179 | 1.25 |
Table 3: Fitted parameters obtained from least-squared curve fitting of six selected constructs.
We had slight issues with the induction assay. The experimental design failed to take into account the arabinose concentration range between 0 and 10µM. Therefore, for PB5, PB6, and PB7, we could not investigate the kinetics within this range. Our preliminary fitted curves for these functions (not shown) indicated a very steep, ‘all-or-nothing’ behaviour, and their maximal expression starts to plateau from even as low as 10uM. We hypothesised such observations with two possibilities:
More careful testing must be done to verify these scenarios.
It was, perhaps not very surprising, that AP4 (containing all three optimizations) did not live up to the expectation. We hypothesised that the modification in araI sites interfered with the SPL segment, which severely reduces the maximal strength. We also observed the same phenomenon between AP6 (- + +) and AP8 (- - +). From the time-dependent FI curve, we observed that cells with PB4 construct (containing AP4) consistently showed a strong fluorescence burst, followed by a quick drop. Furthermore, beyond 10uM, PB4 shared similar behaviour with the three promoters above in that FI is relatively unchanged (possibly reaching maximum already). Glancing at OD600 graphs, we noted that PB4 cell viability is comparable to the rest, with no sign of impeded growth. We hypothesised that the FI drop could be due to an intrinsic mechanism that degrades mVenus protein, or somehow purposely interferes with the transcription mechanism. It might also be possible that the overengineered mechanism causes a ‘’congestion’’ of araC onto the araI1-araO2 site, which permanently blocks transcription. Though PB4 is unable to serve as an improved part, future investigations could generate very interesting knowledge about the underlying emergent mechanism of this P_BAD.
Moreover, AP5, which is similar to AP4 but instead with a 78 bp spacing, showed some promise for its towering maximal strength, but is unfortunately devalued by the higher leakiness compared to wild-type. It is interesting to note that AP5 did not suffer the burst-effect like AP4 did; therefore, it can be reasonably assumed that the spacing of 56 bp did have a drastic influence on AP4’s performance.
We are optimistic that AP9, despite lacking all three optimizations and thus being the most truncated, can be of use in certain circumstances where an inducible promoter with low expression is favoured. We therefore also made the sequence available on the Registry.
We envisioned that the significant reduction in the promoter’s size can be of economical advantage for future circuit designs. With less than 150 bp, the promoter can be synthesised by oligo-annealing, which is significantly faster and cheaper than ordering DNA parts. This could also be of great benefit for future iGEM teams from regions where DNA synthesis and delivery are often slow, allowing them to speed up project timelines.
We still hope to take on a continuation of this side project to fully achieve our initial goal of making a minimised araBAD promoter that has both lower leakiness and higher fold-change.