And we go again…Design- Build- Test
For the next set of experiments, we ordered the mutants of cspF (just nine nucleotides in length) to see if they could sufficiently upregulate or downregulate expression levels. For this purpose, we decided to make single-point substitution mutations in the sequence of cspF. The ordered sequences are referred to as SDM (Site-Directed Mutant), and the parts are also named in this fashion.
This generated a library of 27 potential sequences of UTRs to test. However, we decided to pick only 10 sequences to start. To make this choice, we predicted the expression of all sequences in OSTIR with the pre-start codon cut-off set to 38 and the post-start codon cut-off set to 35. After ranking 26 mutants (one of them was an illegal sequence) in decreasing order of predicted expression, we chose ten evenly spaced mutants from this set. In the earlier batch, we had ordered Promoter-UTR (pUTR) BioBricks. We realised that it would be much faster and more convenient to order the forward and reverse "mega-primers" with a single nucleotide mismatch and create the entire plasmid using PCR from the existing cspF plasmids. In this iteration, we also took the readings in the log and stationary phases, as suggested by Prof Cameron.
We have also done project Modelling using neural networks to make an attempt to correlate the expression values we specifically obtain in this iteration of engineering cycle to our own model. We have already predicted the expression of these nucleotides in Salis' RBS calculator version 2 and OSTIR and through our neural network which we trained on the basis of the data available in literature. The details of this are given on the Project Modelling Page.
Following is the list of CspF mutants we designed, built oligos for and ordered:
Sr. No. | Mutant No. | Original Seq | Mutated Seq | Mutation Positions | Mutations | Expression by OSTIR | % Original | dG_total |
1 | SDM9 | GGAATTTTT | GGAACTTTT | [5] | [('T', 'C')] | 785.1623 | 45 | 1.5332 |
2 | SDM26 | GGAATTTTT | GGACTTTTT | [4] | [('A', 'C')] | 1613.1365 | 92 | -0.2668 |
3 | SDM17 | GGAATTTTT | GGGATTTTT | [3] | [('A', 'G')] | 2504.7979 | 143 | -1.3668 |
4 | SDM10 | GGAATTTTT | GGAAGTTTT | [5] | [('T', 'G')] | 3059.4055 | 175 | -1.8668 |
5 | SDM23 | GGAATTTTT | GGAATTTTA | [9] | [('T', 'A')] | 4944.3759 | 283 | -3.0668 |
6 | SDM5 | GGAATTTTT | GGAATTTTC | [9] | [('T', 'C')] | 6809.1742 | 390 | -3.8668 |
7 | SDM2 | GGAATTTTT | GGAGTTTTT | [4] | [('A', 'G')] | 7087.0796 | 406 | -3.9668 |
8 | SDM3 | GGAATTTTT | GGAATATTT | [6] | [('T', 'A')] | 7990.7202 | 457 | -4.2668 |
9 | SDM19 | GGAATTTTT | GTAATTTTT | [2] | [('G', 'T')] | 9009.5798 | 516 | -4.5668 |
10 | SDM27 | GGAATTTTT | GGAATTTCT | [8] | [('T', 'C')] | 9760.0103 | 559 | -4.7668 |
We tested these sequences out in two strains of E. coli: Dh5alpha, a cloning strain and BL21(DE3), an expression strain. The analysis suggests that there is not much of a variation in the F/OD values. This suggests that probably in some cases, changing just one nucleotide does not change the expression. This may have structural correlations.
NUPACK analysis also reveals that while calculating the relative performance with respect to cspF, we realised that if a G/C in the original sequence GGAATTTTT was changed to an A/T then the F/OD values went lower. Whereas, if the vice versa was done, the F/OD value increased. We already have an experimentally validated library that gives a range of expression with respect to just the RBS (negative control). Our further designs will be based in learnings and developments from this experiment.
We wish to modify the GC content of cspF by making multiple mutations in one sequence while maintaining its structural scaffold in the third engineering cycle. The next aspect that we wish to check is that keeping the structural scaffold of the cspF intact we wish to add additional structural complexities of different levels to it through our sequence and see how the gene expression changes.