Our project was divided into two wet-lab streams: Surface Display of Nanobodies and Construction of fuGFP-Cellulose Binding Domains fusion protein. These streams would come together to allow for the screening of novel nanobodies against GFPs generated through DNA shuffling.
The surface display side of our project focused on the expression and evolution of anti-sfGFP nanobodies to anti-fuGFP nanobodies through DNA shuffling.
Experiments on the surface display side of our project had the following main aims:
Experimentally, we were able to carry out all our aims through to the fourth step, and partway through the fifth step. These experiments are discussed in detail below:
Parts Design
The design for our final surface display construct followed Salema and Fernandez’s 2017 paper which detailed a process for displaying nanobodies on the surface of E. coli using Neae-intimin, a membrane protein found naturally in E. coli. We selected five anti-sfGFP nanobodies from the literature that we wanted to use in combination with Neae-intimin. Because of this, we wanted to create a system where we could first clone in the Neae-intimin, and then in a second phase clone in our nanobodies.
Surface display part
To design the Neae-intimin part, we adapted the Neae sequence from literature on Neae-Intimin for surface display (Salema & Fernandez 2017; Salema et al. 2013) and inserted a lacZ-alpha operon in place of the nanobody. We also inserted a flexible linker and a His tag in between the end of the Neae sequence and the lacZ-alpha sequence. On either side of the lacZ-alpha sequence, we put in Esp3I restriction sites that have matching overhangs with each of our nanobodies. This means that once the Neae part is inserted into our plasmid, we would be able to insert any one of our nanobody sequences using Golden Gate cloning. We removed any restriction sites from these sequences that would interfere with the BioBrick standard. Finally on either side of the Neae-intimin sequence, outside of the Esp3I sites, we put BsaI Golden Gate sites with overhangs that match the pUS250 plasmid backbone, as well as our standard adaptor sequences on the outside which allow us to use standard primers to amplify the ordered dsDNA.
Nanobody part
Through a literature search, we identified five different anti-GFP nanobodies to display on our E. coli. The five anti-GFP nanobodies were: AIF28002 (NB2), AIF28003 (NB3), AIF28006 (NB6), AIF28007 (NB7) - denoted by their NCBI codes - and 3OGOH (NBH) - denoted by its PDB code. The first four are all from the same paper (Twair et al. 2014) and the latter, 3OGOH, is from another (Kubala et al. 2010). To convert these sequences into parts, we removed restriction sites from the sequences, optimising any changes for expression in E. coli, and added Esp3I sites with overhangs compatible with our Neae part on either end. Finally, we added our standard adaptor sequences to either end.
Double blue-white screening
In the design of our Neae and nanobody parts, we utilised two layers of blue-white screening. The first layer of blue-white screening was from the amilCP gene present in our plasmid backbone, which produces a dark blue, almost purple color. The second layer of blue-white screening was from the lacZ-alpha operon present in our Neae part, which in combination with the chemical X-Gal, produces an aqua blue pigment. These two different kinds of blue white screening allowed us to screen colonies at both stages of our cloning strategy.
Neae Cloning and Transformation
The first step in the lab-work side of creating this surface display construct was cloning our Neae-lacZalpha part into our plasmid backbone, pUS250. This part contains the Neae-intimin protein complex which will display other proteins on the surface of the cell. In place of the displayed protein, we have inserted a lacZ-alpha gene flanked by two Golden Gate sites, allowing us to clone in our nanobodies and our surface display proteins in two different stages.
In order to clone our Neae-lacZalpha surface display complex into our plasmid backbone pUS250, we used the Golden Gate enzyme BsaI. We set up a Golden Gate reaction following the basic protocol listed on the Protocols page of our wiki.
Once we had created the pUS250-Neae plasmid with Golden Gate, we used heatshock transformation to transform the new plasmid into TOP10 E. coli. We then spread the ligation mixture onto LB agar plates that contained 50 μM kanamycin, which was the selective marker for pUS250.
Screening of Transfomants
Our plasmid backbone, pUS250, contains an amilcp region between its two multi-cloning sites. This gene makes E. coli create a dark blue, almost purple pigment. We know that any successful restriction cloning would have cut out that amilcp region, and replaced it with the insert. This means that any colonies on our transformation spread plates from the previous step that are amilCP blue do not have our Neae insert. Any colonies that are white, either have pUS250 Neae, or have no plasmid at all. We can be fairly certain that the former is the case for any white colonies, because if there was no plasmid present, there would be no conferred antibiotic resistance and the colony would not have grown. At this stage we had mistakenly plated onto X-Gal plates without IPTG. So while we had expected successful colonies to be lacZalpha blue, they were in fact white instead, due to the lack of IPTG. Some colonies with leaky expression turned out lacZalpha blue, but junction PCR (as below) confirmed lacZalpha blue and white colonies were identical. Once we realised our mistake, we re-plated our transformed cells on plates containing both X-Gal and IPTG. We tried a few different combinations of X-Gal and IPTG on the agar plates to figure out what combination of concentrations worked the best for our plasmid. We found that spreading 40 μL of 20 mg/mL X-Gal followed by 8 μL of 0.5 M IPTG produced the best blue-white screening results.
To further confirm which colonies contained our plasmid with the Neae insert, we performed a junction colony PCR on a few of the white colonies. To do this we used primers NVC15b and CFM23. Once we confirmed which colonies carried our successfully cloned plasmid, we made 5 mL broths of two of those colonies and purified the plasmid via the plasmid miniprep protocol.
.Nanobody Cloning
The second stage of cloning for this construct relies on the Esp3I Golden Gate sites that we included in our Neae-lacZalpha part. These sites allow us to clone in any one of our nanobody parts, and be able to use X-Gal blue-white screening to see which colonies contain successful clones.
We set up 5 different Golden Gate cloning reactions using Esp3I to clone in each of our 5 nanobodies (note: when setting up Golden Gate cloning with Esp3I, it is important to add ATP, as per our basic protocol). This yielded 5 different versions of the same plasmid, one for each of the five anti-GFP nanobodies (NBH, NB2, NB3, NB6, and NB7).
After creating these 5 different plasmids through Golden Gate cloning, they were each put into TOP10 E. coli through heat shock transformation. These transformations were plated onto IPTG X-Gal plates according to the concentrations we had previously optimised (40 μL of 20 mg/mL X-Gal, 40 μL of 0.1 M IPTG) so that colonies where cloning was unsuccessful would appear lacZalpha blue.
Once we had identified a few potentially successful colonies for each of the five nanobodies via the X-Gal blue-white screening, we performed a junction colony PCR using the primers iG22-1 and iG22-4 to confirm with more certainty which colonies contained plasmids with our inserts. Once these colonies were identified, their plasmids were purified via miniprep.
nb-gfp binding assay
In order to assess the success of our E. coli in expressing our construct, we devised a Nanobody-GFP binding assay (see Protocols for a granular description). 5 mL LB broths were inoculated with each nanobody containing-clone, and grown up overnight. ODs were measured and recorded for standardisation. 1 mL of cells were then pelleted and resuspended in equimolar amount of sfGFP, fuGFP and eGFPs. This was calculated assuming 7000 molecules of Neae expressing per cell, a number obtained from the literature (Salema et al. 2017). Cells were then washed in PBS by pelleting and resuspending. 50 μL of cells resuspended in PBS after two rounds of washing were then loaded onto a plate for plate reader fluorescence analysis. Measurements were taken at 400 nm for the fuGFP (which absorbs blue wavelengths of light) and at 470 nm for the sfGFP and eGFP. Results showed that four out of the five nanobodies appeared to be binding sfGFP, as intended, and none appeared to bind fuGFP. Only Nanobody H was binding eGFP.
As nanobody H was binding most strongly to sfGFP, we proceeded with another round of the assay for this nanobody in particular to validate it as a positive control for later testing of shuffled nanobodies. pUS250-Neae was also verified as a valid negative control at the same time. This was done with five replicates. Induced versus uninduced broths were added as another negative control.
As this assay was an invention of our team, it went through various iterations and refining each time we performed it, in order to strengthen the assay. The first round was done with no replicates, but each subsequent assay was done at least in duplicate, if not in triplicate or with five replicates. Induced versus uninduced broths were added as another negative control, and later, no GFP added negative controls were performed as well. By the time it came to test the binding of our shuffled nanobodies (as detailed in a later section of the experiments), we were fairly confident in the assay’s efficacy and ability to qualitatively determine whether a nanobody was binding strongly to a GFP or not.
DNA Shuffling
DNA shuffling (a.k.a. sexual PCR) is a method in which fragments of similar pieces of DNA are mixed together, cut in many places, and then reassembled at random. The goal of this process is to create many new variations on a given gene or fragment via a kind of artificial meiosis. Our project aimed to use DNA shuffling to create many random variants of anti-sfGFP nanobodies, with the eventual goal of finding a variant that would bind to fuGFP instead of sfGFP. This method was chosen because DNA shuffling was the method used to originally create fuGFP from sfGFP (Small Things Considered 2019).
The protocol we used to perform DNA shuffling was published in 2014 by Meyer et al.. This protocol was implemented with very few changes, but for clarity's sake you can find our entire detailed process written out in our protocols page.
The first step in DNA shuffling is to acquire a few different similar variants on the same gene. We did this in our parts design phase, where we found 5 different sequences for anti-sfGFP nanobodies, and ordered them as gBlocks. When designing these parts, we also created two sets of primers to be used in shuffling. One was the outer set of primers (CFM 129 and CFM130 which bind in the standard adaptor region flanking our nanobody parts), which is used in the initial amplification of fragments pre-shuffling, and the other was the inner set of primers, used to amplify the post-shuffling fragments. The two sets of primers need to be distinct because the shuffling process can degrade the ends of the DNA fragments, meaning that the annealing sites of the outer primers may not still be present after shuffling.
Using the outer set of primers, we PCR amplified each of our five anti-sfGFP nanobodies from our ordered gBlocks. These nanobodies were all combined in equal parts to create the DNA mixture that was put through the shuffling protocol.
The detailed, and reproducible, version of the shuffling protocol can be found here, but for those of you that want a more general overview, here a quick summary: First the five nanobody sequences (after being mixed in a tube and amplified via PCR) are cut up into small pieces by a nonspecific DNAse. Then these pieces are ligated back together. In this process, pieces from one nanobody will be ligated to pieces from another. This process is kind of like the crossing over you would find in the process of meiosis, resulting in pieces of DNA that are a mixture of the input fragments.
Once the shuffling protocol was completed, we amplified the final mix using the inner set of primers using PCR (iG22-3, iG22-4) Because the annealing sites of these primers include the Esp3I type IIS restriction site, this PCR yields a large collection of different nanobodies which still have the relevant Golden Gate cloning sites and overhangs.
Finally, this PCR product was purified via spin column purification to prepare it for cloning in the next phase of the project.
Shuffled NB Cloning
In much the same way as the cloning of the original five nanobodies into pUS250-Neae, the mixture of newly shuffled nanobodies (the purified PCR product from the last step of DNA shuffling) were inserted via Golden Gate cloning into pUS250-Neae using Esp3I. Then the resulting plasmids were then transformed via heat shock transformation into TOP10 E. coli. In order to generate a large enough library to screen for a colony that may bind fuGFP, 16 plates of transformants were plated, generating a library of approximately ~10k colonies.
Out of these plates, plate number 12 was an IPTG X-Gal control plate so that we could tell the efficiency of the clongin via blue-white screening. Unfortunately this batch of X-Gal IPTG plates were not functional and yielded no blue colonies at all.
Foolishly, we presumed that this meant that our transformation had just gone exceptionally well, and proceeded with sequencing and our screening protocol.
To prepare the samples for sequencing, we did a spanning PCR with primers iG22-1 and CFM4 on either side of the shuffled nanobody insert region. We then purified this PCR product and sent it for sanger sequencing with the primer iG22-2, which would amplify the nanobody region. When the sequencing results came back, we found that all of the colonies that we sequenced did not contain an insert at all, let alone a shuffled nanobody.
Shuffled NB Screening
In order to screen the shuffled nanobodies for binding to fuGFP, a protocol was developed by our team that harnessed the fuGFP-linker-cellulose binding domain fusion proteins the GFP-CBD side of our project had developed. See protocols for granular detail. The library was recovered and homogenised. Aliquots were then used to inoculate 50 mL broths of LB Km50, with 100 μM cumate to induce the expression of the constructs. This was then put through a column with a cotton wool plug with immobilised fuGFP-CBDs attached. 10 mL of cells were added to the column at a time. Once all cells had been put through the column, 10 mL of PBS was added to the column at a time, the idea being that cells expressing a nanobody that would be able to bind fuGFP would not be washed off, but those that did not bind would. Flow through was collected and measured for OD until the OD reached near enough to zero.
The cotton wool was then removed from the column, and placed in a 50 mL LB broth to grow up a number of cycles. Cells were then recovered from the broth and plated out onto LB Km50 plates, and incubated overnight. This resulted in 19 colonies that had made it through the screening. A pellet of cells was also generated and frozen.
These cells were then prepared for sequencing, using the same methods as above (primers iG22-1 and CFM4).
Simultaneously, the cells were put through the Nanobody-GFP binding assay, a few at a time. Only a number had been screened when sequencing results returned, showing that all 19 colonies were in fact the original pUS250-Neae plasmid, with no nanobody insert.
While this was disheartening, it was clear that due to the failed blue-white screening, we had failed to identify that our cloning efficiency was extremely low, and plasmid background therefore extremely high. In order to deal with this, the previously frozen pellet was resuspended and plasmid prepped, before an Esp3I digest was performed. As we had used golden gate cloning, successfully cloned plasmids would no longer contain the Esp3I sites, while the pUS250-Neae backbone retained them. This was then transformed into TOP10 E. coli, resulting in a plate of nine colonies.
A spanning PCR was performed on these nine colonies using the primers iG22-1 and MVS77 to prepare for sequencing. Spanning PCR results showed that only colony number 7 had an insert that was the right length to be a functional nanobody. In order to get some idea of the diversity of the sequences before sequencing results returned, an MspI digest was performed on the nine colonies and also the five original anti-GFP nanobodies, and run out on a gel. From this it was identified that colony number 7 appeared to have a unique digest pattern different to any of the original anti-GFP nanobodies, however the unique band that differentiated it from nanobody three was very faint and only around 20-50bp in length, so we could not be certain. Colony seven was named Allocamelus to differentiate it from the original nanobody number 7.
We proceeded to perform the Nanobody-GFP binding assay on Allocamelus in order to ascertain whether or not it bound to fuGFP or sfGFP. This assay was run with nanobody H as a positive control for binding to sfGFP, and nanobody three as based on the MspI digest gel, there was a small chance Allocamelus was unshuffled nanobody three. Induced versus uninduced and no GFP controls were also performed. While this did not show any signs of Allocamelus binding fuGFP, it also did not bind sfGFP, while nanobody three does.
Sequencing results were able to confirm that colony 7, A.K.A. Allocamelus (see Results page for more information) was indeed a successfully shuffled colony.
The allocamelus or ass-camel is a mythical creature with the head of a donkey or mule and the body of a camel (Topsell 1658). This creature features in British heraldry (Smith 1928). We named our one successfully shuffled nanobody after this strange beast because our nanobodies sequences come originally from camelids (of which the allocamelus is one) and that they are shuffled, much like the camel and donkey are shuffled together to create the allocamelus. Although it probably wasn’t necessary to give this shuffled nanobody such a fanciful name, we decided to have a bit of fun with it.
After sequencing our “allocamelus” nanobody, we realised that it fit better with the name than we ever anticipated! It turns out that this nanobody was a mix of just two of our original nanobodies (NB2 and NB3), just like how allocamelus is a mix between two animals.
Smith, R. F. (1928). An Early Map of Surrey. The British Museum Quarterly, 3(1), 16. doi:10.2307/4420919
Topsell, E. (1658). History of Four-footed Beasts and Serpents.
Initial Designs
We started our fusion protein design by searching the parts registry for existing work on cellulose-binding domains. We found that the 2014 Imperial iGEM team Aqualose had previously experimented with CBDs fused to sfGFP and chose 4 different CBDs (cenA, cipA, cex, and clos) to incorporate in our designs. The sequences for our initial fuGFP-CBDs contain the sequence for fuGFP upstream of a CBD sequences and connected by a 2 amino acid gap. Furthermore, we inserted recognition sites for BsaI and BamHI at the start of our fusion protein sequences, and XhoI and BsaI sites at the end of the sequence to allow for cloning into pUS250v3 and pET28c(+).
Cloning and transformation
Synthetic DNA of our fuGFP-CBD sequences were obtained and cloned into pUS250v3 via golden gat cloning with BsaI and transformed into TOP10 E. coli using heat-shock. We performed a colony PCR on transformed cells using primers MVS48 and CFM23 which bind in the pUS250v3 backbone on opposite ends of the insert sequence. The results from agarose gel electrophoresis showed most of the products matched our expected insert sequence of ~1100 bp, indicating a high cloning and transformation success rate.
Initial growth and testing
After confirming TOP10 E. coli were successfully transformed, we tried small-scale expression of our fusion proteins using 5 mL cultures and tested whether they were fluorescent and binding cellulose. We used bead beating to extract cell lysate from E. coli expressing fuGFP-CBDs we added these to filter disk paper as well as microcrystalline cellulose. We then measured changes of fluorescence to the cellulose and performed SDS-PAGE to determine whether proteins were bound to cellulose as well as the amount of fuGFP-CBDs being expressed by TOP10 E. coli. Overall we found that only fuGFP-CBDcipA showed caused cellulose to gain any significant fluorescence.
Redesign and addition of a flexible linker
In an attempt to improve the functionalisation as well as expression of our fusion proteins, we redesigned the initial fuGFP-CBD sequences by adding a flexible (GGGGS)3 linker (BBa_K4488006) between fuGFP and each of the separate CBDs. The new fuGFP-linker-CBDs were cloned and transformed into TOP10 for initial testing. Colony PCR using primers MVS48 and MVS71 which bind in the plasmid backbone and fuGFP sequence respectively, again confirmed successful transformation into TOP10.
Growth and testing in BL21(DE3)
To improve the overexpression of fuGFP-linker-CBDs, we cloned the fusion protein sequences into pET28c(+) using BamHI and XhoI restriction enzymes and transformed BL21(DE3) using heat-shock. BL21(DE3) is a common E. coli strain for recombinant protein expression because of it has been engineered to produce the highly efficacious T7 polymerase and lacks proteases (Ratelade et al. 2009). Expression with 100 mL broths of BL21(DE3) instead of TOP10 E. coli massively increased our protein yield. Furthermore, we chose to focus on fuGFP-linker-CBDcipA since all our previous experiments suggested this was the most functional in terms of cellulose-binding and fluorescence. We were able to show that fuGFP-linker-CBDcipA binds to cellulose through our tests.
We were also interested in determining whether fuGFP-linker-CBDs could be selectively eluted after being bound to cellulose as this provides potential applications for using fuGFP-CBDs as a tag for protein purification. Previous studies have found that CBD fusion proteins can be purified by binding to cellulose and eluted with solutions low in ionic strength such as distilled water (Sugimoto et al. 2012). We hypothesised that molecules structurally similar to cellulose may be able to compete for binding to CBD domains and cause the fuGFP-CBDs to be eluted from cellulose.
Using our protocol with fuGFP-linker-CBD bound to microcrystalline cellulose, we screened elution conditions using cellobiose, maltose, glucose, glycerol, and distilled water. We found that glucose with a concentration of 1 M (or above) is most effective for eluting fuGFP-linker-CBDcipA resulting in fractions with high purity. Overall, our experiments have shown that fuGFP-CBDs bind to cellulose and are eluted with glucose which may be useful in protein screening or purification.
Creation of new free-use fluorescent proteins
Under the guidance of our advisor Mark Somerville, we explored the possibility that the fluorophore of fuGFP could be mutated to generate other fluorescent proteins with differing Excitation/Emission spectra. This aligned with feedback we’d received on our project (see Human Practices). These mutations had been characterised and were known in other GFPs (Tsien 1998).
With assistance from our advisor, we used the degenerate primer MVS70 (with redundancy at three loci of the fluorophore) to amplify plasmid preps of pUS252-fuGFP and our pET28-fuGFP-linker-CBDs and generate variant fluorescent proteins.
Primer: AAACGTCTCCCTTGRSCYATGGCGTGCAG
We then performed a PCR cleanup (see Protocols page). The PCR products were then digested with DpnI to remove any unamplified plasmid, then with Esp3I restriction enzyme to be recircularised with Hi-T4 Ligase. The resultant mixture of plasmids were then transformed into appropriate hosts - pUS252-fuGFPx into TOP10 E. coli and pET28-fuGFP-linker-CBD into BL21 E. coli. We varied the amount of ligation mix added to our transformations - either 2 or 4 µL - as we were not sure what would be the most efficient. This produced a plates of transformants that could then be analysed through a plate-reader fluorescence assay and for binding and eluting to cellulose (see Protocols page for relevant protocols).
References
Kubala, M. H., Kovtun, O., Alexandrov, K., & Collins, B. M. 2010. 'Structural and thermodynamic analysis of the GFP:GFP-nanobody complex', Protein Science, 19, 2389-401.
Meyer, A. J., Ellefson, J. W., & Ellington, A. D. 2014. Library generation by gene shuffling. Current protocols in molecular biology, 105, Unit–15.12. https://doi.org/10.1002/0471142727.mb1512s105
Ratelade, J, Miot, M-C, Johnson, E, Betton, J-M, Mazodier, P & Benaroudj, N 2009, ‘Production of Recombinant Proteins in the lon -Deficient BL21(DE3) Strain of Escherichia coli in the Absence of the DnaK Chaperone’, Applied and Environmental Microbiology, vol. 75, no. 11, pp. 3803–3807.
Salema, V., and Fernández LÁ. 2017. 'Escherichia coli surface display for the selection of nanobodies', Microb Biotechnol, 10, 1468-84.
Salema, V., Marín E., Martínez-Arteaga R., Ruano-Gallego D., Fraile S., Margolles Y., Teira X., Gutierrez C., Bodelón G., & Fernández LÁ. 2013. 'Selection of single domain antibodies from immune libraries displayed on the surface of E. coli cells with two β-domains of opposite topologies', PLoS One, 8: e75126.
Sugimoto, N, Igarashi, K & Samejima, M 2012, ‘Cellulose affinity purification of fusion proteins tagged with fungal family 1 cellulose-binding domain’, Protein Expression and Purification, vol. 82, no. 2, pp. 290–296.
The Story of Free Use GFP (fuGFP). Small Things Considered. 2019. https://schaechter.asmblog.org/schaechter/2019/05/the-story-of-free-use-gfp-fugfp.html.
Tsien, Roger Y. 1998. 'THE GREEN FLUORESCENT PROTEIN', Annual Review of Biochemistry, vol. 67, pp.509-44.
Twair, A., Al-Okla, S., Zarkawi, M., & Abbady, A. Q. 2014. Characterization of camel nanobodies specific for superfolder GFP fusion proteins. Molecular biology reports, 41(10), 6887–6898. https://doi.org/10.1007/s11033-014-3575-x