Overview
Interaction
Detection
Signalling
Biosafety
Modularity
Colourectal consists of four pillars: Interaction, Detection, Signalling and Biosafety. These together enable early in vivo detection of colorectal cancer. On this page, an overview is given of all the results we accomplished to create Colourectal, our in vivo diagnostic tool for colorectal cancer. In-depth reports of the separate pillars can be found by clicking on the tabs at the top of the page. Additionally, for a more substantial overview of all the scientific highlights, go to our proof of concept page!
Interaction

In our first pillar we researched the interaction between Escherichia coli Nissle 1917 (EcN) and cancer cells. We performed a cytotoxicity assay from which it appeared that the Caco-2 cells proliferation was not affected by a change in concentration of the heat inactivated EcN. However, there might be a time-dependent cytotoxic effect on the Caco-2 cells. The chromoproteins showed no toxic effect at a concentration up to 0.1 mg/ml. The next step would be to use a representative model for the colon to determine the cytotoxic effect of the different factors inside the human body. The effect of EcN on the MMP-9 secretion was analysed by a Western Blot, and no effect was found. Except for the change in expression in cell cycle related genes, the RNA sequencing results showed no significant effect on the gene expression of the Caco-2 cells after exposure to the heat inactivated bacteria.

Detection

In our second pillar we successfully engineered EcN to sense our biomarker L-lactate in the colon environment. This was done by introducing a synthetic biosensor that is able to work in microoxic and glucose rich conditions. Efforts were taken to use this sensor to distinguish cancerous from healthy L-lactate levels. Experimentally this was not yet successful. However, modelling predicted that if we would be able to control the internal concentration of L-lactate we could control the operational range of the dose response. In the future, this hypothesis will be tested in a strain that cannot catabolise L-lactate.

Signalling

In our third pillar we managed to have consistent expression of chromoproteins in E. coli Nissle 1917. However, we did not yet manage to secrete them using the native secretion systems of E. coli. In future experiments a Western Blot will be performed to further test and improve this secretion system. Further, with in silico modelling we designed a flip chromoprotein, but so far testing of the constructs in the lab has not shown the modified chromoproteins retain their colour after cutting with our biomarker MMP-9. Additionally, we successfully produced MMP-9 in E. coli BL21 as confirmed by Western Blot, an important step for future testing of this cancer biomarker.

Biosafety

In our fourth pillar we successfully tested the three biosafety levels we envisioned to keep our living diagnostic tool safe. We managed to make EcN dependent on mucin, thanks to the chimeric two-component system that we constructed. At the same time, a kill switch based on the temperature sensitive promoter cpsA on a EnvZ knock out E. coli strain was shown to work with a temperature sensitive survival assay. For the third safety level, which will be used to stop the diagnostic test, we proved that with high concentrations of rhamnose (3 and 5 mM), growth of EcN was effectively impaired.

Modularity

To combine the four pillars into our living diagnostic, we have to induce several genetic circuits. Suitable inducers for our system are limited and therefore we designed a pipeline with which we can turn on three separate genetic systems with one inducer metabolite. We tested this system and demonstrated that we can turn on the signal by increasing the amount of inducer metabolite. Additionally, we formed a hypothesis on how to further develop the system and create three distinguished genetic steps. We found that rhamnose as an inducer metabolite was the most suitable candidate for our system.

Binding to cancer cells
The first pillar of Colourectal, our living diagnostic, is the binding of potential cancer cells. This allows Colourectal to get close to the cancer and stay there, giving it a higher chance of detecting our biomarkers . Some bacteria have natural adhesins, Type 1 fimbriae, which are structural pili. These fimbriae, in Escherichia coli, are able to bind to α-D-mannose groups in glycoproteins on host cells [1]. Carcinoembryonic antigen cell adhesion molecules (CEACAMs) are one such glycoprotein. This glycoprotein is normally expressed in healthy colon tissue [2]. However, in early colonic adenomas and hyperplastic polyps, which can both grow into cancer, CEACAM6 expression is increased [2]. This makes it a good target for early detection of cancerous growths. Type 1 fimbriae are expressed by the fim genes and are constructed by assembling different subunits, FimA, C, D, F, G and H, where FimH is the binding domain at the end of the pilus [3]. A potential issue arises due to the fact that CEACAM6 is also expressed in healthy tissue, making the binding less specific. To increase specificity to colon cancer while keeping the normal binding capacity of FimH, we intended to fuse FimH with the nine amino acids long peptide CPIEDRPMC (RPM ), which binds specifically to integrin α5β1 in invasive colon cancer [4]. S ince this fusion protein may not work or fold properly, we opted to fuse at both the C and N termini, behind the signal peptide sequence, and test both options, resulting in parts (BBa_K4244065) and (BBa_K4244067). The 2015 iGEM team of Harvard made a fusion part with RPM on the N terminus (BBa_K1850011) , however they added both a His- and SpyTag and therefore we decided to design our own constructs.
Construction
Implementation of these fusions were done on SEVA plasmid 258. This plasmid contains XylS-Pm regulator promoter system originally from Pseudomonas putida. The promoter works with a titratable inducer m-toluate. This allows us to test different expression levels based on different inducer concentrations. By exploring different concentrations, we can identify expression levels of the fused protein that will replace standard FimH in the assembly of the fimbriae without being a high burden on the cell. To ensure that potential effects on binding capacity are not only due to the overexpression of fimH, we also created a control plasmid with the native fimH only (BBa_K4244069) . Furthermore, the plasmid contains a broad-host-range replication origin, RFS1010 and its replication proteins RepA, B and C and kanamycin resistance gene.
Binding assay experiments
Bacterial cells of E. coli Nissle 1917 (EcN) were be tested on binding ability, using three different binding assays. Here, they would be combined with the immobilized proteins CEACAM6 and integrin α5β1. Further, they would also be combined with immobilized Caco-2 cells.
Cells containing either C- or N-fusion to FimH should have a higher binding capability on integrin α5β1 compared to wild-type EcN if RPM can reach its target. Hopefully, they are still able to bind to CEACAM6. In addition, the wild type of EcN would also be used as a control to determine normal adhesion levels and would be compared to wild-type E. coli K-12.
Unfortunately, we were not able to construct the fusion and the native fimH plasmids yet. Therefore, we were unable to perform the binding assays in time for the wiki freeze. In the upcoming weeks, we will continue to troubleshoot the integration of the C- and N-fusion in addition to the FimH overexpression circuit.
What is the effect of Escherichia coli Nissle 1917 on colorectal cancer cells?
Since we designed our EcN to attach to colorectal cancer cells, it is important to investigate the interaction between our living diagnostic and the cancer cells. A commonly used model for colorectal cancer molecular studies is provided by the Cancer coli-2 (Caco-2) cells, that were originally isolated from a (human) colorectal adenocarcinoma [5].
Several members of the E. coli family are known to possess the pks genomic island that encodes for colibactin, a toxin that can cause a genotoxic effect in eukaryotic cells [6]. Colibactin is activated during direct contact between the living bacteria and mammalian cells [7]. However, there is conflicting evidence on whether EcN has this genotoxic potential [8]-[ 10]. Fortunately, the toxic effect can be shut down, while keeping the probiotic properties that EcN has, by substituting an amino acid in the clbP gene, which is responsible for the colibactin activation [9]. Alternatively, the whole clbP gene could be knocked out. In the future we could consider using one of the two mutants to silence the colibactin gene. During this project we only used the wild-type EcN, and its heat inactivated form, to assess the interaction between EcN and Caco-2 cells as genotoxicity is a slow process and it will most likely not affect our experiments.
The potential toxic effect that EcN might have on the cancer cells was assessed with a cytotoxicity assay . This assay allowed us to determine the effect of potential cytotoxic compounds produced by EcN on the proliferation of the Caco-2 cells. From this experiment, as shown in Figure 1, it appeared there was no relation between the Caco-2 cell proliferation and the concentration of EcN. This means that an increased EcN dose does not affect the metabolic activity of human cells.
Figure 1: The relative proliferation of Caco-2 cells that were exposed to different concentrations of heat inactivated Escherichia coli Nissle 1917 at time t = 0, 4, 24, 48 hours. The relative proliferation was determined by the water-soluble tetrazolium salt (WST) cytotoxicity assay and corrected for the untreated control samples. Abbreviations OD is the optical density measured at wavelength 600 nm; h is hours.
The increase that is seen between 24 and 48 hours indicates a change in metabolic activity in the human cells. This does not necessarily mean that the cancer cells are growing, but it could suggest an increase in cellular processes. To obtain a conclusion on the change in physiological response over time we should perform a cytotoxicity assay on healthy, non-cancerous cells.
Additionally, one of our chromoproteins (anm2CP) was also assessed for its potential toxic effect on human cells. We incubated the human cells with different concentrations of chromoprotein, ranging from 0.01 to 2 mg/ml for 24 hours. The chromoproteins did not affect the Caco-2 cells at concentrations of 0.1 mg/ml or lower, as shown in Figure 2. In contrast, from chromoprotein concentrations of 0.5 mg/ml and higher the proliferation decreased significantly.
Figure 2: The relative proliferation of Caco-2 cells that were exposed to different concentrations of the chromoprotein anm2CP, ranging from 0.01 to 2 mg/ml, at t = 24 hours. The relative proliferation was determined by the water-soluble tetrazolium salt (WST) cytotoxicity assay and corrected for the control samples. P<0.01 (**) and P<0.001 (***).
This experiment gives an indication of how chromoproteins affect the human cell proliferation. However, it is not a good representation for our real-life application. These results need to be confirmed in a more realistic model for the human colon. The modelled concentration could ultimately be used to validate if this is enough to colour the stool.
One of the CRC biomarkers we are using in this project is Matrix metalloproteinase 9 (MMP-9). Therefore, it is important to investigate whether EcN is affecting the expression of MMP-9. Fábrega et al. (2017) found that EcN can decrease the MMP-9 expression in colonic mice [11] We incubated the Caco-2 cells with heat inactivated EcN for t = 0, 4, 8, 12, and 24 hours and collected the supernatant from these cells, as MMP-9 is a secreted protein. The Western Blot in Figure 3 shows a clear band around 65 kDa. The expected size was 92 kDa for the inactive protein and 84 kDa for the active MMP-9 [12] However, MMP-9 can be converted in vivo to a smaller active form of 64 kDa [13]. From this and from the fact that the blot is very clean it is probable that the band around 64 kDa represents MMP-9. In future experiments, a positive control, such as purified MMP-9, could be included.
Over time, the cells secrete more MMP-9. The blots were stained with Ponceau staining to ensure that the samples were loaded with an equal concentration of proteins. When comparing the untreated cells with the Caco-2 cells that were exposed to heat inactivated bacteria at t = 12 and t = 24 hours, we can clearly see that the bands show the same intensity. This suggests that the secretion of MMP-9 is not affected by the presence of EcN.
Figure 3: Western blotting analysis of the supernatant of Caco-2 cells that were exposed to heat inactivated Escherichia coli Nissle 1917. Samples were probed with an antibody against both active and inactive Matrix metalloproteinase 9. Left is the Precision Plus ProteinTM ladder. Bands are shown around 65 kDa and 45 kDa. Abbreviations kDa is kilo Dalton, M is marker, h is hours, NC is negative control, HIB is heat inactivated bacteria.
To investigate the effect EcN had on the expression profile of the colorectal cancer cells we used bulk RNA sequencing followed by differential expression analysis. We compared Caco-2 cells in normal conditions with Caco-2 cells that have been in contact with heat inactivated EcN. Several time points, t = 4, 8, 12, 24 hours were compared to assess the differential expression over time. In total 16518 genes were analysed, from which almost 600 genes showed altered temporal expression. We used a regression strategy to identify genes with significant temporal expression via maSigPro. The genes were divided into six different clusters , shown in Figure 4A. However, only cluster three was associated with a specific biological function, namely cell division (Figure 4B). The increased expression of genes related to the cell cycle might be correlated to the increase in proliferation we observed in Figure 1.
Figure 4: RNA-seq results Caco-2 cells treated with heat inactivated bacteria. (A) Plot of the 570 gene median expression profiles divided in 6 clusters with time-factor (hours) represented in the x-axis and gene expression in the y-axis, solid line joins the averages of each time-group to show the temporal trend while dashed line represents the fitted cubic regression curve. (B) Gene ontology terms enrichment analysis of genes present in cluster 3 based on biological process terms. The y-axis reports the biological term, circle sizes show the gene counts and different colour depicts the Benjamini-Hochberg (BH) adjusted p-value (p>0.05).
Additionally, the colorectal cancer cells could also affect the expression profile of EcN. Therefore, we performed transcriptomics on EcN that had been in contact with Caco-2 cells. This allowed us to see if EcN presents genes that respond differently to the presence of Caco-2 cells. EcN was incubated on Caco-2 cells for 30 and 120 minutes and the expression profiles were compared to an untreated control. We could use this data to evaluate activated genes and potentially identify promoters that are affected by colorectal cancer. In the future, we could use this information to improve colorectal sensing. Due to the turnover time of the RNA sequencing we were unfortunately not able to obtain the results before the wiki freeze.
Conclusion
Our current FimH constructs were sent for sequencing but do not align with expected results. To understand what went wrong, the whole plasmids will be sent for sequencing.
From the cytotoxicity assay it appeared that the Caco-2 cells’ proliferation is not affected by a change in concentration of the heat inactivated EcN. However, there might be a time-dependent proliferative effect on the Caco-2 cells. The chromoproteins showed no toxic effect at a concentration up to 0.1 mg/ml. The next step would be to use a representative model for the colon to determine the cytotoxic effect of the different factors inside the human body. The effect of EcN on the MMP-9 secretion was analysed by a Western Blot, and no effect was found. Except for the change in expression in cell cycle related genes, the RNA sequencing results showed no significant effect on the gene expression of the Caco-2 cells after exposure to the heat inactivated bacteria.
L-Lactate; A biomarker of cancer
To make our living diagnostic capable of detecting cancer, we first must understand some of the underlying concepts of cancer cells. Cancer is a disease where cells proliferate uncontrollably due to changes in signalling and metabolism [1]. Because of this, cancer cells produce more L-lactate than healthy cells [2]. This increase in lactate production is also found in colorectal cancer (CRC) cells. The microenvironment of healthy colon cells contains between 1.5 – 3 mM of L-lactate, while tumorous colon microenvironments contain between 10 – 25 mM of L-lactate [3]. By sensing lactate with our living diagnostic, we want to make it sensitive to CRC.
Sensing lactate with Escherichia coli
Some gut bacteria, such as Escherichia coli (E. coli), can use lactate as a substrate for growth. In E. coli, the operon that is responsible for this is called the lldPRD operon. This operon contains a transcription factor, LldR, which can bind to two different operator regions called O1 and O2. These operator regions are located up- and downstream of the operon’s promoter, PlldPRD, see Figure 1A. By binding to these regions, LldR represses the operon, see Figure 1B. However, if lactate is bound to LldR, it changes conformation, resulting in activation of gene expression, see Figure 1C [4]. This system has already been used in previous research, including previous iGEM competitions, to develop lactate biosensors [3,5,6] [7–9].
Figure 1: Lactate inducible promoter PlldPRD. 1A the different components shown. 1B In the absence of lactate the promoter is repressed due to LldR binding to the recognition sites O1 and O2. 1C When lactate is present it can bind to LldR. This results in LldR changing conformation inducing gene expression by staying bound to O1 and releasing from O2.
Problematic colon environment
By employing this same strategy, we decided to develop a synthetic biosensor in Escherichia coli Nissle 1917 (EcN) with the aim to sense L-lactate concentrations found in the microenvironment of colon tumours. In its current state, LldR and PlldPRD are unfit for this purpose due to two reasons. The first is that the promoter is repressed by glucose, which is found in the colon [10,11]. And secondly, the lldPRD operon is regulated by a system native to E. coli which regulates the expression of several operons under aerobic and anaerobic conditions [12]. To address these challenges, Zúñiga et al. developed a lactate sensitive promoter called ALPaGA (A Lactate Promoter Operating in Glucose and Anoxia) which is not repressed by the above-mentioned conditions [5], making it suitable for sensing cancer in the colon environment.
Escherichia coli Nissle 1917 lactate biosensor
At first, we constructed a lactate-sensing genetic circuit. In this circuit a reporter gene, the super folder green fluorescent protein (sfGFP) (BBa_I746916), was placed after the lactate inducible promoter ALPaGA (BBa_K4244000), while LldR (BBa_K822000) was constitutively expressed. A genetic construct like this should only show fluorescence when lactate is present. An overview of the circuit can be found in Figure 2.
Figure 2: Configuration of lactate sensing genetic circuit, the circuit consists of a transcription factor LldR (red), a lactate inducible promoter ALPaGA (A Promoter Operating in Glucose and Anoxia) and super folder green fluorescent protein (sfGFP) (Green). 1A When lactate is absent sfGFP is repressed. 1B When lactate is present sfGFP is expressed.
The circuit from Figure 2 was tested by transforming it into EcN. The strain was grown for 16 hours in microoxic conditions, in minimal medium (M9) containing glucose and induced with different concentrations of L-lactate. After 16 hours, the fluorescence was measured and plotted against the L-lactate concentration, see Figure 3. The goal of this experiment was to see if we could use this genetic circuit to differentiate healthy levels from cancerous levels of L-lactate, highlighted as the green and red vertical bars in Figure 3, respectively. In its current form, the genetic circuit is sensitive to L-lactate since there is an increase in fluorescence with increased L-lactate levels. However, the system is currently unable to distinguish healthy from cancerous levels of L-lactate, because in both cases significant fluorescence is observed. For this reason, we set out to find a way to change the operational range of the dose-response by moving it to the right. By doing so, we want to eliminate the promoter responding to healthy concentrations of lactate found in the colon.
Figure 3: The response of ALPaGA, expressed as OD600 corrected fluorescence output, at different L-lactate concentrations. The green and red areas signify the L-lactate concentrations in the microenvironment of healthy and cancerous colon cells, respectively.
Modifying the dose-response
There are many tools to regulate gene expression in bacteria, of which two examples are Clustered Regularly Interspaced Palindromic Repeats interference (CRISPRi) and antisense RNA (asRNA) [13–15]. We hypothesized that we could modify the operational range of the dose response of ALPaGA by combining these two tools. To test this hypothesis, we designed a CRISPRi-asRNA genetic circuit. This circuit is organized as follow, a constitutively expressed CRISPRi cassette represses sfGFP. The cassette consists of a deactivated CRISPR associated (dCas) protein and a single-guide RNA (sgRNA) complementary to sfGFP. Additionally, an asRNA complementary to the sgRNA would be made inducible by lactate. This can be obtained by placing the ALPaGA promoter in front of the asRNA and constitutively expressing LldR. Designing a genetic circuit in this way means that the reporter gene is repressed if lactate is absent but expressed when lactate is present, see figure 4 for an overview. The initial hypothesis was that by changing the binding affinity between the asRNA-sgRNA complex, the activation efficiency would change, resulting in the operational range shifting to the left or to the right.
Figure 4: Genetic construct of the CRISPRi-asRNA genetic circuit, the circuit consists of a transcription factor LldR (red), a lactate inducible promoter ALPaGA (A Promoter Operating in Glucose and Anoxia), anti sense RNA (asRNA) (blue), a CRISPRi (Clustered Regularly Interspaced Palindromic Repeats interference) cassette consisting of a deactivated CRISPR associated protein (dCas) and single guide RNA (sgRNA) (Purple) and super folder green fluorescent protein (sfGFP). 1A Shows that if lactate is absent, asRNA is not expressed and the CRISPRi cassette silences the expression of sfGFP. 1B shows that if lactate is present the asRNA is expressed, which represses the sgRNA, resulting in sfGFP expression.
Unfortunately, due to time constraints, we were unable to test the CRISPRi-asRNA genetic circuit in the lab before the iGEM Grand Jamboree deadline. However, we were able to model this construct in silico, the details of this can be found in the page Lactate Model. The model predicted that only the sensitivity, meaning the slope of the dose response curve, would change when adding the CRISPRi-asRNA genetic circuit. The added CRISPRi and asRNA genetic components were not predicted to affect the operational range of the dose response curve, see Figure 5 for the modelled dose response curves. Because of this, we set out to find different ways to change the operational range of the dose response curve.
Figure 5: Modelled dose responses for L-lactate of the genetic circuits with and without Clustered Regularly Interspaced Palindromic Repeats interference (CRISPRi) and antisense RNA (asRNA) in orange and teal, respectively. The green and red areas signify the concentration of L-lactate around healthy and cancerous colon cell microenvironments, respectively.
Lactate dehydrogenase knockouts
One of the things we learned from the Lactate Model is that we could shift the operational range of the dose response by changing $k\_f_{LldRcomplex}$. This parameter describes the rate of activation of LldR. We hypothesized that, in biological terms, this rate is influenced by the intracellular L-lactate concentration. This means that, if we could control the internal concentration of L-lactate, we could control the operational range of the dose response. We wanted to test this hypothesis by generating a strain that cannot metabolise L-lactate. From literature we found that by knocking out lldD and ykgF we should be able to generate an E. coli strain that cannot catabolise L-lactate [16]. Furthermore, we found that EcN specifically contains an additional lactate dehydrogenase, nldH, which we also planned on knocking out [17]. Even though the knockouts were successfully generated before the iGEM deadline, there was not enough time to test our hypothesis [16].
Conclusion
During our experiments, we showed successful detection of L-lactate using a synthetic biosensor in EcN as a proof of principle. However, we were not successful in using the biosensor to differentiate healthy levels of L-lactate (1-3 mM) from cancerous levels of L-lactate (10-25 mM). We first thought of an approach to combine CRISPRi and asRNA in the genetic circuit of the synthetic biosensor. We hypothesised that this would change the dose response so that it could differentiate healthy and cancerous levels of L-lactate. Modelling efforts predicted that this would not be the case, but it did give us new insights into how the system functions by doing a sensitivity analysis. The model predicted that, if we could somehow control the internal concentration of L-lactate, we could control the operational range of the dose response. In the future, we aim to test this hypothesis by transforming our L-lactate biosensor circuit into a EcN strain that cannot catabolise L-lactate.
Chromoproteins; signalling the cancer
Upon detection of colorectal cancer, our living diagnostic will secrete a coloured protein from the chromoprotein family, colouring the user’s stool. Chromoproteins (CP) are a class of small, coloured proteins that come in a variety of shades and are responsible for colouration in corals or sea anemones (Uppsala_Chromoproteins) . Unlike fluorescent proteins (FP), CPs absorb light to give off colour in ambient light, allowing for instrument-free detection. This feature makes them very useful as reporter proteins in non-laboratory conditions [1].
CPs are encoded by a single gene, making them relatively easy to clone and engineer. Thanks to the work of the iGEM 2011 Uppsala team, a collection of chromoproteins optimized for Escherichia coli, was made available to use. In 2017, the SHSBNU China team mixed different CPs with curry to imitate stool samples. Based on their results, pink and blue chromoproteins produced the most significant change in stool colouration. This, combined with the fact that these colours are the least likely to be found in food, made us opt for blue and pink CPs. The blue CP will be used as the indicator of colorectal cancer, whereas the pink one will be used as a positive control to assess if the living diagnostic is still active once ingested. In order to avoid confusion with blood stains in the stool, secretion of the pink CP will be controlled by the user through an inducible system. Thus, multimeric proteins amilCP (BBa_K592009) and spisPink (BBa_K1033932) , were selected as good candidates for our project. In addition, two monomeric CPs were also tested: the pink anm2CP (BBa_K2387001) and blue Ultramarine (UM) (BBa_K4244001) [2]. (Refer to figure/clip of monomeric and multimeric (amilCP and UM))
Figure 1: Escherichia coli expressing chromoproteins on Luria Bertani Agar plates supplemented with kanamycin after 36 hours. From top-left to bottom-right: two multimeric chromoproteins: amilCP, spisPink, and two monomeric chromoproteins: Ultramarine and anm2CP. Dark blue colonies results from blue/white screening.
The first step of our project consisted of producing the CPs, as can be seen in Figure 1, out of the four CPs tested only spisPink did not exhibit any colouration. We decided to continue with only anm2CP for the pink variant and keep UM and amilCP for the blue variant.
Secretion of chromoproteins
CPs have not been observed to be naturally secreted [3,4] and no research has been found that focuses on the CP secretion in E. coli. We assume that CPs generally do not spontaneously diffuse outside the bacterial cell, therefore a secretion system needs to be integrated into our design. By adding a signal peptide (SP) sequence to the C- or N-termini of a protein, the protein can be secreted via its respective system [5]. In E. coli, type I and II secretory pathways are the most prevalent for recombinant protein translocation out of the six reported pathways for Gram-negative bacteria [6]. Type I secretion entails a single-step transfer of specific proteins to the extracellular space through both inner and outer membranes with the sequence remaining attached to the protein [7,8]. On the other hand, type II secretion entails two-step secretion mechanisms where proteins are directed to the periplasmic region where they accumulate before entering the pseudopilus apparatus (Figure 2, orange complex) that transports cargo across the outer membrane [9,10]. The general secretory (Sec) or twin-arginine translocation (TAT) cellular machinery enables type II translocation in E. coli [11]. The Sec pathway export unfolded protein whereas the TAT system exports fully folded proteins into the periplasmic region [12] in both systems the SP sequence gets cleaved off the protein after translocation.
Another system that can be used for the secretion of CPs is the curli secretion pathway (Type VIII) [13]. This pathway is natively used in E. coli to secrete protein units of curli fibres (CsgA) which self-assemble into extracellular curli fibres during biofilm formation [14,15]. CsgA is secreted into the periplasm with the help of the SEC peptide and across the outer membrane with the help of the N22 peptide [16]. These two components are essential for the secretion of proteins through this pathway, and previous research has shown that they can be harnessed for the secretion of heterologous proteins too [17].
Figure 2: Illustration of the types of secretion system tested in E.coli Nissle
The signal peptides tested in our project are shown in Table 1. We decided to test the signal peptide for secretion using a medium strength constitutive rubisco promoter, rbcL (BBa_K4244057) with its terminator (BBa_K4244058) and super folder green fluorescent proteins (sfGFP) (BBa_K3143689). rbcL was chosen to favour secretion in E. coli avoiding high levels of expression. In addition, chloroplast promoters are eubacterial in origin and hence compatible with gene expression in bacteria [18]. sfGFP is more suitable for secretion assays as it matures faster than CPs, and the fluorescence can be quantified more easily. Even though, this will gives us an indication that secretion is possible in our strain the experiment will have to be repeated with chromoproteins as the amino acid sequence and tertiary structure affect the secretion efficiency of the signal peptide[10].
Table 1: The signal peptides tested and their properties.
   
Signal Peptide   
   
iGEM registry   
Type

Pathway

Protein state

Position
   
HlyA   
   
BBa_K554002   
   
Type I   
   
Unfolded   
   
C-terminus   
   
OmpA   
   
BBa_K208003   
   
SEC (Type II)   
   
Unfolded   
   
N-terminus   
   
PelB   
   
BBa_J32015   
   
SEC (Type II)   
   
Unfolded   
   
N-terminus   
   
TorA   
   
BBa_K1012002   
   
TAT (Type II)   
   
Folded   
   
N-terminus   
   
SEC-N22   
   
BBa_K2895007   
   
Type VIII   
   
Unfolded   
   
N-terminus   
However, as observed in Figure 3, no fluorescence was detected in the supernatant over 8 hours of growth, signalling that secretion of sfGFP was not achieved with any of the constructs under a constitutive promoter. The experiment was re-performed using a rhamnose inducible promoter as we hypothesized that the high expression level caused by the constitutive promoter caused inclusion bodies [19] clogging transporter. The plasmid were transformed in Escherichia coli Nissle 1917 (EcN) ΔrhaB ΔrhaT to allow for better titration [20]. An endpoint measurement was done after 5 hours of growth induced with rhamnose (500 μM) (Figure 4), as we observed from Figure 3 that protein production reached a plateau after 5 hours. Different rhamnose concentrations (0, 25, 50, 100, 250 μM) were tested leading to the same conclusion that no fluorescence was detected in the supernatant (data not shown).
Figure 3: Time series of total, pellet and supernatant fraction of super folder Green Fluorescent Protein (sfGFP) under different signal peptides expressed with a constitutive promoter. After 5 hours of growth, the fluorescence reached a plateau in the total and pellet fractions. No fluorescence was observed in the supernatant fraction for any of the signal peptides tested.
Figure 4: Endpoint measurement of total, pellet and supernatant fraction of super folder Green Fluorescent Protein (sfGFP) under different signal peptides expressed with rhamnose inducible promoter. Measurement was performed 5 hours after rhamnose induction at 500 μM. No fluorescence was detected in the supernatant fraction for any of the signal peptides tested.
A clear observation is that less fluorescence is emitted by the fused sfGFP than by GFP without SP. This comes from the SP affecting the growth of EcN. Regarding secretion activity, the results are in contrast with previous literature [21–23]. One reason could come from the type II secretion system that translocates the protein in the periplasmic space where they accumulate and are not secreted into the extracellular medium [21]. One way to solve that would be to knockout lpp a protein that helps with the outer membrane sturdiness. This can render the outer membrane permeable enough to allow extracellular protein production as demonstrated by Shin & Chen, 2008 where they secreted twice the amount of active xylanase compared to the wild type [24]. Regarding the type I system, Linton et al. (2012) demonstrated uvGFP translocation using HlyA by performing a Western blot of concentrated supernatant from a 100 ml induced culture [21]. Our expression system is only different by the inducible promoter so our system might not be sensitive enough to detect this little amount. Another hypothesis might come from the amino acid sequence not being to fold properly in the extracellular space. A Western blot on these fractions could help us determine if sfGFP is indeed in the supernatant but an inactive state as the extracellular space might not be appropriate for proper sfGFP folding. However, due to time, we were not able to perform this experiment. Additionally, the same secretion experiment could be performed using a different strain of E. coli. Even though the signal peptides, except PelB, tested are native to E. coli species the sequences were not amplified from the genome of EcN but taken from the iGEM registry. This could play a role in the non-secretion of sfGFP as it was not directly amplified from our strain. Therefore, introducing these constructs in another strain (BL21 or K-12) could help us determine if the secretion is strain specific or not. Alternatively, we could also amplify a known signal peptide from EcN and fuse it to our gene of interest. Finally, additional screening can also be performed with other SP if none of the solutions above gives positive results.
Activating chromoproteins
While the chromoproteins will be secreted when high lactate levels are detected, using a second cancer biomarker will ensure there is less risk of false positives. We have coupled the colour of our chromoproteins to interaction with a second cancer biomarker. For our living diagnostic, this second marker is Matrix Metalloprotease 9 (MMP-9). MMP-9 is a protease which has been shown to play a vital role in cancer cell invasion and is one of the most widely investigated MMPs [25]. MMP-9 is an extracellular protein that can degrade extracellular matrix proteins through proteolytic cleavage. This protease recognises and cleaves a specific amino acid sequence. The consensus sequence of MMP-9 is Pro-X-X-Hy-(Ser/Thr), with X being any amino acid and Hy being a hydrophobic amino acid [26].
We set out to make use of this cleaving property by engineering our chromoproteins in such a way that they are only coloured upon cleavage by MMP-9. For this design, we took inspiration from the already reported flipGFP system [27]. This reporter system is based off a splitGFP variant, in which sfGFP is divided into two proteins that need to interact with each other to restore the beta-barrel structure of GFP and thus fluorescence. In the flipGFP variant, GFP was split in two parts: β-sheets 1-9 and β-sheets 10-11. The latter two sheets are usually anti-parallel, but in the flipGFP system the addition of dimerising linkers E5 and K5 forces them in a parallel configuration, disrupting fluorescence. In the original flipGFP system, cleavage of a specific protease could abolish dimerisation of E5 and K5, thus restoring the anti-parallel configuration of β-10-11 and fluorescence of the FP, as demonstrated in Figure 5.
Figure 5: The mechanism of chromoprotein (CP) inactivation. The E5/K5 domains introduced in the CP ensure the 10th and 11th β-sheets fold parallel disrupting the β-barrel. The K5 domain is preceded by an Matrix Metalloproteinase 9 (MMP-9) specific cleavage site. Upon cleavage by MMP-9 the K5 domain is removed, allowing antiparallel orientation of the β-sheets and restoring CP activity. Adapted from [27].
We sought to mimic this system in our chromoproteins, as chromoproteins share a similar structure with the green fluorescent protein (GFP) with a distinctive β-barrel composed of eleven β-sheets surrounding an α-helix core that restrains the chromophore group [28] as can be seen in Figure 6. Our MMP-9-sensitive flipCPs were designed alongside an MMP-9-sensitive flipGFP. In our design, the K5 domain was preceded by cleavage sequence specific for either MMP-9 or the TEV protease, which serves as a positive control [29].
Figure 6: The 3D structures of chromoprotein Ultramarine (blue) and fluorescent protein GFP (green).
To get an indication of whether we could expect similar disruption for the flipCP to flipGFP we performed in silico predictive modelling using Alphafold [30]. The predicted structure for flipGFP is shown in Figure 7, whereas the results for the flipCPs can be found at their respective pages ( flipUM , flip-anm2CP , flip-spisPink , flip-amilCP ). Based on the results of our predictive modelling, we could discern the optimal insertion sites for the E5 and K5 regions with respect to the β-sheets. Our predictive modelling results showed us that the CPs β-barrel structure could indeed be disrupted, and that upon removal of the K5 region normal chromoprotein structure would be restored. Moreover, it helped us determining the optimal insertion sites for the E5 and K5 helices. This served as an important starting point for designing and building the actual flipCPs.
Figure 7: Predicted folding of: flipGFP. Generated using Alphafold. The left shows the flipGFP in its un-cleaved, inactive state. In the middle is the cleaved, active state. The right shows normal GFP. In cyan are marked the 10th and 11th β-sheets. Yellow shows the E5 and K5 insert. Red is the Matrix Metalloproteinase 9 (MMP-9) cleavage region.
Based on the predicted CPs structures, construction was started for the four chromoprotein variants alongside GFP. However, initial tests with the four different CPs encountered difficulties due to inconsistent expression. E. coli DH5a was initially chosen as test platform, but the low CP yields led to waiting times of days before colour could be discerned. This is consistent with previous reports showing that consistent expression of CPs can prove challenging [3]. Therefore, we decided to switch to the protein production strain E. coli BL21 (DE3) as our new strain for testing the CP constructs.
Due to time investment in learning how to reliably work with CPs, we decided to create a flipGFP variant which was sensitive to MMP-9, and to focus on only one of the four CPs to engineer our flipCFP. Since anm2CP and Ultramarine produced clear colour and Ultramarine was used before in creation of a splitUltramarine [28], we selected Ultramarine as our CP of choice. For this, splitGFP and splitUltramarine were first generated by inserting a terminator-promoter-RBS block between the 9th and 10th β-sheet ( BBa_K4244043 , BBa_K4244044 ) of both proteins. Subsequently, the E5 insert was added to see if the colour would be retained in the absence of the K5 region. Unfortunately, no fluorescence or blue colour was observed for the modified proteins. This was unexpected especially for the splitGFP construct, as splitGFP is a well-established construct in literature. Further testing is required to discern whether the problem lies in the protein modification or the protein expression.
MMP-9 purification
To test whether our flipGFP and flipCP could be reactivated by cleavage, we attempted to express and purify a functional, truncated version of MMP-9 fused to an His-tag ( BBa_K4244042 ) from an engineered E. coli BL21 (DE3) [31]. MMP-9 expression was assessed by inducing protein production and performing a Western Blot with anti-MMP-9 antibodies. Figure 8 showed distinct bands at the predicted size of our MMP-9 variant (45 kDa) [31], as well as an unidentified band at ~70 kDa. Due to time constraints, we did not manage to purify MMP-9 through the fused His-tag yet.
Figure 8: Western Blot of BL21 (DE3) expressing MMP-9. Left is the Precision Plus ProteinTM ladder. 1x and 5x dilutions of sonicated whole cell fragment, soluble supernatant, and pellet were loaded.
Conclusion
Throughout the project, consistent expression of chromoproteins proved to be challenging and yielded the best results in E. coli BL21(DE3). No secretion was observed using the native secretion systems of E. coli, although Western Blot could be performed to prove the presence of the proteins in the supernatant. In silico modelling of flipCPs showed the design is promising, but so far testing of the constructs in the lab has not shown that the modified CPs and fluorescent protein retain their colours. MMP-9 was successfully expressed in E. coli as confirmed by Western Blot, an important step for future testing of this cancer biomarker.
Three levels of biosafety
Since Colourectal involves the administration of a genetically modified organism to humans, it is important to build in biosafety and biocontainment mechanisms. A multi-layered strategy encompassing different genetic safeguards makes of our living diagnostic a safe tool for humans and the environment. In the first place, Escherichia coli Nissle 1917 (EcN) should not be able to colonise parts of the body other than the colon, as this would represent an important health concern. To this end, our living diagnostic will be engineered for surviving only in the presence of mucin, a glycoprotein found in the lining of the colon. Second, our living diagnostic should not be able to survive outside of the human body to avoid its spread into the environment. To achieve this, a kill switch, which is activated at temperatures below that of the human body, has been implemented. In the third place, another inducible kill switch has been integrated, this time triggered by the fourth pill of the Colourectal self-test, which is administered to the patient in case of an adverse response, or once the diagnosis process finishes.
Colon biocontainment: a two-component chimeric system for mucin dependence
In order to mitigate risks associated to the spread of the living diagnostic from the colon to other parts of the body, a mucin auxotrophy has been implemented. Our molecular system is based on a chimeric two-component system (TCS) activated by mucin, the main family of proteins found in the mucus. Mucins are continuously produced by the cells lining the gastrointestinal tract [1,2] and therefore, always present in the colon. The TCS controls the expression of the guanylate kinase gene gmk, essential for cell for its role in nucleotide metabolism [3]. When mucin is present (permissive conditions), the essential gene is expressed allowing cell survival. By contrary, in absence of mucin (non-permissive conditions), gmk is not expressed, leading to cell death.
The TCS consists of a chimeric sensor formed by Dismed2, the extracellular domain of RetS, which is in turn a sensor protein from Pseudomonas aeruginosa known to sense mucin [4], and the transmitter domain of EnvZ, part of the popular E. coli-derived two-component system EnvZ-OmpR [5,6]. The EnvZ-OmpR TCS functions in such a way that EnvZ, when activated, phosphorylates the transcription factor OmpR, which subsequently activates the OmpC promoter. As indicated, in our system OmpC regulates the transcription of the essential gene gmk.
Figure 1: The chimeric two-component system (TCS) is formed by combining the extracellular Dismed2 domain of the RetS protein from Pseudomonas aeruginosa and intracellular HisKA and HATPase intracellular domains of EnvZ part of the EnvZ-OmpR TCS. In this system mucin activates the chimeric TCS, which in turns allow the expression of the essential gene gmk.
The chimeric sensor, named Dismed2-EnvZ was constructed by cloning the two domains under the control of the constitutive promoter (BBa_J23100) and strong RBS (BBa_J34801). The construct (BBa_K4244009) is inserted in a medium copy number plasmid to avoid any possible toxic effect in the host. After successful cloning of Dismed2-EnvZ as well as the OmpC promoter with the reporter gene (GFP) (BBa_K4244011), these constructs have been transformed into an E. coli from the Keio collection lacking the native gene for envZ [7] to test the mucin dependency. The results of our plate reader experiments with a fluorescent reporter show that increasing mucin concentrations result in high activation levels of gene expression (Figure 2).
Figure 2: Green fluorescent protein (GFP) fluorescence value of Escherichia coli strain lacking the EnvZ gene, after induction with varying concentrations of mucin in M9 medium at 14th hour of incubation. The first negative control is the naked strain, harbouring no plasmid. The second negative control have the Ompc:GFP plasmid (BBa_K4244011). While chimeric two-component system (TCS), contains both the OmpC:GFP and the chimeric Dismed2 EnvZ (BBa_K4244009). The graph shows that with 5 % of mucin the fluorescence of the chimeric TCS is significantly higher than with no mucin.
The results also suggest that the chimeric receptor shows leakiness, a natural phenomenon given that the EnvZ-OmpC is a native osmotic pressure responsive system in E. coli Nissle 1917. This is shown by the levels of fluorescence in the cells only containing the OmpC-GFP (control -), and the cells containing the complete chimeric TCS but without the presence of the ligand (mucin). Therefore, the main future goal of our system would be the reduction of this leakiness. As a potential solution, we propose the deletion of the two other components of the EnvZ-OmpR system: ompF and ompC. Adding to this, next step would be the cloning gmk under OmpC and knock-out this gene from EcN genome to perform cell death assay with following conditions: mucin and no mucin.
Biocontainment: temperature kill switch
When our living diagnostic leaves the user’s body through the faeces, it could potentially spread into the wild representing a risk for the environment. To prevent this, a temperature-dependent kill switch has been implemented as biocontainment strategy allowing cell survival only at the permissive condition of 37 °C, the body temperature.
Two temperature kill switches with temperature-dependent tunable promoters, which had previously been successfully demonstrated in literature [8–10] were tested in EcN as part of our project. In the work of Piraner and colleagues, the temperature kill switch was constructed using the transcriptional auto-repressor TIpA (BBa_K2500004) from the virulence plasmid of Salmonella typhimurium [8]. This protein contains a coiled C-terminal domain that undergoes uncoiling between 37 °C and 45 °C and an N-terminal DNA-binding domain that acquires a dimeric state with lower temperatures and blocks transcription. If tIpA and its promoter are upstream the antitoxin of a toxin-antitoxin system (Ba_K4244019) (figure 3, left), they will cause a decrease of antitoxin expression when temperature drops, leading to the microorganism’s death [10]. The second kill switch that was tested, was obtained from the work of Stirling and colleagues, and contains the temperature-sensitive regulatory region of the cold shock protein A (PcspA) [9]. The pcspA (BBa_K4244021) is a constitutive promoter that has a high rate of transcription but contains a long 5′ UTR of 159 bp that, at 37 °C, acquires an unstable secondary structure and leads to its degradation. On the other hand, at lower temperatures it forms a stable configuration, which allows translation. In this case, a toxin (ccdB) under the cspA promoter (Figure 3, right) is expressed but counteracted by the antitoxin ccdA (BBa_K4244023), which in turn is expressed by being placed after the constitutive LacUV5 (BBa_M36801) [10]. As indicated, at lower temperature the cspA stabilizes and ccdB overexpresses ccdA, which leads to cell death.
Figure 3: The promoter pTIpA, regulated by TlpA, is upstream of the ccdA antitoxin. When temperature reaches below 37 °C, this leads to a decreased expression of the antitoxin and the toxin ccdB leads to the microorganism’s death (left). The promoter pcspA is upstream the toxin (ccdB) while the antitoxin (ccdA) is placed after the constitutive LacUV5. When temperature decreases, the expression of the toxin increases, leaving to microorganism’s death (right). Adapted from [10].
Both systems containing the temperature-sensitive kill switches have been tested with a reporter gene in EcN. This was done to test if at different temperatures the results in our microorganism matched those of literature: increase of fluorescence with decreasing temperature for the pTlpA system and decrease of fluorescence with decrease of temperature in the pcspA system. Since both systems showed the correct behaviour with the fluorescent reporter, this gene was replaced with either a toxin or an antitoxin according to the two systems’ rationales in order to perform temperature sensitive survival assays. On the one hand, the pTlpA system did not show the expected results in our host even after our troubleshooting thanks to our partnership with NanoBuddy, due to growth could also be observed under non-permissive conditions. On the other hand, the pcspA system showed a dramatic decrease of CFU at temperatures that were lower than 37 °C as shown in Figure 4.
Figure 4: The temperature sensitive survival assay shows that the pcspA system with the toxin-antitoxin (ccdA-ccdB) no colonies are present in the plates incubated at temperatures below 37°C compared to the control, pcspA system with reporter gene.
Since the pcspA system was shown to work both with a reporter gene and with the toxin-antitoxin kill switch, the next step would be its integration in the EcN genome. This would allow both to have an antibiotic resistance-free system that is not susceptible to plasmid loss, and to decrease the risk of emergence of mutants that could survive also under non-permissive conditions. Having the construct in the chromosome of EcN increases the genetic stability of the kill switch, decreasing the possibility of escapers.
“Terminator” kill switch
What if our living diagnostic causes any discomfort to the person who ingests it? What if colorectal cancer has already been detected and it is no longer necessary to report its presence through coloured stool? And how do we make sure that any trace of the Colourectal microorganism abandons the user’s body after the self-test process?
The third biosafety mechanism of our project has been installed to address all these questions by means of another inducible kill switch with two clear states: ON in the presence of inducer, and OFF in its absence. Three components were identified as the most important for the performance of such a kill switch: an inducible promoter/regulator system, an effective killing mechanism, and a way to ensure zero basal levels of killing under the permissive conditions.
Inducible system
A great variety of inducible expression systems driven by different molecules can be found both in the literature and in the iGEM repository (e.g., pBad/araC, BBa_I0500). However, since we aimed to introduce this kill switch circuit in our living diagnostic, which will perform its function in the human colon environment, the search for a suitable system became tricky. The inducer molecule of choice could not be part of the colon, be produced by tumour cells, or be a food component. Moreover, it could not be toxic to humans, the environment, or the colon microbiome.
All these requirements drastically reduced the number of possibilities, but the inducible expression system NahR/Psal, activated by acetylsalicylic acid (aspirin), initially seemed to be a possible solution. Aspirin is not present in food or in the colon, and it is considered safe as it is used for medical purposes. Since it is generally administered as a painkiller, we thought it would be possible to use it as the inducer molecule for the kill switch and advice users of Colourectal to avoid intake of aspirin during the self-test process. The part for this inducible expression system (BBa_J61051) was obtained from the iGEM plates to start experimentally testing its activity. However, gastroenterologist Markus Gwingger strongly advised us to not use aspirin as our inducing molecule, since aspirin is often taken as blood thinner by our target group.
“You are using it in a population over 55 of which many will have heart disease and be on aspirin [...]. You definitely need to change that”. - Markus Gwiggner, gastroenterologist
Therefore, our design needed to be changed and a different inducible expression system had to be tested. Because we developed a three-level rhamnose inducible system in our project, we decided to include this kill switch in the third step of this induction system. As a proof of concept, we used a rhamnose-inducible system to test the killing mechanism.
Killing mechanism
The method of choice to cause cell death was DNA degradation, controlled by our inducible expression system and elicited by a type-IC Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) device, which belongs to the Class I CRISPR-Cas systems. Unlike the more frequently used Class II systems, that only consist of 1 effector protein (e.g., Cas9 and Cas12a), the effector complex of Class I encompasses a CRISPR RNA (crRNA) and a number of different Cas proteins, and commonly receives the name of Cascade. In type-IC, the Cascade complex has 3 subunits – Cas5, Cas7 and Cas8 – that are responsible for recognising the target DNA and recruiting Cas3, an enzyme with single-strand DNA helicase-nuclease activity able to cleave and degrade target DNA in a processive way (Figure 5) [11,12]. For the Cascade complex to recognise the target sequence, a short 2-4 nucleotide sequence known as Protospacer Adjacent Motif (PAM) must be found upstream the sequence. In the case of Type-IC CRISPR-Cas, the most commonly used PAM sequence is 5'-TTC-3' [11].
Figure 5. Schematic of Type-IC Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated (Cas) system. One Cas5, one Cas8 and seven Cas7 (yellow, blue, and green respectively), together with the CRISPR RNA (crRNA) (purple), form the Cascade complex, recognise the target DNA and recruit Cas3 (pink). Cas3, thanks to its dual helicase-nuclease activity, is able to cleave and degrade DNA [12].
Type-I CRISPR-Cas systems have proven to be very effective mechanisms to achieve cell death in previous studies. For instance, an article by Caliando and Voigt (2014) demonstrates the utilisation of a highly effective type-IE system for inducible cell killing. However, we decided to use the type-IC CRISPR system from P. aeruginosa since, although its mechanism is not that well-known, its mechanism is very similar and has a more compact size (3 proteins instead of 5 in the Cascade complex), which would reduce the burden in the cell [11]. The advantage of this system compared to the more frequently used Cas9 or Cas12a is that the helicase-nuclease activity of Cas3 allows for large genomic deletions, that are assumed to be harder for the cell to repair than just double stranded breaks, and hence more deadly [12].
A good spacer targeting EcN’s genome had to be designed to achieve a system with a high killing efficiency. With that purpose, we analysed different features in EcN’s genome and discovered the presence of Repetitive Extragenic Palindromic (REP) elements. REP elements are very conserved sequences of around 35-40 bp long that are found in around 500 positions of the E. coli genome, always in intergenic regions. Their role in the cell is not well known, but they are believed to be involved in translation regulation [14]. With that information, we designed a single spacer to target multiple points in the genome simultaneously (5’-TTGCCGGATGCGGCGTAAACGCCTTATCCGGCCT-3’), expecting to achieve a highly genotoxic response upon induction. The REP element we used as a target was repeated 20 times in the EcN’s genome with a 100 % accuracy, and in three of those occasions it was preceded by the necessary PAM sequence (TTC), suggesting that our spacer would target the genome at least at three different loci.
A plasmid, referred as pCas3cRh, containing the CRISPR array with the designed spacer and the cas genes (cas3, cas6, cas8 and cas7) under the inducible expression system RhaRS/PrhaB was employed to test the kill switch efficiency in EcN. We performed this test in M9 medium with 50 mM of glucose and different rhamnose concentrations (0, 0.1, 0.5, 1, 3 and 5 mM) (Figure 6).
As it can be observed, our inducer managed to hinder cell growth in a gradual manner when applied at different concentrations. By inducing our kill switch with high concentrations of rhamnose (3 and 5 mM), we managed to significantly impair the growth of EcN, which was effectively inhibited for 15 h. The emergence of escapers with the ability to circumvent the CRISPR-Cas mechanism [12] most likely resulted in the growth observed after that period of time.
• Figure 6: Plate reader growth experiment conducted during 24h. Escherichia coli Nissle 1917 (EcN) containing pCas3cRh with the designed spacer targeting a Repetitive Extragenic Palindromic (REP) sequence was tested on different rhamnose concentrations (0, 0.1, 0.5, 1, 3 and 5 mM). The positive control consisted of EcN wildtype in M9 medium with 50 mM of glucose.
Basal expression correction
Inducible expression systems are known to have some level of basal expression (leakiness) in the absence of an inducer molecule. In some cases, that does not represent a problem, but in our case even a very low basal expression could result in genotoxicity, leading to undesired cell death, and possible mutations in the Cas proteins, the spacer or the inducible expression system [12]. To overcome this obstacle, we are currently working on the implementation of a constitutively expressed antiCRISPR (Acr) protein to block any potential Cas3 basal expression levels [15]. Our aim is to increase the stability of our diagnostic tool under permissive conditions allowing Cas3 to carry out its activity only when the inducer is provided. Our efforts currently focus on testing AcrC1 and AcrE1, which we envision could prevent our Type-IC CRISPR-Cas system from cleaving the host’s genome when unwanted, bettering in the future the stability of the kill switch and the characteristics of our product [16,17].
Three step inducible system

In our project there are three systems that should be induced by an external metabolite. However, inducers that are safe for human consumption, not present in the colon and that are compatible with Escherichia coli Nissle 1917 (EcN) are limited. Therefore, we have built a pipeline to control three separate genetic circuits at three different levels of the same inducer metabolite, the “three-step inducible system”. Here, we created the genetic circuit to enable this, and we tested several inducers on the suitability for this system.

Inducible genetic circuits

The first step in this system is the introduction of a positive control. To see whether our living diagnostic colonised the colon, a positive control is needed. In our project we already made use of colouring the stool blue. Therefore, we introduced a genetic circuit where we can induce the production of a pink chromoprotein to see if the bacteria are present during the diagnostic test. Further, we created an inducible kill-switch that is based on a Clustered Regularly Interspaced Palindromic Repeats (CRISPR)-associated (Cas) system. This circuit will be used to make sure that none of our modified bacteria are able colonise the colon after the test is done and to create an emergency kill-switch to stop the test if needed. This circuit was introduced on the third step of the three-step inducible system, because this should only be triggered after the living diagnostic completed the test. Lastly, our initial idea contained a localisation circuit, a system that made use of the fact that the living diagnostic cells accumulate around tumour cells. These cells would be made visible on a scan, by inducing the production of a visualisation protein. This would enable a medical professional to confirm the presence of a tumour and localise it. However, by talking to stakeholders, we concluded that a localisation system would not work with currently available techniques. However, if a suitable visualisation technique were to be developed, this system could increase the sensing specificity of our living diagnostic. The second step in the three-step inducible system will be reserved for this purpose.

The three-step inducible system was intended to be used for the steps mentioned above. However, we made it as modular as possible so it can be coupled to different genetic circuits and introduced into other chassis.

Genetic design for three inducible steps
The system is based on the binding efficiency of deactivated Streptococcus pyogenes Cas9 (dSpyCas9). dSpyCas9 is based on SpyCas9, which is a protein that is part of the self-defence mechanism of bacteria against bacteriophages in nature. When a bacterium survives an infection by a bacteriophage, it will save part of the attacker’s genetic sequence in a library called CRISPR. From these small sequences guides are created, which are used by the Cas protein to recognise the genetic sequence of the bacteriophage in a future infection. It does this by checking the DNA constantly for PAM regions, a sequence characteristic for bacteriophage DNA. The protein then matches the sequence next to the PAM to the guide and if a match is made, cuts the bacteriophage DNA [1]. In genome editing these guides can be used to target specific sequences [2]. dSpyCas9 is a non-cutting version of SpyCas9 and will not cut the specific target sequence, but only bind there [3]. This way, the dSpyCas9 protein obstructs other proteins that would normally bind at that location, for example replicases or nucleases. The binding strength of dSpyCas9 can be influenced by changing the sequence of the guide. The guide contains a spacer, a sequence that binds specifically to the target sequence. In this project three spacers called B1, t5 and P1 [4,5] were found for which the decline in binding efficiency, due to decreased length of the spacer, was known (Table 1).
Table 1: Spacers with known decrease in binding efficiency of dSpyCas9, when made shorter.
Spacer binding efficiency B1
tctagatttctcctctttaa
t5
atggatacctataatggttc
P1
ttgacagctagctcagtcct
   
100%   
   
20 matching bases   
   
20 matching bases   
   
20 matching bases   
   
50%   
   
9 matching bases   
   
10 matching bases   
   
9 matching bases   
   
20%   
   
8 matching bases   
   
7 matching bases   
   
7 matching bases   
A three-step inducible system was created using these spacers. This is based on the hypothesis that more dSpyCas9 proteins are needed to repress a gene with a weaker binding spacer than with a stronger binding spacer [3]. To then create three steps, the amount of dSpyCas9 must be controlled [6]. This was achieved by controlling the amount of dSpyCas9 with an inducer molecule and an inverter in between. For this inverter, the cI lambda repressor (BBa_C0051) was used. This is a protein that binds to the operator regions before the cI regulated pR promoter (BBa_R0051) [7]. Due to this inverter, the amount of dSpyCas9 will decrease when the concentration of inducer goes up (Figure 1). Additionally, the cI protein was fused to a degradation tag, stopping the accumulation of cI.
Figure 1: Genetic construct for the three-step inducible system. The system consists of an inducible promoter, a cI repressor protein that represses a pR promoter, deactivated Streptococcus pyogenes Clustered Regularly Interspaced Palindromic Repeats (CRISPR)-associated (Cas) system (dSpyCas9), a spacer plasmid with three single guide RNAs (sgRNAs) with different binding capacities and three separate genetic circuits that are targeted by the dSpyCas9. Here an increase of an inducer metabolite leads into a decrease of dSpyCas9 due to the cI repressor. This dSpyCas9 protein represses each genetic circuit by using an sgRNA to bind to a specific target region introduced in between the promoter and the first genetic sequence of each circuit. Due to the repression of dSpyCas9, the silencing of each genetic circuits goes down and at one point it will be expressed. Additionally, the weaker the binding efficiency of the sgRNA used by the dSpyCas9, the more dSpyCas9 proteins are needed to repress that gene. Therefore, when increasing the repression of the dSpyCas9 protein, the genetic circuits repressed by a weaker binding sgRNA should express earlier than the ones repressed by a stronger binding sgRNA. This way three steps that are expressed at three distinguished concentrations of one inducer metabolite were created [3,6].
Rhamnose induction controlling dSpyCas9 production
This system was tested by fusing luciferase to dSpyCas9 (BBa_K4244031) and measuring the construct at several time points at different concentrations of an inducer molecule. We hypothesised that at a higher inducer concentration, the amount of cI would increase and the amount of dSpyCas9 would go down. For this system we need a titratable inducer system, where more inducer leads to higher expression. We chose to use the inducible expression system RhaRS/PrhaB (BBa_K4244027), because for this inducer there are two knockouts known in E. coli that increase the titratability [8]. These knockouts disable the cells to transport and catabolise rhamnose. Therefore, all rhamnose that diffuses into the cells is used for the promoter system, the strain obtained from this is E. coli ∆rhaB ∆rhaT. These two knockouts were performed in EcN and the EcN ∆rhaB’∆rhaT was obtained. This strain was tested by growing them in M9 media with different concentrations of rhamnose obtained from the paper [8]. Higher concentrations were also tested. Here, only the cells without the knockout should grow. In this experiment it was observed that up to a concentration of 25000 μM of rhamnose, the knockout cells grow substantially less than normal EcN cells. From this result it can be concluded that the knockouts from the paper work in our strain and should result in a more titratable induction.
Figure 2: Growth experiment with wildtype Escherichia coli Nissle 1917 (EcN) strain and knockout EcN strain. Both were grown from 0.10 OD600 and were placed on increasing concentrations of rhamnose, till 25mM. Additionally, both strains were grown with 50 mM of glucose and no rhamnose.
After this positive result, we performed the following experiments in the knockout strain. First, the luciferase experiment was performed with this knockout EcN strain [8]. Here, the amount of luciferase decrease, with increasing amounts of rhamnose is shown (Figure 3).

The experiment was done with rhamnose concentrations of 0 to 500 μM [8]. 90 minutes after induction a trend was observed, where the amount of luciferase production decreased when the rhamnose concentration increases (Figure 3a). Due to the time that it takes to add and mix the substrate, differences in production levels can occur between biological triplicates. Nevertheless, a trend can be observed, especially from 0 to 100 μM of rhamnose. Therefore, the experiment was repeated only with lower concentrations of rhamnose (0 to 125 μM). In this experiment again a similar trend can be observed as in the first experiment (Figure 3b). These results show that the amount of dSpyCas9 can be decreased by an increasing amount of rhamnose.

Figure 3: Rhamnose dose-response curve on production of luciferase. Experiment was repeated on two days in the same machine. 3A: First day, strain was put on increasing rhamnose concentrations until 500 μM. 3B: Second day, strain was put on increasing concentrations until 125 μM, to further examine the system.
Fluorescent protein construct targeted by dSpyCas9
In the second part of the system the three spacers mentioned above play a role. The P1 spacer was used to test whether the binding efficiency of the spacers influences the system. Three plasmids were designed, each containing a monomeric red fluorescent protein (mRFP) (BBa_K4244035) and a P1 spacer with a different length. Every spacer targeted a specific sequence introduced between mRFP and its promoter. The goal was to produce mRFP at three different concentrations of an inducer metabolite, depending on the length of the spacer. As mentioned above, the length of the spacer influences the binding efficiency and therefore we hypothesized that more dSpyCas9 was needed to repress a weak binding guide and less repression of dSpyCas9 is needed to produce this mRFP. Therefore, the longer the spacer, the more repression under the influence of an inducer molecule is needed to produce the protein. To evaluate this system, a plate reader experiment was performed to measure the mRFP production at different rhamnose concentrations. This system was built on plasmids with a medium copy number, and the fluorescent proteins were under the influence of medium promoters and RBS regions. These strains were induced at 0.08 OD and grown over time with different concentrations of rhamnose (Figure 4).
Figure 4: Three separate knocked out Escherichia coli Nissle 1917 strains were created, each containing the double repression plasmid and a plasmid containing monomeric red fluorescent protein and a different length of spacer P1. Furthermore, a positive control, containing both plasmids but with a non-targeting spacer was added, this measurement was obtained in the same machine and experiment but on a different day.
From Figure 4, it can be seen that in the experiment with the sgRNA with the shortest target spacer of 7 nucleotides (nt), there is none or a very small amount of repression of the fluorescent gene. Therefore, the binding efficiency of this spacer is next to nothing and in future experiments with the P1 spacer, a longer spacer should be used to create a threshold system. However, in the experiments with the spacers of 9 and 20 nt very low production of mRFP can be seen, meaning that the dSpyCas9c managed to repress these genes. Additionally, it can be observed that at a rhamnose concentration of 250 μM the fluorescence increases with around 400%. This is the same for the spacer of 9 nt and spacer of 20 nt. At this point the amount of dSpyCas9 is so low, following the repression of the first part of the system, that the fluorescent protein is able to be produced. In our hypothesis the strain containing the 9 nt spacer should produce the fluorescent protein at a lower level already. This is not yet the case, both start producing mRFP between 250 and 500 μM. An explanation for this result is that dSpyCas9c is very stable, due to which a low amount of dSpyCas9 is needed to repress the system. Therefore, the fluorescent protein will only be produced at very low levels of dSpyCas9, this is after 250 μM of rhamnose. Before that, there is such a great amount of dSpyCas9 that the repression with the use of cI has almost no effect.
Further regulation of intracellular dSpyCas9
A future approach to create a system that reacts at different concentrations of rhamnose could be to decrease the overall amount of dSpyCas9. Reduced repression might already lead to an effect. To do this, two approaches were considered. One was to decrease the amount of protein produced, e.g. with a lower copy number plasmid. The other was to decrease the stability of dSpyCas9 by fusing it to a degradation tag. This way, more dSpyCas9 is needed to achieve the same level of repression observed in the current system. Alternatively, the current design could be used by switching the luciferase, fused to dSpyCas9, with a degradation tag. With these two approaches the two peaks from figure 4 should shift away from each other, and this way different steps will be created where the three separate genetic systems can be placed on.
Additionally, in future research the two other spacers, t5 and B1, could be tested to find the most suitable one for each step. The same experiments could then be repeated with a library of randomly generated spacers, targeting a specific sequence between the fluorescent protein and its promoter. These can contain mismatches, and length differences. From these the most suitable ones could be selected to increase the dynamic range of the system and create more distinct concentrations levels where separate steps of the system are turned on.
Testing inducer systems for titratability
The three-step system described above should be inducible at different levels instead of being constitutively expressed. One reason for this is that the user of our living diagnostic should be able to activate each genetic circuit at a specific time and these should be expressed using the same metabolite. In this way, different systems can be activated while the user only has to consume more of the same metabolite. Additionally, constant expression of any of the pathways would be too much of a burden to the bacterial cells. The metabolite used for activation should not be toxic to humans at the concentrations that it is used at to activate the pathways. Several inducer systems were assessed, all with different inducers, allowing us to select the most suitable candidate for our purpose. As described before, rhamnose was selected to be one of the inducer metabolites. The other metabolites, each inducing their own promoter system, were 3,4-dihydroxybenzoic acid (DHBA), choline chloride, naringenin, vanillic acid, p-isopropylbenzoate (cumate) and propionate. DHBA is a naturally occurring benzoic acid found in several foods like fruits, nuts and vegetables, and has been found to have anti-carcinogenic properties [9]. Choline chloride is a dietary component found in foods and is important for many cellular processes and is widely used as an additive for animal feed. It has a low toxicity for humans, is not mutagenic and has no toxic effects on development [10]. Naringenin is a flavourless, colourless flavonoid found in a variety of fruits and herbs. It is being researched for several biological effects, including having an anticancer effect by inhibiting tumour growth [11, 12]. Vanillic acid is a constituent of vanilla, commonly used as a flavouring agent. It has been found to improve the condition of patients with both cancer and obesity [13]. P-isopropylbenzoate, also called cumate, is used as a preservative and thus safe for consumption. Choi et al. developed a tightly regulated, titratable gene expression system using this inducer [14]. The same was achieved by developing a promoter system induced by propionate, which is also used as a food preservative [15].
Figure 5: The cumate inducible promoter system was tested at cumate concentrations ranging from 10 to 102 μM. The luminescence was measured in a plate reader 3 hours after induction.
Each of these inducers corresponds to a promoter system that was put into a low copy number plasmid to mimic genomic production [14–16]. The plasmids were designed in such a way that the results could be compared with the results of the three-step system described above. These inducible promoters caused the production of the cI repressor which then represses luciferase production. An increasing metabolite concentration should thus cause a lower luciferase signal.
In Figure 4, the result of one of the inducer experiments with cumate can be seen. EcN cells with the corresponding plasmid were induced with cumate and then measured with a plate reader. In this figure the measurement 3 hours after induction can be seen. At first hand the luminescence seems to decrease when the cumate concentration increases, but no significant conclusions can be drawn from this when looking at the error bars. Other inducer systems had similar results, or did not show any luminescence. This means that either the tested inducer systems were not titratable, or that something in the plasmid design caused them to not be titratable. Either way, the rhamnose system tested before is the best option for now. Additionally, more research should be done on this subject until the best titratable, inducible promoter system can be chosen for our living diagnostic.
Conclusion
With this project we created a pipeline that we can potentially use to express three separate genetic circuits at three different levels of one inducer metabolite. We demonstrated that we can turn on the system by increasing the inducer metabolite concentration in the media. Additionally, we formed a hypothesis for future research, to create three levels where the system turns on. In this experiment rhamnose was chosen to use as an inducer metabolite. This was done because of two known knockouts to increase the titratability of the inducer system. These were successfully performed and tested in EcN. Additionally, we also tested other inducer systems on their titratability an thus suitability for our system. From this we obtained that the tested inducer systems were not titratable, therefore we concluded that rhamnose as an inducer is our best option for now. This system is not host dependent and the steps are not dependant on a genetic circuit. Therefore, it is modular at two levels and helps to increase the modularity of our living diagnostic.