Introduction
As the iGEM teams of DTU Denmark and TU Delft, we formed a partnership to benefit both projects by collaborating on a scientific level. We supported each other’s research by exchanging knowledge, skills and work, as well as frequently meeting to discuss the projects.
Team Fun4al (DTU-Denmark) is developing a strain of Aspergillus niger capable of detecting and degrading a toxic compound, furfural, to improve A. niger growth on lignocellulosic waste. To do this, the team had to develop synthetic transcription factors (TFs) to detect furfural and subsequently activate enzymes converting furfural to non-toxic compounds.
Team SPYKE (TU Delft) is developing a novel bioelectronic sensor for detecting GHB, a common rape drug, in drinks to warn the user as soon as possible that their drink has been spiked. The research involves assembly of the sensor by immobilization of biomolecules on a gold electrode, testing that sensor, and optimization of the individual parts of the system using various approaches.
Figure 1. Overview of the projects Fun4al and SPYKE, from DTU and TU Delft respectively, their differences, and their overlap.
Although these projects are entirely different in terms of application, there is an important overlap that our partnership is based on: both projects involve a TF protein as part of the mechanism of action. Team Fun4al is using the synthetic TF, sBAD, to activate the expression of furfural-degrading genes in response to furfural. Team SPYKE uses the BlcR protein from Agrobacterium tumefaciens as the central part of their capacitive GHB sensor, in which it dissociates from immobilized strands of DNA when GHB is present, resulting in a change in capacitance. Both teams also aim to engineer their TF to optimize the affinity or specificity for their application. Because of this shared objective, we decided it would be valuable to support each other in TF-focused research.
This shared wiki page was created by both teams to document our partnership. The main reason for making a shared wiki page was to allow each of the team to write the parts and results that we had been working on ourselves. In other words, the Fun4al team had better requisites to explain the bioinformatic work behind the results made for the SPYKE team, and the SPYKE team could better explain the laboratory work they offered the Fun4al team and the issues they had.
Timeline
July
First meeting
We discussed the projects of the two teams, the different strategies we use to modify our TFs and which skills and equipment we could provide for each other. Importantly, this meeting shaped the plan and work distribution between us, as outlined above.
Docking results
To verify whether SSA can be used as an adequate analog for GHB, team Fun4al docked both ligands to a crystal structure of BlcR (PDB ID: 3MQ0) using Rosetta software.
The binding site of BlcR was predicted using the P2Rank tool that uses a random forest algorithm that was trained on different datasets such as CHEN11, JOINED, COACH420, and HOLO4K [2]. The result shows that residues T101, Y112, F120, T131, D183, I187, C193, A210, and S212 are on the surface of the BlcR predicted binding pocket. Besides A210, all other residues can potentially form ionic or hydrogen bonds with the ligands. This information will be the input for Rosetta to perform docking. Docking results with the lowest binding energy for each ligand were chosen to visualize and compared with each other as shown in Figure 5. All the side chains that are in the radius of 4 Ångström and have polar contact with the ligand will be shown as sticks.
Both SSA and GHB form a hydrogen bond with T133 at the carboxylic acid side. For SSA, the aldehydes group forms hydrogen bonds with Y112 and T101. For GHB, the alcohol groups form a hydrogen bond with S212. This difference is due to double bonds having shorter length than single bonds which in turn affects the binding ability of the ligand.
Figure 5. Docking of SSA and GHB, respectively. The dotted yellow lines represent hydrogen bonds. Binding energy: -14.232 REU and -14.544 REU.
Figure 6. Violin plot of the binding energy of GHB and SSA.
In Figure 6, the violin plot visualizes the quantitative value of a binding energy term between different ligands. A thousand of different predictions were generated for each ligand to compare the docking similarity between them. Qualitatively, there is no significant difference between SSA and GHB. Quantitatively, a two samples t-test reveals that the difference between GHB and SSA has a p-value of 9.261*e-4.
Although the lowest binding energy prediction of SSA does not have the same conformer as GHB, it is still possible for the SSA conformer to match that of GHB but with a higher binding energy that is insignificantly different from GHB. Therefore, SSA and GHB are likely to interact with BlcR in a similar way.
August
Discussing transcriptomics results
We got together to discuss the first results of the transcriptomics research, with the goal of seeing if there are any operators in A. tumefaciens that are affected by the addition of SSA and/or GHB. Team Fun4al presented their findings, after using their software to process transcriptomics data available from literature [1].
From the high-throughput transcriptomics data available for A. tumefaciens cultured in the presence of GHB, GABA, or succinic acid (as reference), the differentially expressed genes (DEGs) were extracted. The DEGs were mapped to determine which genes are significantly up- or downregulated by GHB and GABA. This analysis revealed that the genes that are significantly upregulated by GABA are limited to the ones in the blc operon, confirming our previous knowledge. We decided together to adjust the experiments and try again, in order to look for more relevant results. We decided to consider all DEGs for GHB, instead of limiting it to those occurring for both GHB and GABA.
Solidifying our partnership
During the same meeting, we agreed it would be interesting if team SPYKE used ITC to characterize sBAD for team Fun4al. For this, team Fun4al would need to produce and purify sBAD and ship samples as well as synthetic oligonucleotides and a solution of furfural to Delft. We agreed to set up a meeting to discuss the protein expression and purification, so team SPYKE could offer advice based on their experience with protein expression and purification. We also discussed the possibility of forming an official partnership for the first time.
New transcriptomics results
Fun4al were still not able to find any other promoters that appeared to be up- or down regulated by GHB within the A. tumefaciens C58 genome. Following this meeting, the Fun4al team decided to change strategy for the next meeting.
September
NCBI Blast results
Since the Fun4al team did not find variants of the BlcR binding site within the A. tumefaciens C58 genome, they decided to test some different NCBI nucleotide BLAST options to search for the binding site in other organisms. They found 25 sequences, mainly from the same species, that to a certain degree matched the BlcR binding site from A. tumefaciens C58 (Figure 7). All of these sequences are located near genes seemingly involved in succinic acid metabolism. Often, the first gene downstream the sequences was named NAD-dependent succinyl-semialdehyde dehydrogenase, which is the same type of gene that is regulated by BlcR. This suggests that the 25 sequences are evolutionary homologs to the blc operator.
Figure 7. 25 sequences similar to the BlcR operator from A. tumefaciens C58 (AE007872.2) were found with an nBLAST on NCBI (searching ‘somewhat similar sequences’ and word size of 7). Left: an unrooted tree of the sequences with the NCBI accession codes at the tips. Right: the multiple alignment of the sequences coloured by the percentage of identity with a normalized logo plot underneath and the consensus sequence. The new observed inverted repeats (IR1, IR2, IR3) are outlined. The sequences were aligned using MAFT and visualized with JalView. The tree was generated using TreeHugger 0.5 and visualized using FigTree v1.4.4.
The sequences seemed to divide into two different groups. Remarkably, a normalized logo plot combining all of the sequences revealed new inverted repeats that were different from what has been reported to be important for BlcR binding in literature [3] [4]. These groups of inverted repeats, mainly IR2 and IR3, seemed to be relatively conserved as opposed to the inverted repeats reported previously.
As DNA-binding proteins are normally known to interfere with inverted repeats, this result sparked our curiosity: could it be that the consensus sequence of the 25 sequences would provide a stronger binding for BlcR?
Following this meeting, Fun4al suggested five variations of the BlcR binding site based on the results presented above. The SPYKE team chose to investigate the consensus sequence and CP049218.1, a sequence belonging to the group of operators that seemed to differ from the WT operator from A. tumefaciens C58 (AE007872.2).
Dropping ITC
Due to trouble with protein expression on the side of team Fun4al, as well as trouble with ITC on the side of team SPYKE, we came to the shared conclusion that it wouldn’t be possible for team SPYKE to characterize sBAD using ITC. Even if it was possible to produce, purify, and ship samples of sBAD in time, there is a chance that ITC measurements would come back inconclusive, and team SPYKE would not have time or materials to troubleshoot. To use time and manpower more efficiently, we agreed to call off the protein production and ITC experiments.
October
Testing alternative BlcR binding sequences
Because of the same trouble with ITC, team SPYKE wasn’t able to characterize binding between BlcR and the alternative DNA sequences found by team Fun4al using ITC. Because of this, we moved to an electrophoresis mobility shift assay (EMSA) as an alternative method for observing protein-DNA binding. A major advantage of EMSA in comparison to ITC is that multiple DNA sequences can be evaluated at the same time, but the drawback is that results are not quantitative enough to determine thermodynamic properties from. Still, we were able to compare different DNA sequences to each other and to the wildtype and estimate whether the binding is stronger or weaker.
The results showed that a number of variants appear to have an improved affinity for BlcR in respect to the wildtype currently used, including the two sequences selected from the list of analogous operator sequences made by team Fun4al. By adding SSA, it was confirmed that there was no loss of efficiency in the ligand-dependent dissociation mechanism in these variants. See Figure 8 .
Figure 8. Multiple sequence alignment of selected blc operator variants showing exceptionally high affinity to BlcR (top: 01-22) or exceptionally low affinity to BlcR (bottom: 00-14). Highlighted in the alignment are the highly conserved outer inverted repeat (pink) and the lesser conserved inner inverted repeat (blue). Wildtype operator sequence 01, with the previously reported inverted repeat regions highlighted, is aligned at the bottom for reference.
By aligning and comparing high-affinity and low-affinity variants, team SPYKE found that one part of the sequence was highly conserved between high-affinity variants and lacking in almost all low-affinity variants (Figure 5). This sequence coincided exactly with the inverted repeat ‘IR2’ identified by team Fun4al before, which does not line up with the binding sequence identified in literature. We concluded that IR2 must be essential for BlcR-DNA binding, which challenges the assumption that the inverted repeats previously reported in literature [4] are the binding element in the operator sequence.