Introduction

As the iGEM teams of DTU Denmark and TU Delft, we formed a partnership to benefit both projects by collaborating on a scientific level. We supported each other’s research by exchanging knowledge, skills and work, as well as frequently meeting to discuss the projects.

Team Fun4al (DTU-Denmark) is developing a strain of Aspergillus niger capable of detecting and degrading a toxic compound, furfural, to improve A. niger growth on lignocellulosic waste. To do this, the team had to develop synthetic transcription factors (TFs) to detect furfural and subsequently activate enzymes converting furfural to other, non-toxic compounds.

Team SPYKE (TU Delft) is developing a novel bioelectronic sensor for detecting GHB, a common rape drug, in drinks to warn the user as soon as possible that their drink has been spiked. The research involves assembly of the sensor by immobilization of biomolecules on a gold electrode, testing that sensor, and optimization of the individual parts of the system using various approaches.

Stages in project development and research

Figure 1. Overview of the projects Fun4al and SPYKE, from DTU and TU Delft respectively, their differences, and their overlap.

Although these projects are entirely different in terms of application, there is an important overlap that our partnership is based on: both projects involve a TF protein as part of the mechanism of action. Team Fun4al is using the synthetic TF, sBAD, to activate the expression of furfural-degrading genes in response to furfural. Team SPYKE uses the BlcR protein from Agrobacterium tumefaciens as the central part of their capacitive GHB sensor, in which it dissociates from immobilized strands of DNA when GHB is present, resulting in a change in capacitance. Both teams also aim to engineer their TF to optimize the affinity or specificity for their application. Because of this shared objective, we decided it would be valuable to support each other in TF-focused research.

This shared wiki page was created by both teams to document our partnership. The main reason for making a shared wiki page was to allow each of the team to write the parts and results that we had been working on ourselves. In other words, the Fun4al team had better requisites to explain the bioinformatic work behind the results made for the SPYKE team, and the SPYKE team could better explain the laboratory work they offered the Fun4al team and the issues they had.

Description of the partnership

At our first meeting, it became clear what skills and equipment each of our teams were lacking or could provide for each other: Fun4al had knowledge of relevant bioinformatic tools, while team SPYKE had access to an Isothermal Titration Calorimetry (ITC) machine that could be used to measure TF-ligand binding and TF-DNA binding affinities. The distribution of work between us therefore seemed evident; Fun4al should provide bioinformatic work power in two areas: 1) Prediction of TF-ligand docking using Rosetta, and 2) Exploration of alternative DNA binding-sequences of the TF. SPYKE should test TF-ligand binding and TF-DNA binding affinities with their ITC machine using different TF variants and ligands sent by Fun4al.

Prediction of TF-ligand docking using Rosetta

As the SPYKE team was not allowed to use GHB in most of their experiments due to regulations, they used the analogue SSA when testing the characteristics of BlcR. Therefore, they were interested in investigating in silico whether SSA and GHB would bind to BlcR similarly, as this could reveal whether they could rely on SSA being an adequate analog. The Fun4al team helped the SPYKE team by using molecular docking to compare the binding energies of SSA and GHB to BlcR.

Finding alternative operator sequences for the TF

One major challenge for the SPYKE team was to improve the BlcR-DNA binding affinity. A part of their project is to explore alternative operator sequences to find variants with increased affinity for BlcR. The Fun4al team set out to find or generate alternative operator sequences, using two different strategies.

First, team Fun4al investigated if there could be any other possible binding sequences for BlcR in the genome of Agrobacterium tumefaciens C58: the strain from which BlcR and its operator sequence were obtained. The operator sequence was BLASTed against the genome. Furthermore, we investigated a published transcriptomic dataset (Gonzalez‐Mula et al. (2019)) to see if some of the significantly up- or down-regulated genes in response to GHB and SSA could be regulated by BlcR.

As a second approach, the operator sequence of BlcR was BLASTed with loose definitions in NCBI to search for possible binding site variants in other organisms.

Characterization of sBAD using isothermal titration calorimetry

The Fun4al team wanted to engineer sBAD to introduce or improve specificity for furfural. However, it is unknown whether the existing sBAD protein has any affinity for furfural, and knowing this would support their research strategy greatly. Team SPYKE was planning to use isothermal titration calorimetry (ITC) to quantitatively measure protein-DNA binding to characterize the binding and unbinding of their TF BlcR, and the same method could be applied to measure the binding and unbinding of sBAD.

With ITC, a solution of DNA is titrated into a solution of the protein (or vice versa) inside an isothermal cell. The machine will measure any consumption or production of heat energy inside the cell over time, integrate the exothermic or endothermic signal from each titration and create a plot of the heat energy against the molar ratio. From this plot we can determine certain parameters, such as the molar ratio at which saturation is reached, and the association or dissociation constant (kA or kD).

Timeline

July

First meeting

We discussed the projects of the two teams, the different strategies we use to modify our TFs and which skills and equipment we could provide for each other. Importantly, this meeting shaped the plan and work distribution between us, as outlined above.

Docking results

To verify whether SSA can be used as an adequate analog for GHB, team Fun4al docked both ligands to a crystal structure of BlcR (PDB ID: 3MQ0) using Rosetta software.

The binding site of BlcR was predicted using the P2Rank tool that uses a random forest algorithm that was trained on different datasets such as CHEN11, JOINED, COACH420, and HOLO4K (Krivák 2018). The result shows that residues T101, Y112, F120, T131, D183, I187, C193, A210, and S212 are on the surface of the BlcR predicted binding pocket. Besides A210, all other residues can potentially form ionic or hydrogen bonds with the ligands. This information will be the input for Rosetta to perform docking. Docking results with the lowest binding energy for each ligand were chosen to visualize and compared with each other as shown in figure 2. All the side chains that are in the radius of 4 Ångström and have polar contact with the ligand will be shown as sticks.

Both SSA and GHB form a hydrogen bond with T133 at the carboxylic acid side. For SSA, the aldehydes group forms hydrogen bonds with Y112 and T101. For GHB, the alcohol groups form a hydrogen bond with S212. This difference is due to double bonds having shorter length than single bonds which in turn affects the binding ability of the ligand.

Snow
Forest

Figure 2. Docking of SSA (left) and GHB (right). The dotted yellow lines represent hydrogen bonds. Binding energy: -14.232 REU and -14.544 REU (on wiki can be clicked to enlarge)

Stages in project development and research

Figure 3. Violin plot of the binding energy of GHB and SSA.

In figure 3, the violin plot visualizes the quantitative value of a binding energy term between different ligands. A thousand of different predictions were generated for each ligand to compare the docking similarity between them. Qualitatively, there is no significant difference between SSA and GHB. Quantitatively, a two samples t-test reveals that the difference between GHB and SSA has a p-value of 9.261*e-4.

Although the lowest binding energy prediction of SSA does not have the same conformer as GHB, it is still possible for the SSA conformer to match that of GHB but with a higher binding energy that is insignificantly different from GHB. Therefore, SSA and GHB are likely to interact with BlcR in a similar way.

August

Discussing transcriptomics results

We got together to discuss the first results of the transcriptomics research, with the goal of seeing if there are any operators in A. tumefaciens that are affected by the addition of SSA and/or GHB. Team Fun4al presented their findings, after using their software to process transcriptomics data available from literature (Gonzalez‐Mula et al. (2019)).

From the high-throughput transcriptomics data available for A. tumefaciens cultured in the presence of GHB, GABA, or succinic acid (as reference), the differentially expressed genes (DEGs) were extracted. The DEGs were mapped to determine which genes are significantly up- or downregulated by GHB and GABA. This analysis revealed that the genes that are significantly upregulated by GABA are limited to the ones in the blc operon, confirming our previous knowledge. We decided together to adjust the experiments and try again, in order to look for more relevant results. We decided to consider all DEGs for GHB, instead of limiting it to those occurring for both GHB and GABA.

Solidifying our partnership

During the same meeting, we agreed it would be interesting if team SPYKE used ITC to characterize sBAD for team Fun4al. For this, team Fun4al would need to produce and purify sBAD and ship samples as well as synthetic oligonucleotides and a solution of furfural to Delft. We agreed to set up a meeting to discuss the protein expression and purification, so team SPYKE could offer advice based on their experience with protein expression and purification. We also discussed the possibility of forming an official partnership for the first time.

New transcriptomics results

Fun4al were still not able to find any other promoters that appeared to be up- or down regulated by GHB within the A. tumefaciens C58 genome. Following this meeting, the Fun4al team decided to change strategy for the next meeting.

September

NCBI Blast results

Since the Fun4al team did not find variants of the BlcR binding site within the A. tumefaciens C58 genome, they decided to test some different NCBI nucleotide BLAST options to search for the binding site in other organisms. They found 25 sequences, mainly from the same species, that to a certain degree matched the BlcR binding site from A. tumefaciens C58 (Figure 4). All of these sequences are located near genes seemingly involved in succinic acid metabolism. Often, the first gene downstream the sequences was named NAD-dependent succinyl-semialdehyde dehydrogenase, which is the same type of gene that is regulated by BlcR. This suggests that the 25 sequences are evolutionary homologs to the blc operator.

Stages in project development and research

Figure 4. 25 sequences similar to the BlcR operator from A. tumefaciens C58 (AE007872.2) were found with an nBLAST on NCBI (searching ‘somewhat similar sequences’ and word size of 7). Left: an unrooted tree of the sequences with the NCBI accession codes at the tips. Right: the multiple alignment of the sequences coloured by the percentage of identity with a normalized logo plot underneath and the consensus sequence. The new observed inverted repeats (IR1, IR2, IR3) are outlined. The sequences were aligned using MAFT and visualized with JalView. The tree was generated using TreeHugger 0.5 and visualized using FigTree v1.4.4.

The sequences seemed to divide into two different groups. Remarkably, a normalized logo plot combining all of the sequences revealed new inverted repeats that were different from what has been reported to be important for BlcR binding in literature(Pan et al. (2013), Pan et al. (2011)). These groups of inverted repeats, mainly IR2 and IR3, seemed to be relatively conserved as opposed to the inverted repeats reported previously.

As DNA-binding proteins are normally known to interfere with inverted repeats, this result sparked our curiosity: could it be that the consensus sequence of the 25 sequences would provide a stronger binding for BlcR?

Following this meeting, Fun4al suggested five variations of the BlcR binding site based on the results presented above. The SPYKE team chose to investigate the consensus sequence and CP049218.1, a sequence belonging to the group of operators that seemed to differ from the WT operator from A. tumefaciens C58 (AE007872.2).

Dropping ITC

Due to trouble with protein expression on the side of team Fun4al, as well as trouble with ITC on the side of team SPYKE, we came to the shared conclusion that it wouldn’t be possible for team SPYKE to characterize sBAD using ITC. Even if it was possible to produce, purify, and ship samples of sBAD in time, there is a chance that ITC measurements would come back inconclusive, and team SPYKE would not have time or materials to troubleshoot. To use time and manpower more efficiently, we agreed to call off the protein production and ITC experiments.

October

Testing alternative BlcR binding sequences

Because of the same trouble with ITC, team SPYKE wasn’t able to characterize binding between BlcR and the alternative DNA sequences found by team Fun4al using ITC. Because of this, we moved to an electrophoresis mobility shift assay (EMSA) as an alternative method for observing protein-DNA binding. A major advantage of EMSA in comparison to ITC is that multiple DNA sequences can be evaluated at the same time, but the drawback is that results are not quantitative enough to determine thermodynamic properties from. Still, we were able to compare different DNA sequences to each other and to the wildtype and estimate whether the binding is stronger or weaker.

The results showed that a number of variants appear to have an improved affinity for BlcR in respect to the wildtype currently used, including the two sequences selected from the list of analogous operator sequences made by team Fun4al. By adding SSA, it was confirmed that there was no loss of efficiency in the ligand-dependent dissociation mechanism in these variants.

Stages in project development and research

Figure 5. Multiple sequence alignment of selected blc operator variants showing exceptionally high affinity to BlcR (top: 01-22) or exceptionally low affinity to BlcR (bottom: 00-14). Highlighted in the alignment are the highly conserved outer inverted repeat (pink) and the lesser conserved inner inverted repeat (blue). Wildtype operator sequence 01, with the previously reported inverted repeat regions highlighted, is aligned at the bottom for reference.

Conclusion of partnership

This SPYKE and Fun4al partnership revolved around modeling and modifying TFs, which has been a crucial part of both of our projects. Without good models and characterizations of our TFs, neither of our teams would not be able to obtain a proof of concept. With the potential to provide or improve a proof of concept for each other, we regarded our frequent discussions and work we did for each other as a valid foundation for a partnership.

Team Fun4al helped team SPYKE with bioinformatics research. They made computational protein-ligand docking predicting and comparing the binding between BlcR and the two ligands GHB and SSA. These results indicated that SSA is an adequate analog to GHB when testing the characteristics of BlcR in the lab. Furthermore, team Fun4al made a multiple alignment of sequences similar to the BlcR operator. They suggested alternative operator sequences based on an analysis of the inverted repeats appearing in the consensus sequence. The binding between BlcR and the consensus sequence itself turned out to be stronger than the binding between BlcR and the WT operator, and team Fun4al has therefore helped team SPYKE find a possibility for improvement of their system.

Team SPYKE provided their ITC machine for team Fun4al with the intentions to measure if the TF, sBAD, could bind to furfural and how strong this potential binding would be compared to benzoic acid, the original ligand to sBAD. They advised team Fun4al on how to express and purify their sBAD, which then could be sent to their facility at TU-Delft to be tested. Unfortunately, this experiment was not accomplished due to issues concerning the purification of the protein. However, the experiment had great potential in providing an additional proof of concept for team Fun4al that they would not have considered if it wasn’t for this partnership with team SPYKE.

References

  • Gonzalez‐Mula, A. et al. The biotroph Agrobacterium tumefaciens thrives in tumors by exploiting a wide spectrum of plant host metabolites. New Phytologist, 222(1), (2019) p.455-467
  • Pan, Y. et al. In vivo analysis of DNA binding and ligand interaction of BlcR, an IclR-type repressor from Agrobacterium tumefaciens. Microbiology, 159(Pt 4), (2013). p.814.
  • Pan, Y. et al. The Agrobacterium tumefaciens transcription factor BlcR is regulated via oligomerization. Journal of Biological Chemistry, 286(23), (2011), p.20431-20
  • Krivák, R. and Hoksza, D., P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. Journal of cheminformatics, 10(1), (2018), p.1-12.