Preliminary testing of our DhdR-DhdO system showed unexpected behavior with minimal repression, which motivated us to carry out modeling studies of the protein structure and interactions. To validate the protein’s activity and to verify that protein dimerization and DNA binding could occur as expected, we completed three computational modeling analyses, which will allow us to move forward with targeted refinements in our wet lab work:
Because we were working with a newly described transcription factor, DhdR, we wanted to perform structural predictions in order to visualize the shape of the protein. To do this, we used the AlphaFold2 tool, which allowed us to generate predicted structures based on the protein sequence [1]. These predictions were compared to the known protein structure for a related GntR transcription factor [2], as well as to homology-modeled structures generated using SWISS-Model (Figure 1) [3].
The results of this analysis validated the AlphaFold2 generated structures, suggesting that they were reasonably close to the expected structures (Figure 2).
This served as initial confirmation for our ability to use AlphaFold2 as a tool to probe the structure of DhdR, both with and without sequence modifications. Next, we determined whether the inclusion of the NLS and FLAG tags to the N- and C-termini of the protein would impact the dimerized structure of DhdR. Using AlphaFold2 and PyRosetta, predicted dimer structures were generated and relaxed (Figure 3).
The visualized structures show that addition of the NLS-FLAG sequence to the N-terminal domain (NTD) significantly impacts the conformation of this portion of the protein, which is problematic given that the NTD is the DNA-interacting domain of the homodimer. Indeed, the conformation appears to completely block the region where DNA would normally form its required binding interactions, suggesting that addition of the FLAG-NLS sequence to the N-terminal end of the protein may prevent DNA binding and the DhdR transcription factor from actually carrying out its functions. This sheds light on one potential factor that may have caused our proof of concept experiment to not function as expected.
AlphaFold2 was run using the publicly available ColabFold notebook, available here. [4] PyRosetta for structure relaxation was run using this Colab notebook, which was modified from the original PyRosetta Jupyter notebooks available here. [5]
Based on the identified structure modifications that were generated in the previous portion of our modeling study, we proceeded to determine possible DNA binding sites within the protein structure. With this information, we would be able to better determine how the changes caused by the FLAG tag may have impacted the function of the transcription factor. To do this, we consulted existing literature on GntR transcription factors, which characterized the presence of canonical helix-turn-helix (HTH) binding domains within the N-terminal portion of the protein. [6] We then validated the presence of this motif within our structure, shedding light on the potential DNA-binding interactions occurring in our DhdR system (Figure 4).
Given the presence of conserved positively-charged arginine residues at positions 35, 49, and 64, along with consensus modeling to previously elucidated structures, we can conclude that this N-terminal domain is likely the DNA binding domain and is less engineerable. This also provides a scaffold to engineer the DhdR transcription factor to bind novel sequences through mutational or directed evolution studies, both of which are beyond the current scope of the project.
In order to visualize the possible binding conformation of the transcription factor, we found a GntR family transcription factor that had a DNA-bound conformation solved and deposited (PDB code: 1HW2) [7]. Then, we aligned and superimposed the two structures in ChimeraX, allowing us to create a simple predictive structure of how the DhdR transcription factor would bind to a DNA sequence (Figure 5).
Further experiments are necessary to validate the specific DNA binding domain and residues responsible for DhdR binding to DhdO sequences. These may include evolution studies or binding assays using surface plasmon resonance (SPR) to determine dissociation constants. However, we have high confidence that N-terminal modifications to DhdR severely impact our system’s function.
Finally, after characterizing the DNA binding domain of the DhdR homodimer, we wanted to determine how the introduction of D-2HG to the system could potentially lead to disruptions of DhdR binding to DNA as a part of its allosteric activity. To do this, we used the SWISSDock tool, which generated the following likely binding configurations (Figure 6).
Overall, the predicted location of the D-2HG ligand in the DNA binding domain suggests that the presence of the ligand may inhibit DhdR binding through disruption of binding pocket interactions. However, further experimentation would likely be required to determine the exact mechanism by which this occurs, whether by preventing DhdR dimerization or by blocking favorable interactions between the transcription factor and the binding site. This provided more evidence that N-terminal fusions would significantly impact DhdR function, both in its binding activity and allosteric binding of D-2HG, the oncometabolite of interest to our study.
The results of the different structural prediction assays provide insight into the DhdR transcription factor that forms the backbone of our reporter system. We were able to better understand the mechanisms by which DhdR both associates and dissociates from its binding site. Furthermore, we gained potential insight into the impacts that our sequence modifications may have produced on the structure of the DhdR homodimer, revealing how its binding ability could have possibly been eliminated through the addition of the NLS and FLAG tag to the N-terminal end. Overall, the results reveal the power that modeling studies can offer, validating them as a meaningful complement to traditional wet lab experimentation and showing their utility in revealing future avenues of exploration.