Modelling

Movement Protein Prediction

One of our priorities during our project’s realization was to figure out the 3D structure of the TSWV movement protein. Given that no publicly available structures for our molecule of interest existed, we performed a prediction from its amino acid sequence using AlphaFold. The results with the highest pLDDT andpTMscore (66.66 and 0.57 respectively) was the following:

Our MSA coverage, alignment error and LDDT was the following, thus giving us adequate certainty, combined with our protein scores for the general structure of our protein.

Synthetic Construct Prediction

Up next we aimed to figure out the structure of our synthetic construct. First we ran our sequence through Expasy’s translational tool in “Compact: M, -, no spaces, Standard Genetic code, Forward Strand” settings to retrieve our amino acid sequence.
By all the possible frames received, we chose the first frame since it’s the only one that is gapless and runs all the way through, thus producing our desired aa sequence:

5'3' Frame 1

MKATKLVLGAVILGSTLLAGCSSNAKIDQGINPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGGMVWRADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGGGSGGGLKHLPDRHSNLVTDEEVVGFENKAEELIDYLIRGTNELDVVPIVGMGGQGKTTIARKLYNNDIIVSRFDVRAWCIISQTYNRRELLQDIFSQVTGSDDNGATVDVLADMLRRKLMGKRYLIVLDDMWDCMVWDDLRLSFPDDGIRSRIVVTTRLEEVGKQVKYHTDPYSLPFLTTEESCQLLQKKVFQKEDCPPELQDVSQAVAEKCKGLPLVVVLVAGIIKKRKMEESWWNEVKDALFDYLDSEFEEYSLATMQLSFDNLPHCLKPCLLYMGMFSEDARIPASTLISLWIAEGFVENTESGRLMEEEAEGYLMDLISSNLVMLSKRTYKGRVKYCQVHVVVHHFCLEKSREAKFMLAVKGQYIHFQPSDWKGTRVSFSFSEELSKFASLVSKTQKPFHQHLRSLITTNRAKSINDIFSCQISELRLLKVLDLSSYIVEFLSLATFKPLNQLKYLAVQAFEFYFDPGSHLPHIETFIVMNLPYYDILLPVSFWEMKKLRHAHFGKAEFDKQGLSEGSSKLENLRILKNIVGFDRVDVLSRRCPNLQQLQITYFGNNEEPFCPKLENLTQLQQLQLSFARPRTLSGLQLPSNLNKLVLEGIHIGCVIPFIAGLPSLEYLQLHDVCFPQSEEWCLGDITFHKLKLLKLVKLNISRWDVSEESFPLLETLVIKKCIDLEEIPLSFADIPTLEQIKLIGSWKVSLEDSAVRMKEEIKDTEGCDRLHLVKQRSD-

Afterwards we also ran the sequence through AlphaFold to receive its expected 3D structure. As you can see in the pictures below, we received, with an acceptable rate of certainty, especially given the construct’s size, the following results(pLDDT:77.09 pTMscore:0.68):

Interaction Prediction

In order to retrieve model confidence for our interaction, we use the following equation proposed by Alphafold researchers of DeepMind (Evans, Richard, Michael O’Neill, Alexander Pritzel, Natasha Antropova, Andrew Senior, Tim Green, Augustin Žídek, et al. “Protein Complex Prediction with AlphaFold-Multimer.” Cold Spring Harbor Laboratory, October 4, 2021. https://doi.org/10.1101/2021.10.04.463034 )

model confidence = 0.8 · ipTM + 0.2 · pTM

Our prediction model showcases somewhat of a probable interaction yet all results from Alphafold 2-multimer should, according to the creators, be taken with a grain of salt since their algorithm is under development. Nonetheless, the model confidence of our interaction is greater than 0.3 and at the time this bioinformatics experiment was enough for us to continue with our wet lab experiments.
The following model has an pLDDT score of 64.4, a ptm score of 0.608 and an iptm score of 0.294 thus equating in a model confidence of 0.3568 or 35.68%.

One cannot overlook the fact that this structure seems odd at best and has an extremely weak certainty level in the places that it is somewhat expected(red color on the pLDDT diagram) . We believe that by experimenting with the parameters of the Alphafold runs, and given enough time, we can approximate the interaction more truly.

False Positives Prediction

In order to calculate false positives we had to first pinpoint the part of the aa sequence of the movement protein that interacts with our synthetic protein. We performed this manually through PyMol. Afterwards we performed a DALI analysis of our target protein against the entire PDB database in order to locate proteins of a similar tertiary structure, especially focusing on the interaction site. Our greatest concern was whether the plant itself would have a protein of similar structure that would interact with our engineered protein with a higher affinity than ours would.
We were extremely happy to find that our structure had a somewhat unique structure against all known PDB structures as you can see below:

Moreover we located all similarity points in the following proteins and we did not pinpoint any similarities larger than 4 aa in a row inside our interaction point of interest. Of course that does not mean that proteins that are not on the PDB yet or have largely different structures cannot and will not interact with our construct. Our results nonetheless point us in the right direction.

Primer Design

Throughout the year our team had to construct, edit or repurpose a number of primers that are shortly presented below. All primers were designed via various programmes and also tested in a in silico PCR with varying parameters for pre-experimental optimization

OmpA

The reconstruction of an iGEM construct to achieve N-terminal fusion took place in order to be used in a standard golden gate reaction.Primers were designed to acquire only the coding region of OmpA for N-terminal fusion with Golden Gate cloning. One extra adenine residue was added as an overhang to ensure the fusion partner remains in frame.

Fw: ttGGTCTCtAATGAAAGCTACTAAACTGGTA
Rev: ttGGTCTCtTCCTCCTCCAGAACCTCC
(Tm = 58 °C)

NB-LRR

The following primer sets were designed to acquire the NB-LRR domain of the Sw-5b resistance gene for C-terminal fusion by Golden Gate cloning

Fw1: ttGGTCTCtAGGATTAAAACATCTGCCGGAT
Rev1: ttGGTCTCtAAGCTCAATCTGAGCGTTGTTTG
(Tm=57°C)

Fw2: ttGGTCTCtAGGACCTTCGGATTGGAAGGGA
Rev2: ttGGTCTCtAAGCTCAATCTGAGCGTTGTTT
(Tm=58°C)

OmpA:NB-LRR and OmpA:LRR

The following primers were designed for the OmpA:NB-LRR and OmpA:LRR constructs and include Ndel and Sacl recognition sites for cloning into the expression vector pET-5b

Fw1: ggCATATGatgAAAGCTACTAAACTGGTACTGG
Rv1: ggGAGCTCtcaATCTGAGCGTTGTTTGACG
(Tm=63 °C)

Fw2: cgcCATATGAAAGCTACTAAACTGGTACTGG NdeI
Rev2: ccGAGCTCtcaATCTGAGCGTTGTTTGACG SacI
(Tm=63 °C)

In conclusion, our modeling work that took place throughout the year with varying data and results (only the most representative are presented), helped us strengthen our confidence in our project’s success and provided useful data, especially concerning our target’s structure that would otherwise remain unknown for some time. In due time, crystallography structures and in vitro interaction assays can either approve or disprove those results thus providing us with even more data to continue our bioinformatics work.