| Stockholm - iGEM 2022

To address the challenging task of protein design, we performed thorough modelling to select the best candidates to be tested in the lab.

The first step in the process of engineering an intrinsic factor (IF) that can evade the autoantibodies is to identify the epitope on the IF and engineer an IF that can evade the clearance mediated by autoantibodies. Based on previous knowledge [1], the residues in the alpha helix region adjacent to the cobalamin (vitamin B12) binding pocket: 256Ser, 258Lys, 262Tyr and 265Val were to be mutated. Optimal mutations to substitute the residues with other or “similar” residues were decided and engineered in silico.

Structure of IF + B12 + cubilin with amino acids highlighted — *Animation.* *Rotating model of IF (dark blue) + B12 (red) + cubilin (light blue) with key amino acids highlighted (orange)*

Then, we designed our mutants based on the rationale of preserving structure as much as possible. This principle was chosen as the antibody binding region is in close proximity to the B12 binding region. Thus, mutagenesis should prevent IF from binding to autoimmune antibodies while preserving ligand-receptor binding affinity. For this purpose, we used BLOSUM62 matrix [2] to detect closely related amino acids. These are the ones whose substitution is well tolerated in evolution.

Mutant creation

Carousel. Ribbon structure of variants of the epitope with mutations highlighted (orange)

We performed protein structure predictions to verify that the designed mutants would preserve the properties of the wild type intrinsic factor. We utilised a well-known tool called AlphaFold2 [3] and used it as ColabFold [4]. Structures of all 21 mutants were obtained as PDB files, visualised and structurally aligned to the wild type IF crystallography structure (PDB ID: 2pmv) using PyMol [5]. This enabled us to refine the list of tested mutants excluding the improperly folded ones.

Immunogenicity predictions with discontinuous epitope — *Figure.* *Mutant S256TK258RY262FV265I -B12 complex (blue and red respectively) structurally aligned with the crystallography structure of wild type IF-B12 complex (cyan and beige respectively)*

Docking and MD

Docking

The final step of the modelling consisted of screening for potent therapeutic candidates. We performed docking studies to ensure that the binding affinity of engineered protein is at least on par with the wild type IF. Docking studies were performed in 2 steps. First, we predicted the binding energy. AutoDock Vina [6] tool was selected for this task as it is a well-established tool for docking simulations. Energy refinement was performed using NNscore 2.0 neural network [7]. The NNScore 2.0 scoring function was used to estimate the binding energy and dissociation constant, as it is more accurate than the AutoDock Vina scoring function.

Mutant	energy, kcal/mol	Kd, pM
WT	-14.17	929.39
S256A	-14.64	270.53
S256T	-14.22	342.33
K258R	-16.41	111.74
Y262H	-14.13	66.03
Y262W	-14.2	116.91
Y262F	-15.08	112.11
V265I	-15	215.17
S256TK258R	-16.04	53.32
S256TY262F	-14.75	245.04
S256TV265I	-16.49	72.19
K258RY262F	-13.99	41.78
K258RV265I	-13.92	313.14
Y262FV265I	-15.26	144.1
S256TK258RY262F	-13.65	127.77
S256TK258RV265I	-14.22	100.04
S256TY262FV265I	-14.33	63.77
K258RY262FV265I	-16.2	102.57
S256TK258RY262FV265I	-14.44	50.69

In addition, binding energies of the B12-IF complex with the cubilin receptor were assessed using MutaBind2 [8] since the strength of binding is a crucial factor for the B12-IF complex internalisation. However, we then found evidence that MutaBind2 has poor performance [9].

The most promising candidates were selected to be tested in wet lab experiments. We found that most proposed mutants have similar or better binding affinity to vitamin B12. Therefore, these results indicate that our mutant selection pipeline is adequate and applicable to the research question.

Molecular Dynamics

To have a closer look at the interactions at molecular level, attempts were made to model the IF-B12, protein-ligand complex for molecular dynamics study using GROMACS [10] software. Some of our initial tasks included pre-processing of the available crystallography structure data and topology generation. As the available structure is not fully resolved for the entire protein, a tool called Swiss-PDB viewer [11] was used to fill in the missing atoms (predict structure) and the structure was cleaned of glycosylation and other unwanted atoms were removed using Chimera [12]. For MD simulations, our force field of choice was CHARMM36 [13] which is well established for protein studies and widely accepted in the scientific community. However, a ligand such as B12 is not a recognized entity in the force field. Thus, it was important to process the topology files separately (for IF and B12) and it served as the bottleneck of our experiment.

One of the most challenging tasks in molecular simulation is the proper processing of ligands. Introducing new entities into the framework of the force field requires careful consideration and validation, often in the form of various quantum mechanical calculations [14]. This could be a very laboursome and tedious process. For preparing the ligand topology, an automated tool CGenFF [15] was used. The input file for the tool was generated using another software Avogadro [16]. An advantage of using CGenFF is that it returns penalty scores for each parameter and our penalty scores were larger than 50, which generally require manual reparametrization. The scores were exceptionally high around the corrin ring and metal ion - cobalt.

In the end, after trial and error, we were not able to implement successful parameterization which would accurately reflect a realistic model, due to lack of prior knowledge and time constraints.

Immunogenicity

Since we are modifying the protein, there is a high chance that artificial modifications could trigger an additional immunogenic response. Thus, we decided to screen our candidates for immunogenicity to ensure that the produced mutants do not trigger immunogenic responses themselves. Our focus lies primarily on possible responses from B lymphocytes [17].

We performed screens for continuous as well as discontinuous epitopes for possible antibodies of an immunogenetic response. The difference between them is that the discontinuous takes into consideration the folding while the linear only takes the ordered amino acids into account [18].

After performing an extensive literature review, we proceeded with the online tool IEDB (Immune Epitope Database and Analysis Resource) [19], more specifically the tools ElliPro for discontinuous and Antigen Sequence Properties for the linear epitopes. ElliPro employs a clustering algorithm combined with Thornton's method (focusing on protruding amino acid residues) for making predictions [20]. The tool Antigen Sequence Properties utilizes several methods that consider beta turns, surface accessibility probability, mobility of protein segments, physicochemical properties of residues and hydrophilicity. Furthermore, they apply a hidden Markov model, the propensity scale method, and a random forest algorithm.

We assumed that introduced mutations would lead to a detectable drop in immunogenicity. However, this hypothesis turned out to be invalid both for the linear epitope predictions and the discontinuous ones. We utilized the area under the per residue immunogenicity score curve (AUC) as a single-value metric to make immunogenicity prediction results comparable between mutants. For the discontinuous predictions, the wild type reached AUC of 202.1 and the mutant with the lowest immunogenicity (a lower score correlates to a lowered predicted chance of autoimmune response) at AUC 201.8. To conclude, the mutants were not predicted to have significantly altered immunogenicity. This could arise due to several reasons:

There is no significant difference in autoantibody binding meaning that the previously described epitope is not the actual one.
The previously described epitope is the actual one. However, our rationale chosen for the mutant design is not applicable to this case. Therefore, mutations with a stronger effect on protein structure should have been selected.
Single point mutations introduce very minor changes in structure that do not allow selected tools to detect any significant difference between them. However, it could still be enough to disrupt antibody binding in vitro.

References

J. L. Guéant, A. Safi, I. Aimone-Gastin, H. Rabesona, J. P Bronowicki, F. Plénat, et al.
Autoantibodies in Pernicious Anaemia type I patients recognize sequence 251-256 in human intrinsic factor
Proc Assoc Am Physicians, vol. 109, no. 5, pp. 462-469, 1997
PMID: 9285945
S. Henikoff and J. G. Henikoff
Amino acid substitution matrices from protein blocks
Proc Natl Acad Sci USA, vol. 89, no. 22, pp. 10915–10919, 1992
DOI: 10.1073/pnas.89.22.10915
J. Jumper, R. Evans, A. Pritzel, et al.
Highly accurate protein structure prediction with AlphaFold
Nature, vol. 596, pp. 583-589, 2021
DOI: 10.1038/s41586-021-03819-2
M. Mirdita, K. Schütze, Y. Moriwaki, et al.
ColabFold: making protein folding accessible to all
Nat Methods, vol. 19, pp. 679–682, 2022
DOI: 10.1038/s41592-022-01488-1
Schrödinger, LLC
The PyMOL Molecular Graphics System, Version 2.5
Schrödinger, LLC, 2021
Read it
O. Trott and A. J. Olson
AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading
J Comput Chem., vol. 31, no. 2, pp. 455-461, 2009
DOI: 10.1002/jcc.21334
J. D. Durrant and J. A. McCammon
NNScore 2.0: A Neural-Network Receptor–Ligand Scoring Function
J Chem Inf Model., vol. 51, no. 11, pp. 2897-2903, 2011
DOI: 10.1021/ci2003889
N. Zhang, Y. Chen, H. Lu, F. Zhao, R. V. Alvarez, A. Goncearenco, et al.
MutaBind2: Predicting the Impacts of Single and Multiple Mutations on Protein-Protein Interactions
iScience, vol. 23, no. 3, 2020
DOI: 10.1016/j.isci.2020.100939
C. Chen, V. S. Boorla, D. Banerjee, R. Chowdhury, V. S. Cavener, R. H. Nissly, et al.
Computational prediction of the effect of amino acid changes on the binding affinity between SARS-CoV-2 spike RBD and human ACE2
PNAS, vol. 118, no. 42, 2021
DOI: 10.1073/pnas.2106480118
P. Bauer, B. Hess, and E. Lindahl
GROMACS 2022.3 Manual
Zenodo, 2022
DOI: 10.5281/zenodo.7037337
N. Guex, and M. C. Peitsch
SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling
Electrophoresis, vol. 18, no. 15, pp. 2714-2723, 1997
DOI: 10.1002/elps.1150181505
E. F. Pettersen, T. D. Goddard, C. C. Huang, G. S. Couch, D. M. Greenblatt, E. C. Meng, and T. E. Ferrin TE
UCSF Chimera--a visualization system for exploratory research and analysis
J Comput Chem., vol. 25, no. 13, pp. 1605-1612, 2004
DOI: 10.1002/jcc.20084
J. Huang, S. Rauscher, G. Nawrocki, T. Ran, M. Feig, B. L. de Groot, et al.
CHARMM36m: An Improved Force Field for Folded and Intrinsically Disordered Proteins
Nat Methods, vol. 14, pp. 71-73, 2016
DOI: 10.1038/nmeth.4067
J. A. Lemkul
From Proteins to Perturbed Hamiltonians: A Suite of Tutorials for the GROMACS-2018 Molecular Simulation Package, v1.0
Living J. Comp. Mol. Sci., vol. 1, no. 1, 2018
DOI: 10.33011/livecoms.1.1.5068
K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. E. Shim, et al.
CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields
J. Comput. Chem., vol. 13, no. 4, pp. 671-690, 2010
DOI: 10.1002/jcc.21367
M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek, and G. R. Hutchison
Avogadro: an advanced semantic chemical editor, visualization, and analysis platform
J. Cheminform., vol. 4, no. 1, p. 17, 2012
DOI: 10.1186/1758-2946-4-17
Johns Hopkins Medicine
The Immune System
Johns Hopkins Medicine
Read it
T. C. Liang
Epitopes
Encyclopedia of Immunology (Second Edition), pp. 825-827, 1998
DOI: 10.1006/rwei.1999.0219
R. Vita, S. Mahajan, J. A. Overton, et al.
The Immune Epitope Database (IEDB): 2018 update
Nucleic Acids Res., vol. 47, no. D1, pp. D339-D343, 2019
DOI: 10.1093/nar/gky1006
J. Ponomarenko, H. H. Bui, W. Li, N. Fusseder, P. E. Bourne, A. Sette, and B. Peters
ElliPro: a new structure-based tool for the prediction of antibody epitopes
BMC bioinformatics, vol. 9, no. 514, 2008
DOI: 10.1186/1471-2105-9-514

Dry lab

Software