To address the challenging task of protein design, we performed thorough modelling to select the best candidates to be tested in the lab.

The first step in the process of engineering an intrinsic factor (IF) that can evade the autoantibodies is to identify the epitope on the IF and engineer an IF that can evade the clearance mediated by autoantibodies. Based on previous knowledge [1], the residues in the alpha helix region adjacent to the cobalamin (vitamin B12) binding pocket: 256Ser, 258Lys, 262Tyr and 265Val were to be mutated. Optimal mutations to substitute the residues with other or “similar” residues were decided and engineered in silico.

Structure of IF + B12 + cubilin with amino acids highlighted

Animation. Rotating model of IF (dark blue) + B12 (red) + cubilin (light blue) with key amino acids highlighted (orange)

Then, we designed our mutants based on the rationale of preserving structure as much as possible. This principle was chosen as the antibody binding region is in close proximity to the B12 binding region. Thus, mutagenesis should prevent IF from binding to autoimmune antibodies while preserving ligand-receptor binding affinity. For this purpose, we used BLOSUM62 matrix [2] to detect closely related amino acids. These are the ones whose substitution is well tolerated in evolution.

Mutant creation

Carousel. Ribbon structure of variants of the epitope with mutations highlighted (orange)

We performed protein structure predictions to verify that the designed mutants would preserve the properties of the wild type intrinsic factor. We utilised a well-known tool called AlphaFold2 [3] and used it as ColabFold [4]. Structures of all 21 mutants were obtained as PDB files, visualised and structurally aligned to the wild type IF crystallography structure (PDB ID: 2pmv) using PyMol [5]. This enabled us to refine the list of tested mutants excluding the improperly folded ones.

Immunogenicity predictions with discontinuous epitope

Figure. Mutant S256TK258RY262FV265I -B12 complex (blue and red respectively) structurally aligned with the crystallography structure of wild type IF-B12 complex (cyan and beige respectively)

Docking and MD

Docking

The final step of the modelling consisted of screening for potent therapeutic candidates. We performed docking studies to ensure that the binding affinity of engineered protein is at least on par with the wild type IF. Docking studies were performed in 2 steps. First, we predicted the binding energy. AutoDock Vina [6] tool was selected for this task as it is a well-established tool for docking simulations. Energy refinement was performed using NNscore 2.0 neural network [7]. The NNScore 2.0 scoring function was used to estimate the binding energy and dissociation constant, as it is more accurate than the AutoDock Vina scoring function.

Mutant energy, kcal/mol Kd, pM
WT -14.17 929.39
S256A -14.64 270.53
S256T -14.22 342.33
K258R -16.41 111.74
Y262H -14.13 66.03
Y262W -14.2 116.91
Y262F -15.08 112.11
V265I -15 215.17
S256TK258R -16.04 53.32
S256TY262F -14.75 245.04
S256TV265I -16.49 72.19
K258RY262F -13.99 41.78
K258RV265I -13.92 313.14
Y262FV265I -15.26 144.1
S256TK258RY262F -13.65 127.77
S256TK258RV265I -14.22 100.04
S256TY262FV265I -14.33 63.77
K258RY262FV265I -16.2 102.57
S256TK258RY262FV265I -14.44 50.69



In addition, binding energies of the B12-IF complex with the cubilin receptor were assessed using MutaBind2 [8] since the strength of binding is a crucial factor for the B12-IF complex internalisation. However, we then found evidence that MutaBind2 has poor performance [9].

The most promising candidates were selected to be tested in wet lab experiments. We found that most proposed mutants have similar or better binding affinity to vitamin B12. Therefore, these results indicate that our mutant selection pipeline is adequate and applicable to the research question.

Molecular Dynamics

To have a closer look at the interactions at molecular level, attempts were made to model the IF-B12, protein-ligand complex for molecular dynamics study using GROMACS [10] software. Some of our initial tasks included pre-processing of the available crystallography structure data and topology generation. As the available structure is not fully resolved for the entire protein, a tool called Swiss-PDB viewer [11] was used to fill in the missing atoms (predict structure) and the structure was cleaned of glycosylation and other unwanted atoms were removed using Chimera [12]. For MD simulations, our force field of choice was CHARMM36 [13] which is well established for protein studies and widely accepted in the scientific community. However, a ligand such as B12 is not a recognized entity in the force field. Thus, it was important to process the topology files separately (for IF and B12) and it served as the bottleneck of our experiment.

One of the most challenging tasks in molecular simulation is the proper processing of ligands. Introducing new entities into the framework of the force field requires careful consideration and validation, often in the form of various quantum mechanical calculations [14]. This could be a very laboursome and tedious process. For preparing the ligand topology, an automated tool CGenFF [15] was used. The input file for the tool was generated using another software Avogadro [16]. An advantage of using CGenFF is that it returns penalty scores for each parameter and our penalty scores were larger than 50, which generally require manual reparametrization. The scores were exceptionally high around the corrin ring and metal ion - cobalt.

In the end, after trial and error, we were not able to implement successful parameterization which would accurately reflect a realistic model, due to lack of prior knowledge and time constraints.

Immunogenicity

Immunogenicity predictions with discontinuous epitope

Figure. Graph showing immunogenicity predictions with discontinuous epitope with a focus on the region of interest

Since we are modifying the protein, there is a high chance that artificial modifications could trigger an additional immunogenic response. Thus, we decided to screen our candidates for immunogenicity to ensure that the produced mutants do not trigger immunogenic responses themselves. Our focus lies primarily on possible responses from B lymphocytes [17].

We performed screens for continuous as well as discontinuous epitopes for possible antibodies of an immunogenetic response. The difference between them is that the discontinuous takes into consideration the folding while the linear only takes the ordered amino acids into account [18].

After performing an extensive literature review, we proceeded with the online tool IEDB (Immune Epitope Database and Analysis Resource) [19], more specifically the tools ElliPro for discontinuous and Antigen Sequence Properties for the linear epitopes. ElliPro employs a clustering algorithm combined with Thornton's method (focusing on protruding amino acid residues) for making predictions [20]. The tool Antigen Sequence Properties utilizes several methods that consider beta turns, surface accessibility probability, mobility of protein segments, physicochemical properties of residues and hydrophilicity. Furthermore, they apply a hidden Markov model, the propensity scale method, and a random forest algorithm.

We assumed that introduced mutations would lead to a detectable drop in immunogenicity. However, this hypothesis turned out to be invalid both for the linear epitope predictions and the discontinuous ones. We utilized the area under the per residue immunogenicity score curve (AUC) as a single-value metric to make immunogenicity prediction results comparable between mutants. For the discontinuous predictions, the wild type reached AUC of 202.1 and the mutant with the lowest immunogenicity (a lower score correlates to a lowered predicted chance of autoimmune response) at AUC 201.8. To conclude, the mutants were not predicted to have significantly altered immunogenicity. This could arise due to several reasons:

  1. There is no significant difference in autoantibody binding meaning that the previously described epitope is not the actual one.
  2. The previously described epitope is the actual one. However, our rationale chosen for the mutant design is not applicable to this case. Therefore, mutations with a stronger effect on protein structure should have been selected.
  3. Single point mutations introduce very minor changes in structure that do not allow selected tools to detect any significant difference between them. However, it could still be enough to disrupt antibody binding in vitro.

References

  1. J. L. Guéant, A. Safi, I. Aimone-Gastin, H. Rabesona, J. P Bronowicki, F. Plénat, et al.
    Autoantibodies in Pernicious Anaemia type I patients recognize sequence 251-256 in human intrinsic factor
    Proc Assoc Am Physicians, vol. 109, no. 5, pp. 462-469, 1997
    PMID: 9285945


  2. S. Henikoff and J. G. Henikoff
    Amino acid substitution matrices from protein blocks
    Proc Natl Acad Sci USA, vol. 89, no. 22, pp. 10915–10919, 1992
    DOI: 10.1073/pnas.89.22.10915


  3. J. Jumper, R. Evans, A. Pritzel, et al.
    Highly accurate protein structure prediction with AlphaFold
    Nature, vol. 596, pp. 583-589, 2021
    DOI: 10.1038/s41586-021-03819-2


  4. M. Mirdita, K. Schütze, Y. Moriwaki, et al.
    ColabFold: making protein folding accessible to all
    Nat Methods, vol. 19, pp. 679–682, 2022
    DOI: 10.1038/s41592-022-01488-1


  5. Schrödinger, LLC
    The PyMOL Molecular Graphics System, Version 2.5
    Schrödinger, LLC, 2021
    Read it


  6. O. Trott and A. J. Olson
    AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading
    J Comput Chem., vol. 31, no. 2, pp. 455-461, 2009
    DOI: 10.1002/jcc.21334


  7. J. D. Durrant and J. A. McCammon
    NNScore 2.0: A Neural-Network Receptor–Ligand Scoring Function
    J Chem Inf Model., vol. 51, no. 11, pp. 2897-2903, 2011
    DOI: 10.1021/ci2003889


  8. N. Zhang, Y. Chen, H. Lu, F. Zhao, R. V. Alvarez, A. Goncearenco, et al.
    MutaBind2: Predicting the Impacts of Single and Multiple Mutations on Protein-Protein Interactions
    iScience, vol. 23, no. 3, 2020
    DOI: 10.1016/j.isci.2020.100939


  9. C. Chen, V. S. Boorla, D. Banerjee, R. Chowdhury, V. S. Cavener, R. H. Nissly, et al.
    Computational prediction of the effect of amino acid changes on the binding affinity between SARS-CoV-2 spike RBD and human ACE2
    PNAS, vol. 118, no. 42, 2021
    DOI: 10.1073/pnas.2106480118


  10. P. Bauer, B. Hess, and E. Lindahl
    GROMACS 2022.3 Manual
    Zenodo, 2022
    DOI: 10.5281/zenodo.7037337


  11. N. Guex, and M. C. Peitsch
    SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling
    Electrophoresis, vol. 18, no. 15, pp. 2714-2723, 1997
    DOI: 10.1002/elps.1150181505


  12. E. F. Pettersen, T. D. Goddard, C. C. Huang, G. S. Couch, D. M. Greenblatt, E. C. Meng, and T. E. Ferrin TE
    UCSF Chimera--a visualization system for exploratory research and analysis
    J Comput Chem., vol. 25, no. 13, pp. 1605-1612, 2004
    DOI: 10.1002/jcc.20084


  13. J. Huang, S. Rauscher, G. Nawrocki, T. Ran, M. Feig, B. L. de Groot, et al.
    CHARMM36m: An Improved Force Field for Folded and Intrinsically Disordered Proteins
    Nat Methods, vol. 14, pp. 71-73, 2016
    DOI: 10.1038/nmeth.4067


  14. J. A. Lemkul
    From Proteins to Perturbed Hamiltonians: A Suite of Tutorials for the GROMACS-2018 Molecular Simulation Package, v1.0
    Living J. Comp. Mol. Sci., vol. 1, no. 1, 2018
    DOI: 10.33011/livecoms.1.1.5068


  15. K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. E. Shim, et al.
    CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields
    J. Comput. Chem., vol. 13, no. 4, pp. 671-690, 2010
    DOI: 10.1002/jcc.21367


  16. M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek, and G. R. Hutchison
    Avogadro: an advanced semantic chemical editor, visualization, and analysis platform
    J. Cheminform., vol. 4, no. 1, p. 17, 2012
    DOI: 10.1186/1758-2946-4-17


  17. Johns Hopkins Medicine
    The Immune System
    Johns Hopkins Medicine
    Read it


  18. T. C. Liang
    Epitopes
    Encyclopedia of Immunology (Second Edition), pp. 825-827, 1998
    DOI: 10.1006/rwei.1999.0219


  19. R. Vita, S. Mahajan, J. A. Overton, et al.
    The Immune Epitope Database (IEDB): 2018 update
    Nucleic Acids Res., vol. 47, no. D1, pp. D339-D343, 2019
    DOI: 10.1093/nar/gky1006


  20. J. Ponomarenko, H. H. Bui, W. Li, N. Fusseder, P. E. Bourne, A. Sette, and B. Peters
    ElliPro: a new structure-based tool for the prediction of antibody epitopes
    BMC bioinformatics, vol. 9, no. 514, 2008
    DOI: 10.1186/1471-2105-9-514