"Science is nothing but perception" Plato
Simulations and models have been being very important since some years ago. They allow to have a first approach of the project without using any biological material, which reduces costs and highlights the correct way to follow. Models and simulation are not only used in the firsts steps of the project, they are strong tools to support the data, the hypothesis and even to have a visual representation of things that cannot be seen with the naked eye, for example: protein structure
The main objective of our project is to get the docking of two proteins: CD19 and CD19-LIGAND (from now on: CD19L). The difficulty of this step is that CD19 (B-linfocites marker), a real protein encoded in humans genome, does not have so many ligands, that is why we had troubles looking for a ligand. After searching in bibliography we found an artificial ligand used in the development of a drug which finally could not be released. Another problem was that we did not have real evidence of the interaction between CD19 and CD19L, moreover, we could not find the structure of CD19L. Nevertheless, only with the sequence of CD19L we tried to use structural biology tools to study if this ligand was optimal for our aim.
First of all, we needed to have the structure of CD19L because in RCSB Protein Data Bank (RCSB PDB) we only found the structure of CD19 (6AL5). To get the structure we tried different approaches. First using AlphaFold2, an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence and has shown incredible results in the CASP competition in 2020. However, the result was dissapointing, as more than the 50% of the structure was disordered.
Our second approach was using homology modeling. Homology modeling is a easy way to find proteins with similar structure in order to use them as templates for model our protein, using reastraints and energy minimization. It needs to be said that the modelling was done only for the recognition domain in order to reduce the time of the modelling and because it is the only important domain in the docking protein-protein interaction. And schema of the method is:
One time we have the receptor (CD19) and the ligand (CD19L) structures it is time to see if they interact. There are different approaches to study the interaction between two proteins. Here, we have chosen an exhaustive search (pyDock = Zdock + energy scoring) instead of a stochastic sampling. These methods use a geometry-based docking with a Fast Fourier Transformation (FFT) based grid search where the proteins are discretized into small grids and then using a correlation function and the FFT to search the more stable interaction. This algorithm has a N3lnN3 computional cost, so that is why we are only using the reconigtion domain. We generated 2000 models of interaction and then scored by energy using electrostatics, van der waals, surface and hidrogen bonds. From the 10 best solutions we created the structures in pdb format and then chose the best one.