Modelling

Strucutre prediction of DNA Aptamers and Protein

Structure of FimH (Our target protein)

Our target protein is fimbrial protein present on the surface of E.coli called FimH. FimH is a 30kDa protein positioned on the tip of surface filaments called type 1 fimbriae that mediate mannose-specific binding and are the most common adhesive organelles in E.coli and other enterobacteria. FimH consists of two immunoglobulin-like domains: an N-terminal lectin domain that binds mannose ligand and a C terminal pilin domain that anchors FimH into the fimbrial tip.

FimH as an adhesive protein:

Adhesive proteins of prokaryotic and eukaryotic cells are generally multidomain in nature, with different domains to bind the ligand on the target cell and to anchor the binding protein to the cell membrane or adhesive organelle.

The allosteric catch bond model has been extensively proposed and experimentally supported for the mannose-specific fimbrial adhesin of FimH. The anchoring (pilin) domain of FimH interacts with the manose-binding (lectin) domain and causes a twist in the beta-sandwich fold of the latter.

This loosens the mannose-binding pocket on the opposite end of lectin domain, resulting in an inactive low-affinity state of the adhesin. The autoinhibition effect of the pilin domain is removed by application of tensile force across the bond, which separates the domains and causes the lectin domain to untwist and clamp tightly around ligand like finger trap toy.

Image

AlphaFold2 model for protein structure prediction:

Structure prediction has always been an important procedure to study protein-ligand and protein-protein interactions. We have used AlphaFold2 deep learning model to generate the prediction of 3D conformation for FimH protein.

AlphaFold 2 has been very successful in 3D structure prediction of Proteins. The system created a stir in November 2020 by winning the biennial CASP contest (Critical Assessment of Structure prediction) beating around 100 other software programs.

The program uses a form of attention network, a deep learning technique that focuses on having the AI algorithm identify parts of a larger problem, then piece it together to obtain the overall solution. The overall training was conducted on processing power between 100 and 200 GPUs

We have used AlphaFold2 Google colab notebook for generating the 3D conformation of the Protein sequence provided by our wetlab team after literature survey. Colab

Image
Image

Figure: Results from the CASP competitions for the past few years.

Image
Image

Figure: Overview of the AlphaFold2 model’s workflow.

DNA aptamer structure prediction using RNAFold and RNAComposer:

To overcome the challenge of the 3D structure prediction of DNA Aptamers we have come with a protocol for our use-case. The steps involved in the protocol are as follows:

  1. Translation of DNA sequence to RNA sequence (since, all of the prominent webservers are available for predicting 3D conformation of RNA sequences)
  2. Feeding the Sequence obtained to RNAFold webserver for generating the 2D conformation. Predicting the 2D structure is half-way to achieve to 3D conformations.
  3. We obtain the dot-bracket form of the 2D structure of the aptamer sequence from RNAFold server and input this into RNAComposer.
  4. Once we submit the job on the RNAComposer webserver, it stays on the queue. After the computations on the server, we recieve the pdb file of the 3D structure of the aptamer sequence.
  5. This .pdb file which contains the 3D conformation of the sequence will be used in the main process i.e. In-silico screening of aptamers using molecular docking procedure.

Image

Figures 1 and 2: Results for the 2D structure prediction of aptamer sequences using RNAFold webserver

References:

  1. Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021-03819-2
  2. AlphaFold Wikipedia: https://en.wikipedia.org/wiki/AlphaFold
  3. RNAFold webserver http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi
  4. RNAComposer webserver https://rnacomposer.cs.put.poznan.pl/

In-silico screening of the aptamers:

Aptamers shows high potential of application in therapeutics and diagnostics. Aptamers are usually obtained through rigorous in-vitro screening procedures known as Systematic Evolution of Ligands by EXponential enrichment (SELEX).

While, screening aptamers through conventional methods such as SELEX is time consuming and require high-end laboratory setup and access to large DNA library database, screening aptamers through computational docking and molecular dynamics study are highly sorted.

In our drylab, we aim to approach the process of aptamer screening through computatoinal methods such as molecular docking and Molecular modelling (MD) simulations. This approach will be really beneficial due to minimal use of chemical and reagents. Since, SELEX is both time consuming and requires a lot of resources(reagents, lab equipment etc.).

However, the challenge of computational costs proved to be a barrier for the drylab to achieve maximum screening of aptamer library for finding the best aptamer which has highest affinity for the target protein.

In-silico approaches were able to poke the details of aptamer-ligand interactions which is hard to elucidate experimentally.

Molecular simulation and Docking:

Molecular simulation involves preparation of our aptamer and Target protein for the docking procedure. We used autodock tools, cygwin, Discovery studio visualization software to perform the molecular simulation of protein-ligand interactions. The best Docked pose is the one with the least amount of the binding free energy. Docking scores will be obtained and be used to screen best ranked dockings. Simulations will be perfomed on the generated aptamer library with each aptamer from the library and target protein and achieve a dataset of containing important features set like binding free energy of the formed complex to screen the aptamer library.

“Our drylab work can act as a verification mechanism parallel with wetlab work to make sure the observations match both in-silico and in-vitro.“

Future work:

We plan to continue the In-silico screening of the aptamer sequences with our drylab team to screen our preliminary library of random aptamers generated (500 sequences) putting a constraint on the ratio of different amino acids in the sequence) and narrow down to a small library for our wetlab team to screen them experimentally.

We plan to automate the protocol inorder to make it possible to screen large aptamer libraries and create a tool for further research into In-silico aptamer screening.

References:

  1. Mohamad Zulkeflee Sabri, Azzmer Azzar Abdul Hamid, Sharifah Mariam Sayed Hitam, Mohd. Zulkhairi Abdul Rahim, In-Silico Selection of Aptamer: A Review on the Revolutionary Approach to Understand the Aptamer Design and Interaction Through Computational Chemistry (https://www.sciencedirect.com/science/article/pii/S2214785319338714)

Epidemiological modelling

We performed data visualisation of the UTI incidence, prevalence, death across all countries by generating a world heat map of UTI spread.

We obtained the dataset from the Global burden of disease database available for public use. The Global Burden of Disease (GBD) provides a tool to quantify health loss from hundreds of diseases, injuries, and risk factors, so that health systems can be improved and disparities can be eliminated.

Using geopandas library for generating the world heat map of UTI spread. We generated the visualization of the data to gain insights into the prevalence of UTI cases, and number of deaths caused every year due to UTI.

We performed normalization on the data with the country’s population to study the UTI deaths in India and other countries.

References:

  1. Global Burden of Disease Study 2019 (GBD 2019) Data Resources: https://ghdx.healthdata.org/gbd-2019
  2. Global Burden of Disease search tool https://vizhub.healthdata.org/gbd-results/
  3. Jordahl, K., 2014. GeoPandas: Python tools for geographic data. URL: https://github.com/geopandas/geopandas.

Image

Image
Image
Image
Image
Image
Image
Loading...