The crucial steps in our project include biofilm attachment, production of recombinant proteins and designing the reactor. In order to optimise the vital elements in our project, we created and instigated modelling. We executed mathematical and computational models to answer the following questions:
The process in which the microorganisms attach to and grow on a surface is called biofilm formation. Biocarriers provide surface for the biofilm attachment. It is essential to understand the steps involved in the formation to derive the model.
The stages of biofilm formation are as follows :
Detachment is one of the phases in the biofilm cycle. Since the biofilm is attached to the biocarrier, the detachment phase shouldn't lead to the run off of engineered bacteria. Hence, it is critical to identify when the biofilm present in the biocarriers need replacement. Conditions that govern detachment of biofilm are stress parameter of bacteria, biofilm strength, liquid shear and other properties of the bacteria.
C - Yield stress parameter of bacteria
MC - Fluidity of the bacteria
NC - Flow index of bacteria
-1 - Power-law index of the model
LA - Strain rate for Acc based on bacterial physical parameters
L0 - Initial stress tensor based on density
ax - attachment in X axis
ay - attachment in Y axis
gx - growth in X axis
gy - growth in Y axis
dx - detachment in X axis
dy - detachment in Y axis
Gro → Growth
Det → Detachment
Att → Attachment
The csgBAC operon encodes csgA, csgB and csgC. csgA is the major structural subunit of curli protein and it is mainly responsible for biofilm formation. csgB is the minor structural subunit and it binds to the extracellular matrix and facilitates the polymerization for the production of curli proteins. csgC is required for the correct assembly of mature curli fimbriae. csgD is the positive transcriptional regulator of csgBAC operon. The transition from the planktonic to the multicellular state of the bacteria is controlled by csgD through regulation of curli genes. To activate csgD gene and produce curli proteins necessary for biofilm formation, upregulation of OmpR is required.
The concentration of gene is considered to be constant.
Transcription and translation processes are not hindered due to any disturbance.
RNA polymerase is available in adequate quantity.
Ribosomes are present in sufficient amounts to aid in translation.
mRNA formed per unit time is given by:
d[mRNA] / dt = k₁[gene] - d₁[mRNA] (1)
Protein formed per unit time is given by:
d[protein] / dt = k₂[mRNA] - d₂ [protein] (2)
Where,
k₁ is transcription rate
d₁ is mRNA degradation rate
k₂ is translation rate
d₂ is protein degradation rate
OmpR protein production:
d[mRNA OmpR] / dt = k₁[gene OmpR] - d₁ [mRNA OmpR] (3)
d[protein OmpR] / dt = k₂ [mRNA OmpR] - d₂[protein OmpR] (4)
OmpR protein has been produced, now OmpR upregulation activates csgD production
d[mRNA csgD] / dt= k₃ [gene csgD] - d₃ [mRNA csgD] (5)
d[protein csgD] / dt = k₄ [mRNA csgD] - d₄ [protein csgD] (6)
We know that csgD is a positive transcriptional regulator of csg BAC operon. csgD controls the transcription of csgA, csgB and csgC genes.
d[mRNA csgA] / dt = k₅[gene csgA] - d₅ [mRNA csgA] (7)
d[mRNA csgB] / dt= k₆[gene csgB] - d₆ [mRNA csgB] (8)
d[mRNA csgC] / dt = k₇[gene csgC] - d₇ [mRNA csgC] (9)
Parameters involved:
Parameter | Description | Values | SI unit | Reference |
---|---|---|---|---|
k₁ | Rate of transcription of OmpR | TBF | Sec -1 | - |
k₂ | Rate of translation of OmpR | TBF | Sec -1 | - |
k₃ | Rate of transcription of csgD | 0.0214 | Sec -1 | Proshkin, Sergey, et al. 2010 |
k₄ | Rate of translation of csgD | TBF | Sec -1 | - |
k₅ | Rate of transcription of csgA | 0.0921 | Sec -1 | Proshkin, Sergey, et al. 2010 |
k₆ | Rate of transcription of csgB | 0.0214 | Sec -1 | Proshkin, Sergey, et al. 2010 |
k₇ | Rate of transcription of csgC | 0.0214 | Sec -1 | Proshkin, Sergey, et al. 2010 |
TBF* - To be found
It is evident from the above mentioned flowchart and equations that the mRNA of csgA,csgB and csgC is produced in greater amounts when the transcriptional regulator, csgD protein is formed effectively. Increase or decrease in the csgD protein levels will control the transcription of csgA,csgB and csgC curli genes.
The gene is transcribed into mRNA and the mRNA is translated into protein. Promoter strength, plasmid copy number, transcription factors, RNA polymerase are important parameters responsible for effective transcription process. For the effective translation process, ribosome, strength of ribosome binding site(RBS), tRNA play a pivotal role.
A → Promoter(BBa_J23100/BBa_K896008)
B → Transcription factor
C → mRNA
D → Fusion protein(OmpA - aphA)
Kon represents binding between A and B
Koff represents unbinding between A and B
d[A.B] / dt = kon [A] [B] – koff [AB] (1)
d[A] / dt = -kon [A] [B] + koff [AB] (2)
d[B] /dt = – kon [A] [B] + koff [AB] (3)
[AB] + [A] = Cₙ (4)
Where Cₙ is plasmid copy number.
[AB] = Cₙ kon [B] / kon [B] + koff
[AB] = Cₙ[B] / kd + [B] (4)
d[C]/dt = (k₁Cₙ[B] / kd +[B]) - d₁[C] (5)
d[D] / dt = k₂ [C] – d₂ [D] (6)
C = (k₁ Cₙ [B]/ kd + [B])/d₁ (7)
D = (k₂ k₁ Cₙ/ d₁d₂) (βo + (1- βo)( [B]ⁿ/ kd + [B]ⁿ) (8)
In the Hill function,
βo is the basal expression
n is hill’s coefficient
kd is apparent dissociation constant
Transcription is mainly dependent on promoter and transcription factor. For the gene expression, transcription is one of the rate determining steps. The gene expression is controlled by various factors.Depending on the plasmid copy number and the promoter strength, the protein yield will differ. So for different promoters, the expression of the fusion protein will be different based on the individual promoter strength.
Fusion proteins can be produced when two genes coding for different proteins are linked by a linker by a linker sequence and expressed as a single entity.In our project, OmpA gene attached to a linker is joined with aphA gene. The role of OmpA linker is to bring aphA protein to the cell surface. For the efficient fusion protein yield, the binding affinity should be strong. To check the model confidence of the fusion protein, we have used the I-TASSER server. The server provided C - score for the fusion protein. We analysed the C - score to check the quality of the predicted fusion protein model. C - score obtained using the I-TASSER helped us understand the model that we initially built. We will work on improving the C - score of the fusion protein till it achieves a high confidence score. Z-score is the normalized Z-score of the threading alignments. Alignment with a Normalized Z-score greater than 1 means a good alignment. The fusion protein sequence received 2.56 Normalized Z-score and pdb hit in Rank 1 is 1n8nA.
To check the model confidence we used another software called AlphaFold. The confidence score of the functional protein domains in our structure such as aphA and ompA range from confident to very confident (blue to dark blue) with slight distortion and lower confidence as a result of the linker molecule between the two major protein domains. (In the below model, blue ribbon represents OmpA-linker and green ribbon represents aphA)
Our pLDDT score for amino acids also has high confidence for the two protein domains with some distortions predicted due to the presence of linker between the two protein domains.
Overall, considering our confidence scores, pLDDT, and Predicted aligned Error, we concluded that our fusion protein system is functional in the expected manner and we proceeded with construction.
The native acid phosphatase will have an active site for substrate binding. The conformation of the active site in aphA shouldn’t change when aphA is present in fused form. To check if there is any change in the active sites, we have made use of PyMOL. We have utilised the ‘align’ function in PyMOL which performed sequence alignment and structural superimposition. We aligned the OmpA-aphA model predicted by I-TASSER and the OmpA-aphA model predicted by AlphaFold Colab with the native aphA. We observed that the aphA in OmpA-aphA fused form and the native aphA superimposed perfectly and thus confirming that the active site of aphA was not disturbed. Hence, it is proved that there is no difference in the active sites of native aphA and OmpA-aphA.
iTASSER fusion protein prediction and native acid phosphatase aligned using PyMOL. (Pink ribbon - native aphA; Yellow ribbon - ompA of fusion protein; Blue ribbon- aphA of fusion protein)
Alphafold fusion protein prediction and native acid phosphatase aligned using PyMOL.(Blue colour - Alphafold fusion protein; Pink ribbon - native acid phosphatase.)
C. Picioreanu, M.C.M van Loosdrecht, J.J. Heijnen, ”Two dimensional model of biofilm detachment caused by internal stress from liquid flow”, Biotechnology and Bioengineering, vol. 72, no. 2, pp. 205-218, 2001. https://doi.org/10.1002/1097-0290(20000120)72:2%3C205::AID-BIT9%3E3.0.CO;2-L
Picioreanu, C., Van Loosdrecht, M. C., & Heijnen, J. J. (1998). Mathematical modeling of biofilm structure with a hybrid differential‐discrete cellular automaton approach. Biotechnology and bioengineering, 58(1), 101-116. https://doi.org/10.1002/(SICI)1097-0290(19980405)58:1%3C101::AID-BIT11%3E3.0.CO;2-M
Picioreanu, C., van Loosdrecht, M., & Heijnen, J. (1999). Multidimensional modeling of biofilm structure. Delft University of Technology, Faculty of Applied Sciences.
Rittman, B. E. (1982). The effect of shear stress on biofilm loss rate. Biotechnology and bioengineering, 24(2), 501-506. https://doi.org/10.1002/bit.260240219
Proshkin, S., Rahmouni, A. R., Mironov, A., & Nudler, E. (2010). Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science, 328(5977), 504-508. https://doi.org/10.1126/science.1184939
Ledezma-Tejeida, D., Altamirano-Pacheco, L., Fajardo, V., & Collado-Vides, J. (2019). Limits to a classic paradigm: most transcription factors in E. coli regulate genes involved in multiple biological processes. Nucleic acids research, 47(13), 6656-6667. https://doi.org/10.1101/479857
Kim, H., & Gelenbe, E. (2011). Stochastic gene expression modeling with hill function for switch-like gene responses. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(4), 973-979. (https://doi.org/10.1109/TCBB.2011.153)
Likhoshvai, V., & Ratushny, A. (2007). Generalized Hill function method for modeling molecular processes. Journal of bioinformatics and computational biology, 5(02b), 521-531. (https://doi.org/10.1142/S0219720007002837)
Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., & Zhang, Y. (2015). The I-TASSER Suite: protein structure and function prediction. Nature methods, 12(1), 7-8. (https://doi.org/10.1038/nmeth.3213)
Roy, A., Kucukural, A., & Zhang, Y. (2010). I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols, 5(4), 725-738. (https://doi.org/10.1038/nprot.2010.5)
Zhang, Y. (2008). I-TASSER server for protein 3D structure prediction. BMC bioinformatics, 9(1), 1-8. (https://doi.org/10.1186/1471-2105-9-40)
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589. (https://doi.org/10.1038/s41586-021-03819-2)
Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold: making protein folding accessible to all. Nature Methods, 1-4. (https://doi.org/10.1038/s41592-022-01488-1)
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589. (https://doi.org/10.1038/s41586-021-03819-2)
Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold: making protein folding accessible to all. Nature Methods, 1-4.(https://doi.org/10.1038/s41592-022-01488-1)
Schrödinger, L., & DeLano, W. (2020). PyMOL. Retrieved from PyMOL