Modeling
Abstract
Modeling can give us a better insight on the process of improving the cellulose aerogel properties with modular protein coating while not wasting proteins in the process. The study was conducted in two parts. First, we adapted a geometrical approach to predict needed protein quantity. Next, we confirmed the latter approach with a statistical approach using chemical parameters of our different constructs.
The reflection went further towards better characterisation of the aerogel and its behavior on a larger scale. Hence, using defined parameters of the different materials, we modeled a full house and assessed the corresponding saved energy of the porous material. Finally, we assessed the thickness needed of the aerogel to meet the Swiss requirements in insulation.
Protein Modeling
Research and background information
Model introduction
The proteins selected for their properties, were carefully designed to specifically bind to cellulose molecules, the major constituent of the aerogel, via a cellulose-binding domain (CBD). By using the AlphaFold1 prediction of the proteins' 3D folding, a simplified representation for the designed proteins was developed to predict the quantity of molecules needed for the coating for the two main approaches.
The proteins needed to be modeled are exposed in the schematic representation above (fig .1, A,B). The different single domains were selected for their properties. The first structure permitted the attachment to the cellulose for the Cellulose Binding Domain (CBD). The second structure in the proteins gave the protein its aimed function and the ending sequence monomer of streptavidin (mSA) permitted the attachment of the different modular proteins as an example Avitag-SR .
Protein characterization
Recombinant protein structure would be essential to foresee the placement of the protein on a cellulose surface. The inability to resolve the protein structure experimentally motivated us to try and assess the structure behavior of our proteins. The major prediction tools use homology approaches to get the best prediction on a protein structure2. Our focus turned toward AlphaFold Deep Learning solution. This powerful tool permitted to exploit the model multilayer Neural Network the company developed. The basic functioning was given by capturing the features separately in the first layer of the model then building more detailed representation using the subsequent convolution layers (fig. 2). We identified two main components of the AlphaFold script, the first block called Evoformer permitted the classification of the specific Multiple Sequence Alignment segments in the protein and gave an early comprehension of the structure of our proteins. The final block in the Neural Net refined the 3D structure at the end.
The parameters used as input in the characterisation of our proteins are the protein sequences that are exposed in the Parts Page.
Amount of protein needed to coat our aerogel
Knowing the structure of our proteins was an essential aspect in the analysis of the quantity of proteins that would be needed to coat the aerogel. Some major assumptions were made on this part that we are going to expose later in the implementation.
In this part, the aerogel was first considered as a flat surface to use the surface of a cylinder as coating area, then the protein's structure and affinity to cellulose were used to geometrically coat the cylinder surface in placing the protein in an adjacent manner. The assumption here was that we considered the protein as elongated structures that attach perpendicular to the cellulose. In the model, the placement of the proteins was chosen to be ideal however the reality won’t be ideal in the placement of a monolayer of proteins. We considered a security factor to insure total coverage of the aerogels . We chose to confirm the model by looking for the affinity of the proteins toward cellulose and considered for this purpose a basic statistical approach.
We had different parameters that needed to be considered and we listed the most of them in (Table.1)
Parameters | Definition | Chosen value | Source |
---|---|---|---|
Aerogel properties | |||
Height (t) | Represents the height of the aerogel to be coated (Unit in mm) | 4.5 | Measured using Kayence VHX-7000 series microscope3. |
Diameter (d) | Represents the diameter of the aerogel to be coated (Unit in mm) | 30 | Measured using Kayence VHX-7000 series microscope3. |
Protein properties | |||
01a Extinction Coefficient (01a) | Represents the absorption constant associated to 01a protein sequence (unit in 1/M.1/cm) | 83810 | Taken using the sequence designed by the company who constructed our plasmids and using Expasy4. |
01a Extinction Coefficient (03a) | Represents the absorption constant associated to 03a protein sequence (unit in 1/M.1/cm) | 91010 | Taken using the sequence designed by the company who constructed our plasmids and using Expasy4. |
01a Molar Weight | Represents the Molar Weight of the protein 01a (Unit in g/mol) | 77798.84 | Similar to the extinction coefficient |
03a Molar Weight | Represents the Molar Weight of the protein 03a (Unit in g/mol) | 71498.66 | Similar to the extinction coefficient |
Security Factor (Q) | Factor that was defined to insure total coating of the aerogel surface | 2 | Taking into account the formation of one to two layers of proteins on the aerogel |
The parameters in (Table 2) were used into a jupyter notebook to get an adequate output. The jupyter notebook can be accessible by all iGEM teams after the competition to have time to document the code.
Implementation and results
Protein characterization
In order to characterize the two recombinant proteins of interest 01a and 03b, we studied the known domains of our proteins to model them separately and assess their geometry. For an individual assessment, we looked into each unique domain of interest from snapgene and we used the EMBL-EBI’s Sequence Similarity Searching FASTA. This is a method comparing the uploaded sequence with sequence databases through fast local alignment searching.
Both recombinant proteins presented the same attachment domains:
Click here to discover more information about attachement domains !
Cellulose Binding Domain (CBD)
This domain is known to have a very high affinity with the cellulose molecules, since the aerogel was mostly composed of cellulose and air. We inserted our sequence coding for the CBD in the FASTA algorithm. The major resembling protein existing is the Mini Scaffolding protein that we could access its structure using the following (link). In (fig.4), we identified a cylindrical shape of this protein segment. The specific shape permitted the high affinity toward cellulose. The chain had around 190 residus and its structure would be crucial to have adherent recombinant proteins.
Monomer of streptavidin (mSA)
In the project approach, we wanted to give a certain modularity to our insulation material. Therefore, throughout an mSA-Avitag linkage, the aerogel had different properties made possible using the mSA binding. The 3D configuration of this sequence must be the most adapted to attract biotinylated avitags. We found that our sequence was the closest to the Streptavidin- V1 protein. In (fig.5), we distinguished a small helicoidal structure that represents the region where we had the Avidin-like structure. This part had 122 residus which was coherent with the length of the sequence chosen by our team.
In the aim of protecting the cellulose aerogel different proteins were selected to efficiently functionalize the material. The major elements of this functionalization were described below:
Click here to discover more about our core functionalization chains !
Silk Protein
As described in the design page, the silk protein permitted the hydrophobization of the material. The structure in the litterature was usually elongated due to repetition of short protein segments. When executing the FASTA algorithm on the precise segment we opted to use on the recombinant proteins, we saw (fig.5) how the confident part in the prediction was the very long twisted plane. The end linkage segments did not have a valid predicted structure.
The following structure (fig.6) was a description of a protein similar to our sequence with a strong matching sequence and 70 % of identity between the two structures. Even though the structure coded by our plasmid was smaller in residu count, we were convinced that the final structure of this part was still going to be elongated and flat.
Green Fluorescent Protein (GFP)
We studied the GFP protein for its fluorescent properties for visual validation. The GFP protein is well characterized so it would make the analysis of the sequence the easiest. We saw in the (fig.7) below that the protein has a bigger helicoidal structure with 238 residus. The Chain that interested us was resembling specific variants of this protein, we take an example the 1EMA6. This specific structure gave the protein GFP a part of the fluorescence function.
SR Protein with a short Avitag
In order to study the modularity of our coating, we chose to have fire retardancy as an additional functionalization of the material. For this particular reason and building upon another iGEM Team (Parts Page), we chose to express the SR protein with an Avitag. The prediction was more confident for the structure of SR as it is an already characterized protein. From (fig.8), we noticed two small helicoidal structures that were characteristic of the SR protein geometry. The closest protein known and could be referenced because it had exactly the same structure is the Serine/arginine-rich splicing factor 1.
Benefiting from the powerful AlphaFold2 tool, we prolonged the modeling of our recombinant proteins to a larger insight on their folding. The supposition in the beginning was that the model would get the above single parts in a very efficient way and would try in later iterations to refine the prediction until getting a structure that could best represent the reality. For those reasons and considering the need for very powerful servers, we used ColabFold ressources. Hence, we ran our prediction on the GoogleCollab servers and got the information within some hours. We were inspired by the following article7 , describing how the algorithm combines both the power of a homology search of MMseqs2 and the computational prediction of AlphaFold2. The tool was free and accessible for every person wanting to use the ressource.
In the following part both our main proteins 01a and 03a, were displayed and briefly analyzed:
01a recombinant protein (Design Page)
Click here to display (Fig.9) Analysis!
The above protein constituted our first layer of attachment to the aerogel and the structure gave us crucial feedback on the project and how to efficiently coat the aerogel. According to (fig. 9, A), we identified the structure of the different single chains of interest easily. The blue slightly smaller helicoidal structure represented the mSA protein that was attached to the silk protein, the elongated green structure and finally we g tothe CBD sequence that is displayed in red. The (fig.9, B) confirmed the hypothesis of knowing very efficiently the three domains and having trouble distinguishing the in-between linkage which is totally normal since we designed those linking segments to be able to keep the structure of the three main elements. The same graph showed that few iterations of the prediction lacked to characterize the silk segment, which was an issue in the experiments. However both (fig. 9, C) and (fig. 9, D), confirmed the reasoning that the proteins kept a structure preserving their initial aim (Design Page). The linkage gave in reality more freedom of placement of those chains. The IDDT score helped to identify the percentage of correctly predicted and true structure that could be superimposed. In general, the higher this score the better the model is considered.
03a recombinant protein (Design Page)
Click here to display (Fig.10) Analysis!
We also designed a control to prove the effective attachment of our recombinant proteins. In (fig.10, A), we identified the three main parts of the mSA-GFP-CBD protein. Compared to the previous (fig.10, A) we saw that the GFP domain, in green, was more easily characterized and yielded a closer linkage to the mSA chain, in blue. We had both of the proteins closely associated. However for the CBD region, we identified a longer linkage which meant different configuration of tha attachment possible. From (fig.10, B, D) it was really interesting to see that the 5 predictions yielded similar correlation measures and similar IDDT scores, which enabled us to confirm that the model was robust in the determination of three chain structures. The Results were more robust then for the case of (fig 9, B,D) where there were differences in the silk domain between the five predictions. We noticed the similar lacking of structural representation when treating the linkage areas.
Those above information extracted from (fig.9) and (fig.10), were crucial to troubleshoot some aspects of the purification and to predict how the proteins behaved in a real environment. The visualization gave us a better understanding of the different chain sizes and permitted a confirmation of the proteins’ structure coherence in the manufacturing of the silk biofilm This characterization was crucial to pursue a modeling of the number of proteins needed to have an efficient coating.
Amount of protein needed to coat our aerogel
A better comprehension of the proteins helped us define the specific configuration of the recombinant proteins in the two different coatings that we developed during the project. We based our reasoning on the fact that the affinity of CBD to cellulose is really high. According to the article8, the CBD has an high affinity of 1.4 uM (umol/l) to fibrous cellulose. So we gave a supposition of the vertical alignment in the structure of both 01a and 03a.
From (fig.11 ,A) and (fig.11 ,C), we identified clearly the fact that this hypothesis was plausible since the linkages between the main constitutive chains presented no particular structure. We made the assumption to simplify the CBD region in a globular way to make easier computations (fig.11, B,D).
In the proposed model we wanted to coat geometrically the aerogel surface with different CBD globular entities. Since they were the most probable point of anchoring they represented an entire protein in the counting of the amount of proteins. We supposed that the proteins were placed in an adjacent manner.
First step was to consider the aerogel as a perfect cylinder that was coated by the proteins. We noticed that in (fig.12) the surface of the cylinder was simplified to ideally two circles and an elongated rectangle. We proceeded to a geometrical filling of the surface.
The attaching CBD region was taken from the above model 4JO55 to study its hydrodynamic characteristics using the Hullrad algorithm9. This program was used to get useful information on the CBD protein :
Parameters | CBD Protein |
---|---|
Molecular Mass of the protein (g/mol) | 18587 |
Anhydrous Volume Sphere Radius (AVSR) (Å) | 17.43 |
Maximum Dimension (MD) (Å) | 48.38 |
Using the values from (Table. 2), we populated the different surfaces of the aerogel with circular proteins using some simple geometry equations. The two values of AVSR and MD were studied in the example. Considering the protein radius and aerogel diameter, we came up with an iterative algorithm in three steps:
-
Step 1 : Populated the outer peripheral of the bigger cylinder:
We used the following angular equation to compute the angle of placement of the first small protein circle in the aerogel bigger circle:
-
Step 2 : Computed the number of cylinders that can populate the peripheral of our the bigger circle:
-
Step 3 : Defined a secondary circle that has the next small circles:
We iterated the new radius until we could not fit any small circles in the area (fig.12):
-
Ending condition:
In (fig.13), we saw clearly how the different circles were placed inside the allocated surface of aerogel. We extended the reasoning to filling the rectangular lateral surface. For the lateral specific surface, the operation was described as following:
-
Step 1 : Divided the width of the rectangle by the number of small protein circles that fitted in the line:
-
Step 2 : Multiplied the above number by the number of lines that fit in the height of the rectangle:
From (fig. 13,14 ), we saw that simple geometrical equations permitted populating on the surface of both the circle and the rectangle in an efficient way. We generalized this concept to different types of proteins. By taking the parameters of the aerogel from (Table.1) and the parameters of the CBD from (Table.2) we got the following output in number of molecules and in number of molar:
Chosen radius for globular representation | # protein molecules (million) | # moles (mol) |
---|---|---|
Taking the Anhydrous Volume Sphere Radius as parameter | 5.002 * 1e8 | 1.661 * 1e-9 |
Taking the Maximum Dimension as parameter | 7.852 * 1e8 | 1.304 * 1e-9 |
The values found in (Table.3) were useful to define the protein solution volume that we soaked our hydrogels or aerogels with.
Implementation of the model to prepare the aerogle coating
From purification of protein solution and final experiments, we got a certain absorbance of our protein solutions and that we translated it into its corresponding concentration. We used the Molecular Mass from there and the extinction coefficient for our protein sequences using the Expasy ProtParam computation tool, found in (Table.1). The values obtained from the Nanodrop measurements are the 01a and 03a absorbance values (Coating Results), which are respectively 0.10 and 012.
We then used a simple linear equation to get the concentration in molar10:
We got the final protein solution volume in the experiment using the following simple equation :
We found using both the (Equation. 6) and the molar weight of both proteins (Table. 1), their massique concentration exposed in (Table. 4). The final volume needed for the coating of the aerogel using the Anhydrous Volume Sphere Radius and a security factor is shown in (Table.4). Around 1.3 mL of solution for coating the aerogels with the chosen parameters seemed to be coherent for a total 3 mL volume of aerogel. We used slightly different parameters in the coating experiments, due to the change of the aerogel geometry.
01a solution | 03a solution | |
---|---|---|
Conc.(mg/L) | 9.283 * 1e-2 | 9.427 * 1e-2 |
Volume of solution for coating (mL) | 1.392 | 1.260 |
Comparing the first model with a statistical approach
In this approach we based our reasoning on the affinity of the CBD to the cellulose to have the needed concentration to coat the surface of the aerogel. According to the article of N.BOLAM and all11 , the observed affinity at saturation is around 14.5 umol/g for CBD and cellulose. We consider the aerogel as composed of fibrous cellulose. We wanted to determine the protein concentration that we added to have a perfect coating meaning that every site on the binding surface on the cellulose aerogel had a corresponding CBD.
Using the simple equilibrium interaction8:
Where [P] was the number of proteins that intervened in the reaction, [C] was the the concentration of binding sites on the cellulose surface and [PC] represented the concentration of binding sites on the cellulose surface exposed to the buffer. Moreover, considering that if we took [P] = Kd, we reached the equilibrium where [C] = [PC] which was corresponding to a full coating. The idea was to try to achieve the dissociation constant Kd with a specific security factor S (Table.1):
Hence the specific concentration of our protein that was needed to saturate the surface of aerogel. We next looked into the experimental surface of the aerogel and determined the needed volume to coat.
Using the Keyence microscope we got a better idea of the surface of the aerogel that we computed from a 3D stitching imaging of our aerogel (fig. 15).
From the experiments displayed in (Fig. 14,B), we got a surface of 10.66 cm2. We imagined that a specific coating would soak a thin volume on the surface of our aerogel and we chose a thickness of 20 nm as order of magnitude since it was the thickness achieved by litterature12. Knowing the density of the cellulose aerogel was 0.149 g/cm3 (Results Page), we had to coat 3.17 * 1e-6 g of the sample for a volume of 2.13 * 1e-5 cm3. Using the affinity value and the security factor described in (Equation. 8), we found a quantity of 9.21 * 1e-10 moles of proteins which were coherent with the results found in the first geometrical modeling (Table.3).
Conclusion
To summarize, our modeling gave us very useful information on the characteristics of our recombinant proteins 01a and 03a binding to cellulose and subdomains. We used the folding prediction to assess the amount of proteins to coat with our aerogel. With simple computations and modeling, we found useful quantities to apply directly into the insulative material manufacturing. Noteworthy, we verified the coherence of the geometrical model through a statistical approach,establishing good confidence in our results, we could improve the model with a finer characterisation of the aerogel surface and the study of the protein dispersion on a biofilm.
Aerogel Modeling
Building a model of a house was interesting to assess better the energy loss during 48 hours using the aerogel as an insulative material. For this part, we took advantage of Comsol Multiphysics thermal modules . We wanted to test the efficiency of the aerogel in a real life situation. The simulation gave us the saved energy with the insulation and at the same time the adequate thickness of aerogel that needs to be developed to meet the insulation policies.
Research and background information
Choice of parameters
For the environment conditions, we used the weather conditions of Zurich on the 1st of December of the year 2021. The duration of the simulation was fixed to 48 hours. We assumed that the temperature of the house was kept at 19 Celsius degrees during the trial time. The goal of our simulation was to evaluate the quantity of heat transferred outside of the house through the walls.
For the insulative aerogel representation in the model, we created a blank material and directly specified in the simulation parameters the important characteristics:
-
The measured density from the results of the produced samples was given by 146 kg/m3.
-
The thermal conductivity of the aerogel was chosen from the article of Lin-Yu Long and all13, as 0.025 W. m-1. K-1. We didn’t use the parameters obtained from the constructed testbench, since there were improvements to be made on it in order to consider that the measurements were correct.
-
The specific heat capacity was given by 1057 J / (kg. K), taken from the article of A. Lakatos & all14
A modelisation Computer Aided Design (CAD) of the house was used in this part with standard parameters. We added to the full house model the insulative material layer to seek an improvement in the energy saved.
Heat transfer theory
In this simulation, the software considered the three different thermal flow equations: conduction, convection and Radiation, which were relevant in the study of the thermal conductivity of the aerogel15 .
Mesh and Study
In the construction of a comsol simulation the physical laws were applied to smaller dissociated portions of the object. The software finally used those portions of the solid to be able to solve numerically the equation of heat transfer (fig.17) .
Results and interpretation
In the following model we compared both the insulative performances of an aerogel material and a known insulative material which was the EPS16.
We noticed that the higher the thickness (fig. 18), the better the aerogel protects the house from heat dissipation. At 100 mm of thickness we noticed almost 50 Wh of difference between the aerogel performance and the EPS one. We considered that both Insulative material efficient at this specific thickness, even though there is a slight advantage for aerogel. We chose at the end to aim for an aerogel having a 150 mm thickness in order to benefit from a clear advantage from EPS and still have a stable mechanical structure.
From the dissipated energy in (fig. 19) , we saw a difference of 180 Wh from the scenario of having an insulative material as the aerogel or having the same thickness in wood material. Knowing that in Switzerland the cost of electricity achieved 0.27 CHF per kWh this year, we saved 48,6 CHF in 48h which was a considerable amount of money if we considered an entire year. The latter value showed us how important the insulation materials are.
Conclusion
A full house model gave us valuable information on the potential thickness to target for the manufactured insulative material, the macro-porous aerogel, and how important insulation was for the saving of energy in a house. We could improve our model using a better characterization of the aerogel through assessing its porous structure, defining better the house parameters to be the closest to reality.
References
-
Highly accurate protein structure prediction with AlphaFoldNature, vol. 596, no. 7873, pp. 583-589
-
Modelling three-dimensional protein structures for applications in drug designDrug Discovery Today, vol. 19, no. 7, pp. 890-897
-
Kayence Digital Microscope specifications for VHX-7000 series
-
Computation tool for various physical and chemical properties of specific protein sequence
-
CBM3a-L domain with flanking linkers from scaffoldin cipA of cellulosome of Clostridium thermocellum
-
GREEN FLUORESCENT PROTEIN FROM AEQUOREA VICTORIA
-
ColabFold: making protein folding accessible to allNature Methods, vol. 19, no. 6, pp. 679-682
-
Characterization of the cellulose-binding domain of the Clostridium cellulovorans cellulose-binding protein AJournal of Bacteriology, vol. 175, no. 18, pp. 5762-5768
-
HullRad: Fast Calculations of Folded and Disordered Protein and Nucleic Acid Hydrodynamic PropertiesBiophysical Journal, vol. 114, no. 4, pp. 856-869
-
Spectrophotometric Determination of Protein ConcentrationCurrent Protocols in Protein Science, vol. 33, no. 1
-
Pseudomonas cellulose-binding domains mediate their effects by increasing enzyme substrate proximityBiochemical Journal, vol. 331, no. 3, pp. 775-781
-
Bioactive Silk Coatings Reduce the Adhesion of Staphylococcus aureus while Supporting Growth of Osteoblast-like CellsACS Applied Materials & Interfaces, vol. 11, no. 28, pp. 24999-25007
-
Cellulose Aerogels: Synthesis, Applications, and ProspectsPolymers, vol. 10, no. 6, pp. 623
-
Experimental verification of thermal properties of the aerogel blanketCase Studies in Thermal Engineering, vol. 25, pp. 100966
-
Parametric Model to Analyze the Components of the Thermal Conductivity of a Cellulose-Nanofibril AerogelPhysical Review Applied, vol. 11, no. 2
-
Thermal conductivity of expanded polystyrene (EPS) at 10°C and its conversion to temperatures within interval from 0 to 50°CEnergy and Buildings, vol. 52, pp. 107-111