Comparion of predicted value and experimental value in our DL-ecGEM model:
Loss function of DL-ecGEM
The figure of obtained weights after training the DLkcat model. alpha is the power of protein embedding, beta is the power of substrate embedding. Different ratios of alpha and beta represent the different weights of substrate and enzyme in the reaction, providing key sequence sites that affect enzyme-substrate binding reactions.
Since the wet-lab experiment conditions are limited due to the epidemic, we use our optimized DL-ecGEM for flux analysis to simulate the wet-lab experiment and obtain satisfactory results.
The exchange reaction of acarbose has been set as a forced objective function, maltose was used as the only carbon source. The top 20 key enzymes with FCC values on the metabolic pathway have been calculated.
We have entered the top ten proteases into Hotspot Wizard to find mutation hotspots and ranked all recommended amino acid sites and high frequency amino acid recommendations for the proteins in full and re-predicted Kcat values for all possible new protein sequences, and obtained a total of five updated Kcat values with more significant increases.
We have updated these five Kcat values in our metabolic model and performed flux simulations again, and finally observed elevated fluxes for the acarbose exchange reaction. Thus, our software model has also corroborated the rationality of the modification scheme.
In order to successfully construct our model, we read a large amount of relevant literature about MACCS, GNN, CNN, etc.We have also actively conducted a large number of simulation experiments simultaneously. We obtained the final DL-ecGEM model after comparing a series of experimental results.
The references are as follows:
[1] Wang Y, Xu N, Ye C, Liu L, Shi Z and Wu J (2015) Reconstruction and in silico analysis of an Actinoplanes sp. SE50/110 genome-scale metabolic model for acarbose production. Front. Microbiol. 6:632. doi: 10.3389/fmicb.2015.00632
[2] Sánchez, Benjamín & Zhang, Cheng & Nilsson, Avlant & Lahtvee, Petri-Jaan & Kerkhoven, Eduard & Nielsen, Jens. (2017). Improving the phenotype predictions of a yeast genome‐scale metabolic model by incorporating enzymatic constraints. Molecular Systems Biology. 13. 10.15252/msb.20167411.
[3] Gu D, Jian X, Zhang C, Hua Q (2016) Reframed genome-scale metabolic model to facilitate genetic design and integration with expression data. IEEE/ACM Trans Comput Biol Bioinform https://doi.org/10.1109/TCBB.2016.2576456
[4] Domenzain, I., Sánchez, B., Anton, M. et al. Reconstruction of a catalogue of genome-scale metabolic models with enzymatic constraints using GECKO 2.0. Nat Commun 13, 3766 (2022). https://doi.org/10.1038/s41467-022-31421-1
[5] Schellenberger, J., Que, R., Fleming, R. et al. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc 6, 1290–1307 (2011). https://doi.org/10.1038/nprot.2011.308
[6] Lu, H., Li, F., Sánchez, B.J. et al. A consensus S. cerevisiae metabolic model Yeast8 and its ecosystem for comprehensively probing cellular metabolism. Nat Commun 10, 3586 (2019). https://doi.org/10.1038/s41467-019-11581-3
[7] Li, F., Yuan, L., Lu, H. et al. Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction. Nat Catal 5, 662–672 (2022). https://doi.org/10.1038/s41929-022-00798-z
[8] Bendl J, Stourac J, Sebestova E, Vavra O, Musil M, Brezovsky J, Damborsky J. HotSpot Wizard 2.0: automated design of site-specific mutations and smart libraries in protein engineering. Nucleic Acids Res. 2016 Jul 8;44(W1):W479-87. doi: 10.1093/nar/gkw416. Epub 2016 May 12. PMID: 27174934; PMCID: PMC4987947.