Computational tools based on genome-scale metabolic models (GSMMs) have significantly advanced our capability to find non-obvious engineering targets for microbial cell factories. Most available software tools focus on predicting gene knockout effects and do not include the identification of putative gene amplification targets. Flux scanning based on enforced objective flux (FSEOF) is a prominent algorithm for identifying gene amplification targets and has been successfully used to optimize cell factories. However, FSEOF was not available as a stand-alone tool, and no code repository that applied FSEOF is publicly available. Thus, we developed a user-friendly command line tool that comprises FSEOF for any metabolite of interest in a GSMM. Our tool only requires three inputs from the user: An SBML file of a GSMM, the biomass-reaction ID of the GSMM, and the ID of the reaction that should be improved. The output is an Excel file with ranked reaction targets for overexpression. Thus, our user-friendly software identifies non-obvious genetic engineering targets for amplification that affect a metabolite of interest within a timeframe of one hour. The software is freely available at our team’s GitLab repository.
The goal of metabolic engineering for industrial applications is the overproduction of metabolites with the help of microorganisms. Starting with single modifications in metabolic pathways, modern metabolic engineering approaches include a much more systematic view of biological systems. This is primarily fueled by advances in computational methods (Woolston et al., 2013). One of the most prominent computational methods in metabolic engineering is flux balance analysis (FBA) for the analysis of genome-scale metabolic models (GSMMs). GSMMs are mathematical representations of all known chemical reactions within an organism. FBA calculates the respective fluxes through all the reactions of a GSMM based on specific mathematical constraints (Orth et al., 2010). To learn more about the general background of FBA and GSMMs, take a look at our modeling page.
Many tools using FBA with GSMMs are available to find genetic targets for metabolic engineering. Most of them, however, focus on predicting gene knockout effects and do not include the identification of putative gene amplification targets. Flux scanning based on enforced objective flux (FSEOF) is a prominent algorithm for identifying gene amplification targets and has been successfully used to optimize cell factories (Choi et al., 2010; Park et al., 2012). As no stand-alone FSEOF software tool is currently available and, to the best of our knowledge, no public code repositories can be found online, we decided to develop a user-friendly command line tool that utilizes the FSEOF algorithm for the identification of genetic overexpression and downregulation targets.
When we started using FBA to analyze our MonChassis yeast strains, we quickly realized that available tools and methods can be challenging to use without prior experience with FBA and GSMMs. Many tools are available in the COBRA toolbox for MATLAB. However, using the COBRA toolbox requires a MATLAB license and knowledge of the MATLAB programming language. Even though iGEM teams had free access to MATLAB for the duration of their project in recent years, many members of the iGEM community would benefit from free access to software tools for metabolic engineering beyond their iGEM project. Hence, we wanted our software to be as accessible and easy to use as possible, especially for users with only little experience with FBA and GSMMs. We aimed that our software can be used without programming knowledge and be fully built on the foundation of open-source libraries.
Regarding the easy use of our software, we decided that for the basic usage of the FSEOF algorithm to identify metabolic engineering targets, only three inputs of the user are required: An SBML file of a GSMM, the biomass-reaction ID of the GSMM, and the ID of the reaction that should be improved. SMBL files for the most common chassis organisms can be easily downloaded from existing databases (e.g. BioModels). If the reaction of interest is an endogenous reaction of the organism, no further modification of the downloaded SBML file is needed. If the reaction of interest is part of a heterologous introduced pathway, these reactions need to be added to the SBML file. This can be easily achieved with various web-based applications for modifying SBML files, like ESCHER-FBA or fluxer. Users with programming experience are advised to edit their SBML files with the COBRApy library. The biomass ID and the reaction ID can be quickly found by searching for the terms "biomass" and a name of an involved metabolite of the reaction of interest within the SBML file. The FSEOF software returns an excel file that contains the ranked overexpression and downregulation targets.
To build our software on the foundation of open-source libraries, we decided to implement our FSEOF software in python using the COBRApy library (Ebrahim et al., 2013). COBRApy has most of the functionalities of the COBRA toolbox, but does not require a MATLAB license. Therefore, users with programming experience can easily adjust the software to their needs or implement the FSEOF algorithm in individual pipelines.
Our FSEOF software for finding genetic targets for overexpression and downregulation requires eight steps for the installation:
pip install -r requirements.txt
And press enter.
(Note: This is only necessary if you use the FSEOF software for the first time)python run_FSEOF.py NameOfYourSBMLFile BiomassID reactionID
and press enter. The results
are stored in an Excel filepython run_FSEOF.py yeast_gem.xml r_4041 r_4269
Users can adjust the parameters of the FSEOF algorithm according to their needs with command line flags. However, this is not necessary for the basic functionality of the FSEOF software:
--steps
Default: 30
Adjust the number of steps that the FSEOF algorithm uses to gradually increase the enforced flux from its minimum value to its theoretical maximum.
Example: python run_FSEOF.py yeast_gem.xml r_4041 r_4269 --steps 40
--constrainBiomass
Default: False
Add a new constraint during FBA that constrains the flux through the biomass reaction of the GSMM at each
iteration of the FSEOF algorithm. Constraining the flux through the biomass reaction can improve the accuracy of
the results, as biological irrational solutions (e.g. no growth) are reduced. By default, the biomass flux is
constrained to 95% of its maximal value. This value can be adjusted with the
--changeBiomassConstrain
flag.
Example: python run_FSEOF.py yeast_gem.xml r_4041 r_4269 --constrainBiomass
--changeBiomassConstrain
If the user decides to constrain the biomass, they can adjust the percentage of the maximal biomass flux they want to enforce.
Example:
python run_FSEOF.py yeast_gem.xml r_4041 r_4269 --contrainBiomass --changeBiomassConstrain 0.80
--useFVA
Default: False
If this flag is used, flux variability analysis (FVA) instead of FBA is used to find the targets. This might improve the accuracy of the results, but will drastically increase the software's runtime. For more information on using FVA for the FSOF algorithm, users are referred to Park et al. 2012.
Example: python run_FSEOF.py yeast_gem.xml r_4041 r_4269 --useFVA
Choi, H.S. et al. (2010) ‘In Silico Identification of Gene Amplification Targets for Improvement of Lycopene Production’, Applied and Environmental Microbiology, 76(10), pp. 3097–3105. Available at: https://doi.org/10.1128/AEM.00115-10.
Ebrahim, A. et al. (2013) ‘COBRApy: COnstraints-Based Reconstruction and Analysis for Python’, BMC Systems Biology, 7(1), p. 74. Available at: https://doi.org/10.1186/1752-0509-7-74.
Orth, J.D., Thiele, I. and Palsson, B.Ø. (2010) ‘What is flux balance analysis?’, Nature Biotechnology, 28(3), pp. 245–248. Available at: https://doi.org/10.1038/nbt.1614.
Park, J.M. et al. (2012) ‘Flux variability scanning based on enforced objective flux for identifying gene amplification targets’, BMC Systems Biology, 6(1), p. 106. Available at: https://doi.org/10.1186/1752-0509-6-106.
Woolston, B.M., Edgar, S. and Stephanopoulos, G. (2013) ‘Metabolic Engineering: Past and Future’, Annual Review of Chemical and Biomolecular Engineering, 4(1), pp. 259–288. Available at: https://doi.org/10.1146/annurev-chembioeng-061312-103312.