Germinant Receptors

Aim

To establish a sense and response system using B. subtilis spores we need to have control over the germination cascade. Several ideas were initially considered for how to achieve this control, but we eventually focused on nutrient triggered germination. Within sporulating bacteria classes Bacilli and Clostridia, this is mediated by Ger receptors that recognise and trigger a germination cascade in response to small molecule nutrients [1]. The GerA receptor in B. subtilis was chosen as our target, as it is the only known receptor capable of inducing germination in the presenceof a single germinant (L-alanine) and can function independently from other receptors. For example, GerB which is thought to increase the sensitivity of GerA, only germinates in conjunction with GerB or GerK in the presence of fructose, K⁺ and L-asparagine [2]. Our aim is to alter GerA receptor specificity to trigger germination in response of our fungal biomarker, N-Acetylglucosamine (NAG). To assist in this aim, the dry lab sought to create a mutant library of germinant receptors which could include a mutant with the desired phenotype to be tested in the lab.

Method

Narrowing the Search Space
Problem with the library size
Initially, we considered conducting directed evolution using error-prone PCR to alter receptor specificity to NAG. However, due to logistical constraints, we only had the capacity to produce and test a mutant library on the order of 10⁵ combinations whereas the number of possible combinations produced by the PCR is on the order of 20¹²²⁰ (for the GerA receptor with a length of 1220 residues). This would leave us with a negligible chance of success. To address this, we identified key residues involved in receptor function and limited our mutations to those.

Understanding the GerAB receptor
Receptors in the Ger family have 3 subunits: A, which is thought to mediate DPA release and cortex-lytic enzyme SleB [3]; B, which shares homology with the APC superfamily of amino acid transporters and is thought to act as a water and ion channel [2]; and C, which has an unknown function.

Putative binding sites for L-alanine in gerA have been identified in both the A and B subunits through mutagenisis studies [4][5], but we chose to focus on the B subunit as it is where sugar binding sites have been found in sugar specific Ger receptors [6]. This was also validated in discussion with spore germination experts Professor David Rudner and Dr. Graham Christie.

In our approach to limiting the residues we mutate we have made 2 assumptions:
1. That all Ger germinant receptors and receptor subunits function similarly.
2. That changes to one binding site in the germinant receptor are sufficient to cause the correct conformational change to induce germination.

Working off of these assumptions we aimed to limit our mutations to residues involved in a single binding site. Our assumptions were corroborated with Rudner and Christie, as well as another expert Professor Peter Setlow. We also communicated with structural biologist Dr. Luke Yates who suggested to also consider residues lining the binding pocket that may obstruct the approach of NAG, as it is considerably larger than L-alanine.

Both structural and sequence based methods were used to better understand the process of ligand binding and location of our binding site. As the crystalline structure of GerA has not yet been determined, we used AlphaFold Multimer [7] to obtain a structure of the receptor complex. This has not been previously done to our knowledge. Alphafold structures were validated using the EVcouplings [8] software which infers protein structure from evolutionary sequence covariation, and structural alignment using TMalign [9] with homologous protein GkApcT, used previously in the homology modelling of the GerAB.

For a sequenced based understanding of receptor function, Multiple Sequence Alignment (MSA) using Clustal Omega [10] was conducted on B subunits from the whole family of germinant receptors to identify conserved and highly variable regions. The most strongly conserved regions are thought to preserve the structure of the receptor whereas the variable regions are thought to be related to specificity.

Throughout the project we have looked into using coarse grain GROMACS [11] simulations as well as Normal Mode Analysis to gain a better understanding of the interactions between different subunits and the conformational change induced by ligand binding. However, these simulations were not suited to the timeframe of our project and can be revisited in future research.

Selecting residues of interest
Residues of interest were selected through a combination of methods. Potential binding pocket residues were identified through flexible docking of L-alanine and NAG within GerAB and glucose within GerKB using AutoDock Vina [12] and compared to binding residues identified through mutagenesis studies in the literature [4]. The broader binding pocket was identified using P2Rank [13]. We identified potential bottleneck residues lining the channel using CaverDock [14]. Residues were weighted more strongly if they were predicted to be significant by multiple methods. ‍

Designing chimeras

B. subtilis spores contain 3 germinant receptors: GerA, GerB and GerK. We also considered GerK as a target for mutations as it germinates in response to glucose, which has a much more similar structure to NAG than L-alanine. In searching for a means to link GerK specificity to GerA’s independent germination pathway we came across the GerU receptor in B megaterium. The receptor forms complexes with two B subunits (UB and VB), that offer different specificities. Functional GerV receptors containing chimeras of the two subunits have been successfully created [15]. This interchangeability is possibly due to the high identity between the flanking regions of the UB and VB sequences, suggesting that it is important that the residues interacting with A and C subunits were conserved. Based off of this research we hypothesised that we may also be able to create a functional GerA AB-KB chimera that germinates in response to glucose through conserving native GerA subunit interactions, with a KB centre for binding.

The similarity of the KB and AB subunits was assessed using MSA and structural alignment using TMalign. Key interacting residues between the Subunits were identified using EVCouplings and using the HADDOCK webserver [16][17]. These were compared with interactions between the AA-KB-AC subunits. Interacting residues identified, we then created a sequence for the chimera to be tested in the lab.

Results

Our AlphaFold Multimer structure of the Germinant receptor can be seen in Figure 1. Notably, the structure of the B subunit was slightly different in the complex than when modelled alone. Transmembrane helices 4 of the B subunit and 6 of the A subunit display a high number of evolutionary covariance pairs, suggesting that these are an important site of interaction, and validating our structure which places the two regions next to each other. The placement of the C subunit is less clear. This is due to the fact that it is much more disordered than the other two subunits, which makes it harder for AlphaFold to model, and only anchored to the membrane by a short diacylglycerol anchor at the N termini. Professor Mark Isalan suggested this could imply the C subunit acts as a cap to mediate ligand access, as it has the freedom to move on its N termini hinge.

Figure 1: AlphaFold Multimer generated structure of GerA complex. Subunits are colourcoded and labelled. Interacting residues between subunits identified using Evcouplings and highlighted.

Structural alignment between B subtilis Ger receptors yielded high TMalign RMSSD scores ranging from 0.85 to 0.95, which suggests an unsurprising significance in similarity of function. This is despite having sequence identities of only 21-32%. The structural and sequential similarities of the B Megaterium GerUB and GerVB subunits was noteworthy. They are almost identical in structure (RMSSD 0.99) and have an 80% sequence identity despite having different ligand specificities. This similarity is possibly key to forming stable and functional complexes with the same A and C subunits.

Figure 2: AlphaFold modelled structure of GerAB with conserved sequence regions coloured in green.

Residues of interest
MSA of Ger receptor B subunits showed that disordered regions at the end of alpha helices 1, 2, 6, and 8 were predicted to be highly conserved (Fig 2), suggesting functional significance. These correspond to literature suggesting that helices 2, 3, 6 and 8 are the most important for binding [2], and the location of a putative binding site [4].

Figure 3: Binding site identification. A) GerAB subunit with residues of interest coloured. B) MSA of B subunits of GER receptors. C) NAG binding site identified in GerAB through AutoDock. D) Tunnel to binding site identified using CaverDock with bottleneck pictured.

In narrowing down the residues of interest (Fig 3), we found the binding site with highest affinity for L-alanine in GerAB to be located in the putative binding site described by literature (Fig 4). This is also supported by finding similarly high binding affinities for glucose and N-acetlyl-glucosamine within the homologous region in the GerKB subunit. Extra channel residues located in clusters identified through other means were added to the list of residues of interest to increase the range of possible mutations (Fig 4b). All but one residue of interest are located on helices 1, 3, 6, and 8 (Fig 4a), supporting the idea that the conserved regions previously identified are involved in receptor function.

Figure 4: A) GerAB structure with literature putative binding sight coloured in dark blue and green. TM regions exposed in the binding pocket (1, 3, 6, and 8) are coloured cyan. B) Table of the residues of interest identified with methods they were identified through and TM region they are located in stated.

Chimera
HADDOCK docking of the GerAA and GerAB subunits produced a conformation almost identical to our original AlphaFold structure. As previously stated, TM4 of the B subunit and TM6 of the A subunit were identified, and a chimera GerAB-KB sequence was created by replacing regions of GerKB in TM4 with GerAB (Fig 5).

Figure 5: A) AlphaFold Multimer modelled structure of GerAB-KB chimera with GerAB TM4 highlighted. B) MSA alignment of the B subunits of the three Ger receptors in B subtilis. Residues present in at least two of the sequences are highlighted and transmembrane regions of the sequences are indicated.

Linker Design

Aim

Chitinase has been successfully incorporated into the spore display previously [17] with reduced bioactivity. This decrease in bioactivity is usually due to inhibitory interactions in between the anchor protein and enzyme or misfolding of the fusion proteins [20]. From the literature, we have found that a standard way of addressing this is to introduce a linker in between the chitinase and anchor protein to separate them. There is a range of standard linkers with varied properties available [20], but these are not necessarily optimised for our particular fusion protein. With this in mind, we designed and investigated a range of linkers for our fusion protein, optimising for variables such as anchor protein, flexibility and length.

Method

Deciding the anchor protein:
Two anchor proteins were considered, CotG and CotZ, as they are located on the outer crust of the spores [21], accessible to chitin polymer. The optimal anchor protein was chosen through finding any potential disulfide bonds with DiANNA [27] between the anchor protein and chitinase to assess stability, and docking both anchor protein and enzyme of interest to assess their affinity. The anchor protein and chitinase were taken from UniProt, CotG (ID: P39801), CotZ (ID: Q08312) and chitinase (ID: Q21017).

Establishing amino acid composition and linker length:
Two of the standard linkers used widely in the literature are rigid (generally repeats of EAAAK), and flexible (generally repeats of combinations of Glycine and Serine residues) [26]. Rigid linkers offer guaranteed separation between the two proteins with the risk of impeding protein mobility whereas flexible linkers offer mobility of the connecting functional domains at the risk of allowing domains to interact. The Glycine:Serine (G:S) ratio in flexible linkers can be varied to enhance flexibility or solubility. The greater the glycine content the higher the flexibility as well increased chances of the linker interacting with itself. By varying the number of repeated units present, linkers can be optimized to achieve appropriate separation of functional domains.

To determine the optimal linker length and amino acid composition we tested both standard rigid and flexible linkers, using one to five repeats of each type. Both structural and sequence-based approaches were used.

For the structural-based approaches, the linker structure was generated through ColabFold [25] and inputted into CABS-flex [23] to assess the flexibility of the structure and generate root mean square fluctuation (RMSF) values, which measure how much an amino acid fluctuates, and provide protein contact maps which suggest interaction. The sequence-based approach used ExPasy-ProtParam [24] to identify the number of potential cleavage sites for each linker.
‍

Results

Deciding the anchor protein:
From the results we discovered that CotZ formed disulfide bonds at the positions shown on the table in Figure 6a with the chitinase. On the other hand, CotG was predicted to have low probability of disulfide bond formation between the anchor protein and chitinase. Additionally, when CotG and CotZ were docked with chitinase using HADDOCK [17] and scored using PRODIGY [28]. CotG had lower affinity with the chitinase (-11.9 kcal/mol^-1) compared to CotZ (-16.4 kcal/mol^-1). Therefore, leading us torecommend the use of CotG over CotZ as the anchor protein of choice to the wetlab. ‍

Figure 6: Anchor protein selection. A) Protein structure of a fusion protein of 4Gx3 (grey), the potential disulfide bond formation is shown in red. The disulfide bond formation between residues is shown on the table. B) Active site chitinase(UniProt ID: Q21017). The red colour indicates the potential binding site for the chitinase. Below is the docking and affinity score of the anchor protein and the active site of chitinase. 4G; GGGGS, number (x2, x3 etc.) equates to the number of repeats.

Establishing amino acid composition and linker length:
Expasy ProtParam on structure of fusion protein as well as just linker structure From our analysis we discovered that rigid linkers had more protease cleavage sites than flexible linkers (Fig. 7a). Thus, we narrowed our search to flexible linkers. Starting off with a standard flexible linker, GGGGS (4G), we discovered that (GGGGS)₃ (4Gx3) produced a max RMSF of 7.13. Even though greater repeating units resulted in higher RMSF, for example, 4Gx5 had a maximum RMSF of 12.5. It also produced more interactions with other residues as shown in the contact maps. After discovering the optimal length, we then assessed several Glycine:Serine ratios. Therefore, we concluded an 80:20 Glycine(G):Serine(S) (4Gx3) ratio with a 15 amino acid length, due to its high flexibility and low interaction with other residues was the optimal linker design.

‍

Figure 7: Number of protease cleavage sites and flexibility of various types of linkers. A) Amount of protease cleavage sites of each linker when attached to different anchorproteins (CotG and CotZ). B) Protein contact Maps generated from CABS-flex foreach 4G linker with different repeating units (2 to 5). The scale at the bottomgenerates the strength of interaction with yellow and black suggesting low andhigh interaction respectively. C) RMSF values of the different 4G linkergenerated from CABS-flex. 4G; GGGGS, AH; EAAAK and the number (x2, x3 etc.) equates to the number of repeats.
‍

Soil Microbiome

Introduction

The plant microbiome consists of many microorganisms competing and coexisting in the soil and surrounding the stem, leaves and roots. It has been shown that the microbiome plays an important role in plant health [29]. For example, the Rhizosphere surrounds the roots and can be directly involved in plant nutrient uptake thus affecting the plats growth while the microbiome in the soil and above ground can be protective against infection as well as abiotic stressors such as a change in soil pH.

One of the downsides of chemical fungicides is the potential for a loss of diversity in the soil microbiome which can make a plant more vulnerable [30]. This raises the concern that our spore-based platform could also detrimentally affect the soil microbiome. Literature has shown that although B. subtills is considered a net-positive addition to the microbiome it could be responsible for a reduction in diversity [31]. In order to investigate this further, we decided to model the potential for loss of microbial diversity in soil as a result of B. subtills spore application.

Why glV?

There have been several approaches to modelling microbial communities and we initially considered constructing a mechanistic model. In mechanistic modeling significant reactions (such as the production or absorption of a lipopeptide) are parameterized in differential equations. This can lead to predictions of emergent behavior in the system, however the number of species included is usually small and can become highly complex. Generalized Lotka-Volterra (gLV) models can simulate pairwise interactions between species. They rely on the assumption that the effect on growth and death rate in each interaction can be summed together. This allows complex mechanistic models to be reduced into gLV systems in certain cases (such as when the interactions are pairwise). We chose to focus on gLV models as they are simple enough to simulate large systems and have been used extensively to study microbial communities such as the human gut microbiome [32].

Mathematical prequisites

The gLV equation is:

$$\frac{dx(t)}{dt}=D(x(t))(r + Ax(t))$$

where A is the community matrix which encodes species interaction, r is the vector of intrinsic growth rates of each species, D is a diagonal matrix and dx/dt is the vector of rates of change (Eq. 1). The diagonal elements of the A matrix are involved in self limitation and a negative diagonal matrix would lead to logistic growth for each species. The other matrix elements parameterize the population dependent growth rate of one species on another.

We aim to investigate the invasion of a community by another species in gLV systems and have simplified the problem based on literature to make the invasion analysis tractable [33].

The 3 assumptions are:

Invasions are rare events – such that equilibrium is reached before each subsequent invasion
Invasions are at low abundance
The system obeys fixed point dynamics (I.e. the system has a feasible and asymptotically stable equilibrium.

Systems with no feasible equilibrium will all reach a boundary value or go to infinity. This is non-physical so we ignore those solutions as well. The equilibrium of a gLV system is the solution to (Eq. 2) where x* is the equilibrium. This implies that community matrices must be invertible to have an equilibrium. The system must also be locally asymptotically stable. We use Mays’ stability criterion (Eq. 3) to determine whether a system is stable where n is the number of matrix columns, C is the probability that two matrices interact and sigma squared is the variance [34].

$$x^{*} = -A^{-1}r$$

$$\sqrt{nC\sigma^{2}} < d$$

Results

To understand how the introduction of a new species into a gLV system can impact which species can coexist we generated random matrices using the method from Mays’ 1972 paper [34]. A 10 species community matrix was generated with values taken from a gaussian distribution of mean 0 and variance of 0.3. The diagonal values and the growth rates were taken to be -1 and 1 respectively. There was also a probability of 0.3 that one species would not be affected by another. We calculated the equilibrium for a 9x9 matrix and once again with the full-sized system. Populations with negative or zero equilibrium values were determined to have collapsed as when the dynamics were simulated with an initial value above 0, these populations tend to zero. Sometimes the 9x9 matrix is generated with some collapsed species to begin with so we used the difference in number of collapsed species for the community before and after the invasion. We wanted to see the effect that increasing the strength of interactions of a species would have on the number of collapsed populations. To do this each value of the last row of the community matrix was increased by 0.1 twenty times and the population loss was calculated. The resultant graphs of population loss were inconclusive as to the effect of increasing the strength of the invading species on population collapse as some graphs increased with interaction strength while others plateaued or decreased. In order to determine the effect, we took the number of collapsed populations for each incremental increase in interaction strength and calculated the mean over 10000 community matrices. This produced the resulting graph (Fig 8):

Figure 8: This shows the increase in proportion of Collapsed species as the invading species' interactions are made stronger.

There is a positive correlation between increasing the strength of the invading species and the number of populations which collapse. By analogy we could predict that the better B. subtilis works as a fungicide relative to other interactions in the microbiome, the more species collapse is likely to occur. Assuming that B. subtilis can invade the microbiome it could result in a loss of diversity. For a relatively similar strength of interactions to the rest of the microbiome, species loss is mitigated. However, when a threshold is reached, species collapse becomes much more likely. This could be an important prediction if we had engineered B. subtilis to overexpress lipopeptides as the interactions could become disproportionate.

In real microbial communities there can also be interdependence between species such as in the case where one species consumes a carbon source produced by another. These hijacking species have been observed in real soil microbiomes and can provide certain benefits [35]. As demonstrated in the graph below, killing off a species which others depend on can cause a greater loss of species diversity (Fig 9).

Figure 9: The top graph shows a system of three species reaching equilibrium. The lower image shows that when another species invades and causes a species to die out it can in turn cause other species to collapse as well because the growth of species 1 and 2 is dependent on species 3.

Bet hedging is a mechanism by which some spores will not germinate all at once but instead some spores will remain dormant even in the presence of sufficient nutrition. As a result, we expect B. subtilis to remain on the plant and in the soil for an extended period as even in the presence of the requisite biomarkers not all spores will germinate immediately. To simulate this, we set a minimum threshold for the invading species’ population below which the growth rate was increased. We first simulated 500 matrices and collected only those where the invading species was predicted to collapse. Of those systems the proportion of additional species lost as a result of invasion was calculated. Species loss was defined as a population below 0.05. This was repeated while the growth rate increment was increased. The graph shows a positive correlation between the growth rate increment and the number of species which collapse. The results (Fig 10) demonstrate that the constant presence of B. subtilis spores can have a destabilizing effect on the microbiome and could potentially cause more species loss than if there were no spores. We would not expect a large increase in population every time a threshold is reached and as a result the decrease in stability should be relatively minimal.

Figure 10: Shows the increase in population collapse as the invading species is kept at a minimal level.

Limitations

There are issues with the way we have produced random matrices. It has been shown that the matrices utilized in Mays research can have orders of magnitude different stability compared to real community matrices [36]. We have therefore not relied on stability directly and have instead observed general trends in the dynamics leading us to use random matrices. As a result, in order to obtain predictive stability values, we should obtain time series sequencing data from a crop field preferably before and after introduction of B. subtilis spores. We could then fit our model to this data to obtain an empirical community matrix.

There are also several limitations of gLV models which are important to consider when contextualizing our results. First, only pairwise interactions are possible in gLV equations and cases such as the mediation of an interaction of two species by a third are excluded. This can make the equations fit to empirical data incorrectly and lead to dynamics which are not predictive [36]. GLV models have already successfully been used to model the human gut microbiome before and after a course of antibiotics [32]. Here the stability of the system before and after antibiotics was a more important factor than having predictive dynamics. If we could use this methodology to analyze data from the field, the importance of the exact trajectories of the populations would be less important.

Another limitation is that the gLV models are space invariant. It has been shown from metacommunity models that space can increase the stability of a population because invasions from nearby communities can prevent collapse [37]. In our case we should keep in mind that the loss of a species in a gLV equation does not equate to homogenous loss of a species throughout a field however, we can treat it as a signal of instability.

The graphs were all produced using Python and the Scipy module.

References

[1] Ross C, Abel-Santos E. The Ger receptor family from sporulating bacteria. Curr Issues Mol Biol. 2010;12(3):147-158.

[2] Blinker S, Vreede J, Setlow P, Brul S. (2021) Predicting the Structure and Dynamics of Membrane Protein GerAB from Bacillus subtilis. Int J Mol Sci. 2021 Apr 6;22(7):3793. doi: 10.3390/ijms22073793. PMID: 33917581; PMCID: PMC8038838.

[3] J. D. Amon, L. Artzi, D. Z. Rudner, Genetic Evidence for Signal Transduction within the Bacillus subtilis GerA Germinant Receptor. Genetics and Molecular Biology. 2022 Feb 15 DOI: 10.1128/jb.00470-21

[4] Artzi L, Alon A, Brock KP, Green AG, Tam A, Ramírez-Guadiana FH, Marks D, Kruse A, Rudner DZ. (2021) Dormant spores sense amino acids through the B subunits of their germination receptors. Nat Commun. 2021 Nov 25;12(1):6842. doi: 10.1038/s41467-021-27235-2. PMID: 34824238; PMCID: PMC8617281.

[5] Li, Y., Jin, K., Perez-Valdespino, A., Federkiewicz, K., Davis, A., Maciejewski, M.W., Setlow, P. and Hao, B. (2019). Structural and functional analyses of the N-terminal domain of the A subunit of a Bacillus megaterium spore germinant receptor. Proceedings of the National Academy of Sciences, 116(23), pp.11470–11479. doi:10.1073/pnas.1903675116.

[6] Christie, G. and Setlow, P. (2020). Bacillus spore germination: Knowns, unknowns and what we need to learn. Cellular Signalling, 74, p.109729. doi:10.1016/j.cellsig.2020.109729.

‌[7] Evans, Richard, et al. (2021) Protein Complex Prediction with Alphafold-Multimer. 2021, DOI:10.1101/2021.10.04.463034. [8] Hopf T. A., Green A. G., Schubert B., et al. The EVcouplings Python framework for coevolutionary sequence analysis. Bioinformatics 35, 1582–1584 (2019)

[9] Y. Zhang, J. Skolnick, TM-align: A protein structure alignment algorithm based on TM-score, Nucleic Acids Research, 33: 2302-2309 (2005)

[10] Madeira, F., Pearce, M., Tivey, A.R.N., Basutkar, P., Lee, J., Edbali, O., Madhusoodanan, N., Kolesnikov, A. and Lopez, R. (2022). Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Research. doi:10.1093/nar/gkac240.

[11] Bauer, P., Hess, B. and Lindahl, E. (2022). GROMACS 2022.3 Manual. Zenodo. [online] Available at: https://zenodo.org/record/7037337#.Y0XYvnbMJPY [Accessed 11 Oct. 2022].

[12] O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461

[13] Dávid Jakubec, Petr Škoda, Radoslav Krivák, Marian Novotný and David Hoksza. PrankWeb 3: accelerated ligand-binding site predictions for experimental and modelled protein structures. Nucleic Acids Research. May 2022

[14] Vavra, O., Filipovic, J., Plhak, J., Bednar, D., Marques, S. M., Brezovsky, J., Stourac, J., Matyska, L., Damborsky, J., 2019: CaverDock: A Molecular Docking-Based Tool to Analyse Ligand Transport through Protein Tunnels and Channels. Bioinformatics 35: 4986-4993. DOI: 10.1093/bioinformatics/btz386.

[15] Christie, G. and Lowe, C.R. (2007). Role of Chromosomal and Plasmid-Borne Receptor Homologues in the Response of Bacillus megaterium QM B1551 Spores to Germinants. Journal of Bacteriology, 189(12), pp.4375–4383. doi:10.1128/jb.00110-07.

[16] R.V. Honorato, P.I. Koukos, B. Jimenez-Garcia, A. Tsaregorodtsev, M. Verlato, A. Giachetti, A. Rosato and A.M.J.J. Bonvin (2021). "Structural biology in the clouds: The WeNMR-EOSC Ecosystem." Frontiers Mol. Biosci., 8, fmolb.2021.729513.

[17] G.C.P van Zundert, J.P.G.L.M. Rodrigues, M. Trellet, C. Schmitz, P.L. Kastritis, E. Karaca, A.S.J. Melquiond, M. van Dijk, S.[17] J. de Vries and A.M.J.J. Bonvin (2016). "The HADDOCK2.2 webserver: User-friendly integrative modeling of biomolecular complexes." J. Mol. Biol., 428, 720-725 (2015).

[18] Rostami, A., Hinc, K., Goshadrou, F., Shali, A., Bayat, M., Hassanzadeh, M., Amanlou, M., Eslahi, N. and Ahmadian, G., 2017. Display of B. pumilus chitinase on the surface of B. subtilis spore as a potential biopesticide. Pesticide Biochemistry and Physiology, 140, pp.17-23.

[19] Swiontek Brzezinska, M., Jankiewicz, U., Burkowska, A. & Walczak, M. (2014) Chitinolytic Microorganisms and Their Possible Application in Environmental Protection. Current Microbiology. 68 (1), 71–81. doi:10.1007/s00284-013-0440-4.

[20] Chen, H., Wu, B., Zhang, T., Jia, J., Lu, J., Chen, Z., Ni, Z. & Tan, T. (2017) Effect of Linker Length and Flexibility on the Clostridium thermocellum Esterase Displayed on Bacillus subtilis Spores. Applied Biochemistry and Biotechnology. 182 (1), 168–180. doi:10.1007/s12010-016-2318-y.

[21] Wang, H., Wang, Y. & Yang, R. (2017) Recent progress in Bacillus subtilis spore-surface display: concept, progress, and future. Applied Microbiology and Biotechnology. 101 (3), 933–949. doi:10.1007/s00253-016-8080-9.

[22] Stoykov, Y.M., Pavlov, A.I. & Krastanov, A.I. (2015) Chitinase biotechnology: Production, purification, and application. Engineering in Life Sciences. 15 (1), 30–38. doi:10.1002/elsc.201400173.

[23] Kuriata, A., Gierut, A.M., Oleniecki, T., Ciemny, M.P., Kolinski, A., Kurcinski, M. & Kmiecik, S. (2018) CABS-flex 2.0: a web server for fast simulations of flexibility of protein structures. Nucleic Acids Research. 46 (W1), W338–W343. doi:10.1093/nar/gky356.

[24] Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D. & Bairoch, A. (2005) Protein Identification and Analysis Tools on the ExPASy Server. In: J.M. Walker (ed.). The Proteomics Protocols Handbook. Totowa, NJ, Humana Press. pp. 571–607. doi:10.1385/1-59259-890-0:571.

[25] Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S. & Steinegger, M. (2022) ColabFold: making protein folding accessible to all. Nature Methods. 19 (6), 679–682. doi:10.1038/s41592-022-01488-1.

[26] Chen, X., Zaro, J.L. & Shen, W.-C. (2013) Fusion protein linkers: Property, design and functionality. Advanced Drug Delivery Reviews. 65 (10), 1357–1369. doi:10.1016/j.addr.2012.09.039.

[27] Ferrè, F. & Clote, P. (2005) DiANNA: a web server for disulfide connectivity prediction. Nucleic Acids Research. 33 (suppl_2), W230–W232. doi:10.1093/nar/gki412.

[28] Xue, L.C., Rodrigues, J.P., Kastritis, P.L., Bonvin, A.M. & Vangone, A. (2016) PRODIGY: a web server for predicting the binding affinity of protein–protein complexes. Bioinformatics. 32 (23), 3676–3678. doi:10.1093/bioinformatics/btw514.

[29] Chen, J., Sharifi, R., Khan, M.S.S., Islam, F., Bhat, J.A., Kui, L. and Majeed, A. (2022). Wheat Microbiome: Structure, Dynamics, and Role in Improving Performance Under Stress Environments. Frontiers in Microbiology, 12. doi:10.3389/fmicb.2021.821546.

[30] Lloyd, A.W., Percival, D. and Yurgel, S.N. (2021). Effect of Fungicide Application on Lowbush Blueberries Soil Microbiome. Microorganisms, 9(7), p.1366. doi:10.3390/microorganisms9071366.

[31] Serván, C. A., and S. Allesina. 2020. Tractable models of ecological assembly. bioRxiv.

[32] Stein, R.R., Bucci, V., Toussaint, N.C., Buffie, C.G., Rätsch, G., Pamer, E.G., Sander, C. and Xavier, J.B. (2013). Ecological Modeling from Time-Series Inference: Insight into Dynamics and Stability of Intestinal Microbiota. PLoS Computational Biology, 9(12), p.e1003388. doi:10.1371/journal.pcbi.1003388.

[33] James, A., Plank, M.J., Rossberg, A.G., Beecham, J., Emmerson, M. and Pitchford, J.W. (2015). Constructing Random Matrices to Represent Real Ecosystems. The American Naturalist, 185(5), pp.680–692. doi:10.1086/680496.

[34] Lozano, G.L., Bravo, J.I., Garavito Diago, M.F., Park, H.B., Hurley, A., Peterson, S.B., Stabb, E.V., Crawford, J.M., Broderick, N.A. and Handelsman, J. (2019). Introducing THOR, a Model Microbiome for Genetic Dissection of Community Behavior. mBio, [online] 10(2), pp.e02846-18. doi:10.1128/mBio.02846-18.

[35] James, A., Plank, M.J., Rossberg, A.G., Beecham, J., Emmerson, M. and Pitchford, J.W. (2015). Constructing Random Matrices to Represent Real Ecosystems. The American Naturalist, 185(5), pp.680–692. doi:10.1086/680496.

[36] Momeni, B., Xie, L. and Shou, W. (2017). Lotka-Volterra pairwise modeling fails to capture diverse pairwise microbial interactions. eLife, 6. doi:10.7554/elife.25051.

[37] Obadia, B., Güvener, Z.T., Zhang, V., Ceja-Navarro, J.A., Brodie, E.L., Ja, W.W. and Ludington, W.B. (2017). Probabilistic Invasion Underlies Natural Gut Microbiome Stability. Current Biology, [online] 27(13), pp.1999-2006.e8. doi:10.1016/j.cub.2017.05.034.