Model

Analysis of physicochemical properties of proteins and preliminary analysis of PNA shielding effect

Overall overview: Before deciding to use cas13a protein and csm6 protein for experiments, we analyzed their physicochemical properties, hydrophilicity, and stability of protein structure. And a preliminary analysis of the shackling effect of PNA was performed.

1.Hydrophilic analysis of proteins

The amino acid sequence of a protein determines its hydrophilicity/hydrophobicity, the hydrophilicity determines its solubility in water, the hydrophobicity facilitates the protein to fold internally to form secondary structures, further to form structural domains, tertiary structures, etc., and the strong hydrophobicity facilitates the protein to form an a-helix to increase stability. Here we used Proscale's open source program to predict the hydrophobicity of cas13 and csm6 from amino acid composition

Amino acid hydrophilicity/hydrophobicity score:

Ala: 1.800 Arg: -4.500 Asn: -3.500 Asp: -3.500 Cys: 2.500 Gln: -3.500 Glu: -3.500 Gly: -0.400 His: -3.200 Ile: 4.500 Leu: 3.800 Lys: -3.900 Met: 1.900 Phe: 2.800 Pro: -1.600 Ser: -0.800 Thr: -0.700 Trp: -0.900 Tyr: -1.300 Val: 4.200 : -3.500 : -3.500 : -0.490

The predicted hydrophobicity of Cas13a is as follows:

png

In the graph, hydrophilic is scored with positive values and hydrophobic with negative values

The predicted hydrophobicity of Csm6 is as follows:

png

In the graph, hydrophilic is scored with positive values and hydrophobic with negative values

2.Analysis of physicochemical properties of proteins

Cas13a protein:

Number of amino acids: 449

Molecular weight: 52595.25

Theoretical pI: 5.56

Amino acid composition:
Ala (A) 16 3.6% Arg (R) 12 2.7%
Asn (N) 42 9.4% Asp (D) 29 6.5%
Cys (C) 1 0.2% Gln (Q) 3 0.7%
Glu (E) 45 10.0% Gly (G) 24 5.3%
His (H) 4 0.9% Ile (I) 59 13.1%
Leu (L) 28 6.2% Lys (K) 55 12.2%
Met (M) 13 2.9% Phe (F) 27 6.0%
Pro (P) 4 0.9% Ser (S) 24 5.3%
Thr (T) 19 4.2% Trp (W) 4 0.9%
Tyr (Y) 24 5.3% Val (V) 16 3.6%
Pyl (O) 0 0.0% Sec (U) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 74Total number of positively charged residues (Arg + Lys): 67 Atomic composition:
Carbon C 2387
Hydrogen H 3725
Nitrogen N 597
Oxygen O 710
Sulfur S 14

Formula: C(2387)H(3725)N(597)O(710)S(14)Total number of atoms: 7433

Csm6 protein:

Number of amino acids: 422

Molecular weight: 49126.42

Theoretical pI: 6.54

Amino acid composition:
Ala (A) 16 3.8% Arg (R) 17 4.0%
Asn (N) 41 9.7% Asp (D) 23 5.5%
Cys (C) 2 0.5% Gln (Q) 9 2.1%
Glu (E) 35 8.3% Gly (G) 13 3.1%
His (H) 9 2.1% Ile (I) 36 8.5%
Leu (L) 48 11.4% Lys (K) 39 9.2%
Met (M) 10 2.4% Phe (F) 18 4.3%
Pro (P) 16 3.8% Ser (S) 28 6.6%
Thr (T) 16 3.8% Trp (W) 4 0.9%
Tyr (Y) 18 4.3% Val (V) 24 5.7%
Pyl (O) 0 0.0% Sec (U) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 58Total number of positively charged residues (Arg + Lys): 56 Atomic composition:
Carbon C 2216
Hydrogen H 3502
Nitrogen N 584
Oxygen O 651
Sulfur S 12

Formula: C(2216)H(3502)N(584)O(651)S(12)Total number of atoms: 6965

3.Preliminary fit analysis of the fluorescence signal data measured in the laboratory

Taking cas13 as an example, we measured in the laboratory the fluorescence signal intensity generated by Crispr/cas13 recognition target at different time and PNA concentration conditions:

png

Fitting effect of the three as variables:

png

Because PNA is expensive, we used different substrate concentrations at fixed PNA concentrations as relative concentrations to represent the concentration of PNA. It can be seen that under certain conditions of reaction time and substrate concentration, the fluorescence signal produced with increasing reaction time becomes stronger, while the effect of the fluorescence signal produced with increasing PNA concentration becomes weaker, but not obvious. This indicates that when the concentration of PNA increases, it also gradually produces interference with the nucleic acid to be tested.

Kinetic simulation of the entire reaction

Overall overview: The CRISPR/Cas system is a technique for the modification of target genes by RNA-directed Cas proteins derived from bacterial acquired immunity.CRISPR/Cas bases can be classified into three types, of which, subtype A of the type III CRISPR-Cas system has an effector complex called Csm, which consists of multiple Cas proteins together with crRNA The Csm complex is not only able to cleave target RNA complementary to crRNA, but also the binding of target RNA can activate two new enzymatic activities generated by the Csm complex, namely the cleavage of ssDNA during transcription and the activity of synthesizing cyclic oligoadenylate cOA, which can activate Csm6 as a second messenger to degrade RNA non-specifically. type III CRISPR can generate two types of cOA: cA6 and cA4, which can bind to the CARF structural domain on the cas protein, which in turn activates the HEPN structural domain for non-specific cleavage, cutting off the pre-engineered fluorescent group from the bursting agent and releasing the fluorescent signal.

Because of the time issue and the impact of the epidemic, we only completed the proof-of-concept of Cas13a and Cas14a under the PNA shielding effect in the laboratory, and did not do the validation experiments of Csm6 in tandem with Cas13a. Therefore, in our modeling, we simulated this process of Csm6 reacting in tandem with cas13a and releasing the fluorescence effect, and simulated the effect of shielding the mutant chain with PNA shackles.

png

Reaction system diagram

1.Shielding of mismatched-strand RNA by shackled PNA

In the process of crispr/cas detection, the cleavage of target by csm6 can be misidentified due to misidentification with similar targets, and we designed the shackled PNA for shielding the mismatched chain to solve this problem. Here we briefly simulated the kinetics of PNA binding to the target and mismatched chains after entering the reaction system.

png

2.Csm6 recognizes RNA and then contacts the synthesis of cyclic tetra- or hexa-adenylate, releasing cA4 and cA6 to activate the CARF structural domain

In this process, we assumed that Csm6 is not activated as Csm6off and released cA4 and cA6 as Csm6on when activated, and the reaction equation is as follows:

png

In turn, we obtain three ordinary differential equations:

png
png
png

3.Csm6 triggers the synthesis of cAO, which binds to the CRISPR-associated CARF structural domain and activates the ribonuclease activity of the HEPN structural domain to cleave the pre-designed fluorescent group-fluorescent burster linker and release the fluorescent signal

Proposed by Michaelis, the classical enzymatic reaction:

png

where X is the substrate, E is the catalase, C is the reaction intermediate, P is the reaction product, and Kcat is the rate constant for the conversion of C to free E and the product P.

So we get :

png
png
png
png

results:

Adding PNA to the reaction system, the concentration of PNA, targetRNA, and mutateRNA in the solution changes

It can be seen that as time changes, the final interfering term mutateRNA is gradually shielded by PNA, while targetRNA is partially depleted, and PNA is depleted the most. This is because PNA will bind to target RNA, mismatch RNA and also to itself at the same time.

And it can be seen from the graph that the degree of binding mutateRNA>targetRNA>PNA. targetRNA will be retained overwhelmingly and go to the next stage of the reaction together with the few mutateRNAs.
png
RNA is recognized by Csm6 and releases cA4 and cA6 to activate the CARF structural domain
png
The cAO binds to the CARF structural domain and activates the HEPN structural domain to non-specifically cleave the designed fluorescent group-fluorescent burster linker and release the fluorescent signal
png

Summary

Here we tried a variety of values of the pre-RNA coefficients, and it can be seen that when n is 1 it does not match the actual situation, and when it is 2 and 3 it matches the real experimental situation. From the predicted results, we can see that the fluorescence signal is increasing with time by a convex function curve with gradually decreasing growth rate. The whole reaction system will still have a part of single base mismatch strand that cannot be completely shielded and recognized by Csm6 after adding the shackle PNA, which interferes with the detection. This kinetic modeling helps us to better understand the reaction process of detection when Csm6-cas13a is in tandem.

Solving chain temperature prediction model based on nearest neighbor method model and DNN neural network

Overall overview: NATer - Nucleic acid tracker achieves accuracy specific to a single base by blocking the mismatched strand with PNA, but unlike using DNA as a yoke, the backbone of PNA is peptide nucleic acid, 1. When it binds complementarily to the nucleic acid strand, the unlinking temperature will be relatively DNA/DNA binding is greatly elevated; 2. When there are base mismatches present, one base pair mismatch can lower the Tm value by 8-20°C (15°°C on average), which is much higher than the lowered temperature for DNA/DNA binding. Therefore, we also need a modeling approach to simulate the prediction of the optimal unstranding temperature when shielding with PNA. In our model, you can see our method to achieve it.

1.Nearest Neighbor Method Model

Currently, the standard enthalpy change ΔH° and the entropy change AS of the hybridization reaction of DNA molecules are calculated using the Nearest Neighbor Mdel model proposed by Borer et al. in 1974. The main idea of Nearest Neigh-bor Mdel is that the calculation of the standard enthalpy change ΔH° and entropy change ΔS° of the hybridization reaction of DNA molecules is transformed into the cumulative sum of the standard brackish change and standard entropy change of the 10 dimers (Duplex) formed by the four bases {A,G,C,T} of the DNA molecule, plus the effect of the starting base pair of GC or AT, the ending base pair of TA and the sequence symmetry of the DNA molecule. The following table shows the effect of the Santalucucleus The following table shows the improved duplex parameters given by Santalucia et al. in 1996.

png

2.Validation analysis of DEA-SBM super-efficiency model for the nearest neighbor method

We used a non-radial SBM model based on the DEA model to obtain the desired efficiency, using the real unchaining temperature measured in the laboratory as the dependent variable and the number of CG/GG/GA nearest-neighbor dimers as the independent variable, and the model was constructed as follows:

Assume that the double-stranded nucleic acid has n decision units, each including m inputs, s1 desired outputs and s2 non-desired outputs, which can be represented by vectors, respectively, as

png

Define the matrix:

png

When the relaxation variables are taken into account, it is further obtained that:

png png
png
The output obtained by SBM: (SBM-output-data.pdf) It can be seen that the efficiency/average value is 0.987,indicating that the prediction is obtained very well using the nearest neighbor method model, and the prediction of the unchaining temperature is performed at a very high efficiency.

3.Neural network prediction of nucleic acid unlinking temperature based on the nearest neighbor method model

1 Fully Connected Neural Network Form establishment

As shown in Figure 1, we construct a four-layer Fully Connected Neural Network network with 10 neural units per hidden layer,

png

Figure 1 Fully Connected Neural Network Sketch Figure

2 Activation function-mish

Since our focus is on the relationship between base sequences and temperature coefficients, and since the cleavage temperature of base pairs is not deterministic, and the probability of deconvolution increases with increasing temperature, but there is still a small probability that some base pairs will deconvolute before reaching a specific temperature, we cannot use the most conventional ReLU function for activation because it will produce a dead point before the 0 point, which does not correspond to the actual situation we expect from the model, and use the Mish function for activation

:f(x)= x・tanh(ς(x))

where, ς(x) = ln(1+e^x), is a softmax activation function.

png

Figure 2 Function Mish Sketch figure

We consider its low cost and its various properties (e.g., smooth and non-monotonic properties, unbounded above and bounded below) to improve its performance compared to other commonly used features such as ReLU (rectified linear unit). ) The properties of Mish are described in detail below:

1. Unbounded above and unbounded below: Unbounded above is an ideal property for any activation function because it avoids saturation, which can cause a sharp slowdown in training. Thus, speeding up the training process. The bounded below property helps to achieve a strong regularization effect (fitting the model correctly). (This property of Mish is similar to that of ReLU and Swish in the range [≈ 0.31, ∞)).

2. Non-monotonic function: This property helps to retain small negative values and thus stabilize the network gradient flow. The most commonly used activation functions, such as ReLU [f(x) = max(0, x)] and Leaky ReLU [f(x) = max(0, x), 1] cannot retain negative values because their differentiation is 0 , so most neurons will not update.

3. Infinite continuity and smoothness: Mish as a smoothing function improves the results because it is better in terms of both generalization and effective optimization of the results. In the figure, it can be seen that the smoothness of the landscape changes dramatically between ReLU and Mish with random initialization of the neural network. However, in the case of Swish and Mish, the landscapes are roughly similar.

4. High computational cost but better performance: it is expensive compared to ReLU, but in deep neural networks, the results are better compared to ReLU.

png

Figure3 ReLU、Mash、swish Calculation iteration comparison chart

Coding reference for python based implementation with pytorch as the architecture:

import torch import torch.nn.functional as fun def swish(x, beta=1): return x*torch.nn.Sigmoid()(x*beta) def mish(x): return x*(torch.tanh(F.softplus(x))) class Mish(nn.Module): def __init__(self): super().__init__() def forward(self, x): return x*(torch.tanh(fun.softplus(x)))

3 loss function- Huber Loss

After comparing the mean square error, mean absolute error, and Huber Loss (smoothed absolute loss) with the squared error loss, the variance of the first two is high, and since we are concerned with the output of temperature coefficient, which is known from thermodynamics that its tracing point changes more slowly and not abruptly, we choose Huber Loss. Huber loss is less sensitive to outliers in the data. It is also differentiable at the value of 0. It is essentially an absolute value that becomes squared when the error is small. How large the error makes its squared value depends on a hyperparameter δ, which can be adjusted.

When δ ~ 0, the Huber loss tends to MAE; when δ ~ ∞, the Huber loss tends to MSE.

The choice of δ is critical, as it determines how you view the outliers. Residuals larger than δ are minimized with L1 (which is less sensitive to very large outliers), while residuals smaller than δ are "properly" minimized with L2. One of the big problems with using MAE for training neural networks is that it always has a large gradient, which can lead to missing minima at the end of training the model using gradient descent. With MSE, the gradient gradually decreases as the loss value approaches its minimum, making it more accurate. In these cases, the Huber loss function can be really helpful because it reduces the gradient around the minimum. And it is more robust to outliers than MSE. Thus, it has the advantages of both loss functions, MSE and MAE. However, the Huber loss function also has a problem that we may need to train the hyperparameter δ, and the process requires constant iterations. Since the Tm we compute has a certain predicted mean value, solid we train δ that will edit more in favor of this mean value.

The python-based code for this function, using numpy as the architecture, is

import numpy as np def huber(true, pred, delta): loss = np.where(np.abs(true-pred) < delta , 0.5*((true-pred)**2), delta*np.abs(true - pred) - 0.5*(delta**2)) return np.sum(loss)

4 Regularization method-Dropout

One of the main challenges in training a model in deep machine learning is co-adaptation. This means that the neurons are interdependent. They have a considerable influence on each other and are not independent enough relative to their inputs. We also often find situations where some neurons have a more important predictive power than others. In other words, we can be overly dependent on the output of individual neurons.

These effects must be avoided and the weights must have a certain distribution to prevent overfitting. The co-adaptation and high predictive power of certain neurons can be tuned by different regularization methods. One of the most commonly used is Dropout. the application of the Dropout method is schematically illustrated as follows:

png

Figure 4 Dropout application scenario

To prevent overfitting in the training phase, neurons are randomly removed. In a dense (or fully connected) network, for each layer, we give a dropout probability p. In each iteration, each neuron is removed with probability p. The paper by Hinton et al. suggests a dropout probability of "p=0.2" for the input layer and The dropout probability of the hidden layer is "p=0.5". Obviously, we are interested in the output layer, which is our prediction. So we only use dropout in the hidden layer and do not apply dropout in the input layer.

Based on python to pytouch architecture to implement Dropout method code as follows:

import torch from torch import nn from d2l import torch as d2l def dropout_layer(X, dropout): assert (0 <= dropout <= 1) if dropout == 1: return torch.zeros_like(X) if dropout == 0: return X torch.rand() mask = (torch.rand(X.shape) > dropout).float() return mask * X / (1.0 - dropout) dropout1, dropout2,dropout3,dropout4 = 0.2, 0.5,0.6,0.8 #test class Net(nn.Module): def __init__(self, num_inputs, num_outputs, num_hiddens1, num_hiddens2, is_training = True): super(Net, self).__init__() self.num_inputs = num_inputs self.training = is_training self.lin1 = nn.Linear(num_inputs, num_hiddens1) self.lin2 = nn.Linear(num_hiddens1, num_hiddens2) self.lin3 = nn.Linear(num_hiddens2, num_hiddens3) self.lin4 = nn.Linear(num_hiddens3, num_hiddens4) self.lin5 = nn.Linear(num_hiddens4, num_outputs) self.mish = nn.Mish() def forward(self, X): H1 = self.mish(self.lin1(X.reshape((-1, self.num_inputs)))) if self.training == True: H1 = dropout_layer(H1, dropout1) H2 = self.mish(self.lin2(H1)) if self.training == True: H2 = dropout_layer(H2, dropout2) H3 = self.mish(self.lin2(H2)) if self.training == True: H3 = dropout_layer(H3, dropout3) H4 = self.mish(self.lin2(H3)) if self.training == True: H4 = dropout_layer(H4, dropout2) out = self.lin3(H4) return out net = Net(num_inputs, num_outputs, num_hiddens1, num_hiddens2, num_hiddens3,num_hiddens4)

5 Training results

As Form 1 our training set using a total of 72 sets of data established an R-value of 0.98483, which can basically meet the prediction needs and established a standard deviation of 2.34277, possessing a high degree of robustness in Tm judgment.

png

Form 1 Statistics

The graph of the relationship between true and predicted values is shown by Figure 5

png png
Figure 5 Fitted Plot & Regression Plot

The residual variance of the network we built is relatively scattered as shown in Figure 6, and a little high variance residuals in the low location region may have an impact on the low eaves data, which still needs to be improved in the future.

png

Figure 6 Residual Plot

Here, we evaluated the nearest neighbor method model by DEA-SBM and implemented the prediction of nucleic acid unstranding temperature using neural networks based on the nearest neighbor method model. For future teams, you can complete the determination of the parameters related to the nearest neighbor dimers such as CG/GG/GA of PNA/DNA in the laboratory, and then use the neural networks to make the prediction of the optimal temperature at annealing.

Annex 1 Network Factor Table

(Network-Factor-Table.pdf)

Annex 2 Test data set

(Test-data-set.pdf)

Annex 3 Dynamics model code

function [dy] = f_PNA(t,y,kr1,kr2,ka,da,ka2,da2) dy = zeros(3,1); dy(1) = -kr2*y(2)-kr1*y(1); dy(2) = -(ka+da)*y(2); dy(3) = -(ka2+da2)*y(3); end %y(1)=[PNA];y(2)=[Target RNA];y(3)=[Mutate RNA]; function [dx] = f_RNA1(t,x,k_Csm6_a,k_Csm6_da,k_ccs6mon,v_mcsm6off,k_mcsm6on) dx = zeros(5,1); dx(1) = -k_Csm6_a*x(3).*x(1)+k_Csm6_da*x(2); dx(2) = k_Csm6_a*x(3).*x(1)-k_Csm6_da*x(2); dx(3) =-k_Csm6_a*x(3).*x(1)+k_Csm6_da*x(2); dx(4)=-((k_ccs6mon*x(2).*x(4))./(k_mcsm6on+x(4)))-((v_mcsm6off*x(5))./(k_mcsm6on+x(5))); dx(5)=((k_ccs6mon*x(2).*x(4))./(k_mcsm6on+x(4)))-((v_mcsm6off*x(5))./(k_mcsm6on+x(5))); end %x(1)=[Csm6_off];x(2)=[Csm6_on];x(3)=[RNA];x(4)=[CARF];x(5)=[HEPN]; %RNA系数为1 function [dx] = f_RNA2(t,x2,k_Csm6_a,k_Csm6_da,k_ccs6mon,v_mcsm6off,k_mcsm6on) dx = zeros(5,1); dx(1) = -k_Csm6_a*x2(3).^2.*x2(1)+k_Csm6_da*x2(2); dx(2) = k_Csm6_a*x2(3).^2.*x2(1)-k_Csm6_da*x2(2); dx(3) =2*(-k_Csm6_a*x2(3).^2.*x2(1)+k_Csm6_da*x2(2)); dx(4)=-((k_ccs6mon*x2(2).*x2(4))./(k_mcsm6on+x2(4)))-((v_mcsm6off*x2(5))./(k_mcsm6on+x2(5))); dx(5)=((k_ccs6mon*x2(2).*x2(4))./(k_mcsm6on+x2(4)))-((v_mcsm6off*x2(5))./(k_mcsm6on+x2(5))); end %x(1)=[Csm6_off];x(2)=[Csm6_on];x(3)=[RNA];x(4)=[CARF];x(5)=[HEPN]; %RNA系数为2 function [dx] = f_RNA3(t,x3,k_Csm6_a,k_Csm6_da,k_ccs6mon,v_mcsm6off,k_mcsm6on) dx = zeros(5,1); dx(1) = -k_Csm6_a*x3(3).^3.*x3(1)+k_Csm6_da*x3(2); dx(2) = k_Csm6_a*x3(3).^3.*x3(1)-k_Csm6_da*x3(2); dx(3) =3*(-k_Csm6_a*x3(3).^3.*x3(1)+k_Csm6_da*x3(2)); dx(4)=-((k_ccs6mon*x3(2).*x3(4))./(k_mcsm6on+x3(4)))-((v_mcsm6off*x3(5))./(k_mcsm6on+x3(5))); dx(5)=((k_ccs6mon*x3(2).*x3(4))./(k_mcsm6on+x3(4)))-((v_mcsm6off*x3(5))./(k_mcsm6on+x3(5))); end %x(1)=[Csm6_off];x(2)=[Csm6_on];x(3)=[RNA];x(4)=[CARF];x(5)=[HEPN]; %RNA系数为3 clear; tspan=[0,120]; x0 = [2.04667e-5;0;6.80008e-5;2.40213e-5;0]; %初始浓度4.30667e-5 k_Csm6_a=1.0e9; k_Csm6_da=2.6e-2; k_ccs6mon=2.0e-2; v_mcsm6off=1.9547e-5; k_mcsm6on=1.27e-3; [t1,x1] = ode45(@f_RNA1,tspan,x0,[],k_Csm6_a,k_Csm6_da,k_ccs6mon,v_mcsm6off,k_mcsm6on); [t2,x2] = ode45(@f_RNA2,tspan,x0,[],k_Csm6_a,k_Csm6_da,k_ccs6mon,v_mcsm6off,k_mcsm6on); [t3,x3] = ode45(@f_RNA3,tspan,x0,[],k_Csm6_a,k_Csm6_da,k_ccs6mon,v_mcsm6off,k_mcsm6on); plot(t1,x1(:,5),'r',t2,x2(:,5),'g',t3,x3(:,5),'b'); axis on; xlabel('time(min)'); ylabel('Fluorescence'); legend('n=1','n=2','n=3') clear; tspan=[0,30]; y0 = [2.11;2.11;2.11]; %初始浓度 kr1=0.96; kr2=0.001; ka=0.004; da=0.0001; ka2=0.106; da2=0.00001; options = odeset('RelTol',1e-10,'AbsTol',[1e-10 1e-10 1e-10]); [t,y] = ode45(@f_PNA,tspan,y0,[],kr1,kr2,ka,da,ka2,da2); plot(t,y(:,1),'r',t,y(:,2),'g',t,y(:,3),'b'); axis on; xlabel('time'); ylabel('concentration'); legend('PNA','Target RNA','Mutate RNA') %x(1)=[PNA];x(2)=[Target RNA];x(3)=[Mutate RNA];

Reference

[1]Liu Wenbin,Zhu Xiangou,Liu Xiangrong. DNA destriling temperature prediction model based on BP neural network[J].Computer Engineering and Applications, 2006(10):1-4.

[2]Liu TY, Knott GJ, Smock DCJ, Desmarais JJ, Son S, Bhuiya A, Jakhanwal S, Prywes N, Agrawal S, Díaz de León Derby M, Switz NA, Armstrong M, Harris AR, Charles EJ, Thornton BW, Fozouni P, Shu J, Stephens SI, Kumar GR, Zhao C, Mok A, Iavarone AT, Escajeda AM, McIntosh R, Kim S, Dugan EJ; IGI Testing Consortium, Pollard KS, Tan MX, Ott M, Fletcher DA, Lareau LF, Hsu PD, Savage DF, Doudna JA. Accelerated RNA detection using tandem CRISPR nucleases. Nat Chem Biol. 2021 Sep;17(9):982-988. doi: 10.1038/s41589-021-00842-2. Epub 2021 Aug 5. Erratum in: Nat Chem Biol. 2021 Nov;17(11):1210. PMID: 34354262.

[3]Lopatkin AJ, Collins JJ. Predictive biology: modelling, understanding and harnessing microbial complexity. Nat Rev Microbiol. 2020 Sep;18(9):507-520. doi: 10.1038/s41579-020-0372-5. Epub 2020 May 29. PMID: 32472051.