Background

In the 2022 BIT project, we decided to identify the pain type of patients as neuropathic pain (Np) or nociceptive pain (No) by detecting the concentration of two miRNAs in pain patients and combining the classification model of machine learning. In the whole project, how to select appropriate markers from a large number of miRNAs for classification is a problem that needs to be solved first. In addition, in the whole miRNA detection process, because the concentration of miRNAs is extremely low, in addition to the signal amplification method (HCR), miRNAs also need to be processed by CRISPR to improve the specificity of detection, The CRISPR process applied to miRNAs also lacks a suitable model to describe the reaction process and guide experimental optimization. Because the field of pain & miRNA is innovative, there is almost no research on the correlation between pain and miRNA, so we need to make more use of existing models to develop our own model, and use it to guide our experimental assessment model.

At the beginning of the project, we decided to select appropriate miRNAs as the detection biomarker based on the literature related to pain and miRNA, so we further optimized the miRNA screening model from the literature screening process. After selecting miRNAs, we need to design the optimal HCR primer and optimize CRISPR experimental conditions. We used NUPACK nucleic acid meter analysis software to optimize the sequence and concentration ratio of the primer chain designed in the HCR reaction. After specific values are substituted into the software, the optimal values of each parameter are obtained by controlling variables. When we use experiments for comparison, if the experiment results agree well, we can additionally consider using this model to guide the optimization of experimental parameters, and supplement some extreme situation data that cannot be achieved by actual experiments through the model. The modeling work focuses on describing the CRISPR process and establishing an evaluation model by using differential equations and the associated enzyme kinetics model. Based on the existing Michaelis equation evaluation model in the literature and combining differential equations to describe the CRISPR process, the CRISPR differential equation model can provide optimization guidance for the experiment, and the reaction rate that affects the model can be obtained by solving the equation, According to the variables that affect the reaction rate, a series of optimal experimental guidance can be derived. In addition, the modeling also established a classification model, which can judge the pain type of patients through the miRNA concentration input.

miRNA Model

——Screening and Classification

Screening

Fig.1 miRNA Screening process

Since literature and works that directly figure out the relationship between miRNAs and pain are relatively avant-garde, and there’s a lack of data and experiments to support our extensive miRNA screening, that is, we cannot screen pain-related miRNAs from all known miRNA populations, but based on the work of Dayer et al[1], We decided to inherit from their completed screening process and develop last two steps of our own to obtain the best two types of miRNAs for us, and apply them to the pain classification model. In the classification model, a machine learning classification algorithm is used. The concentrations of two miRNAs are put into the model respectively, and results are output to determine the type of pain of patients. Finally, we use our self-designed algorithm to give the final type of pain: No/Np;

Dayer et al. targeted 184 miRNAs with low CQ numbers and characterized the potential detection limit of various miRNAs with the CQ value in the PCR process. After screening these miRNAs, 12 kinds of miRNAs that can classify pain were obtained. Based on these 12 miRNAs and 119 detectable miRNAs in the blood[2], we used a variety of machine learning algorithms to verify and screen. The accuracy of classification using the binary regression model in the original text can reach 70%.

Fig.2 The classification results of 14 miRNAs on the training data set, 0 and 1 represent No and Np respectively. It is obvious that all miRNAs have a significant overlap in pain classification, which means that the classification results are unreliable, which is consistent with the positive rate of about 70% in the literature[1]

Taking different concentrations of each miRNA as classification thresholds into the ROC model to calculate its classification sensitivity and specificity, we can get the ROC curve varying with the threshold concentration, and the area under the ROC curve (AUC) can characterize the ability of miRNA as a classification marker.

The AUC value range is 0 ≤ AUC ≤ 1. When the AUC is greater than 0.5, the closer A is to 1, the higher the diagnostic accuracy; When AUC=0.5, the diagnosis is completely ineffective; AUC<0.5 is not in line with the actual situation. Generally, 0.5<AUC ≤ 0.7 indicates a low diagnostic value, 0.7<AUC ≤ 0.9 indicates a medium diagnostic value, and AUC>0.9 indicates a high diagnostic value.

Fig.3 A. By drawing ROC curves of 14 miRNAs; B. Then calculate AUC on the ROC curve, and it can be seen that the AUC of hsa-miR-98-5p and hsa-miR-7d-5p is the largest, with the highest diagnostic value[3]

Figure.4 In addition to the ROC curve, we also used logistic regression, random forest, chi-square test, ANOVA, and other methods to evaluate the importance of the 14 miRNAs mentioned above. In the figure, we used logistic regression to get the most important miRNA 9, namely hsa-miR-95-5p, and the second most important miRNA 1, hsa-miR-7d-5p. Although the most important miRNAs obtained from different algorithms are different, the results of several methods show that hsa-miR-98-5p and hsa-miR-7d-5p are the top two in importance.

Finally, based on our screening model, we determined hsa-miR-98-5p and hsa-miR-7d-5p as biomarkers. The following pain classification models based on hsa-miR-98-5p and hsa-miR-7d-5p are constructed through the decision tree:

Classification

It can be seen from the literature that since the classification accuracy of the binary regression model can only reach 70%, we have considered other classification models. Because the decision tree can classify under the premise of unknown data characteristics, we have chosen the decision tree. Therefore, combined with the data provided by the literature, we have reconsidered the number of input signals of the model, and separately considered single, two, and three miRNA input models for classification, It was comprehensively judged that the optimal miRNA was used as a pain classification marker. The classification results showed that the classification effect of the multi-miRNA input model was not significantly better than that of single miRNA input. In addition, considering that hsa-miR-98-5p and hsa-miR-7d-5p were ranked in the top two in terms of importance during screening, and the single miRNA input model had a high uncertainty, we chose to input the two miRNAs separately to get the results. The final classification result is obtained by combining the reliability (importance) of the two markers.

The basic principles of the decision tree: by establishing an attribute tree, each branch of the decision tree can represent a feature of the data. The decision tree model increases the classification basis by continuously adding branches. When the decision tree has enough branches, the training data can be completely classified. When dealing with unknown data, the trained decision tree model will judge the attribute division of input data from the initial data node, and finally divide it into leaf nodes and get the classification results.

Parameter name	Value
Training time	0.003s
Data segmentation	0.7
Data shuffle	Yes
Cross validation	No
Node splitting evaluation criteria	gini
Feature division point selection standard	best
The maximum feature scale considered in the partition	None
Minimum number of samples split by internal node	2
Minimum sample number of leaf node	1
Minimum weight of samples in leaf node	0
Maximum number of leaf nodes	3
Maximum depth of tree	3
Threshold value of node division impurity	0

Table 1. Decision tree model obtained by training literature data. The advantage of the decision tree model is that it does not need to predict any characteristics of the data, but summarizes the characteristics of the data through the training model. As the classification basis of the model, the trained decision tree model will judge the characteristics of the input data and give the classification result .

Fig.5 A. The structure example of a simple decision tree. Each node represents a feature of the dataset. The following nodes can partition the dataset into features and finally divide the data into leaf nodes. The leaf nodes must correspond to a classification result. In our classification model, the leaf nodes only correspond to No or Np; B. Based on the results of the screening model, we decided to use the concentrations of hsa-miR-98-5p and hsa-miR-7d-5p as the training set to generate a decision tree model. However, for our model, the importance of the two types of miRNAs for pain classification is not equal, which means that we need to calculate and quantify the importance of the two, as shown in the figure, hsa-miR-98-5p is far more important than hsa-miR-7d-5p in classification

Fig.6 The coordinates in the figure are (1,1) and (2,2) data and the total number of data indicating correct classification. According to the calculation method of classification accuracy, the accuracy rate can be about 77%

After the construction of the decision tree model, we need to evaluate the model. For supervised machine learning, sensitivity and specificity are classic indicators for evaluating the classification criteria, while for the decision tree model, we can show the classification effect of the model in the form of a confusion matrix heatmap (Figure 5); In addition, by comparing the classification performance of training set and test set, combined with the model parameters obtained (the deepest is only 3 layers), it can be seen that about 70% of the accuracy rate of the model implementation is mainly due to insufficient data volume, resulting in insufficient feature acquisition

	Accuracy	recall	precision	F1
Training set	0.743	0.743	0.758	0.724
Test set	0.767	0.767	0.768	0.759

Table 2. The above table shows the classification evaluation indicators of the training set and test set, and measures the classification effect of the decision tree on training and test data through quantitative indicators; a. Accuracy: predict the proportion of correct samples in the total samples. The greater the accuracy, the better; b. Recall rate: among the results that are actually positive samples, the predicted proportion is positive samples. The larger the recall rate, the better;c. Accuracy rate: the predicted results are positive samples, and the actual results are positive samples. The greater the accuracy rate, the better; d. F1: The harmonic average of the accuracy rate and the recall rate. The accuracy rate and the recall rate affect each other. Although both are high, it is an expected ideal situation. However, in practice, the accuracy rate is often high and the recall rate is low, or the recall rate is low but the accuracy rate is high. If you need to consider both, you can use the F1 indicator.

Results and Discussions

Through the construction of a screening and classification model, we found the most suitable miRNAs for pain classification. Our project took two miRNAs (hsa-miR-98-5p and hsa-miR-7d-5p) obtained from the screening model as biomarkers. The two types of miRNAs will be used as the input of the HCR+CRISPR reaction system, with the goal of blood miRNA detection. We collect and process blood in the detection system equipped with hardware, and use the HCR+CRISPR system to detect the fluorescence intensity of two miRNAs, convert the fluorescence intensity into the concentration value of miRNAs, and put it into the classification model to judge the patient's pain type.

NUPACK Model

——Design and Analyze

Purpose

In order to ensure the smooth progress of the HCR reaction, we used the NUPACK webpage to guide the sequence design of the probe before the experiment, so as to reduce the number of trials and the experiment cost and promote the experiment.

Design

nupack

NUPACK is a nucleic acid analysis and design software designed and developed by the University of California, which mainly has two functions： analysis and design. It can be run locally by downloading the source code, or online on the NUPACK website. The NUPACK webpage has a powerful nucleic acid sequence analysis function: after presetting the nucleic acid type and reaction temperature of the analyte, by placing the base sequence of each reactant (such as the probe, miRNA), and setting the concentration of each reactant and the maximum number of holding chains, press the analysis button, and you can get the type of all reaction products under the optimal thermodynamic conditions. Click on each product to observe the sequence structure, minimum free energy, base pairing heat map, etc.

Fig.1 Working processes and user guide for NUPACK

In the HCR experiment, the base type, stem-loop length, base mismatch, and other factors of the H1 sequence will all affect the degree of reaction and binding. Therefore, before the formal experiment, we put the different H1 sequences designed into the NUPACK webpage for simulation to guide the following experiment.

Fig.2 procedures for us to utilize NUPACK optimize our experiment

Guide the experiment

The structure of the H1 sequence is one of the important factors affecting the HCR experiment, so we use the NUPACK web page to optimize and improve the H1 sequence, which is embodied in three aspects: H1 base sequence, probe base mismatch, and H1 stem-loop length.

[Base sequence]

Since the H1 probe reaction effect was poor at the beginning of the experiment, we optimized the base sequence of H1 and evaluated the optimization results with NUPACK.

H1
Original H1:AGTCTAGGAAACTGCGTGGGTTAAAACAATACAACTTACTACCTCATTAACCCA
Optimized H1:ATCAGACTGAAAGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAAC

Fig.3 Equilibrium concentration plot of NUPACK system before and after optimization

It can be seen from the NUPACK web simulation that the system with optimized base sequence has stronger reaction amplification and higher concentration of target product. At the same time, there are fewer false positives compared to the old system. We applied the optimized H1 sequence to subsequent HCR experiments and designed experiments to verify the correctness of this selection.

Fig.4 H1 base sequence optimization experimental comparison plot

From (2)(3) and (5)(6), it can be seen that the corresponding bands of the optimized H1 sequence are darker in color and longer in length, so the reaction is easier to carry out and the degree of reaction is higher. Due to the experimental operation and the low content of false-positive products, the reduction of false positives in the new system is not significantly reflected in the left panel, but we can still conclude that the optimized H1 is more conducive to the HCR reaction, which proves the correctness of the NUPACK-guided experiment.

[Probe base pair mismatch]

In our references, the probes used in HCR experiments are designed with base mismatches, so we also designed base mismatches for each probe sequence, and used NUPACK to simulate the reaction after each probe mismatch to see if the base mismatch is conducive to HCR experiments. The base mismatch sequence is designed as follows:

H1 base mismatch sequence
Original H1:ATCAGACGAAATGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAAC
[1]ATCAGACGAAATGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAAG
[2]ATCAGACGAAATGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAAA
[3]ATCAGACGAAATGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAAT

H2 base mismatch sequence
Original H2: ACTTTGTCAACATCTTTCAGTCTGATCAGTGTATCAGACTGAAAGATGTTGA
[1]ACTTTGTCAACATCATTTCGTCTGATCAGTGTATCAGACGAAATGATGTTGT
[2]ACTTTGTCAACATCATTTCGTCTGATCAGTGTATCAGACGAAATGATGTTGG
[3]ACTTTGTCAACATCATTTCGTCTGATCAGTGTATCAGACGAAATGATGTTGC

H3 base mismatch sequence
Original H3: ATCAGACTGAAAGATGTTGACAAAGTTCAACATCTTTCAGTCTGATACACTG
[1]TTCAGACGAAATGATGTTGACAAAGTTCAACATCATTTCGTCTGATACACTG
[2]GTCAGACGAAATGATGTTGACAAAGTTCAACATCATTTCGTCTGATACACTG
[3]CTCAGACGAAATGATGTTGACAAAGTTCAACATCATTTCGTCTGATACACTG

Due to the excessive mismatch of each probe, we hereby excerpt some of the NUPACK simulation results (H1[1], H2[2], H3[1]) for demonstration:

Fig.5 Original equilibrium concentration plot of the system

Fig.6 A. In the case of H1[1] base mismatch, the target product concentration decreases, and the false-positive product concentration increases, so the impact of base mismatch on HCR experiments is negative.

Fig.7 B. In the case of H2[2] base mismatch, the probe cannot be paired properly and cannot be extended, and the HCR reaction cannot be carried out normally.

Fig.8 C. In the case of H3[1] base mismatching, the probe cannot be paired properly and cannot be extended, and the HCR reaction cannot be carried out normally.

Based on the above NUPACK modeling results, we can conclude that base mismatch is not conducive to the HCR reaction, so in the biological experiments, we do not use base mismatch probes, which greatly reduces the number and cost of the experiments, effectively guiding the experiment.

[Stem-loop length]

In the previous preliminary modeling of NUPACK, we optimized the base sequence of H1, but the stem-loop length of H1 is also an important factor affecting the HCR experiment, so we continue to use NUPACK to build a model to optimize the stem-loop length of H1.

We designed H1 with different stem-loop lengths, namely:

H1 with different lengths
H1-98-7:ATCAGACTGAAAGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGT
H1-98-9:ATCAGACTGAAAGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCA
H1-98-11:ATCAGACTGAAAGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAAC
H1-98-13:ATCAGACTGAAAGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAACA
H1-98-15:ATCAGACTGAAAGATGTTGACAAAGTAACAATACAACTTACTACCTCAACTTTGTCAACA

Put the above sequence on the NUPACK webpage for simulation, and the simulation result is as follows:

Fig.9 the concentration of the target product in the H1 simulation system of each stem-loop length

From figure 9 we can observe that when the stem-loop length is 11, the target product concentration is the highest and the system amplification is the strongest. The left column shows the concentration of the target product in the simulated system of each H1 stem-loop length. It can be observed that when the stem-loop length is 11, the target product concentration is the highest and the system amplification is the strongest. The right column shows the concentration of false-positive products in the simulated system of each H1 stem-loop length. It can be observed that when the stem-loop length is 11, the concentration of false-positive products is the lowest. Therefore, from the NUPACK simulation, we can preliminarily conclude that when the length of the H1 probe stem-loop is 11, the experiment system has the best reaction effect.

After reaching this conclusion in the NUPACK simulation, we began to conduct biological experiments.

Fig.10 the biological experiments completed under H1 probes with different stem-loop lengths, of which 2, 4, 6, 8 and 10 are negative controls, and 3, 5, 7, 9 and 11 are H1 probe systems with stem-loop lengths 7, 9, 11, 13 and 15, respectively.

Tips： NC = Negtive Control

From the bands, we can see that the slot under the H1-11 system has the lowest brightness and the longest overall band, which proves that the H1-11 system has the most complete reaction and the strongest amplification, which is consistent with the NUPACK modeling results.

Results and Discussions

The above is the probe optimization design modeling work we completed using NUPACK. Through NUPACK modeling work, we can conclude that the HCR reaction is best facilitated by using an H1 probe with a stem-loop length of 11 and without base mismatching of the probe.

Overall, due to the enormous impact of H1 probes on HCR experiments, NUPACK played a pivotal role in probe design and experiment guidance.

CRISPR Model

——Evaluation and Optimization

Introduction

Under the additional condition that the reporter gene concentration in CRISPR experiments is generally much lower than the Michaelis-Menten constant, the classical Michaelis-Menten equation is modified to obtain the analytical solution of the reaction rate as a function of time. Based on the analytical solution, we proposed three evaluation indexes to evaluate the self-consistency and correctness of CRISPR experiments and proposed a method to optimize the experimental conditions. In the end, we propose other applications of the model.

Notations

Symbols	Implications
E	Enzyme complexes activated by target molecules
S	Substrate concentration
C	Intermediate reactant
P	Cleaved reporter molecule
$k_{f}$	Positive reaction coefficient
$k_{r}$	Reverse reaction coefficient
$k_{c a t}$	Catalytic constant of enzyme
$E_{0}$	Amount of enzyme activated by target molecule
$S_{0}$	Initial amount of substrate
t	A quantity that depends on time
$t_{l i n}$	Duration of linear increase of fluorescence with time
$V_{m a x}$	Theoretical maximum value of reaction rate
$K_{M}$	the Michaelis-Menten constant of enzyme

Model building

Model Derivation

In the CRISPR system adopted in this project, crRNA can specifically recognize the amplified sequence of target miRNAs and then activate the side-chain cleavage activity of the Cas12a protein. The Cas12a protein can specifically recognize and cleave fluorescent probes, thus realizing specific signal detection and transduction.

Hypothesis

- [H1] The initial enzyme concentration $E_{0}$ is approximately equal to the concentration of the target nucleic acid, that is, all the target molecules have reacted with a large number of available enzyme molecules;

- [H2] According to the quasi-steady-state hypothesis, the concentration of the intermediate almost does not change.

- [H3] In the classical Michaelis-Menten dynamical system $E_{0}$ <<S, therefore, at the beginning of the reaction [P]<< $S_{0}$ ,[S]≈ $S_{0}$ ;

- [H4] In most CRISPR experiments [S]<< $K_{M}$ .

Building

E + S \overset{k_{f}, k_{r}}{⟷} C \overset{k_{cat}}{⟶} E + P

The trans-cleavage process can be modeled as above.

Where E represents the enzyme complex activated by the target molecule, S represents the uncleaved reporter gene, C represents the intermediate reactant, and P represents the cleaved reporter gene.

To establish a system of ordinary differential equations:

\begin{matrix} \frac{d [E]}{d t} = - k_{f} [E] [S] + k_{r} [C] + k_{cat} [C] (1) \\ \frac{d [S]}{d t} = - k_{f} [E] [S] + k_{r} [C] (2) \\ \frac{d [C]}{d t} = k_{f} [E] [S] - k_{r} [C] - k_{cat} [C] (3) \\ \frac{d [P]}{d t} = k_{cat} [C] (4) \end{matrix}

Initial value condition:

\begin{array}{l} [E] (0) = E_{0} (5) \\ [S] (0) = S_{0} (6) \\ [C] (0) = 0 (7) \\ [P] (0) = 0 (8) \end{array}

Based on the conditions of conservation of enzyme and substrate, the following constraints hold:

\begin{matrix} [E] (t) + [C] (t) = E_{0} (9) \\ [C] (t) + [S] (t) + [P] (t) = S_{0} (10) \end{matrix}

From assumption H2, we can write:

\frac{d [C]}{d t} \approx 0 \to [C] = \frac{k_{f} [E] [S]}{k_{r} + k_{c a t}} (11)

Combined with hypothesis H3, equations (4), (9), and (11) are simultaneously solved, and the product generation rate is controlled by the Michaelis-Menten equation:

v = \frac{d [P]}{d t} = k_{cat} E_{0} \frac{[S]}{K_{M} + [S]} (12)

By combining hypothesis H4 and Equation (10), the equation is improved:

\frac{d [P]}{d t} \approx \frac{k_{cat} E_{0}}{K_{M}} (S_{0} - [P]) (13)

Solving this differential equation, we can get:

\frac{[P] (t)}{S_{0}} \approx (1 - \exp (- \frac{t}{τ})) (14)

Where

τ = K_{M} / k_{cat} E_{0}

Obviously, τ decreases with increasing $k_{c a t}$ , increasing $E_{0}$ and decreasing $K_{M}$ , and is closely related to the value of $k_{c a t} / K_{M}$ .

Parameter calculation

In Equation (14), there are three parameters, $K_{M}$ , $k_{c a t}$ , and $E_{0}$ , which can be approximately replaced by the concentration of target nucleic acid according to hypothesis H1.

For the same enzyme, multiple sets of fluorescence curves were obtained by changing the concentration of the reporter gene. The initial linear part of $t_{l i n}$ was used to calculate the initial velocity of fluorescence generation ( $t_{l i n}$ should not be too large to avoid nonlinear reaction), and then the initial reaction velocity was obtained by converting fluorescence units to molecular concentration.

Equivalently, equation (12) is transformed into:

\frac{1}{V} = \frac{K_{M} + [S]}{k_{c a t} E_{0} [S]} = \frac{K_{M}}{k_{c a t} E_{0} [S]} + \frac{1}{k_{c a t} E_{0}} (15)

For the concentration and initial velocity of multiple reporter genes, the formula was used to carry out the least square method to fit, that is, to obtain the $k_{c a t}$ and $K_{M}$ of the enzyme. The flow chart of $k_{c a t}$ and $K_{M}$ calculation is as follows:

Results and Discussions

Experiment Evaluation

Three indicators are proposed to evaluate the self-consistency and accuracy of experimental data:

a) The amount of product produced during the initial linear portion of $t_{l i n}$ does not exceed the initial amount of substrate:

α = \frac{v t_{lin}}{S_{0}} < 1

b) According to the Mie equation, the maximum rate of product generation is reached when [ $S$ ]>> $K_{M}$ , and the corresponding $V_{m a x}$ is equal to $k_{c a t}$ $E_{0}$ :

β = \frac{v}{v_{max}} = \frac{v}{k_{cat} E_{0}} \leq 1

c) $t_{l i n}$ and τ are of the same order of magnitude:

γ = \frac{t_{lin}}{τ} = \frac{t_{lin} k_{cat} E_{0}}{K_{M}} \leq O (1)

Rules 1 and 2 are required for all data, and an additional rule 3 is required for the reaction whose $[S]$ << $K_{M}$ .

Experiment improvement

According to Equation (14), the higher the value of $k_{c a t} / K_{M}$ , the faster the product formation rate, which means the better working conditions of the nuclease. Therefore, the gradient of reaction temperature and reaction pH can be set, and the reaction condition with the maximum ratio can be selected as the best reaction condition for the enzyme.

Experimental prediction

In Equation (14), t/τ is the horizontal axis, and $P / S_{0}$ is the vertical axis, as shown below:

Image of the function using normalized variables

Since normalized variables were used in both horizontal and vertical axes, the curve (y=1-exp (-x)) would be satisfied for the product changes of any reaction. Therefore, the reaction time of each experiment could be predicted. For example, when t=τ, almost 63% of the reporter genes were involved in the reaction.

Conclusion

In order to solve the problem of markers for detection items, we established a screening model. Based on 119 miRNAs found in blood by predecessors and 12 of them clinically found to be associated with pain, we used a variety of validation models to prove that hsa-miR-98-5p and hsa-miR-7d-5p are our ideal pain markers. From the logic of marker detection classification, we also need to develop a classification model to transform the miRNA concentration values obtained from experiments. According to the current research status of miRNA&pain, we decided to use the decision tree model to train classification, and found that more clinical research investment is needed to promote data support.

In order to optimize our detection process, NUPACK should be used to guide the optimization of HCR reaction and mathematical model should be used to guide CRISPR optimization in biological part design. First of all, we used NUPACK to design the probe H1, H2, H3 base sequence in HCR reaction, and repeatedly used NUPACK to guide and optimize the structure and length of HCR process. For CRISPR process, we hope to provide a criterion for the success of the experiment with the help of mathematical models, develop a differential equation model from the algebraic equation model and form evaluation parameters, in which the process also provides us with reference values for key coefficients such as substrate concentration and enzyme concentration.

Dayer, Camille Florine, et al. "Differences in the miRNA signatures of chronic musculoskeletal pain patients from neuropathic or nociceptive origins." PloS one 14.7 (2019): e0219311.
Blondal, Thorarinn, et al. "Assessing sample and miRNA profile quality in serum and plasma or other biofluids." Methods 59.1 (2013): S1-S6.
Qin Liao, Zhifeng Hao, and Zhihong Chen. Data Mining and Mathematical Modeling. Guo fang gong ye chu ban she, 2010.
Nouri, Reza, et al. "Figure of Merit for CRISPR-Based Nucleic Acid-Sensing Systems: Improvement Strategies and Performance Comparison." ACS sensors 7.3 (2022): 900-911.
Nelson, David L., Albert L. Lehninger, and Michael M. Cox. Lehninger principles of biochemistry. Macmillan, 2008.
Ramachandran, Ashwin, and Juan G. Santiago. "CRISPR enzyme kinetics for molecular diagnostics." Analytical Chemistry 93.20 (2021): 7456-7464.
Chen, William W., Mario Niepel, and Peter K. Sorger. "Classic and contemporary approaches to modeling biochemical reactions." Genes & development 24.17

# Background

# miRNA Model

# Screening

# Classification

# Results and Discussions

# NUPACK Model

# Purpose

# Design

# Guide the experiment

# [Base sequence]

# [Probe base pair mismatch]

# [Stem-loop length]

# Results and Discussions

# CRISPR Model

# Introduction

# Model building

# Model Derivation

# Hypothesis

# Building

# Parameter calculation

# Results and Discussions

# Experiment Evaluation

# Experiment improvement

# Experimental prediction

# Conclusion

Background

miRNA Model

Screening

Classification

Results and Discussions

NUPACK Model

Purpose

Design

Guide the experiment

[Base sequence]

[Probe base pair mismatch]

[Stem-loop length]

Results and Discussions

CRISPR Model

Introduction

Model building

Model Derivation

Hypothesis

Building

Parameter calculation

Results and Discussions

Experiment Evaluation

Experiment improvement

Experimental prediction

Conclusion