Model

Fig. 1 Reaction mechanism of Nano Luciferase.

^-2

⁴

Fig. 2 Linear dynamic range of NanoLuc.

mathematically describe the relationship between luminescence decay kinetics recorded by hardware and NanoLuc concentration, and thus can be used for estimating the possibility of AHPND for shrimps, which is the pursuit of the RENDR system

Before starting modeling, we first learn the new protocol for measuring the NanoLuc concentrations and how the hardware collects the luminescence decay kinetics for the test sample (Fig. 3). Assume there is a sample (such as reaction-finished cell-free system or purified NanoLuc) waiting for the measurement. Mixing the sample with the Nano-Glo® Luciferase Assay Substrate (FMZ) in the paper chip placed in the monitoring devices, the camera in the upper cover will record the whole luminescence decay kinetics after the device is under the dark environment inside (Fig. 3a). The recorded video is then processed in the software part using the back-end algorithm for paper-chip positioning (Fig. 3b) and data filtering (Fig. 3c). After the above process, the output data is ready for modeling and analysis (Fig. 3d). Notice that our new protocol intentionally removes the Nano-Glo® Luciferase Assay Buffer to ensure the possibility of recording using a low-cost camera, which also shorters the progress of luminescence decay for fast detection.

Fig. 3 The process of data collection. a The monitoring device capture the luminescence decay kinetics using camera. b The paper-chip positioning algrithm determinds the postion of samples. c The filter decrease the noise in the collected data. d Data modeling and analysis to mining the information hidden in the data.

Video A

The video is a gif, you may need refresh page for watching.

The data extracted using the back-end algorithm in the software is shown as follows (Fig. 4). The data in Video A marks as ${D_A} = \{ {t_i},{x_i},{u_i}\} ,i = 1,2, \cdots n$ , where ${t_i}$ represents the time, $x$ denotes the NanoLuc concentration, and ${u_i}$ is the light intensity at ${t_i}$.

Fig. 4 The data extracted from Video A.

two critical features

The data collected by hardware is scarce and noisy.

obtaining physical laws from the rigorous first principles seems impossible.

the luminescence decay time

is similar in different groups, which indicates the decay speed is related to the NanoLuc concentration

Nature Communications

Fig. 5 The new strategy (PINN-GP-SR) discovery the free-form governing equation.

Let us go back to using the PINN-GP-SR to derive the governing equation of luminescence decay.

(1) A neural network is trained as a universal function approximator for the luminescence decay process.

M S E_{u} = \frac{1}{n} \sum_{i = 1}^{n} {[u_{i} - N N (x_{i}, t_{i}; θ)]}^{2}

Fig. 6 The process of data collection. a The architecture of the NN1. b The loss changes with the training process. c The R² changes with the training process.

(2) A modified genetic programming algorithm is constructed to get the baseline PDE.

Fig. 7 The basic GP tree can be used to represent essential mathematical operations.

\frac{1}{n} \sum_{i = 1}^{n} {N N (x_{i}, t_{i}; θ) - N [N N (x_{i}, t_{i}; θ)]}^{2}

α \cdot u^{1 + β}

(3) Post-learning using PINN to gain a more precise PDE.

M S E_{u} = \frac{1}{n} \sum_{i = 1}^{n} {[u_{i} - N N (x_{i}, t_{i}; θ)]}^{2}

M S E_{f} = \frac{1}{n} \sum_{i = 1}^{n} {N N {(x_{i}, t_{i}; w, b)}_{t} - N [N N {(x_{i}, t_{i}; w, b)}_{t}]}^{2}

l o s s = M S E_{u} + M S E_{f}

\frac{d u}{d t} = α \cdot u^{1 + β}

α = - 0 .0016

Fig. 8 Luminescence decay kinetics. a ODE simulation result. b The difference between experimantal data and simulation data.

To test the accuracy of the equation, two new videos were prepared simultaneously with Video A.

Video B

Video C

Fig. 9 Data collected using hardware. a Data of Video B. b Data of Video C.

\min_{β} \overset{}{} \sum_{i = 1}^{n} {[u_{i} - D a t a S i m (\frac{d u}{d t} = - 0 .0016 \cdot u^{1 + β})]}^{2}

Video Index	Groundtruth	Prediction
B	50%	54%
B	25%	23%
B	10%	10%
C	30%	28%

To understand the learnable parameters in the equation, three new videos were prepared asynchronously with Video A

Video D

Video E

Video F

Fig. 10 Data collected in Video D, Video E, Video F.

\min_{α} \overset{}{} \sum_{i = 1}^{n} {[u_{i} - D a t a S i m (\frac{d u}{d t} = α \cdot u^{2})]}^{2}

Fig. 11 The dynimical of $\alpha $ with the change of enzyme activity.

Next, we want to test the model's generalizability with a concentration change large than 10-fold for both NanoLuc and FMZ.

⁸

the microplate reader

Fig. 12 The data collected in the the microplate reader with different NanoLuc concentrations from nM to pM.

The data in Fig. 12 shows that the luminescence decay kinetics exists in the different NanoLuc concentrations from 7.7 nM to 0.77 pM. To check whether the luminescence decay kinetics obeys the ODE equation discovered by PINN-GP-SR or not, the linearity between the numerical differentiation calculated from light intensity (du/dt) and the squared light intensity data (u^2) is checked since each trace can be identified as 100% NanoLuc (also called standard sample in future text). The checking result is shown in Fig. 13. When the concentration of NanoLuc decreases, the linearity is not guaranteed, the reason may be as follows: (1) when the light intensity is close to the LOD of the microplate reader, the data collected is oscillating, which can be seen in Fig. 13d, e. (2) The numerical differentiation is not so reliable, especially for the oscillating data. A data filter will be reconmend in future data processing to deal with oscillation.

Fig. 13 the linearity between the numerical differentiation calculated from light intensity (du/dt) and the squared of light intensity data (u^2) in different NanoLuc concentrations. a 7.7 nM. b 0.77 nM. c 77 pM. d 7.7 pM. e 0.77 pM. f R² table.

The same exploration is also implemented for the FMZ. As for a detection protocol, the substrate in excess is an important condition. The limitation of FMZ concentration of our new protocol is also one of the key points. The FMZ is diluted using PBS in different ratios, including 1:1, 1:3, 1:7, 1:15, and 1:31. Using the same protocol and test in the microplate reader, the data is plotted in Fig. 14. With the decreasing of FMZ, the reaction rate is also decreasing, and the peak is becoming observable.

Fig. 14 The data collected in the the microplate reader with different FMZ concentrations.

The linearity between the numerical differentiation calculated from light intensity (du/dt) and the squared light intensity data (u^2) is checked and shown in Fig. 15. The linearity is not maintained in the high dilution rate due to the lack of FMZ.

Fig. 15 the linearity between the numerical differentiation calculated from light intensity (du/dt) and the squared of light intensity data (u^2) in different dilution rate of FMT . a 1:1. b 1:3. c 1:7. d 1:15. e 1:31. f R² table.

NanoLuc, a new member of the luciferase reporter gene/protein family, has great potential in imaging and labeling research. However, it only can be tested on standard devices like microplate readers according to the standard protocol provided by Promega, which limits its use in rapid detection and special environments.

In our projects, low-cost hardware with a camera is created to detect the AHPND in situ. Considering the skill level of potential users and the actual test situation, the new protocol is designed, and a new luminescence decay kinetics is built to estimate NanoLuc concentration. The new kinetics are not only suitable for a low-cost device like the camera but also the microplate readers proposed by Promega's protocol, which shows the high generalizability of our model (Fig. 16).

Fig. 16 The new luminescence decay kinetics has high generalizability.

In our model process, we successfully distill the kinetics from sparse and noisy data with the help of physics-informed neural networks and genetic programming. Then lots of experimental work is used to explain the parameters meaning in the new ODE equation and its generalizability, which breaks the black box of neural networks. Our work is the first AI for Science in the iGEM community that connects the data-driven modeling method and mechanism-based modeling method, and this framework will be more and more popular in governing equation mining.

In conclusion, the whole process of our modeling sets a good example for model development. Systematic implementation of the NanoLuc luminescence decay kinetics and proper modeling in our project will also contribute to the iGEM community and inspire other teams who also choose the NanoLuc as an imaging and labeling system in their projects.

et al.

ACS Chem. Biol.

Nat. Commun.

Science

324