In order to cultivate E. coli in the laboratory, it takes at least 8 hours to cultivate the bacteria in the culture dish, and after that, it must be taken out at a fixed time to measure the OD value in order to know the growth cycle of the bacteria, which is costly in terms of time and labor. Our device uses camera platform + deep learning + Google spreadsheet + LineBot to automatically detect and record the results of the E. coli culture dish, and can remind the experimenter to collect the bacteria at the growth cycle of E. coli required by the experimenter.
The 250 photos taken by phone are used as database, where the ratio of training set, test set and validation set is about 8:1:1, and the number is 220:15:15 respectively; Lablme is used to label the colonies in the data set, and the colonies that can be used are labeled as colony, and the colonies that are fused or contacted together cannot be used, so they are labeled as wcolony.
Fig. 1) Separate colonies into two categories
Fig. 2) picture set
Fig. 3) Labeling with labelimg
The reason for using Yolov5m is that the model of Yolov5s is smaller, and no matter how to adjust the parameters, the accuracy will not be greater than 85%; if we use a larger model such as Yolov5x, there will be problems of overfitting and slow running speed, so we choose Yolov5m with a moderate model size. Epochs was set to 200 because after several tests, we found that the accuracy and loss were converged around this position, and more tests would not increase the accuracy. We found that increasing the resolution does not increase the accuracy, so we can improve the training speed by decreasing the resolution.
Fig. 4) The accuracy of the model
Fig. 5) Demonstration of the model in the validation set
The training is performed using the above model tuning parameters and the completed weights are transferred to the notebook computer for use.
The camera automatically takes pictures, the frequency of taking pictures is 15 seconds once, and then the Arduino rotating platform rotates 60 degree. The photos are saved to the set folder, and the colonies on the plate would be detect by our well trained model. After data processing, the data will be uploaded to GoogleSheet, and it could be return to the users by LineBot.
Currently, our LineBot presents experimental datas and URL for our wiki and community websites to users, and we are still developing more advanced functions.
From the graph of PR curve, we can see that the map of this model is 0.669, and from the graph of P_curve, we can see that the best accuracy of this model is around 90%, and in the graph of confusion_matrix, we can notice that the accuracy of prediction in colony is higher, but the accuracy of prediction in wcolony is lower only 0.58. This is probably due to the lack of data in wcolony and the fact that there is a whole tray full of colony photos in the test set, which may cause its accuracy to decrease.
In the actual use scenario, the whole system has been tested by us and works well. The coordinates recorded on the spreadsheet, after the conversion of the scatter diagram and the original photo, the colony locations are quite accurate, but the effect is not good in the case of too many colonies; the RGB value is not much different from the actual color after the check; the Arduino can also control the turntable normally to achieve the effect of monitoring several dishes at the same time; the colony size is determined by the box size, so there may be some errors. Linebot can warn users normally, and can also use additional functions such as IGEM team website and IGEM website normally, but there is a formatting problem in the data return part, and more adjustment should be made.
Our device verified that it is feasible to accelerate the experiment and reduce the manpower for the purpose of cultivating E. coli in a laboratory environment, but there are many areas that need to be optimized, such as:
Based on the above points, we hope to optimize this system in the future and make it into a complete system that can be used in a laboratory environment to speed up laboratory experiments and simplify the experimental process.