Software

This year’s dry lab team established a partnership with San Francisco startup, LatchBio.

In 2021, the founders set out to solve an integral problem in traditional bioinformatics; the inaccessibility of using bioinformatic tools, usable only by bioinformaticians with experience in coding, primarily though shell scripting, Perl, and Python. The goal of the Latch Project™ is to build and disseminate the data infrastructure of the biocomputing revolution by building an intuitive code-free platform.

Our software this year does not relate to our main probiotic project, but advances the tools for synthetic biologists around the world nonetheless.

Bioinformatics, but without the code.

The partnership involved members of the McGill iGEM team developing workflows using the Latch SDK.

A workflow is an analysis that takes an input, processes the input using specified commands, and produces an output. For example, this can be a bioinformatic pipeline that processes a FastQ file into a bam file.

The goal of the partnership was to come up with bioinformatic workflows that have a practical use in biological studies and implement them using the python-based SDK.


The two workflows developed by the team were motif calling pipelines meant to processes and input bed or bam file, then outputting a motif HTML file which can be opened in any browser. Sequence motif discovery is an important part of a computational biologist’s toolkit. The purpose of motif discovery is to discover patterns in biopolymer sequences in order to better understand the structure and function of the molecules in the sequences present. The two tools implemented in the workflows to extract sequence motifs from sequence files were HOMER Motif Analysis and MEME Suite.

HOMER’s novel motif discovery algorithm was designed for regulatory element analysis in genomic applications (DNA only) by using a differential motif discovery algorithm. It uses zero or one occurrence per sequence coupled with the hypergeometric enrichment calculations to determine motif enrichment. The MEME Suite provides a variety of motif-based analyses, AME (Motif Enrichment Analysis) was used in the workflow. The algorithm finds known DNA, RNA, or protein motifs that are relatively enriched in input sequences compared to a shuffled version of those sequences or control sequences, or that are enriched in small values of scores that you can specify with your input sequences.

The purpose of developing two motif callers performing the same task is to allow scientists to compare and contrast the results from two of the most popular sequence motif calling tools to verify scientific findings. With the implementation of these workflows on the latch platform, anyone is able to upload a bam/bed file into the required field, hit run, and let the processing of the file occur to obtain the motif files. The harmonized bioinformatics platform between wet lab and dry lab helps teams accelerate R&D and eliminates learning curves.