We built the software INCalculate to calculate the molecular weight and mono isotopic mass of peptide sequences containing ncAAs based on our database INCLUSIVE - Incorporation of Non-CanonicaL amino acids to Utilise SIde chains with VErsatility. You can access the software here.

Description

Once we decided to work with ncAAs, the motivation to find interesting ncAAs and incorporate them into our compartments was great. However, finding precise information was more of a challenge than it should have been. As we discovered this year, it is not too easy to work with non-canonical amino acids (ncAAs). There are already over 300 ncAAs that can be incorporated into the proteins of organisms via stop codon suppression and other methods. There are some ncAAs that are relatively easy to work with, as plasmids for their incorporation are freely available on e.g., AddGene. However, even these plasmids are not always perfectly annotated, and information is missing, e.g., where exactly the tRNA was inserted on the plasmid and what promoters were used. However, to find a plasmid perfectly fit for your ncAA is also rather rare, so for other ncAAs you either have to mutate the synthetase yourself or have a suitable construct synthesised.
That’s why, this year we build a software tool complementary to the database we created. With the database INCLUSIVE we want to facilitate the work non-canonical amino acids, their aminoacyl tRNA synthetases, and the complementary tRNAs. The database is hosted directly on our iGEM Wiki. There you will also find an Excel file with all entries, which is available for download if you are interested. The software is a prototype to calculate the molecular weight and mono isotopic mass of peptide sequences. This is interesting if you want to detect your built-in ncAA using mass spectrometry and directly calculate the monoisotopic mass of the peptide where your ncAA is located. So, you don't have to mess around with calculating it.

Installation

We have tried to make it as easy as possible to run this software. Therefore, to use this software you can either run the INC_calculate.py script via your terminal or an IDE, you can use the Jupyter notebook file INC_calculate.ipynb or you can use the google collab notebook. Each of these files contains exactly the same content, except that the execution differs (Jupyter Notebook, IDE, or Google Collab). This means that if you are interested in a specific ncAA (which we have listed in our database), have incorporated it into your protein, have carried out the trypsin digest and now have the peptide sequence, you can simply copy this sequence over into our programme, mark the position of the ncAA with an X and in the second step enter the abbreviation for the ncAA of your interest.

Usage

If you have problems executing this file, don't worry, we are here for you. In the following we show you how to get started with our software.

Contributing

If you like any of our code, feel free to use it! It's not the most groundbreaking or interesting code, but a few snippets, like entering the chemical formula in string form into a dictionary. If you are interested in extending the code, for example to introduce trypsin digestion, feel free!

Authors and acknowledgment

The code was written by Leon-Samuel Icking.
The database was created by Leon-Samuel Icking.
Contact: ickingsa@googlemail.com.