Project Design

This page describes the rationale and bioinformatics work that laid the foundation for our project.

Phage Selection

Because different cyanophages have different host specificities, we began the process of selecting a cyanophage to model our ghost phage after by looking at the types of Cyanobacteria that are frequently used in synthetic biology. Cyanobacteria as a group are hugely diverse, with both fresh and saltwater species. During our research, we learned that synthetic biologists are interested in developing tools for and using strains of the saltwater genus Synechococcus as a chassis (1-3). In addition, we met virtually with Team MSP-Maastricht, who were working with the specific strain Cyanobacteria Synechococcus sp. PCC 11901, which is becoming a popular chassis for cyanobacterial engineering. Based on this, we decided to focus our efforts on building a ghost phage that would recognize one or more strains of this genus. To learn more about our partnership with MSP-Maastricht, go to our Partnership page.

Synechococcus sp. PCC 11901
Figure 1. Synechococcus sp. PCC 11901 is one of the most promising species currently available for cyanobacterial biotechnology. TEM Image from (15).


For simplicity, we searched the literature to see if we could identify a cyanophage that would recognize Synechococcus PCC 11901 or another lab strain in vivo. Unfortunately, we were unable to find any literature that identified a cyanophage that infects Synechococcus PCC 11901 or other well-characterized lab strains of Synechococcus, such as 7002, UTEX 2973 or PCC 7942.

We next attempted to look for a wild marine strain of Synechococcus that was closely related to the lab strains. We reasoned that if we could find a phage that infects the wild strain, it might also infect a closely related lab strain. Our review of the literature revealed that the phylogeny of Synechococcus is very complicated, incomplete, and has many conflicting results due to the diversity of wild marine Cyanobacteria and the abundance of horizontal gene transfer that occurs between species (4,5). We eventually found Synechococcus WH8102 and WH8109, marine strains which have similarities to Synechococcus PCC11901.

Using previously published work, we identified several cyanophages that display host specificity for these wild marine strains. We selected two candidate cyanophages that were the most promising: Syn9 and S-TIP37 (6-8).

While these two phages have similar host specificity, they are quite different. Syn 9 is a T4-like phage, meaning that it shares a core set of homologous genes, is structurally similar and replicates similarly to the well characterized E. coli phage T4 (9,10). Syn 9 has a 177 kbp dsDNA genome coding for 232 genes. This virus replicates using a lytic lifecycle and has a capsid structure homologous to T4. S-TIP37 is a T7-like phage, with genetic similarity to the E. coli phage T7 (6). The viral genome is 46.21 kbp in size, coding for only 61 genes in total. This virus appears to use a lytic life cycle, but is also capable of short-term integration, though it has not been found to be a stable lysogen.

T7-like phage virions
Figure 2. T7-like phage virions are less complex than T4-like virions. Figure from (15).


Although less is known about the structure of S-TIP37, the fact that it has a smaller genome, and fewer genes than its counterpart Syn9 make it a preferable candidate virus from which to base the phagemid. In addition, T7-like viruses tend to have simpler capsids than T4 like viruses, meaning that the phagemid would require fewer genes if based on S-TIP37 rather than Syn9 (Figure 2). Another difference that was a deciding factor was the overall structure of the phage. T4-like viruses are typically slender, narrow, and long bodied because of the neck region. Along with that T4-like viruses also have a tail plate which is used to anchor and attach to its host cell, like teeth, and a contractile tail used to deliver the DNA of the phage into the cell. T7-like phages do not have either one of these structures present, reducing the complexity of the ghost phage machinery needed and requiring fewer genes to construct the phagemid. Thus, we elected to use S-TIP37 as the blueprint for construction of the ghost phage.

Identification of Capsid Components Needed for the Phagemid

Since S-TIP37 was first identified through metagenomics, it has a published genome (NCBI Reference Sequence: NC_048026.1) and a putative set of Open Reading Frames (ORF) identified, as well. However, there was no published structure of the viral capsid of S-TIP37. We decided that the best way to determine what genes are needed to construct the viral capsid, would be to use the basic structure of T7 as a guide (Figure 3), and then find a better characterized T7-like cyanophage with which to compare S-TIP37 ORFs to ensure we are selecting the correct genes for the various capsid components.

Structure of the T7 Capsin
Figure 3. Structure of the T7 capsid. Image taken from (11).


We compared S-TIP37 to the known structure of T7 and the better characterized T7-like cyanophage Syn5. Syn5 has a similar genome size than S-TIP37, 45.48kbp long, containing 56 genes in total (12). We found that both viruses contained 6-8 distinct capsid proteins (and one truncated protein due to a frame shift by the ribosome) in the two model viruses, and by name, identified six capsid proteins and one capsid assembly protein in S-TIP37 that might be homologous to the other viruses.

To confirm homology between the capsid proteins, we decided to perform alignments to assess the homology between S-TIP37 and Syn 5. We began by identifying the homologous gene sequences in the NCBI database (16). These are listed in Table 1.

Table 1: Homologous gene NCBI ID numbers of suspected capsid proteins in Syn 5 and S-TIP37.
Structural Component Syn5 gene S-TIP37 gene
Capsid Protein 5220210 54998410
Head-Tail Connector 5220167 54998403
Capsid Assembly 5220200 54998404
Tail tubular protein A 5220212 54998411
Tail tubular protein B 5220174 54998412
Internal Core protein 5220164 54998416
Tail Fiber-like protein 5220153 54998417

After finding these gene sequences, we downloaded protein sequences, and ran a pairwise alignment to compare the protein products. We initially used NCBI BLASTp to perform the alignments but based on feedback we received from Dr. Andrew Millard; we repeated the same analysis using HMMER (17), a program that uses Hidden Markov models to identify homologous protein or nucleotide sequences. BLASTp alignment results are shown in Figure 4.

T7-like phage virions
Figure 4. Pair-wise analysis of putative S-TIP37 capsid genes with those of Syn5. Identical amino acids are shown in red, similar amino acids are shown in blue and amino acids with no homology are shown in grey. Image prepared using NCBI BLASTp (16).


We evaluated the homology between the pairs of viral genes using the Expected (E) values obtained from both HMMER and BLASTp (results shown in Table 2). E values describe the number of hits that one would expect to see by chance when searching the database. The closer the E value is to 0, the less likely that hits are random. All alignments indicate a low E-value, indicating that the matches are not random (13). In addition, we obtained high identity scores for most of the pairs. Based on the scores we obtained, we were confident that the structural genes of S-TIP37 we identified were likely to be the correct genes needed for assembly of the ghost phage.

Table 2: Results of the pair-wise alignments of the indicated protein
sequences of suspected capsid proteins of S-TIP37 and Syn5.

  

Structural component E Value (HMMER) E Value (BLAST) % Identity
Capsid Protein 1.4 e-70 7 e-73 42.87%
Head-Tail Connector 1.80 e-165  5 e-170  53.36%
Capsid Assembly 1.2 e-23   4 e-23 38.79% 
Tail Tubular Protein A 5.4 e-38  1 e-35 40.72% 
Tail Tubular Protein B 1.6 e-87 2 e-81  28.39% 
Internal Core Protein  9.2 e-30  8 e-26 24.89% 
Tail Fiber-like Protein 1.2 e-23 1 e-14  53.00%


References

  1. Santos-Merino, M., Singh, A. K., & Ducat, D. C. (2019). New Applications of Synthetic Biology Tools for Cyanobacterial Metabolic Engineering. Frontiers in bioengineering and biotechnology, 7, 33. https://doi.org/10.3389/fbioe.2019.00033
  2. Markley, A. L., Begemann, M. B., Clarke, R. E., Gordon, G. C., & Pfleger, B. F. (2015). Synthetic biology toolbox for controlling gene expression in the cyanobacterium Synechococcus sp. strain PCC 7002. ACS synthetic biology, 4(5), 595–603. https://doi.org/10.1021/sb500260k
  3. Mills, L. A., Moreno-Cabezuelo, J. Á., Włodarczyk, A., Victoria, A. J., Mejías, R., Nenninger, A., Moxon, S., Bombelli, P., Selão, T. T., McCormick, A. J., & Lea-Smith, D. J. (2022). Development of a Biotechnology Platform for the Fast-Growing Cyanobacterium Synechococcus sp. PCC 11901. Biomolecules, 12(7), 872. https://doi.org/10.3390/biom12070872
  4. Ahlgren, N. A., & Rocap, G. (2012). Diversity and Distribution of Marine Synechococcus: Multiple Gene Phylogenies for Consensus Classification and Development of qPCR Assays for Sensitive Measurement of Clades in the Ocean. Frontiers in microbiology, 3, 213. https://doi.org/10.3389/fmicb.2012.00213
  5. Salazar, V. W., Thompson, C. C., Tschoeke, D. A., Swings, J., Mattoso, M., & Thompson, F. L. (preprint). Insights on the taxonomy and ecogenomics of the Synechococcus collective. bioRxiv 2020.03.20.999532; doi: https://doi.org/10.1101/2020.03.20.999532
  6. Shitrit, D., Hackl, T., Laurenceau, R., Raho, N., Carlson, M., Sabehi, G., Schwartz, D. A., Chisholm, S. W., & Lindell, D. (2022). Genetic engineering of marine cyanophages reveals integration but not lysogeny in T7-like cyanophages. The ISME journal, 16(2), 488–499. https://doi.org/10.1038/s41396-021-01085-8
  7. Fedida, A., & Lindell, D. (2017). Two Synechococcus genes, Two Different Effects on Cyanophage Infection. Viruses, 9(6), 136. https://doi.org/10.3390/v9060136
  8. Zborowsky, S., & Lindell, D. (2019). Resistance in marine cyanobacteria differs against specialist and generalist cyanophages. Proceedings of the National Academy of Sciences of the United States of America, 116(34), 16899–16908. https://doi.org/10.1073/pnas.1906897116
  9. Fedida, A., & Lindell, D. (2017). Two Synechococcus genes, Two Different Effects on Cyanophage Infection. Viruses, 9(6), 136. https://doi.org/10.3390/v9060136
  10. Weigele, P. R., Pope, W. H., Pedulla, M. L., Houtz, J. M., Smith, A. L., Conway, J. F., King, J., Hatfull, G. F., Lawrence, J. G., & Hendrix, R. W. (2007). Genomic and structural analysis of Syn9, a cyanophage infecting marine Prochlorococcus and Synechococcus. Environmental microbiology, 9(7), 1675–1695. https://doi.org/10.1111/j.1462-2920.2007.01285.x
  11. Yue, H., Li, Y., Yang, M., & Mao, C. (2022). T7 Phage as an Emerging Nanobiomaterial with Genetically Tunable Target Specificity. Advanced science (Weinheim, Baden-Wurttemberg, Germany), 9(4), e2103645. https://doi.org/10.1002/advs.202103645
  12. Pope, W. H., Weigele, P. R., Chang, J., Pedulla, M. L., Ford, M. E., Houtz, J. M., Jiang, W., Chiu, W., Hatfull, G. F., Hendrix, R. W., & King, J. (2007). Genome sequence, structural proteins, and capsid organization of the cyanophage Syn5: a "horned" bacteriophage of marine synechococcus. Journal of molecular biology, 368(4), 966–981. https://doi.org/10.1016/j.jmb.2007.02.046
  13. Pearson W. R. (2013). An introduction to sequence similarity ("homology") searching. Current protocols in bioinformatics, Chapter 3, Unit3.1. https://doi.org/10.1002/0471250953.bi0301s42
  14. Włodarczyk, A., Selão, T. T., Norling, B., & Nixon, P. J. (2020). Newly discovered Synechococcus sp. PCC 11901 is a robust cyanobacterial strain for high biomass production. Communications biology, 3(1), 215. https://doi.org/10.1038/s42003-020-0910-8
  15. Tan, Y., Tian, T., Liu, W., Zhu, Z., & J Yang, C. (2016). Advance in phage display technology for bioanalysis. Biotechnology journal, 11(6), 732–745. https://doi.org/10.1002/biot.201500458
  16. https://blast.ncbi.nlm.nih.gov/Blast.cgi
  17. Potter, S. C., Luciani, A., Eddy, S. R., Park, Y., Lopez, R., & Finn, R. D. (2018). HMMER web server: 2018 update. Nucleic acids research, 46(W1), W200–W204. https://doi.org/10.1093/nar/gky448