Several methods for production of synthetic siRNAs suitable for in-vitro and in-vivo studies have been established. The methods range from solid-phase synthesis (Beaucage et al., 2008), in-vitro transcription and RNase III digestion (Yang et. al., 2008), RNA Polymerase II mediated expression of long dsRNA followed by RNase III digestion in-vivo to RNA Polymerase III driven expression of shRNAs in-vivo (Snyder et al., 2009).
In brief, solid phase synthesis of ribonucleotides is a cyclic procedure based on the phosphoramidite method utilizing orthogonal protection groups. This procedure enables production of near foolproof sequences with high yield. Additionally, it allows the insertion of base modifications like 2’O-Me, 2’ even for deoxyribonucleotides (Beaucage et al., 2008). For industrial production and custom synthesis this method is regarded to be the “gold standard". To enable sequence-specific synthesis of siRNA they have to be designed in-silico in this case a natural siRNA for a specific target has not been identified yet. There are several different rules proposed for in-silico design of siRNAs e.g. Elbashir et al. (Elbashir et al., 2001), Reynolds et al. (Reynolds et al., 2004), Ui-Tei et al. (Ui-Tei et al., 2004) and Jagla et al. (Jagala et al., 2005) to name a few. These sets of rules mainly focus on specific sequence motives and further elaborate the principles of siRNA-function proposed by Schwarz et al. (Schwarz et al., 2003) and Khovorova et al. (Khvorova et al., 2003), stating the incorporation of the antisense-strand into the RNA induced silencing complex to depend on weaker base-pairings in the 5’-end. Nowadays, several algorithms such as siRNA direct 2.0 are available for the selection of siRNA-target sequences.
Generally, in-vitro systems are based on expression of single-stranded RNAs via promoters specific to bacteriophage such as the T7-RNA polymerase. For generating siRNA that is required for the RNAse III needed dsRNA-fragments, the expression cassette is flanked by two different promoters. One promoter enables the expression of the leading strand and the other enables the same for the other strand in the opposite direction. This yields two single-stranded RNAs with complementary sequences which then form the double-stranded siRNA. Another possible approach is the insertion of a linker designed in such a way that a stem-loop leading to a dsRNA is formed (Taxman et al., 2013).
In-vivo production methods for siRNA have been reported in the context of eukaryotic expression vectors. The expression of shRNA which can act as siRNAs in-vivo can be facilitated by U6 promoter-driven expression. This construct can be inserted into a eukaryotic transcription vector and can then be either stably or transiently transfected into cell cultures. For in-vivo applications such constructs are usually mediated via use of vehicles, e.g. adeno-associated viruses, allowing continuous episomal presence of the construct and therefore, continuous expression of the siRNA.
In our project, we adapted a method published by Huang et al. in 2013. This method is based on the expression of a 500 to 1000 bp long recombinant single-stranded RNA by a T7 polymerase. This single-stranded RNA consists of the target area for the siRNAs in sense direction, a linker sequence followed by the target area for the siRNA in antisense direction.
Upon expression, this long single-stranded RNA will fold into a dsRNA as already described above, resulting in a 250 to 500 bp long dsRNA which then also comprises the same sequence as the target area. E. coli expresses RNase III-like enzymes which recognize and digest this dsRNA into 18 to 23bp long dsRNAs. Additionally, the plasmid containing the RNA-expression cassette carries the gene of the p19-protein of the tomato-stunt virus. This protein naturally acts as an RNAi-inhibitor by binding short dsRNA of a length of 18 to 25bp unspecifically and therefore, depriving the plant cell of its defense mechanism against the viral infection (Lakatos et al., 2004).
Huang and colleagues utilized this property to extract siRNAs in high purity and yield from an E. coli lysate by adding a His-tag to the C-terminus of the p19 protein, thus enabling clean-up using nickel-affinity chromatography. Native siRNAs can then be released from the complex by denaturing the proteins followed by a clean-up either using Native Page or HPLC.
Whilst rational design and solid-phase synthesis result in a sequence-specific siRNA whose efficiency, even if not yet published, has to be examined, the approach using recombinant expression in E. coli yields a multitude of siRNAs against a single target. Even though using this approach we do not know which siRNAs will work, the probability of some siRNAs working, is very high due to the fact that a large target area is covered. This will then yield an even higher efficiency than the rational design because even if no single siRNA achieves a knock-down efficiency of 70 to 80 percent (typically yielded by the rational design) (Henrik J. Ditzel et al., 2021), the efficiencies of the siRNAs will add up. Furthermore, such a broad target area enables the knock-down of genes which have a high mutational chance such as viral genes, e.g. SARS-CoV-2. As this has been proven to be a viable approach, we decided to design our siRNAs using this method. While the design of such a expression cassette and its expression might seem to be time-consuming and thus frightening, once designed and cloned, siRNAs can be expressed and cleaned up within three days promising to be a more efficient and cheaper way than having each siRNA synthesized separately using solid-phase synthesis via a commercial supplier if an own nucleotide-synthesizer is not available. This method can also be used to screen siRNA-target rich areas in order to design the expression cassette for said siRNAs.
Agarose gel electrophoresis was performed in order to check the transformation success of plasmids isolated from E. coli after transformation with pEGFP-C1, pRK5 yielded plasmids in size ranges that are congruent with the proposed plasmid sizes (compare Table 1, see Figure 1A) were performed.
Plasmid | Proposed size [bp] | Fragment size [bp] |
pEGFP-C1 | 4731 | 4000 - 5000 |
pRK5 | 4661 | 4000 - 5000 |
Next Generation Sequencing (NGS) of pEGFP-C1 with SV40A verified EGFP gene presence which confirmed the results that were obtained by electrophoresis For pRK5 the presence of MCS was proven but sequencing with CMV-Forward and SV40A reverse revealed missing restriction sites in the MCS. For pEX-A258 HSV UL19 AA816:1148-6xHis no electrophoresis was conducted. Instead the plasmid was directly sequenced with AmpR primer, thereby the presence of the insert and its correct orientation was proven.
Here sequence alignments can be found for pEGFP-C1, pRK5 and pEX-A258.
PCR for Kozak-Sequence insertion and insert amplification is expected to yield a fragment with the size of 1038 bp. We identified a ~1000 bp large amplicon throughout agarose gel electrophoresis on a 1.2 % gel. We concluded that the PCR was successful and used the obtained product for further processing.
Plasmid | Theoretical No. of fragments | Theoretical size of obtained fragments [bp] | Obtained No. of fragments | Obtained fragment sizes [bp] |
pRK5-HSV UL19 AA816:1148-6xHis | 2 | 4626 + 1032 | 2 | ~4500 + ~1000 |
The amplicon was cloned in the pRK5 backbone which should result in a plasmid with a size of 5658 bp. Subsequently, transformation in E. coli was performed, the plasmid was isolated and digestion with EcoRI and HindIII was conducted in order to confirm insert presence. Visualisation with ethidium bromide staining on a 1.2 % agarose gel shows a distinct pattern of two fragments: one at ~4500 bp and one at ~1000 bp (see Figure 1C). These sizes are comparable to those obtained by digestion of the plasmid sequence using NEB-Cutter v3 (see Table 2). The obtained plasmid was sequenced with CMV forward and SV40A reverse in order to confirm the correct sequence and orientation. Sequencing confirmed the results of the restriction digest.The proposed plasmid map is shown in Figure 1D.
Here sequence alignments can be found for pRK5-HSV UL19 with CMV and pRK5-HSV UL19 with SV40A.
We initially simulated the folding of the chosen loop element in the dsRNA expression cassette with UNAFold (Markham & Zuker, 2008) by selecting a sequence spanning from the +1 site of the T7 promoter to the 20th base of the T7 terminator. Two possible folding structures with ΔGs of -31.40 kcal/mol (Structure 1, Figure 2A) and -30.60 kcal/mol (Structure 2, Figure 2B) were identified.
pUC19-p19-siRNA empty was engineered by restriction based cloning of a PCR amplified backbone and an insert obtained by solid phase synthesis. Primers for pUC19 backbone amplification were analysed on a temperature gradient ranging from 64 °C to 71 °C. The annealing temperature proposed by NEB Tm- Calculator was 72 °C. For all temperatures we obtained amplicons with a size ranging from 1500 to 2000 bp as well as a second fragment between 600 and 700 bp (Figure 3A). An annealing temperature of 69 °C was chosen and the PCR was repeated (Figure 3B). Plasmids received by plasmid preparation were sequenced with p19-forward and AmpR reverse. Insert presence was proven as well as integrity of the loop structure.
Here sequence alignments can be found for pUC19-p19-empty p19F and pUC19-p19-empty AmpR.
siRNA target areas were amplified with SacI/XhoI or SalI/NotI sequence extension by PCR. We obtained amplicons with a size ranging from 200 to 300 bp (Figure 3C) which was congruent with the expected sizes of 266 bp for the SacI/XhoI Primer set and 268 bp for the NotI/SalI primer set. After performing a two-step cloning and transformation in E. coli, we analysed the obtained plasmids by restriction digest with SacI and NotI to confirm the presence of the insert in both restriction cassettes. Results were visualised on a 1.2 % agarose gel stained with ethidium bromide (Figure 3D). We compared the resulting fragment sizes to the theoretical sizes obtained by digest in NEB Cutter v3 (Table 4).
Plasmid | Theoretical No. of fragments | Theoretical size of obtained fragments [bp] | Obtained No. of fragments | Obtained fragment sizes [bp] |
pUC19-p19-siRNA UL19 | 2 | 2758 + 547 | 2 | ~2500 + ~550 |
Plasmids were sequenced with p19F and AmpR to confirm correct insertion and conservation of the loop structure as well as the integrity of p19 gene.
Here sequence alignments can be found for pUC19-p19-siRNA UL19 p19F and pUC19-p19-siRNA AmpR.
Subsequently we checked whether for the obtained sequence of the dsRNA expression cassette the formation of a loop structure is predicted. We chose a sequence from the +1 site of the T7 promoter to the 20th base of the T7 terminator to include the full expression cassette. Folding simulation resulted in one proposed structure with a ΔG of -605.90 kcal/mol.
We furthermore introduced a 7 bp long fragment between the tac promoter and the start codon of p19 by Primer Extension PCR.
Sequence alignment for pUC19-p19-siRNA tac extension can be found for here.
After the successful cloning of pUC19-p19-siRNA UL19, we extend the insertion by a lac operator sequence and a Shine-Dalgarno sequence using the same principle. The lastly obtained plasmids were analysed by restriction digest with ClaI and HindIII on a 1.2 % agarose gel stained with ethidium bromide (Figure 3E). We compared the obtained fragment sizes to the theoretical sizes acquired by digest in NEB Cutter v3 (Table 5).
Plasmid | Theoretical No. of fragments | Theoretical size of obtained fragments [bp] | Obtained No. of fragments | Obtained fragment sizes [bp] |
pUC19-p19-siRNA UL19 tacO-LacO | 2 | 3005 + 341 | 2 | ~3000 + ~350 |
Afterwards the Plasmid was sequenced with p19F and AmpR to confirm correct insertion of both inserts and conservation of loop structure as well as integrity of p19 gene. Furthermore the plasmid was sequenced with pUC19-pBR322ori-fwd to confirm complete presence of tac-lacO-Shine Dalgarno structural element. The proposed plasmid structure is displayed in Figure 3F.
Sequence alignment for pUC19-p19-UL19 siRNA Tac-LacO-SD pBR322 fwd can be found for here.
We assessed the functionality of our plasmid in two different ways. At first we checked for expression of p19 by SDS-PAGE and Western-Blot. To determine the protein concentration a Bradford Assay was conducted (for results see Table 6, Figure 4A). 30 µg of Protein in the Wash Fraction 1C and a linear mass dilution from 3 µg to 3 ng of Ni-NTA bead bound p19-fraction 1 were analysed by SDS-PAGE. Ponceau staining of the Blot revealed in each fraction an accumulation of a protein with a mass around 28 kDa. This protein is most dominantly present in Wash fraction C together with other proteins of different sizes. In the linear mass gradient only a small band of a ~28kDa protein is visible with decreasing intensity at lower concentrations. In comparison to the loaded BSA masses (1 µg and 500 ng) only the Wash fraction 1C contains comparable amounts of protein. The linear gradient shows high purity of the fraction with only one protein visible (Figure 4C).
Functionality of the T7 expression cassette was assessed by in-vitro transcription. Our self-designed and cloned plasmid was compared to a cassette derived by solid phase synthesis and a PCR full amplicon of this solid phase synthesis derived cassette. Both cassettes derived from the solid-phase synthesis did not yield any visible dsRNA, while our cassette produced a dsRNA at a size of ~300 bp (see Figure 4B).
Sample | OD1 1:10 [610nm] | OD2 1:10 [610nm] | Mean | Standard deviation | c (diluted) [µg/mL] | c (undiluted) [µg/mL] |
1a | 0.109 | 0.099 | 0.104 | 0.007 | 74.3 | 742.9 |
1b | 0.033 | 0.034 | 0.010 | 0.001 | 23.9 | 239.3 |
1c | 0.363 | 0.335 | 0.363 | 0.020 | 249.3 | 2492.9 |
p19 1 | 0.336 | 0.335 | 0.336 | 0.001 | 239.6 | 2396.4 |
2a | 0.123 | 0.127 | 0.125 | 0.003 | 89.3 | 892.9 |
2b | 0.024 | 0.023 | 0.024 | 0.001 | 16.8 | 167.9 |
2c | 0.365 | 0.360 | 0.363 | 0.004 | 258.9 | 2589.3 |
p19 2 | 0.368 | 0.369 | 0.369 | 0.008 | 263.2 | 2632.1 |
Production of pro-siRNA was analysed by native PAGE resulting in a methylene blue stained gel (Figure 5). For each fraction 500 ng were loaded, two different fragment patterns can be observed: one above the positive control (siRNA 21 nts) and one slightly below the positive control. Fragments running roughly at the same height as the positive control are therefore between 18 and 21 nts in size. These were cut from the gel for further processing.
In the development of our plasmid we controlled the success of our experiments after each step. The presence and the size of the inserts were examined after each cloning step by restriction digest and analysis throughout agarose gel electrophoresis. Additionally, the inserted fragments were sequenced at all critical steps to ensure congruence with the proposed sequence. This enabled us to construct the plasmids needed for recombinant expression of the His-tagged UL19 gene fragment in HeLa cells (pRK5-HSV UL19 AA816:1148-6xHis) and production of pro-siRNAs (pUC19-p19-siRNA UL19). For quality control of the plasmid, we digested the plasmids with two restriction enzymes and compared the size of the resulting fragments with the expected size. This method is not 100 % accurate, as size can only be estimated down to 50 bp for small fragments and 100 bp for bigger fragments, but it allows drawing out first conclusions on whether cloning worked. But this method does not indicate if the insert is in the correct orientation or whether important regions are intact. In the case of pUC19-p19-siRNA UL19 (Table 4) the observed size of the bigger of the two fragments was 200 bp smaller than the expected size. However, sequencing showed that the inserts were present and the important p19 gene intact. This shows why double checking with sequencing data is necessary. The reason for the 200 bp differences can be speculated to be the result of the loop structure and maybe other secondary or tertiary structures even after cutting. It is known that circular DNA travels faster than linear DNA through agarose gels because the DNA is more compact that way (Aaij & Borst, 1972).
The expression of the HSV-UL19 mRNA was too low which is why we included an additional Kozak-sequence to our enhanced CMV-promoter. The Kozak-sequence is the conserved consensus sequence next to the promoter, surrounding the AUG for the start of transcription in eukaryotes. It plays a role in the binding of the ribosome complex to the AUG during initiation of transcription (Alekhina & Vassilenko, 2012). This isn’t always the first AUG, but the first AUG with a surrounding Kozak-sequence (Dunston et al., 2004). This sequence was missing in our plasmid initially. Either it was removed previously by the research group that provided us with this plasmid backbone or this specific version didn’t have one to begin with. The pRK5 has not one single version, but different versions of the plasmid have been published and are available to purchase. After cross referencing with the supplier Addgene, we noticed that there are multiple proposed plasmid vector maps (see notes at Addgene plasmid 3944).
In our plasmid the CMV promoter from the human cytomegalovirus is used. It is a transiently expressed promoter with high level of expression and good long-term efficiency in most cell types (Damdindorj et al., 2014), therefore appropriate to be utilised in cell culture for investigating proof-of-principle or proof-of-concept studies in a simple but efficient manner.
The originally designed plasmid for siRNA production showed no expression of p19 protein and therefore no pulldown of pro-siRNAs could be achieved. For this reason, we added a 7 bp consensus sequence of the tac promoter, a Lac Operator and a Shine-Dalgarno sequence. This solved our initial problem and added a regulatory element, which could be tuned with the same inductor as the expression of the genomic T7 RNA polymerase. The tac promoter is a synthetically generated constitutive promoter from the trp and lac operon and provides strong expression in E. coli (de Boer et al., 1983). The Shine-Dalgarno sequence in prokaryotes has the same function as the Kozak sequence in eukaryotes and is located near the start codon (Steitz & Jakes, 1975). To better control the expression of our plasmid, we added terminator regions to it. This kind of region wasn't included by Huang and Lieberman in their published sequence (Huang & Lieberman, 2013). Due to the improved expression control especially of the dsRNA cassette the desired amount of correctly folded and sequence matching dsRNA can be produced. Subsequently the yield of functional pro-siRNAs against the chosen target can be increased.
During western-blot we had problems with heat distribution in the gel. Too much heat caused the gel to expand and as a result, the bands to run in a warped line instead of a straight one (Mahmood & Yang, 2012). As a result, the p19 proteins were not blotted at 22 kDa as expected, but instead at 28 kDa. This is not ideal, but still acceptable. Furthermore, we observed a high amount of p19 protein as well as other proteins in our washing fraction. For this finding several different reasons could apply. Either overexpression of p19 gene which resulted in inclusion bodies (Gutiérrez-González, M., et al., 2019), saturation of the Ni-NTA beads, which is unlikely but also possible since we induced expression at 20 °C overnight or the accumulation of unwanted protein occupying the Ni-NTA beads since they show a higher affinity for the beads at the given pH and ion concentration. Therefore we propose that if protein expression is induced overnight to use a different type of protein-tag which is more unique and has a very high affinity for its ligand such as a GST-Tag or FLAG-Tag (Lichty et al., 2005). His-Tag is very much suitable for short induction periods for example one to two hours as proposed by Huang and Lieberman (Huang & Lieberman, 2013).
On the native PAGE used for siRNA analysis (Figure 5), we observed a wide band for the positive control because of overloading. This makes it impossible to read the exact position for 21 nt nucleotides. The ideal RNA loading of the Native-PAGE chambers are solutions with no more than 0.1-10 μg/mL RNA (Woodson & Koculi, 2009). The 2 µg/µL that we loaded in the chamber was way too much to get a clear band at the right position.
In conclusion, we were able to design both plasmids needed for our proof-of-concept studies. Furthermore we were able to enhance the sequences proved to us by the Lieberman Lab and prove their full functionality. The proposed expression system for pro-siRNAs is a simple workflow which can be achieved with basic techniques of molecular biology and is therefore a simple and efficient way once established to test different sequences for presence of possible siRNA sequences or to facilitate a knockdown of genes with a high amount of polymorphism or which are prone to mutate likely, such as viral genomes. We are confident that the obtained pro-siRNAs are capable of initiating the knock-down of our target gene.