Authors: Kyung Duk Koh, Jay Hesselberth & Francesca Storici
Ribonucleotides are now widely considered as the most frequently incorporated non-canonical nucleotides in genomic DNA. Here, we describe an approach, ribose-seq, to prepare DNA libraries for next-generation sequencing to map ribonucleotide incorporation into DNA. By specifically targeting and capturing the unique ends of alkali-derived 2′,3′-cyclic monophosphate or 2′-monophosphate from ribonucleotides embedded in DNA, ribose-seq libraries are constructed. Upon high-throughput sequencing and analysis, the distribution and identity of ribonucleotides in the genome could be determined as recently reported using yeast as a model organism1. Ribose-seq could potentially be applied to any cell type of any organism to allow profiling of ribonucleotide incorporation into genomic DNA.
Ribonucleotides, also known as ribonucleoside 5′-monophosphates (rNMPs), which are normally monomers of RNA, have been found to be the most abundant non-canonical nucleotides incorporated into DNA2,3. Sources of ribonucleotide incorporation into DNA include incomplete Okazaki fragment maturation, oxidative damage, and numerous DNA polymerases. Quantitative measures of rNMPs in DNA following alkali treatment of yeast genomic DNA derived from RNase H2-deficient cells estimated ~2,000 rNMPs per genome4,5 while similar measurement conducted for genomic DNA derived from RNase H2-deficient embryonic fibroblasts revealed the presence of more than one million rNMPs in the mouse genome6. Ribonuclease H type 2 (RNase H2) is a major protein factor involved in repair of rNMPs in DNA and initiates ribonucleotide excision repair2,3. If not removed, rNMPs have been shown to have both positive roles, such as acting as strand discrimination signal during MMR, and negative consequences, including replication stress and genome instability2,3.
Despite the numerous studies showing the presence of rNMPs in DNA, information about the distribution and identity of these rNMPs in genomic DNA was unknown till very recently. We developed ribose-seq to capture rNMPs in genomic DNA along with their upstream DNA sequence for library construction, next-generation sequencing, and analysis1. Implementation of ribose-seq to genomic DNA from RNase H2-deficient yeast Saccharomyces cerevisiae cells revealed widespread but non-random distribution of rNMPs in yeast genome, with specific base preferences, neighboring DNA sequence preferences, bias of leading and lagging strand, and hotspots. Ribose-seq allows us to explore rNMP incorporation into DNA potentially in any cell type of any organism and opens up a new direction to better comprehend the impact of rNMPs on genome integrity.
Here, we detail the steps of ribose-seq library construction as shown in Figure 1. Ribose-seq employs the unique capacity of Arabidopsis thaliana tRNA ligase (AtRNL) to ligate 2′,3′-cyclic monophosphate or 2′-monophosphate termini of DNA and RNA to 5′-monophosphate ends of DNA and RNA7,8. A double-stranded (ds) sequencing adaptor, which has both primer sequences for later PCR amplification step, is initially ligated to genomic DNA fragments as AtRNL prefers self-ligation or circularization. Alkali treatment exposes the rNMP-specific 2′,3′-cyclic monophosphate or 2′-monophosphate termini. Upon ligation by AtRNL, the rNMP is captured along with its upstream DNA sequence while unligated linear single-stranded (ss) DNA is degraded with T5 exonuclease. Yeast 2′-phosphotransferase Tpt1 then removes the 2′-monophosphate at the ligation junction, allowing subsequent PCR amplification of the library. The details given in this protocol correspond to Illumina library to be sequenced on MiSeq with 50-cycle single-end reads.
A. Preparation of rNMPs-embedded genomic DNA
B. Preparation of ds sequencing adaptor
5-fold excess of Adaptor.S was added to the mixture to ensure that all Adaptor.L molecules are annealed to Adaptor.S. The remaining single-stranded Adaptor.S will be removed in subsequent purification steps.
3.Perform annealing by heating the mixture to 95–100 °C and gradually cooling to room temperature. The resulting ds Adaptor.L/Adaptor.S is at a concentration of 25 uM.
4.Desalt the mixture by using a spin column. Koh et al.1 used illustra MicroSpin G-25 Column.
5.Use NanoDrop to quantify the amount of desalted ds Adaptor.L/Adaptor.S. Typically, the yielded concentration ranges from 10 to 13 uM. A concentration of 10 uM will be assumed for subsequent steps of the protocol.
C. Fragmentation of rNMPs-embedded genomic DNA
Genomic DNA could be digested with a different set of blunt-end resulting restriction enzymes. However, one need to make sure that the restriction sites are well-distributed in the genome and that the digestion results in a population of fragments with an average size of 800–1,500 bp.
2.Incubate at 37 °C overnight.
3.Purify the fragmented DNA by using a spin column. Koh et al.1 used the QIAGEN spin column from their PCR Purification Kit. Both reactions can be purified using a single column with elution volume of 30 uL.
4.Use Qubit 2.0 (dsDNA HS) to quantify the amount of fragmented DNA. Typically, the concentration of the resulting DNA is ~200 ng/uL, following the reaction conditions listed above. A concentration of 200 ng/uL will be assumed for subsequent steps of the protocol.
5.Check the size range of the fragmented DNA by using the Experion DNA 12K Analysis Kit. Typically, with the reaction conditions listed above, the fragmentation results in an average size of ~1,500 bp.
D. dA-tailing and ds sequencing adaptor-ligation of fragmented rNMPs-embedded DNA
2.Incubate at 37 °C for 30 min.
3.Purify using a spin column. Koh et al.1 used the QIAGEN spin column from their PCR Purification Kit, with elution volume of 30 uL.
4.Set up a sequencing adaptor-ligation reaction as follows:
5.Incubate at 15 °C overnight.
6.Purify using Agencourt RNAClean XP with elution volume of 30 uL.
E. Alkali treatment of adaptor-ligated rNMPs-embedded DNA
2.Incubate at 55 °C for 2 h.
3.Neutralize with 2 M HCl to pH 7. Use pH Litmus Paper to check the pH. Typically, 7.5–8 uL is needed for neutralization.
4.Purify using Agencourt RNAClean XP with elution volume of 20 uL.
5.Heat the resulting solution at 95 °C for 3 min to ensure denaturation of dsDNA and immediately chill on ice.
F. Self-ligation (circularization) of rNMP-terminating DNA by AtRNL
Final reaction concentration of AtRNL is 1 uM.
2.Incubate all reactions at 30 °C for 1 h.
3.Purify each reaction using RNAClean XP with elution volume of 30 uL.
G. Removal of linear ssDNA
T5Exo– samples may not be necessary as dA-tailing and adaptor-ligation reactions are standard steps. However, they do act as positive control for the later PCR reaction.
2.Incubate all reactions at 37 °C for 2 h.
3.Purify each reaction using RNAClean XP with elution volume of 20 uL.
H. Removal of 2′-phosphate at the ligation junction
Final reaction concentration of Tpt1 is 1 uM. DNA indicates either AtRNL– T5Exo–, AtRNL– T5Exo+, AtRNL+ T5Exo–, or AtRNL+ T5Exo+ product.
2.Incubate all reactions at 30 °C for 1 hr.
3.Purify each reaction using RNAClean XP with elution volume of 30 uL.
I. PCR amplification and library verification
DNA indicates either AtRNL– T5Exo– Tpt1+, AtRNL– T5Exo+ Tpt1+, AtRNL+ T5Exo– Tpt1+, or AtRNL+ T5Exo+ Tpt1+ product. 5–30 uL of each product could be used as template for PCR. Koh et al.1 used 20 uL. For products which were not treated with T5 Exonuclease (AtRNL– T5Exo– Tpt1+ and AtRNL+ T5Exo– Tpt1+), 5 uL is sufficient to visualize non-specific amplification.
2.Run PCR with the following settings:
PCR could be run for 26–32 cycles. Koh et al. used 30 cycles.
3.Run 6% Non-denaturing PAGE with 10 uL aliquot of each sample. Koh et al. used 100 bp DNA Ladder (NEB) as the ladder.
4.Stain the gel in 1X SYBR Gold (Life Technologies) for 30–40 m.
5.Visualize under UV light. An exemplary gel image is shown in Figure 2. AtRNL+ T5Exo+ Tpt1+ sample will be your ribose-seq library while Primers-only and AtRNL– T5Exo+ Tpt1+ samples will be your controls where no amplification should be observed (only primer dimers).
6.Purify PCR mixtures from Primers-only, AtRNL– T5Exo+ Tpt1+, and AtRNL+ T5Exo+ Tpt1+ using RNAClean XP with elution volume of 15 uL. Controls Primers-only and AtRNL– T5Exo+ Tpt1+ are also purified so that the amount of actual ribose-seq library can be determined and quantitatively confirmed.
7.Use Qubit 2.0 (dsDNA HS) to quantify the amount of ribose-seq library. Confirm that the amount of purified Primers-only product is similar to the amount of AtRNL– T5Exo+ Tpt1+. The amount of the actual ribose-seq library can be calculated by subtracting the amount of AtRNL– T5Exo+ Tpt1+ (which should be just primer dimers) from AtRNL+ T5Exo+ Tpt1+. Typically, ~25 nM of the ribose-seq library is resulted.
This protocol takes ~3–4 working days.
Depending on the genotype of the cell, the cell type, and the organism, more genomic DNA and more reactions leading to more template for PCR may be necessary. The amount of template used and the number of cycles run during PCR can also vary. If using another DNA polymerase for PCR, different reaction conditions such as concentrations of DMSO and primers and annealing temperature need to be tested, minimizing the cycle number while minimizing the background amplification for AtRNL– T5Exo+ Tpt1+.
Application of ribose-seq to yeast genome was recently reported by Koh et al.1 Following this protocol, ribose-seq libraries from genomic DNA of various S. cerevisiae cells were constructed. The libraries were then sequenced on Illumina MiSeq, collecting 50-cycle single-end reads. Analysis of the sequencing data led to profiling of ribonucleotide incorporation in yeast genomic DNA.
We thank S. Garrey for AtRNL and Tpt1 protein purification, F. Stewart for technical support, and all members of the Storici laboratory for experimental suggestions. This work was supported by US National Science Foundation award number MCB-1021763 (to F.S.), Georgia Research Alliance award number R9028 (to F.S.), an American Cancer Society Research Scholar Grant (to J.R.H.), a Damon Runyon-Rachleff Innovation Award from the Damon Runyon Cancer Research Foundation (to J.R.H.) and the University of Colorado Golfers Against Cancer (to J.R.H.).
Figure 1: Scheme of ribose-seq
‘R’ in red denotes an rNMP while ‘P’ indicates a monophosphate group.
Figure 2: Ribose-seq library from genomic DNA of RNase H2-deficient S. cerevisiae cells
Appropriate PCR products were analyzed by PAGE. ‘P’ indicates Primers-only.
Ribose-seq: global mapping of ribonucleotides embedded in genomic DNA, Kyung Duk Koh, Sathya Balachander, Jay R Hesselberth, and Francesca Storici, Nature Methods 12 (3) 251 - 257 doi:10.1038/nmeth.3259
Kyung Duk Koh & Francesca Storici, School of Biology, Georgia Institute of Technology
Jay Hesselberth, Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical School
Source: Protocol Exchange (2015) doi:10.1038/protex.2015.044. Originally published online 19 May 2015.