3
CORRESPONDENCE Splinkerette PCR for more efficient characterization of gene trap events To the editor: A large-scale international mouse mutagen- esis program was recently started in Europe, the US and Canada with the aim of knocking out every gene in the mouse genome using a combination of gene targeting and gene trap- ping 1 . Unlike gene targeting, gene trapping is performed with generic vectors that simulta- neously mutate and report the expression of an endogenous gene at the site of insertion and provide a DNA tag for rapid identifica- tion of the disrupted gene. In most gene trap screens, genes are identified using 5RACE to amplify the cellular sequences appended to the gene trap fusion transcripts expressed at the insertion sites 2–4 . However, RACE tags have two major disadvantages. First, they cannot uncover the exact position of insertion sites, which are usually a considerable distance from the RACE tags and therefore cannot be used directly for mouse genotyping. Second, because 5RACE is largely dependent on gene expression, the major determinant for identifying a trapped gene in an embryonic stem cell (ESC) is its expression level. As most high-throughput trapping screens use highly sensitive G418 selection, more than 50% of the G418-resistant ESC lines are usually lost because the levels of trapped gene expression are below the gene identification threshold imposed by RACE 2,4,5 . To address these prob- lems, we adapted a splinkerette-adaptor PCR (SPLK) 6 to the high-throughput amplification of genomic sequences flanking the gene trap integration sites (Supplementary Fig. 1 and Supplementary Table 1 online). We validated the strategy by subjecting 3,782 mouse ESC lines trapped with the retro- viral gene trap vectors Rosaβgeo (n = 1,824) 7 or FlipRosaβgeo (n = 1,958) 3 to three distinct high-throughput PCR amplification proto- cols: 5SPLK for genomic sequences adjacent to the provirus 5end, 3SPLK for genomic sequences adjacent to the provirus 3end and 5RACE for cellular sequences appended to gene trap fusion transcripts ( Supplementary Methods online). Of 7,075 unambigu- ous tags (Supplementary Note online), 95% corresponded to annotated genes. At least one SPLK tag was obtained from about 90% of the trapped ESC lines (n = 3,326). However, only half of the ESC lines (n = 1,688) yielded RACE tags, which is consistent with success rates previously reported in gene trap screens 2,4,5 . The average SPLK tag was more than double the size of the average RACE tag (372 nt versus 172 nt), as was its aver- age maximum BLAST score (450 versus 193), indicating that the SPLK approach is more robust. Thus, the SPLK technology out- performed 5RACE by nearly twofold in the high-throughput gene trap screens (Fig. 1a). To investigate whether gene expression is a critical factor for a successful RACE, we analyzed a second set of 3,355 ESC lines trapped with the gene trap vector FlipRosaβgeo*. FlipRosaβgeo* is similar to FlipRosaβgeo, but the βgeo fusion gene (βgeo*) encodes a more active neomycin phosphotransferase that requires fewer molecules to confer G418 resistance 8,9 . Therefore, this vec- tor is expected to capture genes beyond the βgeo limit of detec- tion. Indeed, based on SPLK tags, FlipRosaβgeo* trapped twice as many new genes as compared with FlipRosaβgeo and Rosaβgeo combined (Fig. 1a). However, since over 70% of the FlipRosaβgeo*-trapped ESC lines yielded no RACE tag, βgeo* trapping rates based on RACE tags were significantly below βgeo trap- ping rates (Fig. 1a), suggesting that in most cell lines, βgeo* expression was below the RACE amplification threshold. Most importantly, for the FlipRosaβgeo*-trapped ESC lines, the NATURE GENETICS | VOLUME 39 | NUMBER 8 | AUGUST 2007 933 500 0 1,000 1,500 2,000 2,500 3,000 3,500 2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0 Number of ESC lines Number of unique genes Number of ‘hard-to-trap genes’ Number of unique target genes 0 50 100 150 200 250 300 350 400 0 10 20 30 40 50 60 70 a b Figure 1 Gene trapping efficiencies achieved with SPLK and RACE technologies. (a) Overall trapping efficiency. The number of new genes hit with accumulating ESC lines was estimated using either SPLK (solid lines) or RACE tags (dashed lines). (b) Trapping rate of ‘hard-to-trap genes’. Hard-to-trap genes are defined as those that have zero entries or only one entry in the International Gene Trap Consortium and Omnibank I gene trap libraries. The number of hard-to-trap genes hit among accumulating unique genes trapped was estimated for SPLK (solid lines) and RACE tags (dashed lines). © 2007 Nature Publishing Group http://www.nature.com/naturegenetics

Splinkerette PCR for more efficient characterization of gene trap events

Embed Size (px)

Citation preview

Page 1: Splinkerette PCR for more efficient characterization of gene trap events

CORRESPONDENCE

Splinkerette PCR for more efficient characterization of gene trap events

To the editor:A large-scale international mouse mutagen-esis program was recently started in Europe, the US and Canada with the aim of knocking out every gene in the mouse genome using a combination of gene targeting and gene trap-ping1. Unlike gene targeting, gene trapping is performed with generic vectors that simulta-neously mutate and report the expression of an endogenous gene at the site of insertion and provide a DNA tag for rapid identifica-tion of the disrupted gene. In most gene trap screens, genes are identified using 5′ RACE to amplify the cellular sequences appended to the gene trap fusion transcripts expressed at the insertion sites2–4. However, RACE tags have two major disadvantages. First, they cannot uncover the exact position of insertion sites, which are usually a considerable distance from the RACE tags and therefore cannot be used directly for mouse genotyping. Second, because 5′ RACE is largely dependent on gene expression, the major determinant for identifying a trapped gene in an embryonic stem cell (ESC) is its expression level. As most high-throughput trapping screens use highly sensitive G418 selection, more than 50% of the G418-resistant ESC lines are usually lost because the levels of trapped gene expression are below the gene identification threshold imposed by RACE2,4,5. To address these prob-lems, we adapted a splinkerette-adaptor PCR (SPLK)6 to the high-throughput amplification of genomic sequences flanking the gene trap integration sites (Supplementary Fig. 1 and Supplementary Table 1 online).

We validated the strategy by subjecting 3,782 mouse ESC lines trapped with the retro-viral gene trap vectors Rosaβgeo (n = 1,824)7 or FlipRosaβgeo (n = 1,958)3 to three distinct high-throughput PCR amplification proto-cols: 5′ SPLK for genomic sequences adjacent to the provirus 5′ end, 3′ SPLK for genomic sequences adjacent to the provirus 3′ end and 5′ RACE for cellular sequences appended to

gene trap fusion transcripts (Supplementary Methods online). Of 7,075 unambigu-ous tags (Supplementary Note online), 95% corresponded to annotated genes. At least one SPLK tag was obtained from about 90% of the trapped ESC lines (n = 3,326). However, only half of the ESC lines (n = 1,688) yielded RACE tags, which is consistent with success rates previously reported in gene trap screens2,4,5. The average SPLK tag was more than double the size of the average RACE tag (372 nt versus 172 nt), as was its aver-age maximum BLAST score (450 versus 193), indicating that the SPLK approach is more robust. Thus, the SPLK technology out-performed 5′ RACE by nearly twofold in the high-throughput gene trap screens (Fig. 1a).

To investigate whether gene expression is a critical factor for a successful RACE, we analyzed a second set of 3,355 ESC lines trapped with the gene trap vector FlipRosaβgeo*. FlipRosaβgeo* is similar to FlipRosaβgeo, but the βgeo fusion gene (βgeo*) encodes a more active neomycin phosphotransferase that requires fewer molecules to confer G418 resistance8,9. Therefore, this vec-tor is expected to capture genes beyond the βgeo limit of detec-tion. Indeed, based on SPLK tags, FlipRosaβgeo* trapped twice as many new genes as compared with FlipRosaβgeo and Rosaβgeo combined (Fig. 1a). However, since over 70% of the FlipRosaβgeo*-trapped ESC lines yielded no RACE tag, βgeo* trapping rates based on

RACE tags were significantly below βgeo trap-ping rates (Fig. 1a), suggesting that in most cell lines, βgeo* expression was below the RACE amplification threshold. Most importantly, for the FlipRosaβgeo*-trapped ESC lines, the

NATURE GENETICS | VOLUME 39 | NUMBER 8 | AUGUST 2007 933

5000 1,000 1,500 2,000 2,500 3,000 3,500

2,000

1,800

1,600

1,400

1,200

1,000

800

600

400

200

0

Number of ESC lines

Num

ber

of u

niqu

e ge

nes

Num

ber

of ‘h

ard-

to-t

rap

gene

s’

Number of unique target genes

0 50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

a

b

Figure 1 Gene trapping efficiencies achieved with SPLK and RACE technologies. (a) Overall trapping efficiency. The number of new genes hit with accumulating ESC lines was estimated using either SPLK (solid lines) or RACE tags (dashed lines). (b) Trapping rate of ‘hard-to-trap genes’. Hard-to-trap genes are defined as those that have zero entries or only one entry in the International Gene Trap Consortium and Omnibank I gene trap libraries. The number of hard-to-trap genes hit among accumulating unique genes trapped was estimated for SPLK (solid lines) and RACE tags (dashed lines).

©20

07 N

atur

e P

ublis

hing

Gro

up

http

://w

ww

.nat

ure.

com

/nat

ureg

enet

ics

Page 2: Splinkerette PCR for more efficient characterization of gene trap events

CORRESPONDENCE

SPLK approach yielded almost six times as many mutant lines as compared with 5′ RACE (Fig. 1a), indicating that RACE technology substantially reduces gene trapping efficiency and should therefore be avoided, particularly in highly sensitive screens.

Finally, in addition to the increased overall efficiency of the SPLK approach, SPLK tags expose a larger fraction of insertions into genes that can be considered difficult to trap based on their poor representation in existing gene trap libraries presently covering over 66% of the mouse genome (Fig. 1b)10.

We conclude that the SPLK approach makes it possible to mutate and identify genes that are poorly expressed in ESCs much more effec-tively and thereby expands the pool of genes accessible via trapping. This expansion may reduce the need for use of the more labori-ous gene targeting approach to inactivate such genes.

All sequence tags and the corresponding ESC lines referred to in this correspondence can be accessed on the German Gene Trap Consortium’s website (http://www.genetrap.de), which now shows the exact position of

gene trap insertion sites in the genome.

Carsten Horn1,5,6, Jens Hansen2,5,6, Frank Schnütgen3,5, Claudia Seisenberger2,5, Thomas Floss2,5, Markus Irgang1,5, Silke De-Zolt3,5, Wolfgang Wurst2,5, Harald von Melchner3,5,7 & Patricia Ruiz Noppinger1,4,5,7

1Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany. 2Institute of Developmental Genetics, GSF-National Research Center for Environment and Health, 85764 Neuherberg, Germany. 3Department for Molecular Hematology, University of Frankfurt Medical School, 60590 Frankfurt am Main, Germany. 4Center for Cardiovascular Research, Charité–Universitätsmedizin, 10115 Berlin, Germany. 5The German Gene Trap Consortium. 6These authors contributed equally to this work. 7These authors contributed equally to this work.e-mail: [email protected] or [email protected]

Note: Supplementary information is available on the Nature Genetics website.

ACKNOWLEDGMENTSWe thank T. Cox and L. von Melchner for helpful

discussions and for review of the final manuscript. We also thank C. Werner, B. Thalke, M. Hollatz, D. German, A.-T. Tieu and S. Buchner for technical assistance. This work was supported by grants from the Bundesministerium für Bildung und Forschung (BMBF) to the German Gene Trap Consortium (GGTC) and by grants from the European Union to the European Conditional Mouse Mutagenesis (EUCOMM) program.

COMPETING INTERESTS STATEMENTThe authors declare no competing financial interests.

1. International Mouse Knockout Consortium, Collins, F.S., Rossant, J. & Wurst, W. Cell 128, 9–13 (2007).

2. Stryke, D. et al. Nucleic Acids Res. 31, 278–281 (2003).

3. Schnutgen, F. et al. Proc. Natl. Acad. Sci. USA 102, 7221–7226 (2005).

4. Hansen, J. et al. Proc. Natl. Acad. Sci. USA 100, 9918–9922 (2003).

5. Zambrowicz, B.P. et al. Proc. Natl. Acad. Sci. USA 100, 14109–14114 (2003).

6. Devon, R.S., Porteous, D.J. & Brookes, A.J. Nucleic Acids Res. 23, 1644–1645 (1995).

7. Friedrich, G. & Soriano, P. Genes Dev. 5, 1513–1523 (1991).

8. Skarnes, W.C., Moss, J.E., Hurtley, S.M. & Beddington, R.S. Proc. Natl. Acad. Sci. USA 92, 6592–6596 (1995).

9. Yenofsky, R.L., Fine, M. & Pellow, J.W. Proc. Natl. Acad. Sci. USA 87, 3435–3439 (1990).

10. Skarnes, W.C. et al. Nat. Genet. 36, 543–544 (2004).

934 VOLUME 39 | NUMBER 8 | AUGUST 2007 | NATURE GENETICS

©20

07 N

atur

e P

ublis

hing

Gro

up

http

://w

ww

.nat

ure.

com

/nat

ureg

enet

ics

Page 3: Splinkerette PCR for more efficient characterization of gene trap events

ERRATA

Erratum: Splinkerette PCR for more efficient characterization of gene trap events Carsten Horn, Jens Hansen, Frank Schnutgen, Claudia Seisenberger, Thomas Floss, Markus Irgang, Silke De-Zolt, Wolfgang Wurst, Harald von Melchner & Patricia Ruiz NoppingerNature Genetics 39, 933–934 (2007); published online 27 July 2007; corrected after print 30 October 2007

In the version of this article initially published, the second author (Jens Hansen) should have been listed as an equal contributor with the first author. The last two authors (Harald von Melchner and Patricia Ruiz Noppinger) should have been listed as corresponding authors. The error has been corrected in the HTML and PDF versions of the article.

©20

07 N

atur

e P

ublis

hing

Gro

up

http

://w

ww

.nat

ure.

com

/nat

ureg

enet

ics