1
CREATING A MACAQUE SPECIFIC OLIGONUCLEOTIDE MICROARRAY Sean Proll 1 , Matthew Fitzgibbon 1 , Matthew Thomas 1 , Michael Agy 1 , Marcus Korth 1 , Jim Wallace 1 , Campion Fellin 3 , Douglas Miner 3 , Christina Scherer 3 , Shawn Iadonato 3 , Charles Magness 3 , and Michael Katze 1,2 . 1 Department of Microbiology and 2 Washington National Primate Research Center, University of Washington, Seattle, WA 98195. 3 Illumigen Biosciences Inc., 2203 Airport Way South, Seattle, WA 98134 ABSTRACT FUNCTIONAL GENOMIC RESOURCES MATERIALS AND METHODS 0 5000 10000 15000 20000 25000 30000 35000 40000 2000 2001 2002 2003 2004 Pan troglodytes M acaca m ulatta M acaca fascicularis M acaca nem estrina G orilla gorilla Pongo pygm aeus Other Fig. 1 Genbank nonhuman primate EST growth from 2000 to 2004. Graph represents totals as of mid-June each year. As of 29 September 2004 Genbank contains over 46,000 macaque ESTs, 36,340 of them submitted by the Katze Lab. Total RNA Isolation and mRNA Isolation Tissue was extracted at the time of necropsy and immediately placed in RNAlater storage solutions (Ambion, USA). Tissue was homogenized in Solution D using the Polytron tissue homogenizer. The total RNA was extracted with phenol:choloroform and further purified using Qiagen RNeasy purification columns (Qiagen, USA). Extraction of mRNA was done with the Invitrogen FastTrack 2.0 mRNA extraction kit (Invitrogen, USA). The quality and quantity of both the total and messenger RNA were confirmed by spectrophotometry and capillary-electrophoresis on the Agilent BioAnalyzer 2100 (Agilent Technologies, USA). cDNA Library Construction cDNA libraries were constructed via two methods. The spleen mononuclear lymphocyte, both placental, brain, lung, and one activated PBMC library were constructed using 3-5 µg of high quality mRNA with the Stratagene Uni- ZAP cDNA library construction kit (Stratagene, USA). Clones were isolated by ampicillin resistance, and grown in 96-well plates with LB-ampicillin medium. The liver, duodenum, ileum, jejunum, testes, ovarian, and another activated PBMC library were constructed using 3-5 µg of high quality mRNA with the Invitrogen CloneMiner cDNA construction kit (Invitrogen, USA). Clones were isolated by kanamycin resistance, and grown in 96-well plates with LB-kanamycin medium. To check clone size and presence, PCR was performed using the following primers on an ABI 9700 thermal cycler (Applied Biosystems, USA). For the Stratagene pBluescript SK (+/-) vector the primers are: MCQrXho1: CACTATAGGGCGAATTGGGTA MCQfEcoR1: CCCTCACTAAAGGGAACAAAA-sequencing primer For the Invitrogen pDONR222 vector the primers are: pDONR222F1: GACGTTGTAAAACGACGGC-sequencing primer pDONR222R1: GCCAGGAAACAGCTATGACC Microarray Probe Synthesis and Hybridization Total RNA was isolated from macaque (M. mulatta) spleen mononuclear cells and brain tissue. The quality and concentration of the extracted total RNA was verified using the BioAnalyzer 2100 and Nanodrop spectrophotometer. Labeled cRNA probes were generated using the Low Input RNA Probe Synthesis Kit (Agilent Biosciences, USA) as per the manufacturer protocol for 11k postage stamp oligonucleotide microarrays. The probes were hybridized in replicate to the custom made M. mulatta oligonucleotide array (Agilent Biosciences, USA) as per the manufacturer’s protocol. Slides were scanned with an Agilent scanner and analyzed with Agilent Feature Extractor Fig. 2 The Macaque.org web portal provides updated information on our sequencing efforts and related resources. Fig. 3 The sequence collection may be searched by gene symbol. Each search result is hyperlinked to related information and to a web based system for requesting specific clones from our collection. Fig. 4 The collection may also be searched by sequence similarity using MegaBLAST. OLIGO ARRAY DESIGN AND CONSTRUCTION OLIGO ARRAY VALIDATION AND PRELIMINARY EXPERIMENTS Table 1 Data Summary . Graph represents totals as of mid-June each year. As of 29 September 2004 Genbank contains over 46,000 macaque ESTs, 36,340 of them submitted by the Katze Lab. GENBANK NONHUMAN PRIMATE EST GROWTH EST Sequencing Summary We thank Robert Norgren and Eliot Spindel for graciously providing sequences from their Targeted Sequencing of Human Orthologs to include in our collection. The sequencing project and Web site are funded by Public Health Service grants R24RR16354 and P51RR00166 from the National Center for Research Resources . SEATTLE INTERNATIONAL CONFERENCE ON PRIMATE GENOMICS, MARCH 20 th – 23 rd 2005 SEATTLE, WA Target Selection Probe Selection Array Specifications 10,807 Total Probes 9103 Unique non-control 6850 Macaque EST derived 1123 Macaque targeted sequences 1014 Agilent Human Catalog probes 96 Viral probes 20 Reserved for spike-in controls Fig. 5 Pairs of probes derived from macaque sequence show strong agreement. Here the log(Ratio) from one member of each pair is plotted against the log(Ratio) of its partner when hybridized with macaque spleen vs. macaque brain mRNA. Fig. 6 In the same hybridization, we see that many probes derived from macaque spleen sequence (orange) are upregulated relative to probes derived from macaque brain sequence (blue). There is some cross-over, since the probes were not selected for absolute tissue specificity. In collaboration with Agilent Technologies we have leveraged our EST sequence resource to generate the first commercially available macaque-specific oligonucleotide microarray. Preliminary experiments show strong hybridization as well as high correlation (r=.95) between non-overlapping pairs of probes designed to the same target (Figure 5). We have probed the arrays with two tissues sampled in our sequencing efforts, macaque spleen vs. brain. A subset of differentially regulated probes is shown in Figure 6. Probes derived from our spleen library (orange) are up-regulated with respect to probes derived from brain (blue). Because none of the target sequences were selected to be tissue specific, some cross-over was expected. EST reads with no similarity to transcribed sequence currently available (dbEST, Unigene) are being followed up with RT-PCR, microarray experiments as well as bioinformatic analysis. Acknowledgements We are building genomic resources to support studies in non- human primates, with particular emphasis on studies of virus- host interactions. This effort has produced cloned libraries from eleven tissues harvested from rhesus monkeys (Macaca mulatta) of Indian and Chinese origin. To date we have sequenced 48,462 clones, and submitted 30,077 high-quality expressed sequence tags (ESTs) to GenBank. In addition, EST data are disseminated to the public through the macaque.org website with additional search capabilities, such as searching for ESTs by likely human ortholog. Building upon our sequencing and analysis of the transcriptome of the rhesus macaque, we have collaborated with Agilent Technologies to construct the first commercially available macaque specific oligonucleotide microarray, delivered in fall 2004. Candidate target sequences for oligonucleotide design were selected by an automated pipeline of tools, including several from the University of Washington Department of Genome Sciences and TIGR. Target sequences from this pipeline were submitted to Agilent Technologies for oligonucleotide probe design. Our custom-built Probe Selection Pipeline then processed all the probes to manage the annotation, cross hybridization potential, sequence composition and position of each probe on each target sequence. All probes were mapped back to individual reads from our cDNA libraries. This first generation macaque array contains 11K features with two probes designed for each macaque sequence, representing ~4K unique macaque genes. Gene Lists and other material are made available through macaque.org. This array platform will serve as a foundational element in our ongoing studies using macaque infection models to study simian immunodeficiency virus and influenza. PLANS FOR NEXT GENERATION OLIGO ARRAY SUMMARY Comparison of Probes with Rhesus Genome Fig. 7 Rhesus oligonucleotide probes compared (using MegaBLAST) with the currently available rhesus genomic sequence. ~85% have an exact match with ~94% having zero or one mismatch when compared to genomic sequence. GOAL Incorporate multiple resources to increase coverage as well as improve probe quality. PLAN 1.Incorporate more EST sequences, both internally and externally produced 2.Leverage publicly available rhesus genomic sequence to extend our coverage as well as to improve probe quality 3.Utilize the Human Refseq genes in addition to the rhesus genomic sequence to better understand gene structure and therefore improve probe position 4.Continue to utilize the rhesus Targeted Sequencing Effort being provided by Rob Norgren and Eliot Spindel 5.Utilize rhesus/human/chimp comparative mappings to better annotate our probes and create links out to multiple resources #66 For Research Use Only. Not for use in diagnostic procedures.

CREATING A MACAQUE SPECIFIC OLIGONUCLEOTIDE MICROARRAY Sean Proll 1, Matthew Fitzgibbon 1, Matthew Thomas 1, Michael Agy 1, Marcus Korth 1, Jim Wallace

Embed Size (px)

Citation preview

Page 1: CREATING A MACAQUE SPECIFIC OLIGONUCLEOTIDE MICROARRAY Sean Proll 1, Matthew Fitzgibbon 1, Matthew Thomas 1, Michael Agy 1, Marcus Korth 1, Jim Wallace

CREATING A MACAQUE SPECIFIC OLIGONUCLEOTIDE MICROARRAY Sean Proll1, Matthew Fitzgibbon1, Matthew Thomas1, Michael Agy1, Marcus Korth1, Jim Wallace1, Campion Fellin3, Douglas

Miner3, Christina Scherer3, Shawn Iadonato3, Charles Magness3, and Michael Katze1,2.1Department of Microbiology and 2Washington National Primate Research Center, University of Washington, Seattle, WA 98195.

3Illumigen Biosciences Inc., 2203 Airport Way South, Seattle, WA 98134

ABSTRACT FUNCTIONAL GENOMIC RESOURCES

MATERIALS AND METHODS

0

5000

10000

15000

20000

25000

30000

35000

40000

2000 2001 2002 2003 2004

Pan troglodytes

Macaca mulatta

Macaca fascicularis

Macaca nemestrina

Gorilla gorilla

Pongo pygmaeus

Other

Fig. 1 Genbank nonhuman primate EST growth from 2000 to 2004. Graph represents totals as of mid-June each year. As of 29 September 2004 Genbank contains over 46,000 macaque ESTs, 36,340 of them submitted by the Katze Lab.

Total RNA Isolation and mRNA IsolationTissue was extracted at the time of necropsy and immediately placed in RNAlater storage solutions (Ambion, USA). Tissue was homogenized in Solution D using the Polytron tissue homogenizer. The total RNA was extracted with phenol:choloroform and further purified using Qiagen RNeasy purification columns (Qiagen, USA). Extraction of mRNA was done with the Invitrogen FastTrack 2.0 mRNA extraction kit (Invitrogen, USA). The quality and quantity of both the total and messenger RNA were confirmed by spectrophotometry and capillary-electrophoresis on the Agilent BioAnalyzer 2100 (Agilent Technologies, USA).cDNA Library ConstructioncDNA libraries were constructed via two methods. The spleen mononuclear lymphocyte, both placental, brain, lung, and one activated PBMC library were constructed using 3-5 µg of high quality mRNA with the Stratagene Uni-ZAP cDNA library construction kit (Stratagene, USA). Clones were isolated by ampicillin resistance, and grown in 96-well plates with LB-ampicillin medium. The liver, duodenum, ileum, jejunum, testes, ovarian, and another activated PBMC library were constructed using 3-5 µg of high quality mRNA with the Invitrogen CloneMiner cDNA construction kit (Invitrogen, USA). Clones were isolated by kanamycin resistance, and grown in 96-well plates with LB-kanamycin medium. To check clone size and presence, PCR was performed using the following primers on an ABI 9700 thermal cycler (Applied Biosystems, USA).For the Stratagene pBluescript SK (+/-) vector the primers are:MCQrXho1: CACTATAGGGCGAATTGGGTAMCQfEcoR1: CCCTCACTAAAGGGAACAAAA-sequencing primer For the Invitrogen pDONR222 vector the primers are:pDONR222F1: GACGTTGTAAAACGACGGC-sequencing primerpDONR222R1: GCCAGGAAACAGCTATGACC Microarray Probe Synthesis and HybridizationTotal RNA was isolated from macaque (M. mulatta) spleen mononuclear cells and brain tissue. The quality and concentration of the extracted total RNA was verified using the BioAnalyzer 2100 and Nanodrop spectrophotometer. Labeled cRNA probes were generated using the Low Input RNA Probe Synthesis Kit (Agilent Biosciences, USA) as per the manufacturer protocol for 11k postage stamp oligonucleotide microarrays. The probes were hybridized in replicate to the custom made M. mulatta oligonucleotide array (Agilent Biosciences, USA) as per the manufacturer’s protocol. Slides were scanned with an Agilent scanner and analyzed with Agilent Feature Extractor (Agilent Biosciences, USA), then loaded into our local database for analysis with Rosetta Resolver (Rosetta Biosoftware) and Spotfire DecisionSite.

Fig. 2 The Macaque.org web portal provides updated information on our sequencing efforts and related resources.

Fig. 3 The sequence collection may be searched by gene symbol. Each search result is hyperlinked to related information and to a web based system for requesting specific clones from our collection.

Fig. 4 The collection may also be searched by sequence similarity using MegaBLAST.

OLIGO ARRAY DESIGN AND CONSTRUCTION

OLIGO ARRAY VALIDATION AND PRELIMINARY EXPERIMENTS

Table 1 Data Summary . Graph represents totals as of mid-June each year. As of 29 September 2004 Genbank contains over 46,000 macaque ESTs, 36,340 of them submitted by the Katze Lab.

GENBANK NONHUMAN PRIMATE EST GROWTH

EST Sequencing Summary

We thank Robert Norgren and Eliot Spindel for graciously providing sequences from their Targeted Sequencing of Human Orthologs to include in our collection. The sequencing project and Web site are funded by Public Health Service grants R24RR16354 and P51RR00166 from the National Center for Research Resources .

SEATTLE INTERNATIONAL CONFERENCE ON PRIMATE GENOMICS, MARCH 20th – 23rd 2005 SEATTLE, WA

Target Selection Probe Selection

Array Specifications 10,807 Total Probes 9103 Unique non-control 6850 Macaque EST derived 1123 Macaque targeted sequences 1014 Agilent Human Catalog probes 96 Viral probes 20 Reserved for spike-in controls

Fig. 5 Pairs of probes derived from macaque sequence show strong agreement. Here the log(Ratio) from one member of each pair is plotted against the log(Ratio) of its partner when hybridized with macaque spleen vs. macaque brain mRNA.

Fig. 6 In the same hybridization, we see that many probes derived from macaque spleen sequence (orange) are upregulated relative to probes derived from macaque brain sequence (blue). There is some cross-over, since the probes were not selected for absolute tissue specificity.

In collaboration with Agilent Technologies we have leveraged our ESTsequence resource to generate the first commercially available macaque-specific oligonucleotide microarray. Preliminary experiments show strong hybridization as well as high correlation (r=.95) betweennon-overlapping pairs of probes designed to the same target (Figure 5).

We have probed the arrays with two tissues sampled in our sequencing efforts, macaque spleen vs. brain. A subset of differentially regulated probes is shown in Figure 6. Probes derived from our spleen library(orange) are up-regulated with respect to probes derived from brain(blue). Because none of the target sequences were selected to betissue specific, some cross-over was expected.

EST reads with no similarity to transcribed sequence currently available (dbEST, Unigene) are being followed up with RT-PCR, microarray experiments as well as bioinformatic analysis.

Acknowledgements

We are building genomic resources to support studies in non-human primates, with particular emphasis on studies of virus-host interactions. This effort has produced cloned libraries from eleven tissues harvested from rhesus monkeys (Macaca mulatta) of Indian and Chinese origin. To date we have sequenced 48,462 clones, and submitted 30,077 high-quality expressed sequence tags (ESTs) to GenBank. In addition, EST data are disseminated to the public through the macaque.org website with additional search capabilities, such as searching for ESTs by likely human ortholog. Building upon our sequencing and analysis of the transcriptome of the rhesus macaque, we have collaborated with Agilent Technologies to construct the first commercially available macaque specific oligonucleotide microarray, delivered in fall 2004. Candidate target sequences for oligonucleotide design were selected by an automated pipeline of tools, including several from the University of Washington Department of Genome Sciences and TIGR. Target sequences from this pipeline were submitted to Agilent Technologies for oligonucleotide probe design. Our custom-built Probe Selection Pipeline then processed all the probes to manage the annotation, cross hybridization potential, sequence composition and position of each probe on each target sequence. All probes were mapped back to individual reads from our cDNA libraries. This first generation macaque array contains 11K features with two probes designed for each macaque sequence, representing ~4K unique macaque genes. Gene Lists and other material are made available through macaque.org. This array platform will serve as a foundational element in our ongoing studies using macaque infection models to study simian immunodeficiency virus and influenza.

PLANS FOR NEXT GENERATION OLIGO ARRAY SUMMARY

Comparison of Probes with Rhesus Genome

Fig. 7 Rhesus oligonucleotide probes compared (using MegaBLAST) with the currently available rhesus genomic sequence. ~85% have an exact match with ~94% having zero or one mismatch when compared to genomic sequence.

GOALIncorporate multiple resources to increase coverage as well as improve probe quality.

PLAN1. Incorporate more EST sequences, both internally and

externally produced

2. Leverage publicly available rhesus genomic sequence to

extend our coverage as well as to improve probe quality

3. Utilize the Human Refseq genes in addition to the rhesus

genomic sequence to better understand gene structure

and therefore improve probe position

4. Continue to utilize the rhesus Targeted Sequencing Effort

being provided by Rob Norgren and Eliot Spindel

5. Utilize rhesus/human/chimp comparative mappings to

better annotate our probes and create links out to multiple

resources

#66

For Research Use Only. Not for use in diagnostic procedures.