79
An Atlas of Gene Expression in Mouse Development www.mouseatlas. www.mouseatlas. org org •technology development •technology implementation •public access

An Atlas of Gene Expression in Mouse Development technology development technology implementation public access

Embed Size (px)

Citation preview

Page 1: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

An Atlas of Gene Expression in

Mouse Development

www.mouseatlas.orgwww.mouseatlas.org

•technology development

•technology implementation

•public access

Page 2: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Pipeline

– Known coding elements– New genes (housekeeping and regulated)– New transcripts– New exons– New regulatory RNAs

Tissues

RNAs

Tags

Transcribed Features

•Tag-to-gene mapping

•longSAGE•longSAGE Lite

•Manual dissection•Laser capture microdissection•RNA purification

Bioinformatics

Page 3: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Major technical accomplishments

• Established a SAGE library construction pipeline aimed at constructing 150 libraries by March 31, 2005.

• Established methods for tissue acquisition and dissection (manual and LCM) that yield high-quality mRNA for SAGE.

• Established methods for construction of

SAGE libraries from nanograms of total RNA.

• Established bioinformatics pipeline for extraction and analysis of 21mer SAGE tags.

Page 4: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

SAGE library production queue

88 samples in queue(57% complete)

61 libraries constructed

52 libraries passed QC

39 libraries complete(26 % complete)

179,000 sequence reads5.7 million tags33 tags/read

Page 5: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

The majority of tags can be mapped to existing sequence

datasets

Tag Frequency

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6-10 11-50 51-100 101-1000 >1000

Pro

port

ion o

f ta

gs

Page 6: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Most transcripts are “hit” by a

SAGE tag

4,552,635 tags; 543,545 transcripts

Tags mapped: XXXX

mgcmouse: 22419/24607 0.911

refmouse: 13871/18212 0.762

refmouseX: 17464/25362 0.689

refmouseGS: 19795/42393 0.467

Page 7: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Detection of 3’ endvariants

Mouse Atlas SAGE meta-library:4,552,635 tags, 543,545 tag types

61% of moderately abundant transcripts show multiple tag positions

18,775 transcripts(8,400 genes) >10

52% of highly abundant transcripts show multiple tag position

6,888 transcripts(3,550 genes) >=100

Comparison of all tags to 27,026 transcripts from refseq, refseqX, refseqGS, and mgc:

•Approximately 1.6 variants per locus•1 variant / locus for 58% of genes

•2.8 variants / locus for 42% of genes

Page 8: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Tag-to-genome mapping

Location All tags > 7 tags• Exons 25% 73%• Introns 18% 1.4%• 5kb from a UTR 11.5% 11.5%• Intergenic 15% 4.4%• Minus strand 29.4% 8.9%

The distribution of tags or their annotation varies with the level of expression

62 % of tag “types” map uniquely to the genome

Page 9: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Website Usage

Page 10: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Website Usage

Page 11: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Relationship Between Mouse Efforts

SAGE MouseAltas Project

(Marra and Hoodless)150 SAGE Libraries

MPSS MouseProject

(Chris Austin)94 MPSS Libraries

CGAPSAGEgenie(Greg Riggins)

PublicAccessibility

244 Digital Libraries

5 RNAs to bedirectly compared

Transfer of data(34 libraries to date)

Other efforts to note: Australia, Czech website

Page 12: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Total co-funding

Page 13: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Supplemental Slides

Page 14: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Most transcripts are “hit” by a

SAGE tag 4,552,635 total tags; 543,545 transcripts

Total tags mapped to any transcript = XXXX

mgcmouse anywhere: 22419/24607 = 0.91108

mgcmouse position 1: 20981/24607 = 0.85264

refmouse anywhere: 13871/18212 = 0.76164

refmouse position 1: 12176/18212 = 0.66857

refmouseX anywhere: 17464/25362 = 0.68859

refmouseX position 1: 14664/25362 = 0.57819

refmouseGS anywhere: 19795/42393 = 0.4669

refmouseGS position 1: 15947/42393 = 0.37617

Page 15: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Alternate 3’ ends: Multiple tags map to

a gene

AAAAAAA7 6 5 4 3 2 1

Positions in the transcript are defined by NlaIII sites.

Alternate 3’ end formation (alternate splicing) canresult in different tags identifying the same transcript.

AAAAAAA

AAAAAAA

7 6 5 4 3 2

7 6 5 4

Page 16: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Contaminants/ artifacts

• hnRNA (unspliced mRNA) and genomic DNA not likely to be a major contaminant. 18 % of all tags map to introns versus 1.4% of abundant tags (slide 8).

• Partial digestion not likely to be a major artifact. The majority (58%) of transcripts show only a single variant. Of N manually inspected examples that show multiple variants, m looked like the picture in the next slide

Page 17: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Tag position in the transcript

Splice variants

236 tags detected for transcript nm144802

AAAAAAA7 6 5 4 3 2 1

Page 18: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Rate of tag generation and tag quality

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

All

95%

99%

Required from May 04 – Mar 05: 1.176 million tags / mo(~36,000 reads)

Page 19: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

ProcessProcedural Step QC/QA

Tissues are dissected (LCM, manual)

RNA is extracted

SAGE libraries are built

Libraries are sequenced to >100,000 tags

Tags are mapped to genesDiscoverySpace @ BCGSCSageGenie @ CGAP

Tissue and library information is published

www.mouseatlas.org cgap.nci.nih.gov

Page 20: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Sage tag variation:• Approximately 1.6 variants per locus

•1 variant per locus for 40% of genes• 3.8 variants per locus for 60% of genes

•If there exist 30000 genes in the human genome

• Predict 50400 “3’ UTR variants”• Total of 80400 variants

•This assumes the ratio holds for all gene expression•Sage tag variants are a subset (3’utr) of splice variants

• 5’sage may expand the subset

Frequency of variants detected by SAGE

For all transcripts with count >=20 (15915 transcripts) from refseq, refseqX, refseqGS, and mgc

Page 21: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Novel gene discovery via SAGE

• 3519 tags occurred only in predicted transcripts– Intron and 3’utr locations indicate alternate splicing

• 150K (28%) tag types occurred only in the genome, not in transcripts– 9087(1.7%) with frequency > 10 and 24352(4.5%)

with frequency > 3– Would expect 2% polymorphism, except this is

inbred line?• 326K (60%) of tag types were unaccounted for

– 3450(0.6%) with frequency > 10, 13378(2.5%) with frequency > 3

– Sequence error, expectation is now known– Polymorphism, expect 2% of tag types, except this

is inbred line?– Spliced tags in novel transcripts, up to 6% of tag

types to be spliced

Many undiscovered transcripts exist, 2% at a moderate to high frequency, 10-15% at low frequency

Page 22: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Novel gene discovery via SAGE

• 3047 tags occurred only in predicted transcripts– Intron and 3’utr locations indicate alternate splicing

• 124K (36%) tag types occurred only in the genome, not in transcripts– 7022(2.0%) with frequency > 10 and 19155(5.5%)

with frequency > 3– Would expect 2% polymorphism, except this is

inbred line?• 162K (47%) of tag types were unaccounted for

– 641(0.2%) with frequency > 10 and 2384(0.7%) with frequency > 3

– Sequence error should be very low for non-singletons

– Polymorphism, expect 2% of tag types, except this is inbred line?

– Spliced tags in novel transcripts, up to 6% of tag types to be spliced

Many undiscovered transcripts exist, 2% at a moderate to high frequency, 10-15% at low frequency

Mouse Atlas 28 libraries after clustering tags

Page 23: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Detection of coding features

• N million tags representing N transcripts and at least N genes. (complexity)

• Distribution of transcript abundance (graph)

• coverage of refseq, MGC and unigene (complexity and breadth)

• N differentially expressed between any two stages at p < 0.001 (regulated)

• N not differentially expressed at p < 0.001 (housekeeping)

• N candidate new genes identified

• Genes that map, genes that don’t map, etc

• Quality of the data

Page 24: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Co-funding spent

Page 25: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Library construction rates

• Mouse libraries needed (May 04 – Feb 05).…………………….9.8 / mo

• Mouse libraries made (Oct 03 – Apr

04, excl. Dec 03)..…………6.2 / mo

• Mouse libraries made (Mar 04 – Apr 04)..………………………7 / mo

• All libraries made (Mar 04 – Apr 04)…………………………….9 / mo

• Most libraries made: (Oct 03, Mar 04)…………………………...10 / mo

Page 26: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

02468

101214

J J A S O N D J

Month (June 2004-January 2005)

Nu

mb

er o

f T

issu

es

Libraries Made 52Libraries in Progress 5Tissues Waiting for Library Construction 17Tissues Collected (but not yet delivered) 12Tissues To Be Collected 64

Tissue Acquisition Pipeline

Page 27: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

www.mouseatlas.org

cgap.nci.nih.gov

Page 28: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Training / Recruitment

Page 29: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Management

Page 30: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Detection and elimination of contamination

Page 31: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

101 SAGE libraries built

Page 32: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

101 SAGE libraries

Page 33: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Tags sequenced

Page 34: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Library Construction Scale-Up

0

2

4

6

8

10

12

Date (Month-Year)

Nu

mb

er l

ibra

ries

All libraries

Mouse

Page 35: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Rationale and Goals

Systematic association of expressed genes with precisely defined tissues sampled throughout development will enhance dramatically the mouse as a tool for developmental biologists and those seeking to understand the genetic basis of disease in murine models.

•To construct and sequence 150 SAGE libraries representing a variety of tissues and developmental stages

•To place these data in the public domain

Page 36: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

ProgressObjective 1: Define the normal state for many tissues by determining…the number and identity of genes expressed throughout development.

•Progress: longSAGE library construction pipeline established! LCM tissue harvesting explored. Tech. D. on small samples well advanced. SAGELite and PCRSAGE libraries constructed. N SAGELite libraries constructed and sequenced by March 31, 2005. Trans-NIH group completed tissue harvesting for 90 adult tissues. Projected MPSS data in public domain before Fall 2004. Tag-to-gene mapping (v.1) complete at Vancouver. Mouse SAGE Genie under development (Hopkins / CGAP; G. Riggins PI).•Plan for completion: Continue at current rate with increasing emphasis on small, manually- and LCM-dissected samples.

Objective 2: Establish a data structure / curation strategy that will facilitate the ongoing collection of gene expression data….

•Progress: www.mouseatlas.org active and www.ncbi.nlm.nih.gov/ncicgap/ will soon host data (N libraries submitted to S. Greenhut and C. Schaeffer).•Plan for completion: Essentially complete. Data and annotation from Mouse Atlas and NIH /LYNX will populate databases.

Objective 3: Assemble gene expression profiles [to] test hypotheses related to technologies, tumor models and models of abnormal development.•Progress: N % complete. List models so far. Philosophy has been

to focus on establishing pipeline for wild-type tissues as these nay be most relevant to broader community. Increased focus on models over remainder of project.•Plan for completion: Construct, sequence and analyze N libraries representing specific models, including….

Page 37: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Detailed MilestonesYear 1• Establish the project management and communication system. COMPLETE• Launch of project web site in public domain. The launch will include a registration onthe website of all of the tissues we intend to include in the Atlas. COMPLETE• Complete dissections of type A tissues (See Table 1). INCOMPLETE• Complete experiments to compare the use of amplified and non-amplified RNA inSAGE library construction. COMPLETE• Complete experiments to compare the use of RNA from tissues isolated by LaserMicrodissection and by manual dissection in SAGE library construction. INCOMPLETE• Implement SAGE Bioinformatic processing pipeline. This includes implementation ofsoftware to automatically perform quality control testing on the sequencing of thetags, and entry of the tags into SAGEdb for subsequent analysis. COMPLETE• Expand SAGEdb to accommodate dissection procedures and digital images ofmouse tissues used to generate the SAGE libraries. WEBSITE• Construct and sequence 40 (30) SAGE libraries and enter them in the database. COMPLETEYear 2• Complete dissections of type B and C tissues (See Table 1). INCOMPLETE• Construct and sequence 80 (60) SAGE libraries and enter them in the database. IN PROGRESS• Web enabled data mining tool available for SAGE library comparisons. IN PROGRESS• Use of bioinformatics to identify differentially expressed genes from SAGE librariesfor further analysis and to assess the quality of the libraries generated. IN PROGRESSYear 3• Complete dissections of type D tissues and mouse models (See Table 1). INCOMPLETE• Construct and sequence 80 (60) SAGE libraries and enter them in the database. INCOMPLETE• Complete quantitative RT-PCR (QPCR) and in situ hybridization analysis for qualitycontrol and quality assurance. IN PROGRESS• Complete SAGE library construction, sequencing and analysis on mouse models to testutility of the database. IN PROGRESS• Demonstrate of the potential uses of the Atlas through SAGE analysis of specificmouse models. IN PROGRESS• Generate a spin-off project based on SAGE-based discoveries from mouse models(cancer models, early embryogenesis, or ‘fierce’ mice, see below). COMPLETE• Identify candidate genes not previously reported in mouse databases. IN PROGRESS• Target corporate partnership to build microarrays based on new candidate genes. INCOMPLETE• Present research discoveries based on the Atlas at scientific conference (e.g.Gordon Conference). IN PROGRESS• Publish dataset in peer-reviewed journals. IN PROGRESS

Established collaboration with NIH group to compare MPSSand SAGE and coordinate effort. LYNX efforts focused on adults; BC efforts focused on earlier developmental stages.

Page 38: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Mouse Atlas: SAGE Library Construction

Genome Sciences CentreBC Cancer Agency

11th May 2004

Page 39: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

SAGE

LIBRARY

CONSTRUCTION

Page 40: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Agilent Bioanalyzer – RNA picochip, total RNA, 115 pg/uL

synthetic 25nt marker

Page 41: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

PCR Optimization on 12% Polyacrylamide Gel

My Network Places/ mapper.ro on Xena/ GeneExpLab/ Typhoon Data/ LongSAGE folder/ Library Folder/ gel name_date

25

bp

La

dde

r (2

0 n

g/u

L)

23 cycles 25 cycles

25

bp

La

dde

r (2

0 n

g/u

L)

27 cycles 35 cycles

25

bp

La

dde

r (2

0 n

g/u

L)

25

bp

La

dde

r (2

0 n

g/u

L)

Bre

w o

nly

(Bre

w c

ontr

ol)

1/2

0 d

il N

o L

iga

se (

-’ve

co

ntr

ol)

1/1

0 d

il L

S C

ont

rol t

em

pla

te (

+’v

e c

ontr

ol)

1/2

0 d

il L

iga

tion

1/4

0 d

il L

iga

tion

1/8

0 d

il L

iga

tion

Bre

w o

nly

(Bre

w c

ontr

ol)

Bre

w o

nly

(Bre

w c

ontr

ol)

Bre

w o

nly

(Bre

w c

ontr

ol)

1/2

0 d

il N

o L

iga

se (

-’ve

co

ntr

ol)

1/2

0 d

il N

o L

iga

se (

-’ve

co

ntr

ol)

1/2

0 d

il N

o L

iga

se (

-’ve

co

ntr

ol)

1/1

0 d

il L

S C

ont

rol t

em

pla

te (

+’v

e c

ontr

ol)

1/1

0 d

il L

S C

ont

rol t

em

pla

te (

+’v

e c

ontr

ol)

1/2

0 d

il L

iga

tion

1/2

0 d

il L

iga

tion

1/4

0 d

il L

iga

tion

1/4

0 d

il L

iga

tion

1/8

0 d

il L

iga

tion

1/8

0 d

il L

iga

tion

125 bp

150 bp

200 bp

100 bp

75 bp

50 bp

25 bp

175 bp

131 bp Ditag

Load 5uL of Ladder

Load 5uL of sample

Page 42: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

131bp Ditag on 12% Polyacrylamide Gel

My Network Places/ mapper.ro on Xena/ GeneExpLab/ Typhoon Data/ LongSAGE folder/ Library Folder/ gel name_date

Load 5uL of Ladder

Load 6 – 8 uL of sample per well

25bp Ladder(20ng/uL)

131 bpDitag

75 bp

100 bp

125 bp

150 bp

175 bp

200 bp

Page 43: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

36bp Ditag on 15% Polyacrylamide Gel

My Network Places/ mapper.ro on Xena/ GeneExpLab/ Typhoon Data/ LongSAGE folder/ Library Folder/ gel name_date

25bp ladder

25bp Ladder

25 bp

50 bp

75 bp

100 bp

125 bp

150 bp

175 bp

200 bp

131 bpUncut Ditag

84 bp and 87 bpPartially cutDitag

44bp and 47bpAdaptor sequence

36 bp Ditag

Load 5uLLadder

Load 4 ul of sample per well

(20 ng/uL)(20 ng/uL)

Em

pty

lane

Em

pty

lane

Page 44: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Concatemer on 8% Polyacrylamide Gel

My Network Places/ mapper.ro on Xena/ GeneExpLab/ Typhoon Data/ LongSAGE folder/ Library Folder/ gel name_date

100 bp Ladder(10 ng/uL)Load 10 uL

100 bp Ladder(20 ng/uL)Load 5 uL

Load all 10 uL of concatemer into 1 well

100 bp

200 bp

300 bp

400 bp

500 bp

600 bp

700 bp

800 bp900 bp

1000 bp

100 bp

200 bp

300 bp

400 bp

500 bp

600 bp700 bp800 bp900 bp1000 bp

1500 bp

2072 bp

Small size fraction

Medium size fraction

Large size fraction

Page 45: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Colony PCR on 1.5% Agarose Gel

My Network Places/ mapper.ro on Xena/ GeneExpLab/ Typhoon Data/ LongSAGE folder/ Library Folder/ gel name_date

Small size fraction Medium size fraction Large size fractionNo DNA and No Ligase

-’ve controls

Load 1.5 uL of Sample per well

1 Kb+ Ladder(20 ng/uL)Load 1 uL

Page 46: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access
Page 47: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Library construction – future throughput & staffing

Page 48: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access
Page 49: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access
Page 50: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Bottom line : one 3’ most tag per transcript

Page 51: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Tissue and stage selection

150 libraries, 22 stages of development, 24 tissues, and tissue subtypes

Page 52: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Tissue selection, heart & lung

Theiler Stage 1 3 4 9 11 13 15 17 19 20 21 22 23 24 25 26 27Description 1 cell morula blast Neonates

days post coitum (dpc) 6.5 7.5 8.5 9.5 10.5 11.5 12.5 13.5 14.5 15.5 16.5 17.5 18.5Embryos

time post birth 1 2 3 4 5heart

whole bulbous cordisatriumatrio-ventricular cushionsventricle

lung

90 adult libraries to be produced using MPSS (Lynx) technology by a group at NHGRI led by Chris Austin

Page 53: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Discovery and Analysis

Data quality, error detection and correctionTag to gene mappingFunctional mappingDiscovery of novel features

Page 54: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Low frequency tags map poorly to genome and transcript resources

Tag Frequency

Proportion of tag types mapped

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6-10 11-50 51-100 101-1000 >1000

Page 55: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Correction of sequence errors by clustering tags

• High frequency tags are correct• Single base changes in lower frequency tags are errors• A prediction of about 10% tags in error• Extrapolation vs PHRED quality scores yields 4.6% error• Library construction error (rt, pcr) was 5%• Apply this model to the full data set

Colinge & Feger, 2001

Page 56: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

More tags map to exons

All tag types

Highly expressed tag types

5’

3’

73%

3.4% 1.6%

1.5% 9.9%

0.8%

1.4%

3. 1%

4.4%

5’

3’

25%

13% 5.2%

3.1% 8.4%

7.0%

18%

4.2%

15%

Page 57: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Conclusions

The mouse atlas is available, please start using it.

Tag errors can be accounted for in several ways

Tags map to all areas of the genome, which will lead to new feature discovery

Only a small proportion of splice variants are currently annotated.

Page 58: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Funding

• Genome Canada

• NCI

• NHGRI

• BC Cancer Agency

• Applied Biosystems

Page 59: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Steven JonesGenome Sciences Centre

Asim SiddiquiScott ZuyderduynRichard VarholDerek LeungKevin TeagueLisa LeeAnita Landry

Our Team

Elizabeth M. Simpson CMMT

Robert XieSlavita BohacecByron Kuo

Adrian BurkeGenomeBC

Caroline AstellProject Manager

Pamela HoodlessTerry Fox Laboratory

Jim RupertMona WuRebecca Cullum

Cheryl HelgasonCancer Endocrinology

Brad HoffmanTeresa Ruiz de

AlagaraIda Zhang

Marco MarraGenome Sciences Centre

Jaswinder KhattraAllen DelaneyJennifer AsanoSusanna Chan

Greg Riggins, JHUDaniela Gerhardt, NCIChris Austin, NHGRI

Page 60: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Tag location All 1 2 3 4 5 6 7 8 9 10 11-100 101-1000 >1000

exin 448 215 72 22 17 12 15 5 3 8 4 64 11 0

exin- 252 140 37 21 19 5 3 3 3 2 2 14 3 0

exon 32633 12825 4834 2354 1403 904 690 613 500 435 351 6202 1471 51

exon- 17629 8663 3199 1508 915 604 436 301 261 186 142 1342 70 2

inex 357 192 60 26 8 17 7 5 4 2 4 28 4 0

inex- 328 169 48 23 15 7 9 2 2 2 3 45 3 0

intergenic 20081 14139 2776 864 473 278 220 131 123 89 85 808 88 7

intron 24264 18085 3512 1029 484 273 163 122 95 63 49 355 29 5

intron- 9306 7253 1202 296 163 83 54 35 27 23 20 131 15 4

utr3 11175 4989 1658 798 448 372 267 185 162 143 127 1825 200 1

utr3- 6861 3711 1125 521 311 187 158 124 95 70 37 490 32 0

utr5 4190 2369 632 303 153 105 89 44 49 42 29 342 31 2

utr5- 5653 2924 883 377 241 158 115 84 67 49 53 639 62 1

Total: 133177 75674 20038 8142 4650 3005 2226 1654 1391 1114 906 12285 2019 73

Tag count All 1 2 3 4 5 6 7 8 9 10 11-100 101-1000 >1000

exin 0.34 0.28 0.36 0.27 0.37 0.40 0.67 0.30 0.22 0.72 0.44 0.52 0.54 0.00

exin- 0.19 0.19 0.18 0.26 0.41 0.17 0.13 0.18 0.22 0.18 0.22 0.11 0.15 0.00

exon 24.50 16.95 24.12 28.91 30.17 30.08 31.00 37.06 35.95 39.05 38.74 50.48 72.86 69.86

exon- 13.24 11.45 15.96 18.52 19.68 20.10 19.59 18.20 18.76 16.70 15.67 10.92 3.47 2.74

inex 0.27 0.25 0.30 0.32 0.17 0.57 0.31 0.30 0.29 0.18 0.44 0.23 0.20 0.00

inex- 0.25 0.22 0.24 0.28 0.32 0.23 0.40 0.12 0.14 0.18 0.33 0.37 0.15 0.00

intergenic 15.08 18.68 13.85 10.61 10.17 9.25 9.88 7.92 8.84 7.99 9.38 6.58 4.36 9.59

intron 18.22 23.90 17.53 12.64 10.41 9.08 7.32 7.38 6.83 5.66 5.41 2.89 1.44 6.85

intron- 6.99 9.58 6.00 3.64 3.51 2.76 2.43 2.12 1.94 2.06 2.21 1.07 0.74 5.48

utr3 8.39 6.59 8.27 9.80 9.63 12.38 11.99 11.19 11.65 12.84 14.02 14.86 9.91 1.37

utr3- 5.15 4.90 5.61 6.40 6.69 6.22 7.10 7.50 6.83 6.28 4.08 3.99 1.58 0.00

utr5 3.15 3.13 3.15 3.72 3.29 3.49 4.00 2.66 3.52 3.77 3.20 2.78 1.54 2.74

utr5- 4.24 3.86 4.41 4.63 5.18 5.26 5.17 5.08 4.82 4.40 5.85 5.20 3.07 1.37

Total: 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Proportions of tag types

Count of tag types, mouse atlas

Page 61: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

All tags and singletonsproportion by location

0.00

5.00

10.00

15.00

20.00

25.00

30.00

All

1

Page 62: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Proportion by tag locationselected tag frequency

classes

0.00

10.00

20.00

30.00

40.00

50.00

60.00

1

2

5

11-100

Page 63: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Mention sageLite and amplified libraries?

Page 64: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

E9.5 primitive ventricle SAGE library68,270 tags; 15,496 tag

types

E9.5 atria SAGE library60,466 tags;

14,393 tag types

E9.5 bulbus cordis SAGE library

77,826 tags; 18,980 tag types

9.5 dpc heart (> 5 tags)

73

2141

141822

* significantly more or less frequent at P>0.05

Page 65: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

040531ATLASSITE VISIT

Page 66: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

LCM/BRAIN

SPECIFIC OVERHEA

DS

Page 67: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Relationship of Technologies

Page 68: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

High-Quality RNA Sent to GSC

Page 69: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

33% Compl

ete

Page 70: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Brain-Specific Advisory Team

Max Cynader, Ph.D.Director,

Brain Research Centre

Shiv Prasad, Ph.D.Research Associate,

Brain Research Centre

John O’Kusky, Ph.D.Core Leader & Assoc.

Prof.Department of Pathology

Anthony G. Phillips,

FRSC., Ph.D.Professor

Department of Psychiatry

Blair Leavitt,MD, CM, FRCP(C)Scientist & Assoc.

Prof.CMMT, UBC

Page 71: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Process Underlying Library Choices

Page 72: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Striatum Collaboration

Page 73: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Cortical Neurogenesis

Page 74: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

50 Brain Biolog

yLibrari

es

Page 75: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

?ADD TO PROJECT

WIDE SLIDES?

Page 76: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Collaborators (additional to pg. 22)

Blair R. Leavitt MD,CM, FRCP(C) (CMMT, UBC)Dr. Leavitt is a neurologist with both a clinical and research focus on degenerative disease. He is interested in expression profiling results for the striatum because of the key roll of that region in the development of Huntington Disease. He is ready and able to follow-up experimentally on striatum specific genes we identify.

Page 77: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

Publications (additional to pg. 56)

Ongoing:Web publication during first 5.5 months of 2004. Ave. page views/month: 4090. Ave. downloads/library: 10

Abstract:Society for Neuroscience Meeting, San Diego, Oct. 23-27, 2004. SAGE libraries constructed from murine neural tissue harvested by laser capture microdissection (LCM). Xie, Y.-Y., Bohacec, S., Khattra, J., Lee, L., Delaney, A., Jones, S., Marra, M., and Simpson, E. M.

In preparation:??

Planned: Laser capture microdissection (LCM) adapted to construct SAGE-lite libraries from embryonic sites of neurogenesis. Xie, Y.-Y., Bohacec, S., Khattra, J., Lee, L., Delaney, A., et al., Jones, S., Marra, M., and Simpson, E. M. (200_). Journal of Neuroscience Methods.

Ocular dominance plasticity critical period expression profiled by SAGE, Affymetrics, and CodeLink. Cynader, M. S., Bohacec, S., Prasad, S. S., Dewell, S., Kuo, B., Kojic, L., Khattra, J., et al., Jones, S., Marra, M., Wasserman, W. W., and Simpson, E. M. (200_). Nature Neuroscience.

Page 78: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

What next? (ideas - not an overhead)

eQTL focused on Disease Modifiers Couple large scale Affy chip technology and Strain-specific sequencing to

actually clone and identify the sequence differences of Quantitative Trait Loci (QTL) for traits or modifiers of mouse phenotypes (including such things as drug response, regulatory pathways, behaviour, and cancer susceptibility)

Subsection: Comparative SAGE libraries from mouse strains used to create the heterogeneous strains for QTL

Resources required at genomic level:mice, SAGE, Affy, sequencing, bioinformatics, genomic scanningAim: not just map but actually clone and find the specific bp change for ?# disease modifiers; and create a pipeline to do more

Background: The technology and throughput is finally here to achieve the identification of mouse QTLs. This has been a promising but unfruitful approach for years, which scientists and funding agencies (NIH, MRC) are recognizing the time may be here for correctly positioned multidisciplinary groups. Key are three recent developments: sequence of mouse genome, eQTL, start of mouse haplotype map.

Examples of Literature• Doerge, R. W. (2002). Mapping and analysis of quantitative trait loci in

experimental populations. Nat Rev Genet 3, 43-52.• Lemon, W. J., Bernert, H., Sun, H., Wang, Y., and You, M. (2002).

Identification of candidate lung cancer susceptibility genes in mouse using oligonucleotide arrays. J Med Genet 39, 644-55.

• Prows, D. R., McDowell, S. A., Aronow, B. J., and Leikauf, G. D. (2003). Genetic susceptibility to nickel-induced acute lung injury. Chemosphere 51, 1139-48.

Page 79: An Atlas of Gene Expression in Mouse Development  technology development technology implementation public access

PROBABLY NOT MAKE

OVERHEADS