62
Rearrangements and Duplications in Tumor Genomes

Rearrangements and Duplications in Tumor Genomes

  • Upload
    bary

  • View
    54

  • Download
    0

Embed Size (px)

DESCRIPTION

Rearrangements and Duplications in Tumor Genomes. Chromosomal aberrations Structural : translocations, inversions, fissions, fusions. Copy number changes : gain and loss of chromosome arms, segmental duplications/deletions. Tumor Genomes. Mutation and selection. Compromised genome - PowerPoint PPT Presentation

Citation preview

Page 1: Rearrangements and Duplications in Tumor Genomes

Rearrangements and Duplications in Tumor Genomes

Page 2: Rearrangements and Duplications in Tumor Genomes

Tumor Genomes

Compromised genomestability

Mutation and selection

• Chromosomal aberrations– Structural:

translocations, inversions, fissions, fusions.

– Copy number changes: gain and loss of chromosome arms, segmental duplications/deletions.

Page 3: Rearrangements and Duplications in Tumor Genomes

Rearrangements in TumorsChange gene structure, create novel fusion genes

• Gleevec (Novartis 2001) targets ABL-BCR fusion

Page 4: Rearrangements and Duplications in Tumor Genomes

Rearrangements in TumorsAlter gene regulation

Burkitt lymphoma translocation

IMAGE CREDIT: Gregory Schuler, NCBI, NIH, Bethesda, MD, USA

Regulatory fusion in prostate cancer (Tomlins et al.Science Oct. 2005)

Page 5: Rearrangements and Duplications in Tumor Genomes

Complex Tumor Genomes

1) What are detailed architectures of tumor genomes?

2) What genes affected?3) What processes produce these architectures?4) Can we create custom treatments for tumors

based on mutational spectrum? (e.g. Gleevec)

Page 6: Rearrangements and Duplications in Tumor Genomes

Common Alterations across Tumors

• Mutations activate/repress circuits. • Multiple points of attack. • “Master genes”: e.g. p53, Myc.• Others probably tissue/tumor specific.

repressionactivation

Duplicated genesDeleted genes

Page 7: Rearrangements and Duplications in Tumor Genomes

Human Cancer Genome Project

• What tumors to sequence?• What to sequence from each tumor?

1. Whole genome: all alterations2. Specific genes: point mutations3. Hybrid approach: structural rearrangements

etc.

Page 8: Rearrangements and Duplications in Tumor Genomes

Human Cancer Genome Project

• What tumors to sequence?• What to sequence from each tumor?

1. Whole genome: all alterations2. Specific genes: point mutations3. Hybrid approach: structural rearrangements

etc.

Page 9: Rearrangements and Duplications in Tumor Genomes

End Sequence Profiling (ESP)C. Collins and S. Volik (UCSF Cancer Center)

1) Pieces of tumor genome: clones (100-250kb).

Human DNA

2) Sequence ends of clones (500bp).

3) Map end sequences to human genome.

Tumor DNA

Each clone corresponds to pair of end sequences (ES pair) (x,y).Retain clones that correspond to a unique ES pair.

yx

Page 10: Rearrangements and Duplications in Tumor Genomes

Valid ES pairs• l ≤ y – x ≤ L, min (max) size of clone.• Convergent orientation.

End Sequence Profiling (ESP)C. Collins and S. Volik (UCSF Cancer Center)

1) Pieces of tumor genome: clones (100-250kb).

Human DNA

2) Sequence ends of clones (500bp).

3) Map end sequences to human genome.

Tumor DNA

yx

L

Page 11: Rearrangements and Duplications in Tumor Genomes

End Sequence Profiling (ESP)C. Collins and S. Volik (UCSF Cancer Center)

1) Pieces of tumor genome: clones (100-250kb).

Human DNA

2) Sequence ends of clones (500bp).

3) Map end sequences to human genome.

Tumor DNA

yx

Invalid ES pairs• Putative rearrangement in tumor• ES directions toward breakpoints

L

Page 12: Rearrangements and Duplications in Tumor Genomes

OutlineWhat does ESP reveal about tumor

genomes?

1. Identify locations of rearrangements.2. Reconstruct genome architecture, sequence of

rearrangements.3. In combination with other genome data (CGH).

Page 13: Rearrangements and Duplications in Tumor Genomes

ESP Data (Jan. 2006)

• Coverage of human genome:

≈ 0.34 for MCF7, BT474

ES pairs7994

12073

730013003222

6785

5588

39233448

Clones9580

19831

926717564246

9612

7623

52675031BT474

MCF7SKBR3

Normal

BrainBreast1Breast2OvaryProstate

Breast CancerCell Lines

Tumors

Page 14: Rearrangements and Duplications in Tumor Genomes

1. Rearrangement breakpoints

• Known cancer genes (e.g. ZNF217, BCAS3/4, STAT3)

• Novel candidates near breakpoints.

MCF7 breast cancer

• Small-scale scrambling of genome more extensive than expected.

Page 15: Rearrangements and Duplications in Tumor Genomes

Structural Polymorphisms

• Human genetic variation more than nucleotide substitutions

• Short indels/inversions present • (Iafrate et al. 2004, Sebat et al. 2004, Tuzun et al. 2005,

McCarroll et al. 2006, Conrad et al. 2006 etc.)

• ≈ 3% (53/1570) invalid ES pairs explained by known structural variants.

s 1.6 Mb inversion sA

tC-Binversion

Human Variant

A CB

Reference Human

t

Page 16: Rearrangements and Duplications in Tumor Genomes

2. Tumor Genome Architecture

1) What are detailed architectures of tumor genomes?

2) What sequence of rearrangements produce these architectures?

Page 17: Rearrangements and Duplications in Tumor Genomes

Human genome(known)

Tumor genome(unknown)

Unknown sequence of rearrangements

Location of ES pairsin human genome.(known)

Map ES pairs tohuman genome.

B C EA D

x2 y2x3 x4 y1 x5 y5 y4 y3x1

ESP Genome Reconstruction Problem

Reconstruct tumor genome

Page 18: Rearrangements and Duplications in Tumor Genomes

Human genome(known)

Tumor genome(unknown)

Unknown sequence of rearrangements

Location of ES pairsin human genome.(known)

Map ES pairs tohuman genome.

-C -D EA B

B C EA D

x2 y2x3 x4 y1 x5 y5 y4 y3x1

ESP Genome Reconstruction Problem

Reconstruct tumor genome

Page 19: Rearrangements and Duplications in Tumor Genomes

-C

-D

E

A

B

-C -D EA B

Tumor

Human

ESP Genome Reconstruction: Comparative Genomics

B C EA D

Tumor

Page 20: Rearrangements and Duplications in Tumor Genomes

B C EA D

-C

-D

E

A

B

Tumor

Human

ESP Genome Reconstruction: Comparative Genomics

Page 21: Rearrangements and Duplications in Tumor Genomes

B C EA D

-C

-D

E

A

B

Tumor

Human

ESP Genome Reconstruction: Comparative Genomics

Page 22: Rearrangements and Duplications in Tumor Genomes

B C EA D

-C

-D

E

A

B

Tumor (x2,y2)

(x3,y3)

(x4,y4)

(x1,y1)

y4 y3x1 x2 x3 x4 y1 y2

ESP Genome Reconstruction: Comparative Genomics

Page 23: Rearrangements and Duplications in Tumor Genomes

B

C

E

A

D

Human

B C EA D

2D Representation of ESP Data

• Each point is ES pair.• Can we reconstruct the tumor genome from the positions of the ES pairs?

(x2,y2)

(x3,y3)

(x4,y4)

(x1,y1)

ESP Plot

Human

Page 24: Rearrangements and Duplications in Tumor Genomes

B

C

E

A

D

Human

Human

B C EA D

2D Representation of ESP Data

• Each point is ES pair.• Can we reconstruct the tumor genome from the positions of the ES pairs?

ESP Plot

Page 25: Rearrangements and Duplications in Tumor Genomes

B

C

E

A

D

Human

Human

B

-D

E

A

DA C

E

-C

B

-C -D EA BReconstructedTumor Genome

ESP Plot → Tumor Genome

Page 26: Rearrangements and Duplications in Tumor Genomes

B

C

E

A

D

Human

Human

B C EA D

2D Representation of ESP Data

• Each point is ES pair.• Can we reconstruct the tumor genome from the positions of the ES pairs?

Page 27: Rearrangements and Duplications in Tumor Genomes

Human

Human 2D Representation of ESP Data

• Each point is ES pair.• Can we reconstruct the tumor genome from the positions of the ES pairs?

Page 28: Rearrangements and Duplications in Tumor Genomes

Real data noisy and incomplete!

Valid ES pairs• satisfy length/direction

constraints l ≤ y – x ≤ L

Invalid ES pairs• indicate rearrangements• experimental errors

Page 29: Rearrangements and Duplications in Tumor Genomes

Computational Approach

2. Find simplest explanation for ESP data, given these mechanisms.

3. Motivation: Genome rearrangements studies in phylogeny.

1. Use known genome rearrangement mechanisms

sA

tC-B

sA

tCB inversion

Human Tumor

sA

t-B

sA

t-CB DC D translocation

Page 30: Rearrangements and Duplications in Tumor Genomes

• G = [0,M], unichromosomal genome.• Reversal s,t(x)= x, if x < s or x > t,

t – (x – s), otherwise.

Given: ES pairs (x1, y1), …, (xn, yn) Find: Minimum number of reversals s1,t1, …, sn, tn such that if = s1,t1… sn, tn then ( x1, y1 ), …, ( xn, yn) are valid ES pairs.

x1 y1G’ = G

x1 y1

GB CA

-BAx2 y2

x2 y2

ts

ESP Sorting Problem

Page 31: Rearrangements and Duplications in Tumor Genomes

All ES pairs valid.

t

s

Sequence of reversals.

s t

x1 y1

x1 y1

B CA

-C -BAy3 x3 y2

y3

ts x3

x2

y2x2

Page 32: Rearrangements and Duplications in Tumor Genomes

Filtering Experimental Noise 1) Pieces of tumor genome:

clones (100-250kb).

Human DNA

2) Sequence ends of clones (500bp).

3) Map end sequences to human genome.

Tumor DNA

Rearrangement

Cluster invalid pairs

Chimeric clone

Isolated invalid pair

yx

Page 33: Rearrangements and Duplications in Tumor Genomes

Sparse Data Assumptions

tumor

1.Each cluster results from single inversion.

2. Each clone contains at most one breakpoint.

human

y1x2 x3 y3y2x1 y1x2 x3 y3y2x1

tumor

Page 34: Rearrangements and Duplications in Tumor Genomes

Human

Human

ESP Genome Reconstruction: Discrete Approximation

1) Remove isolated invalid pairs (x,y)

Page 35: Rearrangements and Duplications in Tumor Genomes

Human

Human

2) Define segments from clusters

ESP Genome Reconstruction: Discrete Approximation

1) Remove isolated invalid pairs (x,y)

Page 36: Rearrangements and Duplications in Tumor Genomes

Human

Human

3) ES Orientations define links between segment ends

ESP Genome Reconstruction: Discrete Approximation

2) Define segments from clusters

1) Remove isolated invalid pairs (x,y)

Page 37: Rearrangements and Duplications in Tumor Genomes

Human

Human

ESP Genome Reconstruction: Discrete Approximation

(x2, y2)(x3, y3)

(x1, y1)

t

s

3) ES Orientations define links between segment ends

2) Define segments from clusters

1) Remove isolated invalid pairs (x,y)

Page 38: Rearrangements and Duplications in Tumor Genomes

2

3

5

1

4

2

3

5

1

4

ESP Graph

2 3 51 4

Tumor genome (1 -3 -4 2 5)= signed permutation of (1 2 3 4 5)

Paths in graph are tumor genome architectures.

Edges:1. Human genome

segments2. ES pairs

Page 39: Rearrangements and Duplications in Tumor Genomes

(Sankoff et al.1990)Sorting permutations by reversals

Polynomial time algorithms O(n4) : Hannenhalli and Pevzner, 1995. O(n2) : Kaplan, Shamir, Tarjan, 1997.O(n) [distance t] : Bader, Moret, and Yan, 2001. O(n3) : Bergeron, 2001.

Reversal (i,j) [inversion]

= 12…n signed permutation

Problem: Given , find a sequence of reversals 1, …, t with such that: ¢ 1 ¢ 2 ¢ ¢ ¢ t = (1, 2, …, n) and t is minimal.

1…i-1 -j ... -i j+1…n

Solution: Analysis of breakpoint graph ← ESP graph

Page 40: Rearrangements and Duplications in Tumor Genomes

Sorting Permutations

2 3 4 51

-4 2 5-31

-3 -2 4 51

Page 41: Rearrangements and Duplications in Tumor Genomes

Breakpoint Graph

end

2 3 4 51

-4 2 5-31start

start

Black edges: adjacent elements of

end

Gray edges: adjacent elements of i = 1 2 3 4 5

Key parameter: Black-gray cycles

Page 42: Rearrangements and Duplications in Tumor Genomes

Breakpoint Graph

end

2 3 4 51

-4 2 5-31start

start

Theorem: Minimum number of reversals to transform to identity permutation i is:

d() ≥ n+1 - c()where c() = number of gray-black cycles.

Black edges: adjacent elements of

end

start -3 -2 4 51 endGray edges: adjacent elements of i = 1 2 3 4 5

ESP Graph → Tumor Permutation and Breakpoint GraphKey parameter: Black-gray cycles

Page 43: Rearrangements and Duplications in Tumor Genomes

MCF7 Breast Cancer Cell Line• Low-resolution chromosome painting suggests

complex architecture.• Many translocations, inversions.

Page 44: Rearrangements and Duplications in Tumor Genomes

ESP Data from MCF7 tumor genome

Each point (x,y) is ES pair.

Coordinate in human genome

• 6239 ES pairs (June 2003)• 5856 valid (black)• 383 invalid

• 256 isolated (red)• 127 form 30 clusters

(blue)

Page 45: Rearrangements and Duplications in Tumor Genomes

MCF7 Genome

Human chromosomes MCF7 chromosomes5 inversions

15 translocations

Raphael, Volik, Collins, Pevzner. Bioinformatics 2003.

Sequence of

Page 46: Rearrangements and Duplications in Tumor Genomes

Array Comparative Genomic Hybridization (aCGH)

3. Combining ESP with other genome data

Page 47: Rearrangements and Duplications in Tumor Genomes

CGH Analysis• Divide genome into segments of equal copy

number

Copy number profile

Cop

y nu

mbe

r

Genomecoordinate

Page 48: Rearrangements and Duplications in Tumor Genomes

CGH Analysis• Divide genome into segments of equal copy

number

Copy number profile

Numerous methods (e.g. clustering, Hidden Markov Model, Bayesian, etc.)

Segmentation

No information about:• Structural rearrangements

(inversions, translocations)• Locations of duplicated material in tumor genome.

Cop

y nu

mbe

r

Genomecoordinate

Page 49: Rearrangements and Duplications in Tumor Genomes

CGH Segmentation

How are the copies of segments linked???

Cop

y nu

mbe

r

Genome Coordinate

3

2

5

Tumor genome

ES pairs links segments

Page 50: Rearrangements and Duplications in Tumor Genomes

ESP + CGH

ES near segment boundaries

Cop

y nu

mbe

r

Genome Coordinate

3

2

5

CGH breakpoint ESP breakpoint

Page 51: Rearrangements and Duplications in Tumor Genomes

ESP and CGH Breakpoints

BT474

MCF7

ESPbreakpoints

CGHbreakpoints

33(P = 5.4 x 10-7)

244426

39(P = 1.2 x 10-4)

730

ESPbreakpoints

CGHbreakpoints

256

12/39 clusters

8/33 clusters

Page 52: Rearrangements and Duplications in Tumor Genomes

Microdeletion in BT474

3

2

0Cop

y nu

mbe

rES pair

≈ 600kb

Valid ESpair < 250kb

“interesting” genes in this region

Page 53: Rearrangements and Duplications in Tumor Genomes

Combining ESP and CGH

ES pairs links segments.Copy number balance at each segment

boundary: 5 = 2 + 3.

Cop

y nu

mbe

r

Genome Coordinate

3

2

5

Page 54: Rearrangements and Duplications in Tumor Genomes

Combining ESP and CGH

• CGH copy number not exact.• What genome architecture “most

consistent” with ESP and CGH data?

Cop

y nu

mbe

r

Genome Coordinate

3

2

53 ≤ f(e) ≤ 5

1 ≤ f(e) ≤ 3

1 ≤ f(e) ≤ 4

Page 55: Rearrangements and Duplications in Tumor Genomes

Combining ESP and CGHC

opy

num

ber

Genome Coordinate

3

2

5

1. Edge for each CGH segment. 2. Edge for each ES pair consistent with segments.3. Range of copy number values for each CGH edge.

Build graph

3 ≤ f(e) ≤ 5 1 ≤ f(e) ≤ 3 1 ≤ f(e) ≤ 4

Page 56: Rearrangements and Duplications in Tumor Genomes

Network Flow Problem

Flow constraints:l(e) ≤ f(e) ≤ u(e)

CGH edge: l(e) and u(e) from CGHESP edge: l(e) = 1, u(e) = 1

f(e)

Flow constraint on each CGH edge

l(e) ≤ f(e) ≤ u(e) 8 e

Page 57: Rearrangements and Duplications in Tumor Genomes

Network Flow Problem

Flow constraints:l(e) ≤ f(e) ≤ u(e)

CGH edge: l(e) and u(e) from CGHESP edge: l(e) = 1, u(e) = 1

f(e)

Flow in = flow out at each vertex

(u,v) f( (u,v) ) = (v,w) f( v,w) ) 8 v

l(e) ≤ f(e) ≤ u(e) 8 e

Page 58: Rearrangements and Duplications in Tumor Genomes

Network Flow Problem• Minimum Cost Circulation with Capacity

Constraints (Sequencing by Hybridization, Sequence Assembly)

Source/sink

min e (e)Subject to:

Costs: (e) = 0, e ESP or CGH edge 1, e incident to source/sink

f(e)

(u,v) f( (u,v) ) = (v,w) f( v,w) ) 8 v

l(e) ≤ f(e) ≤ u(e) 8 e

Flow constraints:l(e) ≤ f(e) ≤ u(e)

CGH edge: l(e) and u(e) from CGHESP edge: l(e) = 1, u(e) = 1

Page 59: Rearrangements and Duplications in Tumor Genomes

Network Flow Results

• Unsatisfied flow are putative locations of missing ESP data.

• Prioritize further sequencing.

Source/sink

f(e)

• Targeted ESP by screening library with CGH probes.

Page 60: Rearrangements and Duplications in Tumor Genomes

Network Flow Results

• Identify amplified translocations– 14 in MCF7– 5 in BT474

• Eulerian cycle in combined graph gives tumor genome architecture.

Flow values → Edge multiplicities

Page 61: Rearrangements and Duplications in Tumor Genomes

Human Cancer Genome Project

• What tumors to sequence?• What to sequence from each tumor?

1. Whole genome: all alterations2. Specific genes: point mutations3. Hybrid approach: structural rearrangements

etc.

Page 62: Rearrangements and Duplications in Tumor Genomes

Human Cancer Genome Project