27
Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung der Deutschen Gesellschaft für Humangenetik Heidelberg, 08.–11. März 2006

Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Embed Size (px)

Citation preview

Page 1: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Potentials and limits of haplotype trees in exploring

population structure and pathogenicity of mutations

Hans-Jürgen Bandelt (Hamburg)

17. Jahrestagung der Deutschen Gesellschaft für Humangenetik

Heidelberg, 08.–11. März 2006

Page 2: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Human mtDNA

from MITOMAP

HVS-I

alias HVR1

Page 3: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

The perception of evolution as seen through the lenses of laboratories constitutes an overlay of

two different processes:

Perceived evolution =

Natural evolution (of the genome)

+ Artificial evolution (in the lab)

Page 4: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Migrational processes (prehistory)

mtDNA and evolution

α: Natural evolution

Page 5: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

10484312618597551191412007

7691018

35947256

13650

648723

1413547155805746

107501418214861

965+3C14614964 5267 6002 6284 9332

10978 11116 11743 12405 12714 12771

14533A 14791 14959 15244

335754606167737677627775847386318697

103731125311344114851165312280124141317413344

14000A14302

87021592332

3254A343462318856

9130A95549941

10700109551135311944126301323914845152631545815703

15777C

825A8655

10688108101350615301

41047521

75014382706476970288860

117191476615326

12705

87019540

103981087315301

1 2 3 4 5 6 7 8

72123575310

10184103141261812816134431370814461145661485115553

24

1719283137774388485953007055876795099827

100441028911563115901196314410

2707387941225147546055675813593080209098925493809965

11440124691308013755

22 23

13105

678792

3582449153937394883593379682

11944123731422114371145601458715833

345950465605627266806842

11933441521155819477

103731100215299

15 16 17

18223666

7819A85278932

1144014769

3396 4218

15514 15944d

56019950

319736934048 4350 51947270 8853

1250712634 141481510615952

9591692464351816293648066028158 825184009932

10604 111761177014590 15940

745+T1719184258219365

1531415479

18 19 20 21 rCRS

10819

76451404014395

21588598

106791126013687

13800A

11 12

235214212

47158392

1256115367

13 14

34353621

5894+T63927129

8041 819789289941

1234014034

3483640183118817

13708

5899+C1475015172

9 10

81336043705 43754793

6671 12346 1363515514

25

50,0000

100,000150,000

200,000

Time(years)

10400 14783 15043

L0

3516A 5442 9042 9347

10589 10664 10915 13276

L3

5231 5460 8428 8566

11176 12720 14308

L0a

L

L5

L2’6 = L2’3’4’6’7

L3bd = L3bcd

L3ex = L3eix

L3x L3iL3e

L3f

L3f1

L4’6 = L3’4’6’7

L3’7 = L3’4’7

L1’5 = L1’2’3’4’5’6’7

MN

R

L3h

6446 6680 12403

12950C 14110

M1

3666 7055 7389

13789 14178 14560

L12395d 5951 6071 8027 9072

10586 12810 13485

14000A 14911

L1c 241682069221

1011513590

L23693

L2d

L3d

3450577362219449

1008613914A153111582415944d

5147 74248618

13886 14284

L3b

L3c

391881049855

1260913470

L4

L7L6

709770961

137101528915499

L3a

54418222

126301481815388

15944d

275828857146846813105

L2’5= L2’3’4’5’6’7

78619575

518614905

2417G30273720497652138152

9809C104931106511260117011218812215

12546T1271412810135691383015383

70985193018224496 5004 5111 5147 56566182 6297 7424 7873 8155 8188 8582 8754 9305 9329 9899

11015 11025 11881 1223613105 13722 14212 14239 14581 14905 14971 15217 15884

159822205162

5899+C6962

10031111641125211959124771254015929

1114314755

34237972

1243212950

5147571162578460

9bp-del11172

L0a2

95459554

13116

L0a2a

L1c1’210321

1204913149

L1c2

921

L3d1750

L3e5

615062537076733787848877

107921079311654

L1c2a

22455603

116411513615431

L0af

4586 9818

L0ak

ML tree of basal African mtDNA haplogroups Coding-region variation displayed Torroni et al. (TIG, June 2006)

. Ethiopian samples

Page 6: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

CRS

R M

all mutations that distinguish haplogroups M and R (part of N)

incorrect rooting

One of the first views of the East Asian mtDNA phylogeny (Ozawa, Herz 1994)

Page 7: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Upre-HV JT

R1R5

R6

R7

R30

R31

R9PR11

R8

B

N5

SO

X

A

N9 West Eurasia

South Asia

East Asia

Oceania

15607 9140 6755 8404

N

N1W

R

R2

Palanichamy et al (Amer J Hum Genet, 2004)

Star-burst of autochthonous mtDNA lineages in Eurasia (haplogroup N and its subhaplogroup R)

Page 8: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

... and a massive burst in haplogroup M, as e.g. seen in India:

Sun et al (Mol Biol Evol, March 2006)

Page 9: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

An Out-of-Africa model based on mtDNA analysis

Kivisild et al (Springer-Verlag, April 2006)

Page 10: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

HV

HV0 = pre-V

HV0a H3H1

H

V

R0a = (pre-HV)1

R0 = pre-HVUJT

R X N2

W N1b N1a’I

N1

N1a I

N

Sketch of the phylogeny of basal European mtDNA haplogroups

Torroni et al (TIG, June 2006)

Page 11: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Spatial frequency distributions of haplogroups H1, H3, V, and U5b reveal signature of post-LGM expansions

Torroni et al (TIG, June 2006)

Page 12: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Laboratory-specific processes (error and fraud)

mtDNA and evolution

β: Artificial evolution

Page 13: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Major sources of error in mtDNA sequence data

Artificial Recombination through contamination or sample mix-up (or targeting nuclear inserts of mtDNA)

Phantom mutations sequencing errors at electrophoresis

Documentation errorsincurred by casual reading or writing

Page 14: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Impurifying selection is the driving force in artificial evolution

inasmuch as incorrect data are more flexible to interpret and can support sexy

stories — seemingly told by DNA — which are then disseminated by high-impact

factor journals (e.g. Science and Nature).

Worst case: mtDNA in cancer research (Salas et al, PLoS Medicine

2005)

Page 15: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Case of mtDNA sample mix-up, mis-interpreted as somatic mutations;

data generated with MitoChip by Maitra et al (Genome Res, 2004)

Data re-analysis by Bandelt et al (J Med Genet, 2005)

Page 16: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

M7a

N

F

NDsq0168

@6455 965.2+CC

NDsq0178

rCRSL3R

12771

64

1612

990

53

13928C

1630

439

70

1031

063

9224

9d

F1

15618 200

195

NDsq0167

NDsq0015

1622

3M

F1aF1a’c

1617

2

40

86

1620

9 49

58

4386

27

72

2626 9824

64

55

1504

314

783

1040

048

9

1530

110

873

1039

895

4087

01

M7 12705

R9

12882 12406

10410 @9824

13759

1651

9 10

609

6962

52

2-52

3d

1651

9

1614

0 15

422

8005

58

99+

C

4435

22

18

965+

CC

96

1

24

9

1616

2 95

48

14002

F1a1

F1a1b

A C E

B D FF

1

1

3000

3000

6000

6000

9000

9000

12000

12000

15000

15000

16569

16569

M7a2

M7aF1a1b

NDsq0168

M7a

NDsq0167

F1a1b F1a1b

63

A case of cross-over in the 672 human complete mtDNA sequences from Tanaka et al (2004)

Page 17: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Prime example of a phantom mutation (Brandstätter et al, Electrophoresis 2005)

Page 18: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

rCRS

Electropherogram from

Nasidze and Stoneking (2001)

generated 1997 / 1998

and for the first time presented in Stoneking and Nasidze (Ann Hum Genet, 2006)

Page 19: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Phantom mutations can be found in excess in the HVS-I Caucasus data of

Nasidze and Stoneking (2001).

In view of additional problems, this may be regarded as the worst data set ever

published in the realm of molecular anthropology;

see Bandelt and Kivisild (Ann Hum Genet 2006) for data re-analysis

Page 20: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Sequences with phantom transitions at 16280-16281 in those Caucasus dataCode Mutation (16000+) Haplogroup 

AR31 067 279G 280 281 355 HV1AR483 069 126 145 280 281 367C JAZ2 280 281 ?AZ342 280 281 298 pre-VAZ6 154 168A 280 281 356 384 ?CH444 111 214G 249 280 281 327 388 U1bCH451 280 281 292 ?DAR23 129 223 278 280 281 ?DAR36 258 280 281 384 ?KAB408 224 280 281 311 K

This mutation pair has never been observed in >40,000 HVS-I sequences!

Page 21: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Electropherogram presented by Stoneking and Nasidze (Ann Hum Genet, 2006)

rCRS

Page 22: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Phantom mutations in the HVS-I data of Plaza et al (Ann Hum Genet, 2003)(267 samples) Sample Mutation (16000+) Haplogroup

Algeria 279N 285N ?Andalusia 129 182C 183C 189 223 249 311 359 371 M1Andalusia 129 281 ?Andalusia 281 ?Catalonia 093 192 270 281 290A 304 311 U5bCatalonia 224 281 311 KMorroco 093 224 242 311 371 KMorroco 124 223 284C 285T 300 319 374T L2dMorroco 126 187 189 223 264 270 278 293 311 371 374 L1bMorroco 126 284C 292 294 T2Morroco 183C 189 223 278 382G XMorroco 189 192 270 369T U5bSaharawi 093 172 185 223 327 382G L3e1Saharawi 172 281 311 U6?Saharawi 189 382G ?

Page 23: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Comparison with 1624 complete sequences stored in

the mtDB database

Variation in 16279-16285:

Only 20 transitional variants at 16284

Variation in 16369-16389:

Only 1+1+6 transitional variants at 16371, 16380, and 16381

Page 24: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

L3

15301108731039895408701

150431478310400

489

HV

4769 1438

7028 2706

H

H2

15326 8860 750

315+C263

rCRS

1622312705

R

14766

163625178A4883

130741196999508108764265311095326318215146

M11

BJ105 LN7710 GD7817

1448811860A10658

Miao271

C

16172

1609211350

200

#101 #078

16292A1618916167152361247710235860228852238

198146

14569119359554146

#081

16173153271191411410

200151

1629815487T

85847196A 4715

M8

163271431813263119149545

3552A

2

CZ

249d

16519149781295778536338598758214047146

1

1

1

1

1

12

2

D

1466884143010

D4

BJ106

3

3

4

Qu2005

4

16218A161401431411914117781032591509021204

161291497984733206152

D4a

5

5

5

5

6

6

6

7

Li 2004

N

16319162908794482442481736663235

A

7

7

77

77

7

7

7

7

163901629127361555

961+C

8

Zhao2004

8

8

8

8

8

8

14075C11718

11639C4247

2572G1709

961+C

WH6980

16362523-524d

1494

1629414776132878567855142573687

1168d654

9 10 11

9 10 11

10

9

Wang2005

1418096679383

1630413928C

3970

R9

103106392249d

F

12

WZ4

12

12

12

13

WZ5

13

13

13

1629816189

13928C1555495204199184

9824A8964

1382C

D4b2

8020

D4b

13

15043C13182117781978

14

WZ6

163111507115040 14502 131521254987934140709

573+C

152181064688567250

3172+C

M10a

M10

14

14

16129160931313511778112579966

8821G63573866

1288212406106096962

F1

1651916304G15784139281177881672389

523-524d

15

BJ101

15

16362163041629811065 10320 5978 5913 5585 3434

F3

15

15

16220C 9947

8281-8289d152

F3b

15

162651609315784

10988C10980G108731089410427

8270G5885544250761555495G489150

QJ383

1622715910130441191410398480235352392151

16

BJ102

16261 16257A 12372 12358 5231 150

541716

N9

N9a

16

11719 73

pre-HV

1636213856A10873106409443853250461555

1612916111 12007 438616

N9a1

17

17

1717

17

18

BJ104

18

18

18

9296194

D4b2b

18

16519

1521719 16311

1649716265T13928T

2361

SD10324HNsq0152

159241486911926

200154

1479013890 10685

M11b

14340

M11a

10

3

Yuan 2005

5

Li 2005

17

BJ103

1575812468

10742G1064010589863467103423

16519152367511

10410

D4a1

16519744433241811217

1555

12

589744541555

523-524d

16311162171593079821719

14384207

16291159301524414605

523-524d

1555

KAsq0089

5

161898281-8289d

B

161401039899508584709

B5

4

B5a

16266A152353537210

B5a2

6960 35404

WH6967

14989107545773

523-524d

16519

12

12

13

1717

Re-evaluation of the mtDNA data from the lab of Min-Xin Guan

missing mutations

misscored mutations in red

Yao et al (Hum Genet, 2006)

N M

rCRS

R

Page 25: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Strategies of authors to deal with errors1st: Publishing a corrigendum

[rare event]

2nd: No correction — but avoiding similar errors in future work

[common practice]

3rd: No action — and committing the same errors as before [e.g. as Min-Xin Guan and colleagues do]

4th: Fraudulent action — performing fake analyses and giving false statements [as done by Mark Stoneking and

Ivane Nasidze in the Ann Hum Genet]

Page 26: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

... only L strand, no H strand information shown!Stoneking and Nasidze (2006)

Page 27: Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung

Human Mitochondrial DNA and the Evolution of Homo sapiens

Series: Nucleic Acids and Molecular Biology, Vol.18 Volume package: Human Mitochondrial DNA

Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.) 2006, Approx. 250 p., 31 illus., 2 in colour., HardcoverISBN: 3-540-31788-0Springer-VerlagDue: April 2006