43
1 Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca INN-CNR

Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

Embed Size (px)

DESCRIPTION

Francesco Cucca (University of Sassari and INN-CNR) at CRS4 presenting the sardinian genome sequencing program (24 march 2010).

Citation preview

Page 1: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

1

Il sequenziamento dei genomi sardi al CRS4

Francesco Cucca INN-CNR

Page 2: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

• Humans and other living organisms all contain a digital project constituted by a linear sequence of different combinations of 4 small chemical compounds, named nucleotides, which together constitute their DNA.

• Particular combinations of nucleotides specify the key qualitative and quantitative instructions for the synthesis of essential structural and operative components of the cell formed by different combinations of 20 molecules named amino acids

• In turn amino acids are linked to each other to form more complex molecules named proteins.

Page 3: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Page 4: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Page 5: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

U U U Phe U C U Ser U A U Tyr U G U CysU U C Phe U C C Ser U A C Tyr U G C CysU U A Leu U C A Ser U A A STOP U G A STOPU U G Leu U C G Ser U A G STOP U G G Trp

C U U Leu C C U Pro C A U His C G U ArgC U C Leu C C C Pro C A C His C G C ArgC U A Leu C C A Pro C A A Gln C G A ArgC U G Leu C C G Pro C A G Gln C G G Arg

A U U Ile A C U Thr A A U Asn A G U SerA U C Ile A C C Thr A A C Asn A G C SerA U A Ile A C A Thr A A A Lys A G A ArgA U G Met A C G Thr A A G Lys A G G Arg

G U U Val G C U Ala G A U Asp G G U GlyG U C Val G C C Ala G A C Asp G G C GlyG U A Val G C A Ala G A A Glu G G A GlyG U G Val G C G Ala G A G Glu G G G Gly

Page 6: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

• While the basic composition of both DNA and protein building blocks and the translational system of one chemical language into the other is conserved, there is wide variation in the order of these block units in different organisms and individuals.

• This is because the DNA and deriving protein products are not a static entity. Instead, DNA is subjected to a variety of different types of heritable change known as mutation.

•  Mutations often arise as copying errors during DNA replication. Although the fidelity of DNA replication is strikingly high, misincorporation occurs at a given frequency, known as mutation rate.

Page 7: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

• Modern humans originated ~100,000 years ago from pre-modern humans and represent a relatively homogenous species which has experienced a dramatic expansion during its recent evolutionary history.

•  Two unrelated human individuals on our planet are identical for about 99.9% and thus differ for about 0.1% of their DNA content.

•  This means that there is approximately one change every 1000 nucleotides (our genome has an overall content of about two copies of 3.3 billion nucleotides) when comparing the DNA from two unrelated individuals.

Page 8: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

In a complex trait such as MS, the disease state results from interactions between multiple genotypes and the environment. The influence of any individual causal allele tends to be modest and the relationship between the causal variant and the disease state is probabilistic.

This genetic variation has important medical consequences:

In simple mendelian traits, the relationship between the causal genetic variant (genotype) and the disease state is deterministic.

Page 9: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

9

Quantitative trait Qualitative trait

Page 10: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

10

R. A. Fisher, 1890-1962

theoretical framework

Page 11: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

ASSOCIAZIONE PRIMARIA CON LA VARIANTE CAUSALE

ASSOCIAZIONE SECONDARIA DOVUTA A CONTIGUITA’

ASSOCIAZIONE SPURIA DOVUTA A SUBSTRUTTURA DI POPOLAZIONE

POSSIBILI SIGNIFICATI DI UN’ASSOCIAZIONE

Page 12: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

Why a sequencing project?

12

Page 13: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

The imperfect genome-wide search

 All Gene Chip Arrays contain SNPs chosen based on linkage disequilibrium (LD) observed in HapMap populations, a catalogue of ~ 3 million SNPs genotyped in Europeans, Asians, and Africans

Page 14: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

The imperfect genome-wide search

 All Gene Chip Arrays contain SNPs chosen based on linkage disequilibrium (LD) observed in HapMap populations, a catalogue of ~ 3 million SNPs genotyped in Europeans, Asians, and Africans

  Studying a subset of 500,000 or 1 million is limitative

Page 15: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

The imperfect genome-wide search

 All Gene Chip Arrays contain SNPs chosen based on linkage disequilibrium (LD) observed in HapMap populations, a catalogue of ~ 3 million SNPs genotyped in Europeans, Asians, and Africans

  Studying a subset of 500,000 or 1 million is limitative

Tested variant

Causative variant

Power to detect disease associations at a locus inversely correlates with the r2 between typed(tested) and untyped (causative) SNPs

Page 16: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

CROATIA UKRAINE

HUNGARY

POLAND

GEORGIA

SICILY

CALABRIA

TURKEY LEBANON GREECE

ALBANIA

NORTH-CENTRAL ITALY

CORSICA

ANDALUSIA

BASQUE COUNTRY

CATALONIA

SARDINIA

Why a sequencing project in Sardinia?

Page 17: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

Why a sequencing project in Sardinia?

17

MOROCCO

ANDALUSIAN

SPANISH BASQUES

FRENCH

CZECH AND SLOVAKIAN

CENTRAL-NORTHERN ITALIAN

CALABRIAN

CROATIAN

GREEK

MACEDONIAN

POLISH

UKRAINIAN

GEORGIAN

TURKISH

LEBANESE

SYRIAN

SAAMI

MARI

UDMURT

DUTCH

HUNGARIAN

ALBANIAN

Page 18: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

What samples to sequence in Sardinia?

•  ProgeNIA study

•  Case-Control studies

•  Future work

Page 19: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

19

ProgeNIA

6.148 volontari

Arzana Arzana

Ilbono Elini

Lanusei

Page 20: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

ProgeNIA/SardiNIA project

 6,148 individuals - aged 14-102 y.

95% are known to have all grandparents born in Sardinia

 711 pedigrees up to 5 generations deep

Largest family: 625 phenotyped individuals

 >34,000 relatives pairs

Pilia et al. PLoS Genet. 2006

Page 21: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

> 150 quantitative traits   Anthropometric Measurements Height, Weight, Hip, Waist, BMI

  Blood Chemistry Components LDL, HDL, TG, Insulin, RBC, MCH, MCV, Bilirubin, hsCRP, MCP-1, IL-6,

etc.

  Cardiovascular Traits HR, SBP, DBP, PP, PWV, IMT, QT, etc

  Personality Facets Neuroticism, Extraversion, Openess, Agreeableness, Coscientiousness, etc.

  New traits will be added soon (immunological traits).

 Cytokines Adiponectin, Leptin, MCP-1, hsCRP, IL-6, V-CAM, AGE

Page 22: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

Case-control samples

•  The special case of autoimmune diseases

22

Page 23: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Page 24: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

18

19 14

15 12

11

10

7

8 8 10

8 8

6 42

12

12

6

5 7

7 6 9 9

9.8

7

7 6 12

6 8

10 6

21 26

42

36

22

10 13

15 12

13

9

23

13

20

15 8

5

7

19

*Adapted from EURODIAB

Page 25: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Page 26: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

17

47 39

55

119

135

50

81 65

55 6

140

61

39 42

21 50 83

62

35 55

56

120 153 93

112

86 83

76

31

187

126

186

68 29

10 Pugliatti et al (EBC), Eur J Neurol 2006

7

10

112

55

74

165

60

Page 27: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

0 10 20 30 40 50 60 70

Pazienti Controlli 0 10 20 30 40 50 60 70

Pazienti Controlli 0 10 20 30 40 50 60 70

Pazienti Controlli

Page 28: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

How many samples to sequence?

• Is it necessary to sequence all people analysed?

28

Page 29: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

•  Observed genotypes

• Inferred DNA stretches sharing along chromosome

• Inferred missing genotypes according to chromosome sharing

Burdick et al. Nat. Genet. 2006

Chen and Abecasis AJHG 2008

Page 30: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

1) Identify Match Among Reference

Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . .

Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C

Page 31: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

1) Identify Match Among Reference

Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . .

Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C

Page 32: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

1) Identify Match Among Reference

Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . .

Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C

Page 33: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

1) Identify Match Among Reference

Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . .

Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C

Page 34: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

2) Phase Chromosome

Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . .

Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C

Page 35: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

3) Impute Missing Genotypes

Individuals in study sample C G A A A T C T C C C G A C C T C A T G G C G G A G C T C T T T T C T T T T A T G C

Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C

Page 36: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

Recent updates

 We used whole-genome sequences of 52 Europeans available from the 1,000 Genomes Project to infer ~6.6 million markers in individuals typed with the higher density chip…..

 …. then with imputation method we inferred the 6.6 million markers to all individuals and performed a GWAS

 This :  Provides a fine mapping for previously discovered loci

 May show new loci that were poorly tagged by the previous set of SNPs

Page 37: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

GWAS finding

Mostly all of the loci detected by GWAS only explain a small fraction of the heritability

Smaller is the effect size, larger is the sample size required to maintain adequate power

Trait Heritability So far explained

HbF ~60% ~17%

Height ~80% ~4%

BMI ~40% ~1%

Page 38: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

38

Shankar Balasubramanian David Klenerman

Page 39: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

39

Page 40: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

ProgeNIA Team Lanusei-Cagliari

Manuela Uda Serena Sanna Eleonora Porcu Ilenia Zara Carlo Sidore Maristella Steri Marco Masala Gianmauro Cuccuru Angelo Scuteri Marco Orrù Maria Grazia Pilia Danilo Fois Liana Ferreli Francesco Loi

Monica Lai Anna Cau Barbara Deiana Monica Balloi Maria Grazia Piras Gianluca Usala Antonella Mulas Andrea Maschio Fabio Busonero Sandra Lai Mariano Dei

Laura Crisponi Silvia Naitza Caterina Flore Simona Foddi

Giuseppe Pilia, Ideatore e Fondatore del Progetto ProgeNIA

Page 41: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Page 42: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Page 43: Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

Acknowledgements: Paolo Zanella Chris Jones Roman Tirler

Antonio Cao Giuseppe Pilia

David Schlessinger Goncalo Abecasis John Todd