Genetic epidemiology principles relevant to epigenetics · To consider epidemiology as a whole and the special nature of genetics within epidemiological analyses Cover the epidemiological

  • Upload
    others

  • View
    1

  • Download
    1

Embed Size (px)

Citation preview

  • Epigenetic Epidemiology 2012

    Genetic epidemiology – principles relevant to epigenetics

    Nic Timpson

  • Objectives: To reconsider the measurement of complex biological systems To consider epidemiology as a whole and the special nature of genetics within epidemiological analyses Cover the epidemiological properties of genetic variation Consider the principles in genetic epidemiology relevant to epigenetics in 2012 Should should be able to: Place the study of epigenetics in the context of epidemiology broadly Explain how the nature of epidemiological measurement/information varies Describe the properties of genetic variation in particular Transfer lessons learnt in the study of genetics to that of epigenetic data where relevant

    Epigenetic Epidemiology 2012

  • For the human body is so designed by nature that the face, from the chin to the top of the forehead and the lowest roots of the hair, is a tenth part of the whole height; the open hand from the wrist to the tip of the middle finger is just the

    same; the head from the chin to the crown is an eighth, and with the neck and shoulder from the top of the breast to the lowest roots of the hair is a sixth; from the middle of the breast to the summit of the crown is a fourth. If we take the height of the face itself, the distance from the bottom of the chin to the under side of the nostrils is one third of it; the nose from the under side of the nostrils to a line between the eyebrows is the same; from there to the lowest roots of the hair is also a third, comprising the forehead. The length

    of the foot is one sixth of the height of the body; of the forearm, one fourth; and the breadth of the breast is also one fourth. The other members, too, have their own symmetrical proportions…. Then again, in the human body

    the central point is naturally the navel. For if a man be placed flat on his back, with his hands and feet extended, and a pair of compasses centred at his navel, the fingers and toes of his two hands and feet will touch the circumference of a circle

    described therefrom…. the outstretched arms, the breadth will be found to be the same as the height….

    “Vitruvian Man” Da Vinci c.1487

    Vitruvius De architectura c.15BC

    Epigenetic Epidemiology 2012

    Measurements are proxies (most of the time…)

  • Epigenetics

    “epi”, from greek: “above”. Historically, the word “epigenetics” used to describe events that could not be explained by genetic principles. In to the context of the “gene number” arguments and including seemingly “unrelated” processes, such as paramutation in maize, position effect variegation in the fruit fly, genomic imprinting and X-inactivation/Lyonisation…

    It is now a rapidly expanding field with the uncovering of common molecular mechanisms. Epigenetic Epidemiology 2012

  • NICHE CONSTRUCTION

    NATURAL SELECTION

    Epidemiology – “epi” & “demos” – should consider in the general context of measurement…

    ENVIRONMENTAL PROXY

    GENOTYPE BY HISTORY

    “EPIDEMIOLOGY”

    “GENETIC EPIDEMIOLOGY”

    “EPIGENETIC EPIDEMIOLOGY”

    PROXIMAL DISTAL

    DIRECTLY MEASURED

    MEASURED BY OBSERVATION/BEST

    AVAILABLE TOOL

    NOT EASILY TRANSLATED

    EASILY TRANSLATED/CLINICA

    L

    ENVIRONMENT

    GENOTYPE

    REALISED PHENOTYPE

    PROCESS REGULATION

    INTERMEDIATE

    Epigenetic Epidemiology 2012

  • Why special? Properties of the measurement of genotype Properties of the information contained within genotypes (lead on to MR) Requires some extra understanding re the mechanisms of variation and the genomic landscape of traits…

    Epigenetic Epidemiology 2012

    Genetic Epidemiology

  • Why special? Properties of the measurement of genotype Properties of the information contained within genotypes (lead on to MR) Requires some extra understanding re the mechanisms of variation and the genomic landscape of traits…

    Epigenetic Epidemiology 2012

    Genetic Epidemiology

  • PROXIMAL DISTAL

    Epigenetic Epidemiology 2012

  • Why special? Properties of the measurement of genotype Properties of the information contained within genotypes (lead on to MR) Requires some extra understanding re the mechanisms of variation and the genomic landscape of traits…

    Epigenetic Epidemiology 2012

    Genetic Epidemiology

  • “Genetics is indeed in a peculiarly favoured condition in that Providence has shielded the geneticist from many of the

    difficulties of a reliably controlled comparison. The different genotypes possible from the same mating have been

    beautifully randomized by the meiotic process…..Generally speaking the geneticist, even if he foolishly wanted to,

    could not introduce systematic errors into the comparison of genotypes, because for most of the relevant time he has

    not recognized them.”

    Fisher RA. Statistical Methods in Genetics. Heredity (1952) 6, 1-12

    Importance of the apparent Independence of heritable units within the human genome

    Epigenetic Epidemiology 2012

  • Why special? Properties of the measurement of genotype Properties of the information contained within genotypes (lead on to MR) Requires some extra understanding re the mechanisms of variation and the genomic landscape of traits…

    Epigenetic Epidemiology 2012

    Genetic Epidemiology

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

  • ?

    Epigenetic Epidemiology 2012

  • Epigenetic Epidemiology 2012

    Genetic Epidemiology As yet unsolved…

  • Principles from genetic epidemiology relevant for epigenetics

    (i) Candidate research versus genomewide approaches

    (ii) Array based methods and the importance of QC

    (iii) Properties of measurements – interpretation and use

    (iv) Sample sizes, power & replication

    (v) Collaboration, sharing of data & repositories

    (vi) Translation of effects & Integration of multiple data sources

    Epigenetic Epidemiology 2012

  • (i) Candidate research versus genomewide approaches

    Hypothesis driven approaches have/had been the main stay for both genetic epidemiology and epigenetic epidemiology. Based on the measurement of specific genotypes (or methylation profiles) under reasonable hypotheses of gene effect. Biological plausibility/Association strength/Dose response/Replication (Tabor, Risch, Myers, NRG 2002)

    Epigenetic Epidemiology 2012

  • Biologic candidate approach

    Positional cloning

    Hypothesis-free approach

    SLC30A8

    CDKAL1

    IGF2BP2

    CDKN2A

    FTO HHEX

    WFS1 HNF1B

    1.05

    1.10

    1.15

    1.20

    1.25

    1.30

    1.35

    1.40

    1.00

    Risk of diabetes

    (Odds ratio)

    1997 1998-2005 2006 2007 2008 2009

    TCF7L2 KCNJ11

    PPARG

    MTNR1B

    THADA NOTCH2

    CAMK1D ADAM30

    JAZF1 ADAMTS9

    TSPAN8

    KCNQ1

    2010

    IRS1

    Loci reproducibly associated with type 2 diabetes (Oct. 2011)

    Slides courtesy of Paul Franks

    2011

    HCCA2

    SPRY2

    2012??

    HMGA2 TLE4

    ZFAND6 RBMS1 C2CD4A

    PRCT BCL11A

    HNF1A GCK

    ARAP1

    KLF14

    DGKB

    GCKR

    PROX1 ZBED3 UBE2E2 ADCY5

    CDC123

    PTPRD SRR

    TP3INP1

    PPARG

    Loci and effect sizes are from the DIAGRAM+ consortium

    Epigenetic Epidemiology 2012

  • (ii) Array based methods and the importance of QC

    Sample based QC (DNA quality): Overt and cryptic relatedness Ethnicity Missingness Heterogeneity Variant based QC (e.g. SNPs) Frequency Basic expectations – e.g. Hardy Weinberg Equilibrium Missingness (& possible biased patterns re. C/C status) Data quality (as per plots!)

    Sample based QC (DNA quality): Bisulfite conversion Batch effects Missingness Variant based QC (e.g. SNPs) Noise/signal ratio Probe concordance (although metrics to compare against??)

    Epigenetic Epidemiology 2012

  • (iii) Properties of measurements – interpretation and use

    Genotypic data

    Afforded the “luxury” of independent segregation and direct measurement Past the natural properties of genotypic data, allows for relatively simple statistical analysis and the possibility of information in inference (i.e. pathway dissection) Some complications – Linkage disequilibrium, genomic landscape, structure…

    Epigenetic Epidemiology 2012

  • Novel Pathways

    Energy intake Kcal/day/BW Church et al. Nature Genetics 2010;42:1086–1092

    Epigenetic Epidemiology 2012

  • Phenotypic heterogeneity

    Epigenetic Epidemiology 2012

  • What happens if we look at the association between PC1 and our phenotype? Is this likely to be a problem (?)

    Other types of possible confounding ??

    Sampling frame

    Epigenetic Epidemiology 2012

  • (iii) Properties of measurements – interpretation and use

    Dietary folate Methylation Target expression Phenotype

    Diet

    Socio-economic status

    Smoking ???

    Off target expression

    Off target phenotype

    Assuming data are clean and reliable…

    Genotypic data

    Epigenetic data

    Afforded the “luxury” of independent segregation and direct measurement Past the natural properties of genotypic data, allows for relatively simple statistical analysis and the possibility of information in inference (i.e. pathway dissection) Some complications – Linkage disequilibrium, genomic landscape, structure…

    Epigenetic Epidemiology 2012

  • (iv) Sample sizes, power & replication

    Epigenetic Epidemiology 2012

    Deducing “true numerical ratios” requires “the greatest possible number of individual values; and the greater the number of these the more effectively will mere chance be eliminated”.

    Gregor Mendel 1865/6

    Nature, June 7, 2007

    0

    2

    4

    6

    8

    10

    12

    14

    16

    18

    20

    1.05 1.15 1.25 1.35 1.45 1.55 1.65 1.75 1.85 1.95

    Odds Ratio

    %

    Wan et al, HMG(2012)

  • (v) Collaboration, sharing of data & repositories

    Epigenetic Epidemiology 2012

  • (vi) Translation of effects & Integration of multiple data sources

    Epigenetic Epidemiology 2012

    Genomics

    Transcriptomics Metabolomics

    Epigenomics

  • Overall…

    •Much to be learnt from genetic data, but there are clear differences in the properties of genetic and epigenetic data

    •A favoured standpoint is that of a united epi.demi.ology and recognition that these approaches are just different points of measurement on the same overarching scheme

    •Experiences from the handling of large scale genetic data will be valuable for the arrival of array based epigenetic data

    •Will need a convergence of techniques from: –Observational epidemiology –Array based, high throughput, molecular analysis

    Epigenetic Epidemiology 2012

  • Objectives: To reconsider the measurement of complex biological systems To consider epidemiology as a whole and the special nature of genetics within epidemiological analyses Cover the epidemiological properties of genetic variation Consider the principles in genetic epidemiology relevant to epigenetics in 2012 Should should be able to: Place the study of epigenetics in the context of epidemiology broadly Explain how the nature of epidemiological measurement/information varies Describe the properties of genetic variation in particular Transfer lessons learnt in the study of genetics to that of epigenetic data where relevant

    Epigenetic Epidemiology 2012

  • References - Baldwin, B. The date, identity and career of Vitruvius. in Latomus, Vol. 49 425-434 (1990). - Zuk, O., Hechter, E., Sunyaev, S.R. & Lander, E.S. The mystery of missing heritability: Genetic interactions create phantom heritability. Proceedings of the National Academy of Sciences of the United States of America 109, 1193-8 (2012). - Maher, B. Personal genomes: The case of the missing heritability. Nature 456, 18-21 (2008). - Antequera, F. & Bird, A. Number of CpG islands and genes in human and mouse. Proceedings of the National Academy of Sciences of the United States of America 90, 11995-9 (1993). - McCarthy, M.I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9, 356-69 (2008). - Fisher, R.A. Statistical methods in genetics. Heredity 6, 1-12 (1952). - Zeggini, E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet 40, 638-645 (2008). - Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336-41 (2007). - WTCCC Consortium. Genome-wide association study of 14, 000 cases of seven common diseases and 3, 000 shared controls. Nature 447, 661-678 (2007). - Frayling, T.M. et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316, 889-94 (2007). - Church, C. et al. A mouse model for the metabolic effects of the human fat mass and obesity associated FTO gene. PLoS Genetics 5, e1000599 (2009). - Gerken, T. et al. The obesity-associated FTO gene encodes a 2-oxoglutarate-dependent nucleic acid demethylase. Science 318, 1469-1472 (2007). - Timpson, N.J. et al. The FTO/obesity associated locus and dietary intake in children. American Journal of Clinical Nutrition 88, 971-978 (2008). - Cecil, J.E., Tavendale, R., Watt, P., Hetherington, M.M. & Palmer, C.N.A. An obesity-associated FTO gene variant and increased energy intake in children. New England Journal of Medicine 359, 2558-66 (2008). - Stratigopoulos, G. et al. Regulation of Fto/Ftm gene expression in mice and humans. AJP - Regulatory, Integrative and Comparative Physiology 294, R1185-1196 (2008). - Cauchi, S. et al. The genetic susceptibility to type 2 diabetes may be modulated by obesity status: implications for association studies. BMC Medical Genetics 9, 45 (2008). - Timpson, N.J. et al. Adiposity-related heterogeneity in patterns of type 2 diabetes susceptibility observed in genome-wide association data. Diabetes 58, 505-10 (2009). - Heath, S.C. et al. Investigation of the fine structure of European populations with applications to disease association studies. European Journal of Human Genetics 16, 1413-29 (2008). - Davey Smith, G. et al. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS Medicine 4, e352 (2007). - Wan, E.S. et al. Cigarette smoking behaviors and time since quitting are associated with differential DNA methylation across the human genome. Human Molecular Genetics AOP(2012). - A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861 (2007). - Speliotes, E.K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42, 937-948 (2010). - Dupuis, J. et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nature Genetics 42, 105-16 (2010). - Tabor, H.K., Risch, N.J. & Myers, R.M. Candidate-gene approaches for studying complex genetic traits: practical considerations. Nature Reviews Genetics 3, 391-7 (2002).