Lecture 6 candidate gene association full

  • View
    292

  • Download
    1

  • Category

    Science

Preview:

DESCRIPTION

Candidate Gene studies

Citation preview

Lecture 7: Genetic Association - Candidate Gene Studies

Learning objectivesPrimary•Describe the type of statistical test used for association•Explain population stratification & its effects•Explain Hardy-Weinberg EquilibriumSecondary•Describe the differences between linkage and association•Give one advantage & one disadvantage of case-control studies•Give one disadvantage & one disadvantage of case-control studies

Part 1: Different types of studies

• Linkage vs. Association• Candidate gene vs. GWAS• Case-control vs. Quantitative Trait

Linkage vs. Association• Linkage • Association

Linkage vs. Association• Linkage• Localization is…?• Arise from…?• Sample requirements…?

• Association• Localization is….?• Arise from…?• Sample requirements…?

Linkage vs. Association… (ish)• Linkage• Large effects • Low disease frequency

• Association• Small(er) effects• More common disease

frequency

Needle vs. haystack

Linkage vs. Association… (ish)E

ffect

Frequency

Linkage analysis

Association studies

Common disease, common variant hypothesis

Common disease, common variant hypothesis

Candidate gene vs. GWAS• Candidate gene• Small(er) number of

variants• Selected a priori based

on a hypothesis• Confirmatory• High(er) power

• GWAS• Large number of

variants• Hypothesis-free or

hypothesis-generating• Exploratory• Low(er) power

Candidate Gene studies• Goal: characterize candidate genes and variants

related to disease• Not typically intended to “find genes,” generally

begun after disease-related variants identified• Assess generalizability of family-based observations

(genetic heterogeneity)• Assess importance of allelic variation at population

level Identify modification of genetic association by environmental factors (GxE interaction)

Growth of the field of human genetics#

varia

nts

Year

1980s 1990s 2000 2007 2010

10s

1000s

100s

1x105

1x106

10x106

CandidateGenes

Linkage

GWAS

Exome andWhole-genome

sequencing

BUT….

Our genetic concepts evolve

Recognition of the problems with GWAS

Growth of the field of human genetics#

varia

nts

Year

1980s 1990s 2000 2007 2010

10s

1000s

100s

1x105

1x106

10x106

CandidateGenes

Linkage

GWAS

Exome andWhole-genome

sequencing

BUT….

CandidateGenes

(More on) Study designs

1. Case-control2. Quantitative Trait

Case-control studiesAdvantages• May be the only way to study rare diseases or those

of long latency• Existing records can occasionally be used if risk factor

data collected independent of disease status• Can study multiple etiologic factors simultaneously• May be less time-consuming and expensive• If assumptions met, inferences are reliable

Case-control studiesDisadvantages• Relies on recall or records for information on past

exposures; validation can be difficult or impossible• Selection of appropriate comparison group may be

difficult• Multiple biases may give spurious evidence of

association between risk factor and disease• Usually cannot study rare exposures• Temporal relationship between exposure and disease

can be difficult to determine

“But,” they say, “This Is Genetics!”(you silly epidemiologist)

“This Is Different!”• Genes are measured the same way in cases and controls• Information on key exposure is easy to validate• No recall or reporting involved• Temporal relationship between genes and disease is piece

of cake“BUT,” I SAY,• Bias-free ascertainment of cases and controls is still major

concern; cases in most clinical series unlikely to be representative

• Assessment of risk modifiers or gene-environment interactions is likely to be incomplete or flawed

A little more on quantitative traits…

Clinical Diagnosis of the disorder

Hyperactivity Impulsivity Oppositionality Inattention

Does not follow instructions

Does not wait turn

Rater bias

Clinical Diagnosis of the disorder

Hyperactivity Impulsivity Oppositionality Inattention

Does not follow instructions

Does not wait turn

Rater bias

Biomarker

Part 2: Conducting Genetic Association studies• Analytic model• Genotypic model• Multiple testing

Analytic model: Case-control

• Does a particular allele show up more times in cases than controls?

Analytic model: Quantitative trait

• Mean differences between genotypes

0

0.2

0.4

0.6

0.8

1

1.2

-6 -4 -2 0 2 4 6

More copies of ‘a’ allele More copies of ‘A’ allele

Conducting genetic association

• Basically a “simple” regressionYi = + Xi + ….+ ei

whereYi = trait value for individual iXi = genotype for individual i

i.e., test of mean differences, or a test of frequency differences

Genotypic model

• Additive, or dominant / recessive predictor?• Additive (most common?): AA vs. Aa vs. aaPhenotypic means: AA=10, Aa = 20, aa = 30• Recessive effect of minor allele: aa vs. Aa / AAPhenotypic means: aa = 10, Aa=15, AA=15• Dominant effect of the minor allele:Phenotypic means: aa=10, aA = 10, AA = 15.

aa / Aa vs. AA

Multiple Testing

• Often test many SNPs in a gene, or many SNPs in many genes

• Bonferroni – very conservative, assumes independence. p = α / (n tests)

• FDR – Correct for the expected number of false positives at your significance level.

FDR = expected (# false positive predictions/ # total positive predictions)

Part 3: Interpreting your results

• Who do our results apply to?• Is our SNP causal?• Potential Problems:

– Hardy-Weinberg Equilibrium– Population Stratification

Interpreting data

• What did we learn about heritability from our previous lectures?

Interpreting data• What did we learn SNPs from HapMap?

Disease variant

Disease status

Genetic marker

Underlying association

Association between variant and marker

(i.e. LD)

Association due to both underlying

association and LD

Indirect allelic association

Population Stratification

Population stratification

Successful-Use-Of-Selected-Hand-Instruments gene

(SUSHI)

Population stratification

Systematic differences in allele frequencies due to the difference in sample ancestries that can lead to both false positive or false negative findings.

Bouaziz et al, PLoS ONE, 2011

Real examples of population stratification• Knowler et al. A spurious inverse association between

immunoglobulin haplotype: Gm and T2DM. The inverse association actually reflected the association between heritage and Gm.

• A1allele at DRD2 and alcoholism. Ethnic differences in DRD2 alleles: 10% among Yemenite Jews to 80% among Cheyenne Indians; among the controls in the 11 studies reviewed by Gelernter et al., the frequencies ranged from 6 to 24% (10 to 37% among cases). This disappears when population stratification is controlled for.

Controlling for population stratification• Only use one population (remember Table 1?)• Stratify by population• TDT (within family analysis by Allison et al, 1999) “the

frequency with which the allele of interest is transmitted from a heterozygous parent to an affected child; significant deviation from the Mendelian expectation of 50% transmission, is taken as an indication of both association and linkage.”

• Difficult issue, no perfect solution.

Hardy-Weinberg Equilibrium• Hardy (British mathematician) &Weinberg, a (German physician)• the frequency of alleles and genotypes in a population will

remain constant from generation • Under 5 conditions:• A large breeding population• Random mating• No change in allelic frequency due to mutation• No immigration or emigration• No natural selection

• The allele for black coat is recessive to the allele for white coat. Can you count the number of recessive alleles in this population?

Hardy-Weinberg Equilibrium

• Definition:• If 2 alleles p and q. p + q = 1• p= 1-q• q = 1-p

Hardy-Weinberg equilibrium

• Back to our punnet square

p q

p

q

p2 pq

pq q2

p2+2pq+q2=1

• The allele for black coat is recessive to the allele for white coat. Can you count the number of recessive alleles in this population?

• How many individuals are qq? What % of the population is this?

• What is the frequency of q?• What is the frequency of p?• What is the frequency of 2pq?

N=4, 25% (.25)

qq = (.25) then q = √.25 = .5q =.5, then p=1-.5 = .5

2*.5*.5 = .5

1. In a certain population of 1000 fruit flies, 640 have red eyes while the remainder have sepia eyes. The sepia eye trait is recessive to red eyes. How many individuals would you expect to be homozygous for red eye color?

2. In the United States, one out of approximately 10,000 babies is born with PKU. Approximately what percent of the population are heterozygous carriers of the recessive PKU allele?

• In a certain population of 1000 fruit flies, 640 have red eyes while the remainder have sepia eyes. The sepia eye trait is recessive to red eyes. How many individuals would you expect to be homozygous for red eye color?

qq = 360 / 100 = .36q = .6p=.4Pp = .4 * .4 = .16.16 * 1000 = 160

• In the United States, one out of approximately 10,000 babies is born with PKU. Approximately what percent of the population are heterozygous carriers of the recessive PKU allele?

qq = .0001q = .01p = .99Pq = 2 * .01 * .99 = .0198 = 1.98%

Hardy-Weinberg Disequilibrium

• Red flag for genotyping error

Learning objectivesPrimary•Describe the type of statistical test used for association•Explain population stratification & its effects•Explain Hardy-Weinberg EquilibriumSecondary•Describe the differences between linkage and association•Give one advantage & one disadvantage of case-control studies•Give one disadvantage & one disadvantage of case-control studies

Part 4: Lab

• Lab: Are LRP-1 SNPs associated with obesity?• Quantitative and a Qualitative analysis

Recommended