24
Genetic basis of disease and treatment Jake Chen Howard Edenberg David Flockhart Tatiana Foroud Matthew Hahn Jeesun Jung* Dan Koller Lang Li* Sean D. Mooney* Predrag Radivojac Yuzhen Ye

Evaluation of Features for Catalytics Residue Prediction in Novel

  • Upload
    pammy98

  • View
    559

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Evaluation of Features for Catalytics Residue Prediction in Novel

Genetic basis of disease and treatment

Jake ChenHoward EdenbergDavid FlockhartTatiana ForoudMatthew HahnJeesun Jung*Dan KollerLang Li*

Sean D. Mooney*Predrag Radivojac

Yuzhen Ye

Page 2: Evaluation of Features for Catalytics Residue Prediction in Novel

Introduction to Challenges in Modern Genetics

• Introductions• Genome wide approaches• Pharmacogenetics/genomics• Genotype and phenotype databases• Finding causative variation and

understanding the biochemical function• Translational approaches

Page 3: Evaluation of Features for Catalytics Residue Prediction in Novel

Introduction to Modern Genetics

• Although the era is post-genomic, genetic studies still use the fundamentals– Genetic markers that co-segregate with disease

are used to identify genetic elements associated with disease that are not present in unaffected individuals.

• Studies take advantage of either linkage or linkage disequilibrium to identify markers that are ‘close’ to the disease causing genetic elements

Page 4: Evaluation of Features for Catalytics Residue Prediction in Novel

Linkage and Linkage Disequilibrium

• Genetic studies take advantage of two important (and different!) concepts:

– Genetic linkage occurs when alleles are inherited jointly.

– Linkage disequilibrium is the non random association of alleles at two or more loci in a population, not necessarily on the same chromosome

– Allele – a variant locus (position) on a chromosome

(source Wikipedia)

Page 5: Evaluation of Features for Catalytics Residue Prediction in Novel

Genetic Study Design

Families with multiple Families with multiple affected membersaffected members

Case and ControlCase and Control Family based AssociationFamily based Association

Cases(affected)

Controls (unaffected)

Linkage

Genetic markers co-segregate with disease

Association (Linkage Disequilibrium)

Association of alleles with disease status

Page 6: Evaluation of Features for Catalytics Residue Prediction in Novel

Complex disease genetic study

• Genome-wide linkage approaches– Can detect effects of “unknown” genes (rare

disease, rare variants)– Family-based– Modest statistical power

• Genome-wide association approaches– Rare/common disease and rare/common variants– Population-based or family-based– High statistical power

Page 7: Evaluation of Features for Catalytics Residue Prediction in Novel

Complex disease genetic study Potential Problem

• Genome-screen linkage approaches– After 10K SNPs = no additional information– Markers in high LD can inflate the evidence of linkage.– Linkage peak still relatively broad

• Genome-wide association approaches– More powerful for genes with small or moderate effects– Need large replication study to confirm effects– False Discovery Rate for multiple comparisons– Choice of SNPs (tag SNPs)– Population Stratification causes false association.

Page 8: Evaluation of Features for Catalytics Residue Prediction in Novel

Candidate vs. whole genome-wide association

• Candidate gene approaches

– Part of a biological system with suspected influence on disease

– Helps study gene pathways

• Whole genome approaches

– Whole genome using

>100,000 SNPs

Page 9: Evaluation of Features for Catalytics Residue Prediction in Novel

Whole Genome-wide association

• In May 2007, 1 M SNPs platforms will be distributed by Affymetrix

• Are we ready?– Genotype-Phenotype Databases– False Discovery Rate– Population stratification – LD and allele

frequency are different among populations.

Page 10: Evaluation of Features for Catalytics Residue Prediction in Novel

Challenge of genome-wide association

• Study Design

• False Discovery Rate– Use of Pre-information (in views of statistics) : Linkage

signal

• Population-specific effect : LD and SNP frequency differs by population

Sequential or Multi-stage design Sequential or Multi-stage design Simultaneous or Single stage designSimultaneous or Single stage design

sample sample

100K SNPs 100K SNPs

1

3

2

Page 11: Evaluation of Features for Catalytics Residue Prediction in Novel

Pharmacogenetics

• Pharmacogenetics is the study of how genetics cause variability in treatment response and outcome

• The NIH has funded a network of centers to study PG in many important drugs (PGRN). Examples:– Warfarin– Tamoxifen– Statins– etc..

Page 12: Evaluation of Features for Catalytics Residue Prediction in Novel

Pharmacological concepts

• Pharmacokinetics (PK) – the study of the time course of substances and their relationship with an organism or system– Informally, PK is what a body does to a drug

• Pharmacodynamics (PD) – the biochemical and physiological effects of drugs and the mechanisms of drug action and the relationship between drug concentration and effect– Informally, PD is what a drug does to a body

(source wikipedia)

Page 13: Evaluation of Features for Catalytics Residue Prediction in Novel

Pharmacogenetics

Drug Concentration

(PK)

Clinical Endpoints

metabolic enzymetransporters

Receptors

Candidate Genes ForDisease Predisposition

Environmental Effects:drug-drug interactionfood…

Disease statusCo-medicationsPhysiology status…

Page 14: Evaluation of Features for Catalytics Residue Prediction in Novel

Cell Line Level PG Approach

• For example, CELPH lymphoma samples can be treated with different agents.

– Genetic data from Hapmap data (Genotypes)

– Individual protein and mRNA expression and activities can be assayed. They can served as intermediated phenotypic data for evaluating PG effects.

Page 15: Evaluation of Features for Catalytics Residue Prediction in Novel

Animal Model PG Approach

• Informatic approach that uses knowledge of the genotypes of mouse strains.

• 40 Mouse Strain Sequences, i.e. genotypes (Roche).

• Phenotypic data can be measured for drug target tissue level and the association is done in silico

Page 16: Evaluation of Features for Catalytics Residue Prediction in Novel

Clinical PG Study

• Genotype Selection: – validated in cell line or animal

models– pathways – Function– LD

• Phenotype Selection– PK– intermediate biomarkers– clinical efficacy– clinical side effect

Page 17: Evaluation of Features for Catalytics Residue Prediction in Novel

Challenges

• Functional mutation prediction (connecting association with causation)

• Sensible intermediate phenotype selection

• Gene-gene interaction

• Gene-environmental interaction

Page 18: Evaluation of Features for Catalytics Residue Prediction in Novel

This is creating genetic data faster than we are creating genetic databases

• The genetics community has not succeeded in coalescing a database driven informatics solution to genetic data

– There is no centralized database for genetic data!

– There is no defined approach for annotating and representing phenotype!

– There is no centralized database for genotype-phenotype associations

Page 19: Evaluation of Features for Catalytics Residue Prediction in Novel

Genotype and Genotype-Phenotype Databases Are Growing But Disparate

• Genotype:– dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/)

• Genotype-Phenotype:– OMIM (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?

db=OMIM)– HGVBase (http://hgvbase.cgb.ki.se/)– HGMD (http://hgmd.org/)– Locus Specific Databases – AR, BRCA1, P53– PharmGKB (http://www.pharmgkb.org/) – Pharmacogenetics

data– Coming soon: GAIN, GWAS (GAPdb?)

Page 20: Evaluation of Features for Catalytics Residue Prediction in Novel

Database challenges

• Directly linking clinical and basic research data

• Informatic challenges:– Representing phenotype and disease– Data sharing, consent, and policy issues– Managing extremely large datasets– Using literature and ‘-omics’ data to

identify promising candidate genes

Page 21: Evaluation of Features for Catalytics Residue Prediction in Novel

Connecting association to biochemical function

http://snp.ims.u-tokyo.ac.jp/samplesMethods.html#SNP http://snp.ims.u-tokyo.ac.jp/samplesMethods.html#SNP

Page 22: Evaluation of Features for Catalytics Residue Prediction in Novel

Decision tree for polymorphism analysis

From: Bioinformatics for GeneticistsBarnes and Gray, eds. (Wiley)

Decision tree for polymorphism analysis

Page 23: Evaluation of Features for Catalytics Residue Prediction in Novel

MutDB: Top genes since last October

• Top genes make sense on what people are interested in:– BRCA1– CFTR– AR– TP53– CYP2D6

Page 24: Evaluation of Features for Catalytics Residue Prediction in Novel

Canned Discussion Questions

• Genome wide association studies are going to begin collecting data at an incredible rate (1M SNPs x N) how do we cope?

• Analysis approaches in gene/gene and gene/environmental interactions in either linkage based or random population based studies.

• Integration of SNP data and other -omics data.