33
Predictive sub-typing of subjects • Retrospective and prospective studies • Exploration of clinico-genomic data • Identify relevant gene Issues in Bayesian Tree Modeling of Clinical and Gene Expression Data

Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

• Predictive sub-typing of subjects

• Retrospective and prospective studies

• Exploration of clinico-genomic data

• Identify relevant gene expression patterns

Issues in Bayesian Tree Modeling of Clinical and Gene

Expression Data

Page 2: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Current Areas of Application

Breast Cancer

lymph node status

disease recurrence

Ovarian Cancer

tumor location

Page 3: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Lymph Node Involvement Is a Key Breast Cancer Risk Factor

But -- lymph node dissection also carriesmorbidity and inaccuracy

Page 4: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Identifying Metagenes Associated With Lymph Node Status

Tumor Sample

Gen

e

Page 5: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Metagenes/ Expression Signatures

• Dimension reduction: Signal improvement

• Clustering• Singular value decomposition• Empirical or model-based factor

analysis• Characterize patterns in data

Page 6: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Gene Clustering

Page 7: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Gene Clustering (cont’d)

Page 8: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Factor extraction (SVD)

Page 9: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Differential Gene Expression

Page 10: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Differential Gene Expression (Threshold 1)

Page 11: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Differential Gene Expression (Threshold 2)

Page 12: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Differential Gene Expression (Threshold 3)

Page 13: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Nonlinear Expression

Page 14: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Nonlinear Expression (Threshold 1)

Page 15: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Nonlinear Expression (Threshold 2)

Page 16: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Lymph Node Metastasis Metagenes

Page 17: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Ovarian Tumor Site Genes

Page 18: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Statistical Tree Models for Clinico-Genomic Prediction

• Regression trees: Non-linear, interactions

• Recursive partitioning • Retrospective studies• Many trees: Model uncertainty• Predictions average across trees

Page 19: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Binary Outcomes Retrospective Sampling

)|1(Pr 1 ppxxY

LN +LN +

Page 20: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Binary Outcomes: Prospective Inference from Retrospective

Model)τ,...,xτ|x(Yπ kk 111Pr

10Pr 111 , yy), |Yτ(x a y

10,|Pr 11222 , yy), Yx(x a y

0201

1211

0Pr

1Pr

1 ,,

,,

aa

aa

)(Y

)(Y

π

π

Page 21: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Binary Outcomes: Retrospective Model

• Model conditionals for predictors

• Nonparametric Bayes: Dirichlet model

• Modeling in x space – joint structure• Implies Beta priors on

y)Yτ,...,xτ|xF(x 1k1kk ,11

iya

Page 22: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Growing Binary Trees

Node split:• Each candidate predictor:threshold pair• 2x2 table: 2 Bernoulli’s, fixed columns

(Y=0/1)• Assess and select split, or stop• Conservative Bayesian tests

Multiple trees:• Multiple splits at any node

Page 23: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Inference with Many Binary Trees

Within-tree inference & prediction: • Sequences of beta posteriors for • Simulate: Impute Pr(Y=1|leaf)

Multiple trees:• Likelihood across trees• Average predictions across trees • Model (predictor:threshold)s

uncertainty• “Smoothing” classification boundaries

iya

Page 24: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Binary Outcome: Lymph Node Metastasis

Tumor Sample

Gen

e

Predictive trees:

• Nonparametric Bayes’

• Metagene expression

• Retrospective sampling

Lancet 2003 (Huang, West et Lancet 2003 (Huang, West et al)al)

Page 25: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Predicting Lymph Node Status With Metagenes

LN+ LN-

Pro

babi

lity

of L

N+

Out-of-sample cross validation

Sample

Page 26: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Forests of Clinico-Genomic Trees

Select from potential clinical and genomic predictor variables

multiple trees

variable combination – co-occurrence

multiple subtypes

Page 27: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

… With Metagenes and Clinical Predictors

LN+ LN-

Pro

babi

lity

of L

N+

Out-of-sample cross validation

Sample

Page 28: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Lymph Node Clinico-Genomic Predictors

Page 29: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Predicting Ovarian Tumor Site

Omentum Ovary

Pro

babi

lity

of O

men

tum

Out-of-sample cross validation

Sample

Page 30: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Gene Identification

• Implicated metagenes – gene subsets

• Genes correlated with key metagenes

Breast Cancer – nodal metastasis:• Interferon pathway/inducible gene subset

• Interferons mediate anti-tumor response

Evidence of dysfunction of normal anti-tumor response?

Ovarian Cancer – tumor site:• Growth regulatory pathway/inducible gene subset

Evidence of dysfunction of normal cell growth?

Page 31: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Ongoing Research

• Stochastic search (sequential,annealing)

• Representation of tree ‘forest’• Metagene definition/ creation• Cluster implementation of tree models

Page 32: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Computational & Applied Genomics Program

Joseph Nevins Mike WestErich Huang Ed IversenHolly Dressman

Duke University

Koo Foundation-Sun Yat Sen Cancer Center

Andrew Huang, Skye Cheng, Mei-Hua Tsou

http://www.isds.duke.edu/~jennifer/

Department of Obstetrics and Gynecology

John LancasterAndrew Berchuck

Page 33: Predictive sub-typing of subjects Retrospective and prospective studies Exploration of clinico-genomic data Identify relevant gene expression patterns

Growing Binary Trees (2x2)

kk tx

kk tx

0Y 1Y

1N

0N00n 01n

10n 11n

N

)Yτ,x,τ|xP(x 1k1kkk 0,11

)1,,,|( 11 YxxxP 1k1kkk

?