Causal Graphical Models in Systems Genetics...Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell

Causal Graphical Models in Systems Genetics

2013 Network Analysis Short Course - UCLA Human Genetics

Elias Chaibub Neto and Brian S Yandell

July 17, 2013

1

Motivation and basic concepts

2

Motivation

Suppose the expression of gene G is associated with a clinical phenotype C.

We want to know whether: G → C or if G ← C or if C ↔ G .

We cannot distinguish between these models using data since

f (G ) f (C | G ) = f (G ,C ) = f (C ) f (G | C ) ,

and their likelihood scores are identical.

3

Schadt et al. (2005)

However, if G and C map to the same QTL, we can use genetics to inferthe causal ordering among the phenotypes.

0 20 40 60 80 100

0

5

10

15

20

Chromosome

lod

Q

Q

G

C

causal

Q

G

C

reactive

Q

G

C

indepen

4

Schadt et al. (2005)

For a drug company, it is important to determine which genes are causal,and which genes are reactive, since:

I Causal genes have the potential to become drug targets.

I Whereas reactive genes are of lesser interest.

5

Genetics and causal inference

The integration of genetics and phenotype data allows us to infer causalrelations between phenotypes for two reasons:

1. In experimental crosses, the association of a QTL and a phenotype iscausal.

2. A causal QTL can be used to determine the causal order betweenphenotypes using the concept of conditional independence.

6

Causal relations between QTLs and phenotypes

In experim. crosses, the association of a QTL and a phenotype is causal.

Why is it so ?

QTL mapping is analogous to a randomized experiment (Li et al. 2006).

Randomization is considered the “gold standard” for causal inference.

Causality can be inferred from a randomized experiment since:

1. Application of a treatment to an experimental unit precedes theobservation of the outcome (genotype precedes phenotype).

2. Because the treatment levels are randomized across the experimentalunits, the effects of confounding variables get averaged out (themendelian randomization of alleles during meiosis average out theeffects of other unlinked loci on the phenotype).

7

Causal relations between QTLs and phenotypes

X

A

chr1

X

B

Phenotype

chr2

0

5

10

15

20

Chromosome

lod

1 2

Aa−− AA−−−2

02

46

8

Effect of QTL A, after the effect of B is averaged out

phen

otyp

e

Bb−− BB−−−2

02

46

8

Effect of QTL B, after theeffect of A is averaged out

phen

otyp

e

8

Conditional independence as the key to causal orderingModel: Q → G → C

Marginal dependence:

Aa AA

02

46

GAa AA

−2

26

C

Conditional independence:

0 2 4 6

−2

02

46

8

G

C

Aa AA

−3

−1

13

res(

C|G

)

9

Causal ordering between phenotypes

Q

G

C

causal

Q

G

C

reactive

Q

G

C

indepen

Aa AA

−10

−5

05

10re

s(C

|G)

Aa AA

−10

010

20re

s(C

|G)

Aa AA

−15

−5

05

10re

s(C

|G)

Aa AA

−10

05

15re

s(G

|C)

Aa AA

−5

05

res(

G|C

)

Aa AA

05

1020

res(

G|C

)

10

Causal ordering between phenotypes

In general (although it is not always true):

Models that share the same set of conditional independence relations(Markov equivalent models) cannot be distinguished using the data (theyhave equivalent likelihood functions).

Whereas, models with distinct sets of conditional independence relations,can be distinguished.

11

Causality tests for pairs ofphenotypes

Q

G

C

causal

Q

G

C

reactive

Q

G

C

indepen

12

Pairwise models as collapsed versions of more complexnetworks

(a) Q��

(b) Q��

(c) Q��

(d) Q��~~ ��@

@(e) Q

�� @@

Y1

}}{{ !!CC Y

��Y

~~}} AA Y

��Y��

Y~~}}

Y��

Y!!CC

C Y}}{{{

Y1

��

Y1

��

AAY

��

Y1 Y��

Y1 Y��

Y

��

!!CCC Y

=={{

!!CC Y

��

Y2 Y2

Y}}{{

Y2 Y��

Y

vvnnnnnn Y

OO

66nnnnnn

Y2 Y2

Q}}{{

Q}}{{

Q~~|| BB

Q~~}} AA

Q~~|| BB

Y1// Y2 Y1

//<<Y2bb Y1

// Y2 Y1 Y2 Y1 ==Y2aa

A causal relation might be direct or mediated by other phenotypes.

Pairwise models are misspecified.

13

Schadt et al. 2005

By using this approach, Schadt et al. 2005, has been able to identify, andexperimentally validate genes related to obesity in a mouse cross.

So, what is the issue then?

I Model selection via AIC or BIC scores do not provide a measure ofuncertainty associated with the model selection call.

I With noisy data, model selection can lead to a large number of falsepositives.

14

The issue, and illustrationFor each one of the 1,000 simulations we:

I Generate noisy data from model: Q → Y1 → Y2.

I Fit models M1 : Q → Y1 → Y2 and M2 : Q → Y2 → Y1.

I Compute log-likelihood ratio LR12.

I If LR12 > 0, select M1. If LR12 < 0, select M2.

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.1

0.2

0.3

0.4

0.5

R2(Y1 = Q + ε)

R2 (Y

2=

Q+

ε)

false positives (318)true positives (682)

15

Issue: no measure of uncertainty for a model selection call

We want a statistical procedure that attaches a measure of uncertainty toa model selection call.

However, given the characteristics of our application problem, it:

1. Needs to handle misspecified models.

2. Needs to handle non-nested models:

M1 Q��~~

M2 Q��@

@

Y1// Y2 Y1 Y2

oo

3. Should, ideally, be fully analytical for the sake of computationalefficiency.

16

Assessing the significance of a model selection call

Vuong’s model selection test (Vuong 1989) satisfies these three criteria.

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.1

0.2

0.3

0.4

0.5

R2(Y1 = Q + ε)

R2 (Y

2=

Q+

ε)false positives (1)true positives (65)no calls (934)

17

Vuong’s model selection test (Vuong 1989)

Consider 2 competing models M1 ×M2.

Vuong’s test the hypothesis:

H0 : M1 is not closer to the true model than M2,

H1 : M1 is closer to the true model than M2.

where, under H0, the scaled log-likelihood-ratio test statistic

Z12 =LR̂12√n σ̂12.12

−→d N(0, 1) ,

with LR̂12 =∑n

i=1(log f̂1,i − log f̂2,i ), and σ̂12.12 is the sample variance ofthe log-likelihood ratio scores.

18

Causal Model Selection Tests (CMST)

Vuong’s test handles model selection for 2 models only.

However, we want to use data from experimental crosses to distinguishamong 4 models:

M1 Q��~~

M2 Q��@

@ M3 Q��~~ ��@

@ M4 Q��~~ ��@

@

Y1// Y2 Y1 Y2

oo Y1 Y2 Y1 ;; Y2cc

Likelihood equivalent models:

Ma4

Q�� >

>> Mb4

Q�� >

>> Mc4

Q�� >

>>

Y1// Y2 Y1 ;; Y2cc Y1 Y2

oo

19

Causal Model Selection Tests (CMST)

Combine several separate Vuong’s tests into a single one. 3 versions:

1. Parametric CMST: intersection-union test of 3 Vuong’s tests,M1 ×M2, M1 ×M3, M1 ×M4, testing:

H0 : M1 is not closer to the true model than M2, M3, or M4.

H1 : M1 is closer to the true model than M2, M3, and M4.

2. Non-parametric CMST: intersection-union test of 3 paired signtests (Clark’s test).

3. Joint-parametric CMST: extension of the parametric CMST testwhich accounts for the correlation among the test statistics of theVuong’s tests.

20

Yeast data analysis

Budding yeast genetical genomics data set (Brem and Kruglyak 2005).

Data on 112 strains with:

I Expression measurements on 5,740 transcripts.

I Dense genotype data on 2,956 markers.

Most importantly:

We evaluated the precision of the causal predictions using validated causalrelationships extracted from a data-base of 247 knock-out experiments inyeast (Hughes et al. 2000, Zhu et al. 2008).

21

Knockout signatures

I In each experiment, one gene was knocked-out, and the expressionlevels of the remainder genes in control and knocked-out strains wereinterrogated for differential expression.

I The set of differentially expressed genes form the knock-out signature(ko-signature) of the knocked-out gene (ko-gene).

I The ko-signature represents a validated set of causal relations.

22

Validation using yeast knockout signatures

To leverage the ko information, we:

I Determined which of the 247 ko-genes also showed a significant QTLin our data-set.

I For each ko-gene showing significant linkages, we determined whichother genes co-mapped to the ko-gene’s QTL, generating, in this way,a list of putative targets of the ko-gene.

I For each ko-gene/putative targets list, we applied all methods usingthe ko-gene as the Y1 phenotype, the putative target genes as the Y2

phenotypes and the ko-gene’s QTL as the causal anchor.

23


I In total, 135 ko-genes showed significant linkages (both cis- andtrans-).

I The number of genes in the target lists varied from ko-gene toko-gene, but, in total, there were 31,936 targets.

24


Performance in terms of “biologically validated” TP, FP and precision:

I TP: a statistically significant causal relation between a ko-gene and aputative target gene when the putative target gene belongs to theko-signature of the ko-gene.

I FP: a statistically significant causal relation between a ko-gene and aputative target gene when the target gene doesn’t belong to theko-signature.

I The “validated precision”, is computed as the ratio of true positivesby the sum of true and false positives.

25

Results: cis and trans ko-genes

0.02 0.06 0.10

050

100

150

200

250

300

350

True Positives

Nominal significance level

Num

ber

of tr

ue p

ositi

ves

0.02 0.06 0.10

010

0020

0030

0040

0050

00

False Positives


Num

ber

of fa

lse

posi

tives

0.02 0.06 0.10

0.00

0.05

0.10

0.15

0.20

0.25

Precision


Pre

cisi

onblack: BIC, blue: joint CMST BIC, green: par CMST BIC, red: non par CMST BIC

26

Results: cis ko-genes only

27 out of the 135 candidate regulator ko-genes mapped in cis.

0.02 0.06 0.10

020

4060

8010

012

014

0

True Positives


Num

ber

of tr

ue p

ositi

ves

0.02 0.06 0.10

010

020

030

040

050

0

False Positives


Num

ber

of fa

lse

posi

tives

0.02 0.06 0.10

0.0

0.1

0.2

0.3

0.4

Precision

Nominal significance levelP

reci

sion

black: BIC, blue: joint CMST BIC, green: par CMST BIC, red: non par CMST BIC

27

Precision side by side

0.02 0.06 0.10

0.0

0.1

0.2

0.3

0.4

0.5

Cis and trans


Pre

cisi

on

0.02 0.06 0.100.

00.

10.

20.

30.

40.

5

Cis only


Pre

cisi

on

black: BIC, blue: joint CMST BIC, green: par CMST BIC, red: non par CMST BIC

28

Cis-vs-trans caseWhy is the cis-vs-trans case easier than the trans-vs-trans case ?

In general, the cis-linkages tend to be stronger than trans-linkages.

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.1

0.2

0.3

0.4

0.5

R2(Y1 = Q + ε)

R2 (Y

2=

Q+

ε)

29

Conclusions

I CMST tests trade a reduction in the rate of false positives by adecrease in statistical power.

I Whether a more powerful and less precise, or a less powerful and moreprecise method is more adequate, depends on the biologist’s researchgoals and resources.

I If the biologist can easily validate several genes, a larger list generatedby more powered and less precise methods might be more appealing.

I If follow up studies are time consuming and expensive, and only a fewcandidates can be studied in detail, a more precise method thatconservatively identifies candidates with high confidence can be moreappealing.

30

Causal Bayesian networks and theQTLnet algorithm

YOL084W

YAL061W

YDR032C

YEL011W

YFR043C

YHR016C

YHR104W

YIL113W

YIR016W

YJL111W

YJL161W

YJL210W

YJR096W

YKL085W

YKL091C

YLR178C

YMR170C

YNL160W

YNL195C YNR014W

YOL097C

YOR028C

YPL154C

YPR160W

[email protected]

[email protected]

[email protected]

[email protected]

31

Standard Bayesian networksA graphical model is a multivariate probabilistic model whose conditionalindependence relations are represented by a graph.

Bayesian networks are directed acyclic graph (DAG) models,/.-,()*+1��=

=== /.-,()*+3 ///.-,()*+5 ///.-,()*+6/.-,()*+2 @@�� /.-,()*+4OO

Assuming the Markov property, the joint distribution factors according tothe conditional independence relations:

P(1, 2, 3, 4, 5, 6) = P(6 | 5)P(5 | 3, 4)P(4)P(3 | 1, 2)P(2)P(1)

6 ⊥⊥ {1, 2, 3, 4} | 5, 5 ⊥⊥ {1, 2} | 3, 4, and so on

i.e., each node is independent of its non-descendants given its parents.32

Standard Bayesian networks and causality

Even though the direct edges in a Bayes net are often interpreted as causalrelations, in reality they only represent conditional dependencies.

Different phenotype networks, for instance,

Y1 → Y2 → Y3 , Y1 ← Y2 → Y3 , Y1 ← Y2 ← Y3 ,

can represent the same set of conditional independence relations(Y1 ⊥⊥ Y3 | Y2, in this example).

When this is the case, we say the networks are Markov equivalent.

33

Standard Bayesian networks and causality

In general:

Markov equivalence ⇔ Distribution equivalence

(equivalence of likelihood functions)

Hence, model selection criteria cannot distinguish between Markovequivalent networks.

The best we can do is to learn equivalence classes of likelihood equivalentphenotype networks from the data.

34

Genetics as a way to reduce the size of equivalence classes

The incorporation of genetic information can help distinguish betweenlikelihood equivalent networks in two distinct ways:

1. By creating priors for the network structures, using the results ofcausality tests (Zhu et al. 2007).

2. By augmenting the phenotype network with QTL nodes, creating newsets of conditional independence relations (Chaibub Neto et al. 2010).

35

Genetic priors

Consider the networks

M1 : Y1 → Y2 → Y3 , M2 : Y1 ← Y2 ← Y3 .

These Markov equivalent networks have the same likelihood, i.e.,

P(D | M1) = P(D | M2) .

If the phenotypes are associated with QTLs, we can use the results of thecausality tests to compute prior probabilities for the network structures. If

P(M1)

P(M2)̸= 1 , then

P(M1 | D)

P(M2 | D)=

P(D | M1)P(M1)

P(D | M2)P(M2)̸= 1 ,

and we can use the posterior probability ratio to distinguish between thenetworks.

36

Augmenting the phenotype network with QTLs

Consider the Markov equivalent networks:

M1 : Y1 → Y2 → Y3 , M2 : Y1 ← Y2 ← Y3 .

By augmenting the phenotype network with a QTL node,

M1 : Q → Y1 → Y2 → Y3 , M2 : Q → Y1 ← Y2 ← Y3 ,

we have thatM1 andM2 have distinct sets of conditional independencerelations:

Y2 ⊥⊥ Q | Y1 , Y1 ⊥⊥ Y3 | Y2 , on M1

Y2 ⊥̸⊥ Q | Y1 , Y1 ⊥⊥ Y3 | Y2 , on M2

Hence,M1 andM2 are no longer likelihood equivalent.

37

Learning Bayesian Networks from Data

I Posterior prob of networkMk given the observed data, D,

P(Mk | D) =P(D | Mk)P(Mk)∑Mk

P(D | Mk)P(Mk).

I Prior predictive distribution of D givenMk

P(D | Mk) =

∫P(D | θ,Mk)P(θ) dθ

I Prior distribution of networkMk , P(Mk).

I Marginal distribution of the data

P(D) =∑Mk

P(D | Mk)P(Mk) ,

cannot, generally, be computed analytically because the number ofnetworks is too large.

38

Learning Bayesian Networks from Data

Complexity of the learning task:

# of nodes # of networks

1 12 33 254 5435 29,2816 3,781,50310 4.175099e+1820 2.344880e+7230 2.714854e+158

Hence, heuristic search algorithms are essential to traverse the networkspace efficiently.

39

QTLnet algorithm

I Perform joint inference of the causal phenotype network and theassociated genetic architecture.

I The genetic architecture is inferred conditional on the phenotypenetwork.

I Because the phenotype network structure is itself unknown, thealgorithm iterates between updating the network structure and geneticarchitecture using a Markov chain Monte Carlo (MCMC) approach.

I QTLnet corresponds to a mixed Bayesian network with continuousand discrete nodes representing phenotypes and QTLs, respectively.

40

QTLnet algorithm - standard structure sampler

41

Bayesian model averaging

M_1 M_2 M_3 M_4 M_5 M_6 M_7 M_8 M_9 M_10

Model

Pos

terio

r pr

ob0.

000.

100.

20

12

34

M1 12

34

M2 12

34

M3 12

34

M4 12

34

M5

12

34

M6 12

34

M7 12

34

M8 12

34

M9 12

34

M10

Pr(Y1 → Y2) = Pr(M1) + Pr(M3) + Pr(M4) = 0.54Pr(Y1 . . .Y2) = Pr(M2) + Pr(M5) + Pr(M7) = 0.34Pr(Y1 ← Y2) = Pr(M6) + Pr(M8) + Pr(M9) + Pr(M10) = 0.12

42

Yeast data analysis

We build a causal phenotype network around PHM7.

PHM7 is physically located close to the hotspot QTL on chr 15.

0 100 200 300 400

0

20

40

60

80

100

120

chr 15 hotspot

Map position (cM)

coun

ts

PHM7 is the cis-gene with the largest number of significant causal callsacross all hotspots (23 significant calls at α = 0.001 for joint CMST).

43

Yeast data analysisPHM7 (yellow) shows up at the topof the transcriptional network.

YOL084W

YAL061W

YDR032C

YEL011W

YFR043C

YHR016C

YHR104W

YIL113W

YIR016W

YJL111W

YJL161W

YJL210W

YJR096W

YKL085W

YKL091C

YLR178C

YMR170C

YNL160W

YNL195C YNR014W

YOL097C

YOR028C

YPL154C

YPR160W

[email protected]

[email protected]

[email protected]

[email protected]

44

References

1. Chaibub Neto et al. (2013) Modeling causality for pairs of phenotypes in systemsgenetics. Genetics 193: 1003-1013.

2. Chaibub Neto et al. (2010). Causal graphical models in systems genetics: aunified framework for joint inference of causal network and genetic architecture forcorrelated phenotypes. Annals of Applied Statistics 4: 320-339.

Software: R/qtlhot and R/qtlnet packages.

Further references:

1. Brem and Kruglyak (2005) PNAS 102: 1572-1577.

2. Clarke (2007) Political Analysis 15: 347-363.

3. Hughes et al. (2000) Cell 102: 109-116.

4. Kullback (1959) Information theory and statistics. John Wiley. New York.

5. Li et al. (2006) Plos Genetics 2: e114.

6. Schadt et al. (2005) Nature Genetics 37: 710-717.

7. Vuong (1989) Econometrica 57: 307-333.

8. Zhu et al. (2008) Nature Genetics 40: 854-861.

45

Acknowledgments

Co-authors:

I Brian S Yandell (Statistics - UW-Madison)

I Mark P Keller (Biochemistry - UW-Madison)

I Alan D Attie (Biochemistry - UW-Madison)

I Bin Zhang (Genetics and Genomic Sciences - MSSM)

I Jun Zhu (Genetics and Genomic Sciences - MSSM)

I Aimee T Broman (Biochemistry - UW-Madison)

46

Thank you !

47

Documents

Causal Graphical Models in Systems Genetics...Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell