Bayes, Data Mining and Pharmacovigilance Patrick Graham University of Otago, Christchurch

Bayes, Data Mining and Pharmacovigilance

Patrick GrahamUniversity of Otago, Christchurch

Acknowledgements

• Funders: MedSafe and HRC• Collaborators: Ruth Savage, Janelle Ashton, Michael

Tatley from New Zealand Pharmacovigilance Centre, University of Otago

Background (1)

• Pharmacovigilance – post-marketing surveillance of medicines.

• Seeks early detection of adverse drug reactions, • Traditionally, clinical review of spontaneous reports• Late 1990s? “Data mining” of databases of spontaneous

reports – “signal detection”• Two Bayesian methods prominent: -WHO –Independent 2 x 2 tables, Multinomial – Dirichlet -FDA (DuMouchel) - Analyse all drugs by all reactions table using a hierarchical Poisson – mixture of Gammas model

Background (2)• Interest is now turning to the potential of longitudinal health care

databases, record-linkage, electronic prescribing and other technologies

• MedSafe / HRC formed the “product-vigilance partnership” 2007 to advance product-vigilance research in NZ.

• Feasibility studies in 2008• Collaborative group funded late 2009 led by Dr. Michael Tatley of

the NZ Pharmacovigilance Centre.• Wider project includes work on: risk communication; ethics,

acceptability and methods for accessing general practice data; epidemiological studies; signal detection in longitudinal databases.

• Signal detection work includes Bayesian methods (me) as well as investigation of machine-learning approaches, text mining and natural language processing.

Features of health care databases

• More representative than spontaneous reports

• Usually longitudinal• Lots of time-stamped information• Large size (makes analysis at the level of

individual patients difficult)

Signal detection in health care databases

• Aim is exploratory analysis of:• Multiple drugs• Multiple outcomes, some of which will be rare• Large datasets, potentially large number of

comparisons, so computational efficiency will often be an issue.

One approach: Noren et al (2010)

Published in Data Mining and Knowledge Discovery ,2010, 20, 361-387.

• Simple Poisson-Gamma model for each combination of outcome, drug, and time period.

• Exact posterior immediately available• no smoothing, no pooling, just shrinkage towards a prior mean

• Applied to a UK General Practice database, > 20 million scripts• Analysed 2,445 drugs x 5,753 outcomes x 72 time periods• Took ~8 hours on a server with 2 dual core, 2.4GHz processors • Nice graphs of temporal trends• Software system developed but not publicly available• A data mining approach?

# ocurrences of outcome o, among people exposed to , and at risk

in period

person-time at risk during period t among people exposed to d

( / )

odt

dt

odt oat at dta a

Y d

t

N

e Y N N

Noren et al - Details

| , ~ Poisson( )

~ Gamma(0.5,0.5)

| , ~ Gamma(0.5 ,0.5 )

and ( | , ) (0.5 ) / (0.5 )

(1 )(

indep

odt dt odt odt odt

odt

indep

odt odt dt odt odt

odt odt odt odt odt

dt

Y e e

so Y e Y e

Y e Y e

w

/ ) 1

0.5 / ( 0.5)odt dt dt

dt dt

Y e w

w E

Why did we not use Noren et al’s approach?

(i) A statistical modeller’s perspective• Many similar parameters to be estimated• Should we not be trying to learn from similar estimations

when estimating each particular parameter?• Old idea for both Bayesians and frequentists.(ii) Our paradigm is on a smaller scale, e.g.• Compare a new drug with drugs currently used for

treating the same condition.• 20-30 outcomes with thought a priori likely to be

associated with ADRs, e.g. (Trefiro et al 2009)

Hierarchical Bayesian Model

(similar notation to previously but specifically reference patient sub-groups by subscript g)

0

0

20 2

| , ~ Poisson( )

| , ~ Gamma( , / )

log( ) [prior model]

( , ) ( | 0, )( )

indep

godt gdt godt gdt godt

indep

godt o o o godt

godt gdt

oo o i ii

o o

Y N N

X

zp N s

z

Focus is on summaries of the e.g. ( ) /

for which inference follows from ( | , ) -joint posterior for the

stdgodt odt gdt godt gdtg g

godt

N N

p

λ Y N

(Separate model for each outcome, o)

Hierarchical Bayesian Model (cont’d)

Prior model permits full flexibility of statistical modelling, e.g.- jump in event rate just after first prescription -smooth but nonlinear changes elsewhere-drug by time interactions-drug by covariate interactions-etc

But hierarchical model structure provides some protection against model-misspecification.

,( | , , ) (1 )( / )

where ; log( )

godtgodt o o godt godt gdt godt

ogodt godt gdt o

o godt gdt

E w Y N w

w XN

Y N

Of course we don’t actually condition on ,o o But integrate over the posterior to obtain ( | , )p λ Y N

Hierarchical Bayesian Model - Computation

,

( | , ) ( | , , , ) ( , | , )o o

o o o o o o o op p p d d

λ Y N λ Y N Y N

First part of the integrand is product of independent Gammas, second part is the posterior for parameters of a negative binomial model, suggesting Monte Carlo computation via

0

0

(i) MCMC to obtain a sample from ( , | , )

(ii) For each sampled value of ( , ) draw from

Gamma( | ,( / ) )

where exp( )

o o

o o

godt godt o godt godtgdt

godt gdt o

p

Y N

X

Y N

λ

Should be faster than Gibbs sampler

Hierarchical Bayes model computation performance

• Burn-in of 4000 for MCMC seems adequate• Example 1: Cohort of 4531, prescribed one of 4

atypical antipsychotics, 23 outcomes, adjusting for age and sex, took 17 minutes

• Example 2: Cohort of 10,308 children receiving one of 3 vaccines, 12 outcomes, adjusting for age, sex and season, took 27.5 minutes

• Using 64 bit R on a laptop, 4 GB RAM, quad core 2.2 GHz, but with R not optimised for multi-core

Results: Atypical antipsychoticsIschaemic Heart Disease (IHD)

Posterior probabilities for standardised IHD rates exceeding baseline rates, by drug and period

Drug Period 1 Period 2 Period 3 Period 4

Clozapine 0.08 0.01 0.00 0.00Olanzapine 0.05 0.03 0.03 0.03Quetiapine 0.93 0.89 0.81 0.74Risperidone 0.00 0.00 0.00 0.00

Posterior probability that each drug has the largest standard IHD rate, by period

Drug Period 0 Period 1 Period 2 Period 3Clozapine 0.16 0.03 <0.01 <0.01Olanzapine 0.31 0.02 0.03 0.03Quetiapine 0.02 0.89 0.83 0.94Risperidone 0.51 0.06 0.14 0.03

Results. IVMP – local reactions

Summary

• Signal detection in Pharmacovigilance is an important and interesting area

• “Data mining” approach emphasises large scale computation, simple statistical model, independent analyses, no learning across groups, outcomes, drugs or time-periods.

• “Statistical modelling” approach emphasises a more complex statistical model which permits, for each outcome, learning across groups, drugs, time-periods, (while still permitting departures from the model) but is designed for smaller scale computations

• Can each approach learn from the other?

Documents

Bayes, Data Mining and Pharmacovigilance Patrick Graham University of Otago, Christchurch