D etection automatique de signaux en pharmacovigilancemaths.cnam.fr/IMG/pdf/Conference-IsmailAhmed-2011-11.pdfIntroduction (2) Large quantity of data I In France 2000 - 2010 : 370000

Detection automatique de signaux en pharmacovigilanceApproche statistique fondee sur les comparaisons multiples

Ismaıl Ahmed

INSERM UMRS Centre de recherche en Epidemiologie et Sante des Populations - Villejuif

16 novembre 2011

Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion

Introduction (1)

Pharmacovigilance systems

I aim at early detection of adverse drug reaction of marketed drugs

I based on spontaneous reports

The French pharmacovigilance system

I set up in 1979

I currently based on 31 pharmacovigilance centres (CRPV)

I coordinated by the pharmacovigilance unit of the « Agence Francaise de SecuriteSanitaire des Produits de Sante » (Afssaps)

2 / 40


Introduction (2)

Large quantity of data

I In France

2000 - 2010 : 370 000 spontaneous reports

More than 30 000 reports per year

I FDA : 2.6 million, WHO : 3.7 million in 2005

Automatic signal detection methods

I Statistical associations within the pharmacovigilance database

I Potential adverse drug reactions

3 / 40


Introduction (3)

Existing methods

I Based on different probability models and decision rules

I Frequentist methods :

Reporting Odds Ratio (ROR, Netherlands)

Proportional Reporting Ratio (PRR, UK and EudraVigilance)

I Bayesian methods :

Gamma Poisson Shrinkage (GPS, USA)

Bayesian Confidence Propagation Neural Network (BCPNN, WHO)

France does not use any automatic signal detection method yet

4 / 40


Introduction (4)

Main limit of the existing methods

I Detection rules arbitrarily chosen

Main objective

I Decision rule derived from a statistical error criteria

I Accounting for the multiple comparisons

I False discovery rate (FDR)

5 / 40


Outline

Description of the current methods

Extension to the multiple comparison setting

Simulation study

Application to the French pharmacovigilance data

Discussion

6 / 40


Pharmacovigilance data structure and notations

Large contingency table crossing all the drugs and all the adverse events

I French database (ATC5-HLT)

672 classes of drugs (D) × 820 classes of adverse events (AE)

551 040 cells of which 80% are empty

For a particular couple (AEi , Dj )

Drug j Other DrugsAdverse event i nij ni j ni.

Other adverse events ni j ni j ni.

n.j n.j n

I nij : Number of reports involving AEi and Dj

I ni. : Marginal count involving AEi

I n.j : Marginal count involving Dj

I n : Total number of AE-D pairs counts

7 / 40


Reporting Odds Ratio (ROR) van Puijenbroek et al. (2002)

Association measure

I For the adverse event-drug pair (i , j )

ψij =nijni j

ni jni j

Signal generation

I ln(ψij ) is assumed to follow a normal distribution with variance :

var{ln(ψij )} =1

nij+

1

ni j

+1

ni j

+1

ni j

I A signal is generated if

ln(ψij )− 1.96 var{ln(ψij )}1/2 > 0

8 / 40


Proportional Reporting Ratio (PRR) Evans et al. (2001)

Association measure

I For the adverse event-drug pair (i , j )

ϕij =nij/ni.

ni j/ni.

I very close to ψij

Signal generation

I Evans et al. (2001)ϕij > 2nij ≥ 3A χ2 statistic with 1 df ≥ 4

I van Puijenbroek et al. (2002) : same as for the ROR method

9 / 40


Gamma Poisson Shrinkage (GPS) DuMouchel (1999)

Poisson - 2 gamma mixture model

I nij |eij , λij ∼ Pn(λij eij ) avec eij =ni. n.j

n

I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)

where (w , α1, α2, β1, β2) maximizes the marginal likelihood∏ij

[w fBn{nij ;α1, β1/(β1 + eij )}+ (1− w) fBn{nij ;α2, β2/(β2 + eij )}

]Association measure

λ∗ij = λij |nij , eij

I λ∗ij ∼ wij Ga(α1 + nij , β1 + eij ) + (1− wij ) Ga(α2 + nij , β2 + eij )

Signal generation

I E{log2(λ∗ij )} DuMouchel (1999)

I Q0.05(λ∗ij ) DuMouchel et al. (2001)

I Q0.05(λ∗ij ) > 2 Szarfman et al. (2002)

10 / 40





n

I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)






Signal generation




10 / 40





n

I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)






Signal generation




10 / 40





n

I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)






Signal generation




10 / 40


Bayesian Confidence Propagation Neural Network (BCPNN) (1)Bate et al. (1998), Noren et al. (2006)

Multinomial-Dirichlet model

(nij ,ni j ,ni j ,ni j ) ∼ Mu(n, pij , pi j , pi j , pi j )

with (pij , pi j , pi j , pi j ) ∼ Di(αij , αi j , αi j , αi j )

I The hyperparameters depend on the cell counts

I The posterior distribution of (pij , pi j , pi j , pi j ) is also a Dirichlet :

(pij , pi j , pi j , pi j )∗ ∼ Di(γij , γi j , γi j , γi j )

with γkl = αkl + nkl

I In particular

p∗ij ∼ Be(γij , γi j + γi j + γi j )

p∗i. = p∗

ij + p∗i j ∼ Be(γij + γi j , γi j + γi j )

p∗.j = p∗


11 / 40



Multinomial-Dirichlet model

(nij ,ni j ,ni j ,ni j ) ∼ Mu(n, pij , pi j , pi j , pi j )

with (pij , pi j , pi j , pi j ) ∼ Di(αij , αi j , αi j , αi j )

I The hyperparameters depend on the cell counts

I The posterior distribution of (pij , pi j , pi j , pi j ) is also a Dirichlet :

(pij , pi j , pi j , pi j )∗ ∼ Di(γij , γi j , γi j , γi j )

with γkl = αkl + nkl

I In particular

p∗ij ∼ Be(γij , γi j + γi j + γi j )

p∗i. = p∗


p∗.j = p∗


11 / 40



Association measure

IC∗ij = log2

(p∗ij

p∗i. p

∗.j

)Ratio of beta distributions ⇒ No analytic form

Signal generation

Q0.025(IC∗ij ) > 0

I Normal approximationdelta method : Bate et al. (1998)exact moments : Gould (2003)

I Interpolation model built from Monte Carlo simulations : Noren et al. (2006)

12 / 40




Simulation study


Discussion

13 / 40


False Discovery Rate and Pharmacovigilance

I Automatic signal detection methods are data mining tools

I Extension to the hypothesis testing framework

relying on the recent developments in multiple comparison statistical field

detection thresholds rule based on statistical criteria

I False Discovery Rate (Benjamini and Hochberg (1995))

E(proportion of false discoveries among the generated signals)

used in the genomic data analysis

adapted to massive comparisons and exploratory analysis

14 / 40


Frequentist methods : Proposed approach (1)

Use of the P-values as statistic of interest

ROR

I For each cell, we want to test H0ij : ψij ≤ ψ0

I The corresponding P-values

pij = 1− Φ

(ln(ψij )− ln(ψ0)

var[ln(ψij )]1/2

)where Φ denotes the standard normal cdf

I The current decision rule corresponds to

choose ψ0 = 1 andgenerate signals for cells with pij ≤ 0.025

PRR

I H0ij : ϕij ≤ ϕ0

15 / 40



Fisher’s exact test

I Simple and exact alternative to ROR and PRR

I Large proportion of cells with small counts

I P-values : Pr(Nij ≥ nij |ni.,n.j ,n;ψ0)

I The Fisher’s exact test is known to be conservative

I mid-P-values : Pr(Nij ≥ nij |ni.,n.j ,n;ψ0)︸︷︷︸P-values

− 12

Pr(Nij = nij |ni.,n.j ,n;ψ0)

16 / 40



Marginal distribution of the P-values

P-values are assumed to follow a mixture of two distributions

f (p) = π0 f0(p) + (1− π0) f1(p)

I f0(p) is the pdf of p under the null hypothesis

I f1(p) is the pdf of p under the alternative hypothesis

FDR Storey et al. (2002)

For a P-value rejection region [0, γ] with γ ∈]0, 1]

FDR(γ) =π0F0(γ)

F (γ)

17 / 40



FDR estimation : Single null hypothesis (not our case)

I The P-values are uniformly distributed under the null hypothesis

f (p) = π0 f0(p) + (1− π0) f1(p)= π0 + (1− π0) f1(p)

I For a P-value rejection region [0, γ] with γ ∈]0, 1]

FDR(γ) =π0F0(γ)

F (γ)=

π0γ

F (γ)

I F is estimated by its empirical estimate : F (γ) = 1/m∑

ij 1(pij ≤ γ)

I The main difficulty is to estimate π0

18 / 40



Location Based Estimator (LBE) Dalmasso et al. 2005

E[ϕ(P)] = π0E0[ϕ(P)] + (1− π0)E1[ϕ(P)]

E[ϕ(P)]

E0[ϕ(P)]= π0 + (1− π0)

E1[ϕ(P)]

E0[ϕ(P)]︸︷︷︸non negative term : Bias

I ϕ is chosen to minimize the non negative term : ϕ(p) = − ln(1− p)a

π0 =E[ϕ(P)]

E0[ϕ(P)]=

1/m∑

ij ϕ(pij )

E0[ϕ(P)]=

1/m∑

ij{− ln(1− pij )}a

Γ(a + 1)

I LBE only assumes that f is a non increasing function

I π0 is a biased estimator ⇒ upper bound for the FDR estimation

19 / 40



FDR estimation : One-sided null hypothesis

I The P-values are not calculated under H0ij : ψij ≤ ψ0 butunder H0ij∗ : ψij =ψ0

I The P-values are not uniformly distributed under H0ij

I f0 expressed as a mixture of a uniform f0∗ and anon-decreasing f1∗ function

p

f((p))

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

12

f (p) = π0 f0(p) + (1− π0) f1(p)= π0 {π0∗ + (1− π0∗)f1∗(p)} + (1− π0) f1(p)

I In practice, only small values of γ are of interest :

FDR(γ) =π0F0(γ)

F (γ)=π0{π0∗γ + (1− π0∗)F1∗(γ)}

F (γ)=π0π0∗γ

F (γ)

I The issue is thus to estimate π0π0∗

I LBE on P∗ = 1− 2|P − 12| which has a non-increasing pdf expressed as

fP∗ (p∗) = π0π0∗ + (1− π0π0∗)f1P∗ (p∗)

20 / 40


Bayesian methods : Proposed approach (1)Based on the bayesian decision theory framework - Muller et al. (2004)

Statuszij ∈ {0, 1}

Decisiondij ∈ {0, 1}

FDR

FDP =

∑ij (1− zij )dij∑

ij dij−→ FDR = E[FDP]

Bayesian FDR estimation : FDR∗

FDR∗ = E[FDP|data] =

∑ij uij dij∑

ij dij

where uij = Pr(zij = 0|data) i.e. the posterior Pr. of H0ij : zij = 0

21 / 40


Bayesian methods : Proposed approach (2)

Status zij - uij

I For each cell, we want to test

GPS → H0ij : λij ≤ RR0

BCPNN → H0ij :pij

pi.p.j≤ RR0

I and thus to calculate uij i.e. the posterior probability of H0ij

GPS :

Pr(λ∗ij ≤ RR0)

= wij FGa(RR0; α1 + nij , β1 + eij ) + (1− wij ) FGa(RR0; α2 + nij , β2 + eij )

BCPNN :

Pr(IC∗ij ≤ log2(RR0))

No analytic form → Monte Carlo simulations

22 / 40


Bayesian methods : Proposed approach (3)

Decision rule dij

I GPS

DuMouchel (1999) : dij = 1[E{log2(λ∗ij )} > τ ]

DuMouchel et al. (2001) : dij = 1[Q0.05(λ∗ij ) > τ ]

Our suggestion : dij = 1[uij ≤ α]Szarfman et al. (2002) : RR0 = 2 and α = 0.05

I BCPNN

Our suggestion : dij = 1[uij ≤ α]Bate et al. (1998) : RR0 = 1 and α = 0.025

23 / 40




Simulation study


Discussion

24 / 40


Simulation study

Objectives

I To compare the performances of the methods

I To evaluate the quality of the FDR estimators

Methods

I Frequentist methods :

ROR « new »RFET

midRFET

I Bayesian methods :

GPS : E{log2(λ∗ij )}, Q0.05(λ∗ij ), Pr(H∗0ij

)

BCPNN : Pr(H∗0ij

)

I 3 different tested hypotheses based on {ψ0, RR0} = 1, 2 and 5

25 / 40


Simulation study

Data generation

I Model : nij ∼Mu(n,pij) from the French databaseI pij

pi.w ∼ Di(ni.)

p.jw ∼ Di(n.j) =⇒ pij =

rwij pwi. p

w.j∑

ij rwij pw

i. pw.j

log(rwij ) ∼ Lo(0, 0.5)

I From pij

nij s and the true marginal probabilities : pi. =∑

j pij p.j =∑

i pijreal status of the cells according to

• ψij , and ψ0 for the frequentist methods

• RRij =pij

pi.p.jand RR0 for the bayesian methods

Simulation plan

I 500 simulated datasets

I « True »FDR estimated by the average of the FDPs over the 500 datasets

I nij ≥ 3

26 / 40


Simulation study : Comparison of the frequentist methods

FDR

and estimation

RORRFETmidRFET

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

ψψ0 == 1

Average number of generated signals

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

ψψ0 == 2


0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

ψψ0 == 5


0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

27 / 40


Simulation study : Comparison of the frequentist methods

FDR and estimation

RORRFETmidRFET

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

ψψ0 == 1


0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 5000 10000 15000 20000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

ψψ0 == 2


0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

ψψ0 == 5


0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

0 500 1000 1500

0.00

0.05

0.10

0.15

0.20

27 / 40


Simulation study : Comparison of the Bayesian methods (1)

GPS and decision rules

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10 E((λλ*))

Q0.05((λλ*))Pr((H0*))

RR0 == 1


0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 2


0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 5


I The proposed decision rule gives better results according to the FDR

I Close performances between Q0.05(λ∗ij ) and Pr(H∗

0ij) for small values of FDR

28 / 40


GPS vs BCPNN with Pr(H∗0ij )

FDR and estimation

— GPS– - BCPNN

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 1


0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 2


0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 5


I Very close performances according to the FDR

I Large differences in the FDR estimation

29 / 40


GPS vs BCPNN with Pr(H∗0ij )

FDR and estimation

— GPS– - BCPNN

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 1


0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 2


0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

RR0 == 5



I Large differences in the FDR estimation

29 / 40


Global Comparison

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 1 , RR0 == 1


ψψ0 == 1 , RR0 == 1


midFETGPSBCPNN

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 2 , RR0 == 2


ψψ0 == 2 , RR0 == 2


0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 5 , RR0 == 5


ψψ0 == 5 , RR0 == 5



I Large differences in FDR estimation

Results in favor of the use of the GPS model with Pr(H∗0ij )

30 / 40


Global Comparison

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 1 , RR0 == 1


midFETGPSBCPNN

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 2 , RR0 == 2


0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 5 , RR0 == 5



I Large differences in FDR estimation

Results in favor of the use of the GPS model with Pr(H∗0ij )

30 / 40




Simulation study


Discussion

31 / 40


Global analysis (1)

I 1984-2003

I nij ≥ 3 → 47 520 cells

FDR estimation

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

0 5000 10000 15000 20000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 1 , RR0 == 1

Number of generated signals

midRFETRORBCPNNGPS

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

0 1000 2000 3000 4000 5000 6000 7000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 2 , RR0 == 2


0 500 1000 1500 2000

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500 2000

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500 2000

0.00

0.02

0.04

0.06

0.08

0.10

0 500 1000 1500 2000

0.00

0.02

0.04

0.06

0.08

0.10

ψψ0 == 5 , RR0 == 5


I Results close to that obtained in the simulation study

32 / 40


Feasibility Study

Afssaps is interested in implementing an automatic signal detection tool in France

Objective

A feasibility study was recently conducted with two aims

I Relevance of the signals generated by the method

I Time necessary for the evaluation of the signals

33 / 40


Feasibility Study

Methodology

I Use of GPSpH0 with FDR at 5%

I A first analysis was performed on the data (Jan 1 2000 - dec 31 2008)

I A second analysis was performed three months later (Jan 1 2000 - March 31 2009)

I Selection of the 1414 signals that were highlighted by the second analysis but not bythe first one

I These signals were dispatched among the 31 regional pharmacovigilance centres andthe pharmacovigilance department at Afssaps (about 40 signals per centre)

34 / 40


Feasibility Study

Methodology

The centres were given one month toI categorize the signals

known/nothing further

known/further follow up

unknown/further follow up

unknown/outlier

no time to assess

I Evaluate the time necessary for the analysis

35 / 40


Feasibility Study

Results

I 28 centres participated (sent back the questionnaire) =⇒ 1294 signalsI 1170 signals were categorised

35.7% known/nothing further6.9% known/further follow up16.8% unknown/further follow up36.6% unknown/outlier4% no time to assess

I Large variability in the time necessary for analysisbetween 2 and 26 hours per centremedian : 6 hours

Conclusion

I 277 signal worth following up

I At least 60% of relevant signals

I The lack of precision was often invoked for the outliers : too vague signals=⇒ need to work out the coding dictionaries

36 / 40




Simulation study


Discussion

37 / 40


Discussion (1)

Extension of the existing methods to the multiple comparison framework

I No modification of the model

I New decision rules

The GPS model with Pr(H∗0ij ) provides the best results

I in the simulation study

I in the sequential evaluation study

38 / 40


Discussion (2)

Limits of the GPS model

I Data represented as a contingency tableCommon to all the methods currently usedco-prescription

Use of penalized regressions

I Logistic regression in which

One studies one adverse event at a timeSeveral hundred of predictors : the drugs (coded in 1/0)Several hundred thousand (or millions) of individuals

I Lasso type algorithm / Stability Selection (Meinshausen & Buhlmann JRSS B 2010)

I Much more intensive

I Should make it possible to better account for the correlation structure between thedrugs

co-prescription

I Direct extension to the study of the interaction drug-drug

39 / 40


References

I Ahmed, C Dalmasso, F Haramburu, F Thiessard, P Broet, and P Tubert-Bitter.

False discovery rate estimation for frequentist pharmacovigilance signal detection methods.Biometrics, 66(1) :301–309, March 2010.PMID : 19432790.

I Ahmed, F Thiessard, G Miremont-Salame, B Begaud, and P Tubert-Bitter.

Pharmacovigilance data mining with methods based on false discovery rates : a comparative simulation study.Clinical Pharmacology and Therapeutics, 88(4) :492–498, October 2010.PMID : 20811349.

Ismaıl Ahmed, Francoise Haramburu, Annie Fourrier-Reglat, Frantz Thiessard, Carmen Kreft-Jais, Ghada

Miremont-Salame, Bernard Begaud, and Pascale Tubert-Bitter.Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting.Statistics in Medicine, 28(13) :1774–1792, June 2009.PMID : 19360795.

40 / 40

Documents

D etection automatique de signaux en pharmacovigilancemaths.cnam.fr/IMG/pdf/Conference-IsmailAhmed-2011-11.pdfIntroduction (2) Large quantity of data I In France 2000 - 2010 : 370000