Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Detection automatique de signaux en pharmacovigilanceApproche statistique fondee sur les comparaisons multiples
Ismaıl Ahmed
INSERM UMRS Centre de recherche en Epidemiologie et Sante des Populations - Villejuif
16 novembre 2011
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Introduction (1)
Pharmacovigilance systems
I aim at early detection of adverse drug reaction of marketed drugs
I based on spontaneous reports
The French pharmacovigilance system
I set up in 1979
I currently based on 31 pharmacovigilance centres (CRPV)
I coordinated by the pharmacovigilance unit of the « Agence Francaise de SecuriteSanitaire des Produits de Sante » (Afssaps)
2 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Introduction (2)
Large quantity of data
I In France
2000 - 2010 : 370 000 spontaneous reports
More than 30 000 reports per year
I FDA : 2.6 million, WHO : 3.7 million in 2005
Automatic signal detection methods
I Statistical associations within the pharmacovigilance database
I Potential adverse drug reactions
3 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Introduction (3)
Existing methods
I Based on different probability models and decision rules
I Frequentist methods :
Reporting Odds Ratio (ROR, Netherlands)
Proportional Reporting Ratio (PRR, UK and EudraVigilance)
I Bayesian methods :
Gamma Poisson Shrinkage (GPS, USA)
Bayesian Confidence Propagation Neural Network (BCPNN, WHO)
France does not use any automatic signal detection method yet
4 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Introduction (4)
Main limit of the existing methods
I Detection rules arbitrarily chosen
Main objective
I Decision rule derived from a statistical error criteria
I Accounting for the multiple comparisons
I False discovery rate (FDR)
5 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Outline
Description of the current methods
Extension to the multiple comparison setting
Simulation study
Application to the French pharmacovigilance data
Discussion
6 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Pharmacovigilance data structure and notations
Large contingency table crossing all the drugs and all the adverse events
I French database (ATC5-HLT)
672 classes of drugs (D) × 820 classes of adverse events (AE)
551 040 cells of which 80% are empty
For a particular couple (AEi , Dj )
Drug j Other DrugsAdverse event i nij ni j ni.
Other adverse events ni j ni j ni.
n.j n.j n
I nij : Number of reports involving AEi and Dj
I ni. : Marginal count involving AEi
I n.j : Marginal count involving Dj
I n : Total number of AE-D pairs counts
7 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Reporting Odds Ratio (ROR) van Puijenbroek et al. (2002)
Association measure
I For the adverse event-drug pair (i , j )
ψij =nijni j
ni jni j
Signal generation
I ln(ψij ) is assumed to follow a normal distribution with variance :
var{ln(ψij )} =1
nij+
1
ni j
+1
ni j
+1
ni j
I A signal is generated if
ln(ψij )− 1.96 var{ln(ψij )}1/2 > 0
8 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Proportional Reporting Ratio (PRR) Evans et al. (2001)
Association measure
I For the adverse event-drug pair (i , j )
ϕij =nij/ni.
ni j/ni.
I very close to ψij
Signal generation
I Evans et al. (2001)ϕij > 2nij ≥ 3A χ2 statistic with 1 df ≥ 4
I van Puijenbroek et al. (2002) : same as for the ROR method
9 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Gamma Poisson Shrinkage (GPS) DuMouchel (1999)
Poisson - 2 gamma mixture model
I nij |eij , λij ∼ Pn(λij eij ) avec eij =ni. n.j
n
I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)
where (w , α1, α2, β1, β2) maximizes the marginal likelihood∏ij
[w fBn{nij ;α1, β1/(β1 + eij )}+ (1− w) fBn{nij ;α2, β2/(β2 + eij )}
]Association measure
λ∗ij = λij |nij , eij
I λ∗ij ∼ wij Ga(α1 + nij , β1 + eij ) + (1− wij ) Ga(α2 + nij , β2 + eij )
Signal generation
I E{log2(λ∗ij )} DuMouchel (1999)
I Q0.05(λ∗ij ) DuMouchel et al. (2001)
I Q0.05(λ∗ij ) > 2 Szarfman et al. (2002)
10 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Gamma Poisson Shrinkage (GPS) DuMouchel (1999)
Poisson - 2 gamma mixture model
I nij |eij , λij ∼ Pn(λij eij ) avec eij =ni. n.j
n
I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)
where (w , α1, α2, β1, β2) maximizes the marginal likelihood∏ij
[w fBn{nij ;α1, β1/(β1 + eij )}+ (1− w) fBn{nij ;α2, β2/(β2 + eij )}
]Association measure
λ∗ij = λij |nij , eij
I λ∗ij ∼ wij Ga(α1 + nij , β1 + eij ) + (1− wij ) Ga(α2 + nij , β2 + eij )
Signal generation
I E{log2(λ∗ij )} DuMouchel (1999)
I Q0.05(λ∗ij ) DuMouchel et al. (2001)
I Q0.05(λ∗ij ) > 2 Szarfman et al. (2002)
10 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Gamma Poisson Shrinkage (GPS) DuMouchel (1999)
Poisson - 2 gamma mixture model
I nij |eij , λij ∼ Pn(λij eij ) avec eij =ni. n.j
n
I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)
where (w , α1, α2, β1, β2) maximizes the marginal likelihood∏ij
[w fBn{nij ;α1, β1/(β1 + eij )}+ (1− w) fBn{nij ;α2, β2/(β2 + eij )}
]Association measure
λ∗ij = λij |nij , eij
I λ∗ij ∼ wij Ga(α1 + nij , β1 + eij ) + (1− wij ) Ga(α2 + nij , β2 + eij )
Signal generation
I E{log2(λ∗ij )} DuMouchel (1999)
I Q0.05(λ∗ij ) DuMouchel et al. (2001)
I Q0.05(λ∗ij ) > 2 Szarfman et al. (2002)
10 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Gamma Poisson Shrinkage (GPS) DuMouchel (1999)
Poisson - 2 gamma mixture model
I nij |eij , λij ∼ Pn(λij eij ) avec eij =ni. n.j
n
I λij ∼ w Ga(α1, β1) + (1− w) Ga(α2, β2)
where (w , α1, α2, β1, β2) maximizes the marginal likelihood∏ij
[w fBn{nij ;α1, β1/(β1 + eij )}+ (1− w) fBn{nij ;α2, β2/(β2 + eij )}
]Association measure
λ∗ij = λij |nij , eij
I λ∗ij ∼ wij Ga(α1 + nij , β1 + eij ) + (1− wij ) Ga(α2 + nij , β2 + eij )
Signal generation
I E{log2(λ∗ij )} DuMouchel (1999)
I Q0.05(λ∗ij ) DuMouchel et al. (2001)
I Q0.05(λ∗ij ) > 2 Szarfman et al. (2002)
10 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Bayesian Confidence Propagation Neural Network (BCPNN) (1)Bate et al. (1998), Noren et al. (2006)
Multinomial-Dirichlet model
(nij ,ni j ,ni j ,ni j ) ∼ Mu(n, pij , pi j , pi j , pi j )
with (pij , pi j , pi j , pi j ) ∼ Di(αij , αi j , αi j , αi j )
I The hyperparameters depend on the cell counts
I The posterior distribution of (pij , pi j , pi j , pi j ) is also a Dirichlet :
(pij , pi j , pi j , pi j )∗ ∼ Di(γij , γi j , γi j , γi j )
with γkl = αkl + nkl
I In particular
p∗ij ∼ Be(γij , γi j + γi j + γi j )
p∗i. = p∗
ij + p∗i j ∼ Be(γij + γi j , γi j + γi j )
p∗.j = p∗
ij + p∗i j ∼ Be(γij + γi j , γi j + γi j )
11 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Bayesian Confidence Propagation Neural Network (BCPNN) (1)Bate et al. (1998), Noren et al. (2006)
Multinomial-Dirichlet model
(nij ,ni j ,ni j ,ni j ) ∼ Mu(n, pij , pi j , pi j , pi j )
with (pij , pi j , pi j , pi j ) ∼ Di(αij , αi j , αi j , αi j )
I The hyperparameters depend on the cell counts
I The posterior distribution of (pij , pi j , pi j , pi j ) is also a Dirichlet :
(pij , pi j , pi j , pi j )∗ ∼ Di(γij , γi j , γi j , γi j )
with γkl = αkl + nkl
I In particular
p∗ij ∼ Be(γij , γi j + γi j + γi j )
p∗i. = p∗
ij + p∗i j ∼ Be(γij + γi j , γi j + γi j )
p∗.j = p∗
ij + p∗i j ∼ Be(γij + γi j , γi j + γi j )
11 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Bayesian Confidence Propagation Neural Network (BCPNN) (2)Bate et al. (1998), Noren et al. (2006)
Association measure
IC∗ij = log2
(p∗ij
p∗i. p
∗.j
)Ratio of beta distributions ⇒ No analytic form
Signal generation
Q0.025(IC∗ij ) > 0
I Normal approximationdelta method : Bate et al. (1998)exact moments : Gould (2003)
I Interpolation model built from Monte Carlo simulations : Noren et al. (2006)
12 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Description of the current methods
Extension to the multiple comparison setting
Simulation study
Application to the French pharmacovigilance data
Discussion
13 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
False Discovery Rate and Pharmacovigilance
I Automatic signal detection methods are data mining tools
I Extension to the hypothesis testing framework
relying on the recent developments in multiple comparison statistical field
detection thresholds rule based on statistical criteria
I False Discovery Rate (Benjamini and Hochberg (1995))
E(proportion of false discoveries among the generated signals)
used in the genomic data analysis
adapted to massive comparisons and exploratory analysis
14 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Frequentist methods : Proposed approach (1)
Use of the P-values as statistic of interest
ROR
I For each cell, we want to test H0ij : ψij ≤ ψ0
I The corresponding P-values
pij = 1− Φ
(ln(ψij )− ln(ψ0)
var[ln(ψij )]1/2
)where Φ denotes the standard normal cdf
I The current decision rule corresponds to
choose ψ0 = 1 andgenerate signals for cells with pij ≤ 0.025
PRR
I H0ij : ϕij ≤ ϕ0
15 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Frequentist methods : Proposed approach (2)
Fisher’s exact test
I Simple and exact alternative to ROR and PRR
I Large proportion of cells with small counts
I P-values : Pr(Nij ≥ nij |ni.,n.j ,n;ψ0)
I The Fisher’s exact test is known to be conservative
I mid-P-values : Pr(Nij ≥ nij |ni.,n.j ,n;ψ0)︸ ︷︷ ︸P-values
− 12
Pr(Nij = nij |ni.,n.j ,n;ψ0)
16 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Frequentist methods : Proposed approach (3)
Marginal distribution of the P-values
P-values are assumed to follow a mixture of two distributions
f (p) = π0 f0(p) + (1− π0) f1(p)
I f0(p) is the pdf of p under the null hypothesis
I f1(p) is the pdf of p under the alternative hypothesis
FDR Storey et al. (2002)
For a P-value rejection region [0, γ] with γ ∈]0, 1]
FDR(γ) =π0F0(γ)
F (γ)
17 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Frequentist methods : Proposed approach (4)
FDR estimation : Single null hypothesis (not our case)
I The P-values are uniformly distributed under the null hypothesis
f (p) = π0 f0(p) + (1− π0) f1(p)= π0 + (1− π0) f1(p)
I For a P-value rejection region [0, γ] with γ ∈]0, 1]
FDR(γ) =π0F0(γ)
F (γ)=
π0γ
F (γ)
I F is estimated by its empirical estimate : F (γ) = 1/m∑
ij 1(pij ≤ γ)
I The main difficulty is to estimate π0
18 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Frequentist methods : Proposed approach (5)
Location Based Estimator (LBE) Dalmasso et al. 2005
E[ϕ(P)] = π0E0[ϕ(P)] + (1− π0)E1[ϕ(P)]
E[ϕ(P)]
E0[ϕ(P)]= π0 + (1− π0)
E1[ϕ(P)]
E0[ϕ(P)]︸ ︷︷ ︸non negative term : Bias
I ϕ is chosen to minimize the non negative term : ϕ(p) = − ln(1− p)a
π0 =E[ϕ(P)]
E0[ϕ(P)]=
1/m∑
ij ϕ(pij )
E0[ϕ(P)]=
1/m∑
ij{− ln(1− pij )}a
Γ(a + 1)
I LBE only assumes that f is a non increasing function
I π0 is a biased estimator ⇒ upper bound for the FDR estimation
19 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Frequentist methods : Proposed approach (6)
FDR estimation : One-sided null hypothesis
I The P-values are not calculated under H0ij : ψij ≤ ψ0 butunder H0ij∗ : ψij =ψ0
I The P-values are not uniformly distributed under H0ij
I f0 expressed as a mixture of a uniform f0∗ and anon-decreasing f1∗ function
p
f((p))
0.0 0.2 0.4 0.6 0.8 1.0
02
46
810
12
f (p) = π0 f0(p) + (1− π0) f1(p)= π0 {π0∗ + (1− π0∗)f1∗(p)} + (1− π0) f1(p)
I In practice, only small values of γ are of interest :
FDR(γ) =π0F0(γ)
F (γ)=π0{π0∗γ + (1− π0∗)F1∗(γ)}
F (γ)=π0π0∗γ
F (γ)
I The issue is thus to estimate π0π0∗
I LBE on P∗ = 1− 2|P − 12| which has a non-increasing pdf expressed as
fP∗ (p∗) = π0π0∗ + (1− π0π0∗)f1P∗ (p∗)
20 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Bayesian methods : Proposed approach (1)Based on the bayesian decision theory framework - Muller et al. (2004)
Statuszij ∈ {0, 1}
Decisiondij ∈ {0, 1}
FDR
FDP =
∑ij (1− zij )dij∑
ij dij−→ FDR = E[FDP]
Bayesian FDR estimation : FDR∗
FDR∗ = E[FDP|data] =
∑ij uij dij∑
ij dij
where uij = Pr(zij = 0|data) i.e. the posterior Pr. of H0ij : zij = 0
21 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Bayesian methods : Proposed approach (2)
Status zij - uij
I For each cell, we want to test
GPS → H0ij : λij ≤ RR0
BCPNN → H0ij :pij
pi.p.j≤ RR0
I and thus to calculate uij i.e. the posterior probability of H0ij
GPS :
Pr(λ∗ij ≤ RR0)
= wij FGa(RR0; α1 + nij , β1 + eij ) + (1− wij ) FGa(RR0; α2 + nij , β2 + eij )
BCPNN :
Pr(IC∗ij ≤ log2(RR0))
No analytic form → Monte Carlo simulations
22 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Bayesian methods : Proposed approach (3)
Decision rule dij
I GPS
DuMouchel (1999) : dij = 1[E{log2(λ∗ij )} > τ ]
DuMouchel et al. (2001) : dij = 1[Q0.05(λ∗ij ) > τ ]
Our suggestion : dij = 1[uij ≤ α]Szarfman et al. (2002) : RR0 = 2 and α = 0.05
I BCPNN
Our suggestion : dij = 1[uij ≤ α]Bate et al. (1998) : RR0 = 1 and α = 0.025
23 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Description of the current methods
Extension to the multiple comparison setting
Simulation study
Application to the French pharmacovigilance data
Discussion
24 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Simulation study
Objectives
I To compare the performances of the methods
I To evaluate the quality of the FDR estimators
Methods
I Frequentist methods :
ROR « new »RFET
midRFET
I Bayesian methods :
GPS : E{log2(λ∗ij )}, Q0.05(λ∗ij ), Pr(H∗0ij
)
BCPNN : Pr(H∗0ij
)
I 3 different tested hypotheses based on {ψ0, RR0} = 1, 2 and 5
25 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Simulation study
Data generation
I Model : nij ∼Mu(n,pij) from the French databaseI pij
pi.w ∼ Di(ni.)
p.jw ∼ Di(n.j) =⇒ pij =
rwij pwi. p
w.j∑
ij rwij pw
i. pw.j
log(rwij ) ∼ Lo(0, 0.5)
I From pij
nij s and the true marginal probabilities : pi. =∑
j pij p.j =∑
i pijreal status of the cells according to
• ψij , and ψ0 for the frequentist methods
• RRij =pij
pi.p.jand RR0 for the bayesian methods
Simulation plan
I 500 simulated datasets
I « True »FDR estimated by the average of the FDPs over the 500 datasets
I nij ≥ 3
26 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Simulation study : Comparison of the frequentist methods
FDR
and estimation
RORRFETmidRFET
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
ψψ0 == 1
Average number of generated signals
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
ψψ0 == 2
Average number of generated signals
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
ψψ0 == 5
Average number of generated signals
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
27 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Simulation study : Comparison of the frequentist methods
FDR and estimation
RORRFETmidRFET
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
ψψ0 == 1
Average number of generated signals
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 5000 10000 15000 20000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
ψψ0 == 2
Average number of generated signals
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
ψψ0 == 5
Average number of generated signals
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
0 500 1000 1500
0.00
0.05
0.10
0.15
0.20
27 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Simulation study : Comparison of the Bayesian methods (1)
GPS and decision rules
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10 E((λλ*))
Q0.05((λλ*))Pr((H0*))
RR0 == 1
Average number of generated signals
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 2
Average number of generated signals
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 5
Average number of generated signals
I The proposed decision rule gives better results according to the FDR
I Close performances between Q0.05(λ∗ij ) and Pr(H∗
0ij) for small values of FDR
28 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
GPS vs BCPNN with Pr(H∗0ij )
FDR and estimation
— GPS– - BCPNN
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 1
Average number of generated signals
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 2
Average number of generated signals
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 5
Average number of generated signals
I Very close performances according to the FDR
I Large differences in the FDR estimation
29 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
GPS vs BCPNN with Pr(H∗0ij )
FDR and estimation
— GPS– - BCPNN
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 1
Average number of generated signals
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 2
Average number of generated signals
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
RR0 == 5
Average number of generated signals
I Very close performances according to the FDR
I Large differences in the FDR estimation
29 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Global Comparison
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 1 , RR0 == 1
Average number of generated signals
ψψ0 == 1 , RR0 == 1
Average number of generated signals
midFETGPSBCPNN
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 2 , RR0 == 2
Average number of generated signals
ψψ0 == 2 , RR0 == 2
Average number of generated signals
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 5 , RR0 == 5
Average number of generated signals
ψψ0 == 5 , RR0 == 5
Average number of generated signals
I Very close performances according to the FDR
I Large differences in FDR estimation
Results in favor of the use of the GPS model with Pr(H∗0ij )
30 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Global Comparison
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 1 , RR0 == 1
Average number of generated signals
midFETGPSBCPNN
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 2 , RR0 == 2
Average number of generated signals
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 5 , RR0 == 5
Average number of generated signals
I Very close performances according to the FDR
I Large differences in FDR estimation
Results in favor of the use of the GPS model with Pr(H∗0ij )
30 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Description of the current methods
Extension to the multiple comparison setting
Simulation study
Application to the French pharmacovigilance data
Discussion
31 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Global analysis (1)
I 1984-2003
I nij ≥ 3 → 47 520 cells
FDR estimation
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
0 5000 10000 15000 20000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 1 , RR0 == 1
Number of generated signals
midRFETRORBCPNNGPS
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
0 1000 2000 3000 4000 5000 6000 7000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 2 , RR0 == 2
Number of generated signals
0 500 1000 1500 2000
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500 2000
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500 2000
0.00
0.02
0.04
0.06
0.08
0.10
0 500 1000 1500 2000
0.00
0.02
0.04
0.06
0.08
0.10
ψψ0 == 5 , RR0 == 5
Number of generated signals
I Results close to that obtained in the simulation study
32 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Feasibility Study
Afssaps is interested in implementing an automatic signal detection tool in France
Objective
A feasibility study was recently conducted with two aims
I Relevance of the signals generated by the method
I Time necessary for the evaluation of the signals
33 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Feasibility Study
Methodology
I Use of GPSpH0 with FDR at 5%
I A first analysis was performed on the data (Jan 1 2000 - dec 31 2008)
I A second analysis was performed three months later (Jan 1 2000 - March 31 2009)
I Selection of the 1414 signals that were highlighted by the second analysis but not bythe first one
I These signals were dispatched among the 31 regional pharmacovigilance centres andthe pharmacovigilance department at Afssaps (about 40 signals per centre)
34 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Feasibility Study
Methodology
The centres were given one month toI categorize the signals
known/nothing further
known/further follow up
unknown/further follow up
unknown/outlier
no time to assess
I Evaluate the time necessary for the analysis
35 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Feasibility Study
Results
I 28 centres participated (sent back the questionnaire) =⇒ 1294 signalsI 1170 signals were categorised
35.7% known/nothing further6.9% known/further follow up16.8% unknown/further follow up36.6% unknown/outlier4% no time to assess
I Large variability in the time necessary for analysisbetween 2 and 26 hours per centremedian : 6 hours
Conclusion
I 277 signal worth following up
I At least 60% of relevant signals
I The lack of precision was often invoked for the outliers : too vague signals=⇒ need to work out the coding dictionaries
36 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Description of the current methods
Extension to the multiple comparison setting
Simulation study
Application to the French pharmacovigilance data
Discussion
37 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Discussion (1)
Extension of the existing methods to the multiple comparison framework
I No modification of the model
I New decision rules
The GPS model with Pr(H∗0ij ) provides the best results
I in the simulation study
I in the sequential evaluation study
38 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
Discussion (2)
Limits of the GPS model
I Data represented as a contingency tableCommon to all the methods currently usedco-prescription
Use of penalized regressions
I Logistic regression in which
One studies one adverse event at a timeSeveral hundred of predictors : the drugs (coded in 1/0)Several hundred thousand (or millions) of individuals
I Lasso type algorithm / Stability Selection (Meinshausen & Buhlmann JRSS B 2010)
I Much more intensive
I Should make it possible to better account for the correlation structure between thedrugs
co-prescription
I Direct extension to the study of the interaction drug-drug
39 / 40
Introduction Current methods Extension to the multiple comparison setting Simulation study Application Discussion
References
I Ahmed, C Dalmasso, F Haramburu, F Thiessard, P Broet, and P Tubert-Bitter.
False discovery rate estimation for frequentist pharmacovigilance signal detection methods.Biometrics, 66(1) :301–309, March 2010.PMID : 19432790.
I Ahmed, F Thiessard, G Miremont-Salame, B Begaud, and P Tubert-Bitter.
Pharmacovigilance data mining with methods based on false discovery rates : a comparative simulation study.Clinical Pharmacology and Therapeutics, 88(4) :492–498, October 2010.PMID : 20811349.
Ismaıl Ahmed, Francoise Haramburu, Annie Fourrier-Reglat, Frantz Thiessard, Carmen Kreft-Jais, Ghada
Miremont-Salame, Bernard Begaud, and Pascale Tubert-Bitter.Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting.Statistics in Medicine, 28(13) :1774–1792, June 2009.PMID : 19360795.
40 / 40