2016年SASユーザー総会 Selection Model【企画セッション】欠測のあるデータの解析のためのSASプログラムの紹介～データ発生・DIAマクロとプロシジャの進展～

【企画セッション】欠測のあるデータの解析のためのSASプログラムの紹介～データ発生・DIAマクロとプロシジャの進展～

(3) MNARの仮定の下での Selection Modelに基づく解析

黒田晋吾1, 2 （1日本製薬工業協会医薬品評価委員会データサイエンス部会タスクフォース4

欠測のあるデータの解析検討チーム, 2武田薬品工業株式会社）

[Planning session] Introduction of the SAS programs to analyze the missing data

(3) Selection model approach under an MNAR assumption

Shingo Kuroda1, 2 1The team for statistical analysis of data analysis with missing data, task force 4,

data science expert committee, drug evaluation committee, Japan pharmaceutical manufacturers association

2Takeda Pharmaceutical Company Limited

要旨：

DIAマクロ「%Selection_Model2」で使用されている解析手法を解説し、SASによる実装方法を示すとともに、マクロの利用方法及び留意点も紹介する。

キーワード：欠測のあるデータの解析、MNAR、Selection Model、DIAマクロ

2

発表構成

• MNARの仮定の下でのSelection Modelに基づく推測

• 二変量正規分布での実装例

• マクロの利用方法と留意点

• マクロによるシミュレーションデータの解析

• まとめ

3





• まとめ

4

発表構成

5

※本発表では単調な欠測を仮定（𝑙：最終観測時点）

記号の定義

以降の理論に関するより詳細な内容ついては、JPMA (2016) Appendix Bをご参照ください

Selection Model

• (𝒀𝒊, 𝑹𝒊)の同時密度関数が以下のように分解できることを仮定（以下、共変量等の記載は省略）

• 観測データの同時密度関数、尤度関数

6

MNARの仮定の下でのSelection Model

• 観測データの尤度関数

• 𝜽の最尤推定量を計算

– 積分計算

– 最適化

7

Diggle and Kenwardの方法

• Diggle and Kenward（1994）に従い、以下を仮定

• Non-Future Dependence（NFD）（Kenward, Molenberghs and Thijs, 2003）

8

※𝑗 − 1時点まで観測された際に、𝑗時点以降欠測となる確率


• 積分の計算（）

9

※𝒀𝒊𝒎を含まない項は積分の外へ

※𝑌𝑖𝑗+1, ⋯ , 𝑌𝑖𝑛で積分

→𝑌𝑖𝑗のみの1次元の積分に

※𝑗 − 1時点までは観測、𝑗時点以降欠測

• 観測データの尤度

• 「数値積分」＋「最適化」により最尤推定量を算出

– SASではIMLプロシジャのQUAD CALLとNLPXXX CALLを利用することでパラメータの推測が可能となる（※用いる最適化の手法に応じてXXXは適切に変更） 10


〔被験者𝑖に関する部分（𝑅𝑖𝑗−1 = 1, 𝑅𝑖𝑗 = 0のとき）〕

発表構成





• まとめ

11

二変量正規分布での実装例

• テストデータ

– 二変量正規分布に従うデータを発生

– 𝑋は必ず観測される

– 𝑌にのみ欠測が生じうるとし、𝑌の欠測確率を以下で定義

𝑌𝑖の欠測の有無が𝑌𝑖自身の値に依存（𝑖：被験者） → 欠測メカニズムはMNAR

12


13

二変量正規分布での実装例 1 − 𝑟

（欠測識別変数） 𝑥

（欠測なし） 𝑦

（Fullデータ） 𝑦𝑜

（観測データ）


– 欠測データを含んだ場合

14

n mean sd

𝑥 100 0.00771 0.99725

𝑦 100 -0.05392 0.97332



– Observed Case

15

n mean sd

𝑥 100 0.00771 0.99725

𝑦𝑜 62 -0.55704 0.81112


• IMLプロシジャを利用した解析

– 対数尤度の計算（𝑌：観測）

16

proc iml ; ... start likelihood(param) global(x_obs, y_obs, x_mis) ; * param = (theta psi) = (mux muy sx sy rho psi1-psi3) ; * x_obs: yが欠測でない被験者のxの値, y_obs: yが欠測でない被験者のyの値, x_mis: yが欠測である被験者のxの値 ;

* x,yがともに観測 ; ll_obs_1 = -log(theta[3]) -log(theta[4]) -1/2#log(1-theta[5]##2) -1/(2#(1-theta[5]##2)) # ( (x_obs-theta[1])##2/theta[3]##2 -2#theta[5]#(x_obs-theta[1])#(y_obs-theta[2])/(theta[3]#theta[4]) +(y_obs-theta[2])##2/theta[4]##2 ) ; ll_obs_2 = -log(1+exp(psi[1]+psi[2]#x_obs+psi[3]#y_obs)) ; ...

※二変量正規分布：


1

2𝜋𝜎𝑥𝜎𝑦 1 − 𝜌2

exp −1

2(1 − 𝜌2)

𝑥 − 𝜇𝑥2

𝜎𝑥2 − 2𝜌

𝑥 − 𝜇𝑥 𝑦 − 𝜇𝑦𝜎𝑥𝜎𝑦

+𝑦 − 𝜇𝑦

2

𝜎𝑦2

theta パラメータ

theta[1] 𝜇𝑥

theta[2] 𝜇𝑦

theta[3] 𝜎𝑥

theta[4] 𝜎𝑦

theta[5] 𝜌

17

proc iml ; ... start likelihood(theta) global(x_obs, y_obs, x_mis) ; * param = (theta psi) = (mux muy sx sy rho psi1-psi3) ; * x_obs: yが欠測でない被験者のxの値, y_obs: yが欠測でない被験者のyの値, x_mis: yが欠測である被験者のxの値 ;

* x,yがともに観測 ; ll_obs_1 = -log(theta[3]) -log(theta[4]) -1/2#log(1-theta[5]##2) -1/(2#(1-theta[5]##2)) # ( (x_obs-theta[1])##2/theta[3]##2 -2#theta[5]#(x_obs-theta[1])#(y_obs-theta[2])/(theta[3]#theta[4]) +(y_obs-theta[2])##2/theta[4]##2 ) ; ll_obs_2 = -log(1+exp(psi[1]+psi[2]#x_obs+psi[3]#y_obs)) ; ...

※ロジスティックモデルによる𝑌の観測確率：


1

1 + exp 𝜓1 + 𝜓2𝑥 + 𝜓3𝑦𝑜


– 対数尤度の計算（𝑌：観測）

18

proc iml ; ... * xが観測, yが欠測 ; ll_mis_1 = -log(theta[3]) -(x_mis-theta[1])##2 / (2#theta[3]##2) ; ecurr = theta[2]+theta[5]#theta[4]/theta[3]#(x_mis-theta[1]) ; vcurr = theta[4]##2#(1-theta[5]##2) ; range = ecurr-8#sqrt(vcurr)||ecurr+8#sqrt(vcurr) ; ll_mis_2 = j(n_mis,1,0) ; do i=1 to n_mis ; call quad(integral,"integral",range[i,]) ; ll_mis_2[i]=log(integral) ; end ; ... finish ;

※一変量正規分布：


1

2𝜋𝜎𝑥2

exp −𝑥 − 𝜇𝑥

2

2𝜎𝑥2


theta[1] 𝜇𝑥

theta[2] 𝜇𝑦

theta[3] 𝜎𝑥

theta[4] 𝜎𝑦

theta[5] 𝜌


– 対数尤度の計算（𝑌：欠測）

19

proc iml ; ... start integral(y_m) global(x_mis, psi, ecurr, vcurr, i) ; pi=constant("PI") ; ex=exp(psi[1]+psi[2]#x_mis[i]+psi[3]#y_m) ; p=ex/(1+ex) ; int = p#(1/(sqrt(2#pi#vcurr)))#exp(-(1/2)#(y_m-ecurr)##2/vcurr) ; integral = int[i] ; return (integral) ; finish integral ; ...

※ロジスティックモデルによる𝑌の欠測確率と𝑌𝑚の𝑋での条件付き密度：


exp 𝜓1 + 𝜓2𝑥 + 𝜓3𝑦𝑚

1 + exp 𝜓1 + 𝜓2𝑥 + 𝜓3𝑦𝑚

1

2𝜋𝜎𝑦|𝑥2

exp −𝑦𝑚 − 𝜇𝑦|𝑥

2

2𝜎𝑦|𝑥2

vcurr ecurr

𝜇𝑦|𝑥 = 𝜇𝑦 + 𝜌𝜎𝑦/𝜎𝑥(𝑥 − 𝜇𝑥)

𝜎𝑦|𝑥2 = 𝜎𝑦

2(1 − 𝜌2)


– 対数尤度の計算（𝑌：欠測）


– パラメータの最尤推定

– 推定値

20

proc iml ; ... x0={0 0 1 1 0 0 0 0} ; opt1={1} ; con={. . 1e-10 1e-10 -1 . . ., . . . . 1 . . .} ; call nlpnrr(rc1,xr1,"likelihood",x0) opt=opt1 blc=con ; mux=xr1[1] ; muy=xr1[2] ; vx=xr1[3]##2 ; vy=xr1[4]##2 ; cov=xr1[5]#xr1[3]#xr1[4] ; psi1=xr1[6] ; psi2=xr1[7] ; psi3=xr1[8] ; print mux muy vx vy cov psi1 psi2 psi3 ; quit ;


• DIAマクロ（%Selection_Model2）による結果

– 2群以上存在しない場合はエラーとなる（観測確率モデルにおける2番目の群のパラメータ（psi6等）の分散が0となるため）

21 mux = －0.0146 + 0.0223 = 0.0077, muy = －0.0146






• まとめ

22

発表構成

%Selection_Model2の指定 %Selection_Model2( inputds =, * 解析するデータセット; covtype =, * 共分散構造（UN、TOEP、TOEPH、AR、ARH、CS、CSHから選択） ; response =, * 応答変数の変数名 ; modl =, * 応答変数のモデルの平均構造 ; clasvar =, * modlで指定したカテゴリ変数 ; mech =, * 欠測メカニズム（MCAR、MAR、MNAR、MNARSから選択） ; psi5 =, * mech=MNARSを指定した場合のパラメータ値の指定 ; psi6 =, * mech=MNARSを指定した場合のパラメータ値の指定 ; const =, * 積分範囲（mu±const*sigma、constは3～8の値を推奨） ; derivative=, * 1：対数尤度の1階微分を解析的に計算（推奨） 0：対数尤度の微分をfinite difference法により計算 ; method =, * 最適化の手法（NR又はQN） ; out1 =, * アウトプットDS名（モデルのパラメータ推定値） ; out2 =, * アウトプットDS名（LS meanの差） ; out3 =, * アウトプットDS名（LS mean） ; ) ;

23

mech 実行内容

MNAR 𝜓5, 𝜓6も推定

MNARS 𝜓5, 𝜓6の値を指定

%SM_GridSearchの指定

%SM_GridSearchではmech=MNARSを指定し𝜓5と𝜓6の値を複数指定可能 → tipping point analysisが容易に実行可能

%SM_GridSearch( psi5grid =, * psi5の値をスペース区切りで指定 ; psi6grid =, * psi6の値をスペース区切りで指定 ; const =, * 積分範囲（mu±const*sigma、constは3～8の値を推奨） ; inputds =, * 解析するデータセット; covtype =, * 共分散構造（UN、TOEP、TOEPH、AR、ARH、CS、CSHから選択） ; response =, * 応答変数の変数名 ; modl =, * 応答変数のモデルの平均構造 ; clasvar =, * modlで指定したカテゴリ変数 ; mech = MNARS, derivative =, * 1：対数尤度の1階微分を解析的に計算（推奨） 0：対数尤度の微分をfinite difference法により計算 ; method =, * 最適化の手法（NR又はQN） ; out1 =, * アウトプットDS名（モデルのパラメータ推定値） ; out2 =, * アウトプットDS名（LS meanの差） ; out3 =, * アウトプットDS名（LS mean） ; );

24

マクロ使用時の留意点

• inputdsで指定するデータセットは、subject・visitごとに1レコードの形式

• 変数名

– 時点を表す変数名： seq

– 治療を表す変数名： therapy

– 被験者を表す変数名： patient

• 群数

– 3群以上でもマクロは動く

– 以下のように観測確率モデルでは2群以下を想定したプログラムとなっているため、3群以上の場合でのマクロの実行結果には注意が必要 •

=psi[1]*(1- therapy) + psi[2]*therapy + psi[3]*y_ij-1*(1- therapy) + psi[4]*y_ij-1* therapy + psi[5]*y_ij*(1- therapy) + psi[6]*y_ij* therapy

• すべての被験者について、最初の時点のレコードは非欠測とする

25

• 非単調な欠測

– 非単調な欠測を含むデータでもマクロは実行可能

– マクロ内で非単調な欠測がある場合、尤度の計算の際に当該欠測時点を無視（単純に欠測時点を除外し、存在する時点のデータのみから尤度を計算）

– 観測確率モデルの尤度が想定するモデルと異なる（1時点前が欠測のときは2

時点前のデータを利用することになる）ため、事前に単調な欠測のみを含むデータにしておくべき

• 最適化にはIMLのNLPNRR CALL又はNLPQN CALLを使用（いずれかを指定）

• 推定は最尤法に基づく

• LS meanの比較にはZ検定を使用

• 観測確率モデルは群ごとに以下のモデルが想定されており、他の因子をモデルに含めることはできない

26

マクロ使用時の留意点

参考（Bayes手法の利用）

• Liu et al. (2015)

– 𝒀と𝑹の同時密度関数について、MCMCによりパラメータ𝜽, 𝝍を推定

– Diggle and Kenward（1994）に従い以下を仮定

27


28

n mean sd

𝑥 100 0.00771 0.99725

𝑦 100 -0.05392 0.97332

𝑦𝑜 62 -0.55704 0.81112


29

proc mcmc data =test_mnar_miss nbi=5000 nmc=500000 thin=50 seed=777 monitor=(mu cov psi) ; array yy{2} x y_o ; array mu{2} mu1-mu2 ; array pp{2} p1-p2 ; array psi{3} psi1-psi3 ; array cov[2,2] ; array s0[2,2] (0.0001 0 0 0.0001) ; parms mu1-mu2 0 ; parms psi1-psi3 0 ; parms cov ; prior mu1-mu2 ~ normal(0,var=10000) ; prior psi1-psi3 ~ normal(0,var=10000) ; prior cov ~ iwish(3,S0) ; p=exp(psi[1]+psi[2]*yy[1]+psi[3]*yy[2]) ; pp=p/(1+p) ; if miss=0 then ll=log(1-pp) ; else ll=log(pp) ; model yy ~ mvn(mu,cov) ; model general(ll) ; run ;






• まとめ

30

発表構成

解析対象データ

31

◎うつ病の第III相試験を想定したシミュレーションデータ

●主要評価項目：HAM-D → スコア低下：改善

●実薬群 vs プラセボ群・1群120例（ベースライン）

ベースライン時点1 時点2 時点3 時点4

例数平均 (SD)

例数平均 (SD)

例数平均 (SD)

例数平均 (SD)

例数平均 (SD)

実薬群 120 20.0 (4.1)

100 14.9 (4.4)

92 12.8 (6.5)

86 10.1 (7.5)

83 8.1

(8.7)

プラセボ群 120 20.2 (3.9)

106 15.9 (4.6)

100 13.3 (5.3)

94 12.1 (7.2)

88 11.3 (8.2)

ベースライン時点1 時点2 時点4 時点3 0

10

20

30

（平均＋SD）

実薬群プラセボ群

• %Selection_Model2による解析の結果

シミュレーションデータへの適用

データ欠測メカニズムの指定*

群時点1 時点2 時点3 時点4 時点4 での差

𝝍の推定値

欠測なし**

― Drug -4.8 (0.3) -6.7 (0.5) -8.8 (0.6) -10.0 (0.7) -1.6

p=0.126*** －

Placebo -4.0 (0.3) -6.6 (0.4) -7.3 (0.6) -8.4 (0.7)

欠測あり

MAR Drug -4.9 (0.3) -7.0 (0.5) -9.2 (0.7) -11.2 (0.8) -2.9

p=0.015 𝜓1 = −2.0, 𝜓3 = 0.1 𝜓2 = −2.5, 𝜓4 = 0.0 Placebo -4.1 (0.3) -6.6 (0.5) -7.7 (0.6) -8.3 (0.8)

MNAR （𝜓5, 𝜓6も推定）

Drug -4.9 (0.3) -6.7 (0.5) -8.6 (0.7) -10.4 (0.9) -1.4 p=0.300

𝜓1 = −1.9, 𝜓3 = −0.1, 𝜓5 = 0.3 𝜓2 = −3.6, 𝜓4 = 0.2, 𝜓6 = −0.2 Placebo -4.1 (0.3) -6.7 (0.5) -8.1 (0.7) -9.0 (0.9)

32

各時点の値：LS mean（SE） * 解析時に仮定する欠測メカニズム ** データを欠測させる前のFullデータ *** 2標本t検定

• %SM_GridSearchによる解析の結果

データ欠測メカニズムの指定*

感度パラメータ群時点4 時点4での差

欠測なし** ― Drug -10.0 (0.7) -1.6

p=0.126*** Placebo -8.4 (0.7)

欠測あり MNAR

（𝜓5, 𝜓6を指定）

𝜓5 = 0.0, 𝜓6 = 0.0

Drug -11.2 (0.8) -2.9 p=0.015 Placebo -8.3 (0.8)

𝜓5 = 0.1, 𝜓6 = 0.0

Drug -10.8 (0.9) -2.5 p=0.032 Placebo -8.3 (0.8)

𝜓5 = 0.2, 𝜓6 = 0.0

Drug -10.5 (0.9) -2.2 p=0.061 Placebo -8.3 (0.8)

𝜓5 = 0.3, 𝜓6 = 0.0

Drug -10.3 (0.9) -2.0 p=0.098 Placebo -8.3 (0.8)

33

時点4の値：LS mean（SE） * 解析時に仮定する欠測メカニズム ** データを欠測させる前のFullデータ *** 2標本t検定

シミュレーションデータへの適用





• まとめ

34

発表構成

まとめ

• 欠測メカニズムにMNARを仮定したSelection Modelでは、応

答変数と欠測識別変数に対するモデルを仮定し、積分計算と対数尤度関数の最大化を行うことで最尤法によってパラメータの推定を行うことが可能となる

• DIAマクロ「%Selection_Model2」では、Diggle and Kenward（1994）の仮定を用いた、Selection Modelに基づく解析が実行できる

– 感度パラメータを利用したtipping point analysisも実施が可能（%SM_GridSearchの利用）

• マクロにはいくつかの制約があるため、使用する際は注意が必要

35

参考文献 • Diggle, P. and Kenward, M.G. (1984). Informative drop-out in longitudinal data

analysis. Applied Statistics, 43, 49-93.

• Kenward, M.G., Molenberghs, G. and Thijs, H. (2003). Pattern-mixture models with proper time dependence. Biometrika, 90, 53-71.

• Liu, G.F., Han, B., Zhao, X. and Lin, Q. (2015). A comparison of frequentist and Bayesian model based approaches for missing data analysis: Case study with a schizophrenia clinical trial. Statistics in Biopharmaceutical Research, doi: 10.1080/19466315.2015.1077725

• SAS Institute Inc. (2015). SAS/IML® 14.1 User's Guide. http://support.sas.com/documentation/cdl/en/imlug/68150/PDF/default/imlug.pdf[Accessed 12 May 2016]

• 日本製薬工業協会医薬品評価委員会データサイエンス部会欠測のあるデータの解析チーム (2016). 欠測のある連続量経時データに対する統計手法について(Ver2.0). http://www.jpma.or.jp/information/evaluation/allotment/statistics.html [Accessed 12 May 2016]

36

http://support.sas.com/documentation/cdl/en/imlug/68150/PDF/default/imlug.pdfhttp://www.jpma.or.jp/information/evaluation/allotment/statistics.htmlhttp://www.jpma.or.jp/information/evaluation/allotment/statistics.html

Back up （公開予定）

37

MCAR又はMARの仮定の下での Selection Model

• 観測データの対数尤度関数

38

• 観測データの対数尤度関数上記に加えて𝜽と𝝍にdistinctness conditionを仮定す

れば、尤度に基づいた解析を行うことによって観測データのみに基づいて𝜽の推測が可能（𝝍に関する推測はできない）

39

MCAR又はMARの仮定の下での Selection Model

観測確率の計算

• 積分の計算の準備

40

IMLでの数値積分

• QUAD CALL

– 一次元の実数値関数の数値積分を行う

– call quad(r, "fun", points ) ;

• r: 積分の結果で構成されるベクトル

• fun: 被積分関数を指定した関数モジュール名

• points: 積分範囲を指定したベクトル（-∞→.M、∞→.Pで指定）

41

IMLでの数値積分

• 例：

42

proc iml ; result1=sqrt(constant("PI")/2) ; start fun(x) ; y=exp(-x##2/2) ; return(y) ; finish fun ; r={0 .P} ; call quad(result2,"fun",r) ; print result1 result2 ; quit ;

IMLでの最適化

マクロで使用されている最適化のサブルーチン

• NLPQN CALL

– 準ニュートン法による最適化

• NLPNRR CALL

– リッジ安定化Newton-Raphson法による最適化

call サブルーチン名(rc, xr, "fun", x0 ) ;

• rc: 最適化が終了した原因を格納するベクトル（rc>0で成功）

• xr: 最適化が成功したときのパラメータ値を格納するベクトル

• fun: 最適化する関数を指定した関数モジュール名

• x0: 初期値を指定するベクトル 43

IMLでの最適化

• その他、IMLで使用できるサブルーチン

– NLPCG：共役勾配法

– NLPDD：ダブルドッグレッグ法

– NLPNMS： Nelder-Meadシンプレックス法

– NLPNRA：直線探索付きNewton-Raphson法

– NLPTR：信頼領域法

土居主尚ら（2007）「SASプロシジャにおける非線形最適化法の実例」に各手法の簡単な説明がある

44

IMLでの最適化

• 例：

45

proc iml ; start fun(x) ; y=sin(x) ; return(y) ; finish fun ; x0={0} ; opt1={1} ; *--- 最大化 ; opt2={0} ; *--- 最小化 ; con=-constant("PI")//constant("PI") ; *--- xの範囲の制約 ; call nlpnrr(rc1,xr1,"fun",x0) opt=opt1 blc=con ; max=fun(xr1) ; print rc1 xr1 max ; call nlpnrr(rc2,xr2,"fun",x0) opt=opt2 blc=con ; min=fun(xr2) ; print rc2 xr2 min ; quit ;

オプション・「opt=」で最大化、最小化を指定（第1要素）第2要素で出力内容も指定可・「blc=」でパラメータの制約を指定・その他、「grd=」で導関数を指定することも可

IMLでの最適化

• 最適化手法の選択（SAS/STAT User's guideより） – Techniques that do not use the Hessian also tend to be less reliable.

For example, they can more easily terminate at stationary points rather than at global optima.

– The second-derivative methods TRUREG, NEWRAP, and NRRIDG are best for small problems where the Hessian matrix is not expensive to compute. Sometimes the NRRIDG algorithm can be faster than the TRUREG algorithm, but TRUREG can be more stable

46 FOD: First Order Derivative

SOD: Second Order Derivative

NLMIXEDでの最適化

• NLMIXEDプロシジャでの最適化

– general log likelihood functionとして最適化を行う目的関数を指定できることを利用

– 最適化の手法としては以下が利用可能（PROC NLMIXEDステートメントのTECH=オプションで指定） • CONGRA: 共役勾配法

• DBLDOG: ダブルドッグレッグ法

• NMSIMP: Nelder-Meadシンプレックス法

• NEWRAP: 直線探索付きNewton-Raphson法

• NRRIDG: リッジ安定化Newton-Raphson法

• QUANEW: 準ニュートン法（デフォルト）

• TRUREG: 信頼領域法

47

NLMIXEDでの最適化

• 例：

48

data dummy ; y=0 ; run ; proc nlmixed data=dummy tech=nrridg ; parms x=0 ; * パラメータの初期値を指定 ; bounds -1

IMLでの導関数の計算

• NLPFDD CALL

– 有限差分による導関数の近似

– 点x0における関数値f、勾配ベクトルg、ヘッセ行列hを返す

– ヘッセ行列を用いて最尤推定量の分散が計算可能

– call nlpfdd(f, g, h, "fun", x0 ) ; • f: 点x0における関数"fun"の値

• g: 点x0における"fun"の勾配ベクトルの値

• h: 点x0における"fun"のヘッセ行列の値

• fun: 関数モジュール名

• x0: 関数値や導関数値を計算する点

49

IMLでの最適化・導関数の計算

• 例：

50

proc iml ; start fun(x) ; y1=10*(x[2]-x[1]*x[1]) ; y2=1-x[1] ; f=0.5*(y1*y1+y2*y2) ; return(f) ; finish fun ; opt={0} ; x0={0 0} ; call nlpnrr(rc1,xr1,"fun",x0) ; call nlpfdd(f,g,h,"fun",xr1) ; print xr1 f g h ; quit ;

二変量正規分布での例（MAR）


– 二変量正規分布に従うデータを発生（N=100）

– 𝑋は必ず観測される

– 𝑌にのみ欠測が生じるとし、 𝑌の欠測確率を以下で定義（𝑌の欠測の有無は𝑌自身の値に依存しない）

51



– 欠測データを含んだ場合

52

n mean sd

𝑥 100 0.00771 0.99725

𝑦 100 -0.05392 0.97332


– Observed Case

53

n mean sd

𝑥 100 0.00771 0.99725

𝑦 62 -0.20471 1.02698



– 対数尤度の計算

54

proc iml ; ... start likelihood(theta) global(x_obs, y_obs, x_mis) ; * theta = (mux muy sx sy rho) ; * x_obs: yが欠測でない被験者のxの値, y_obs: yが欠測でない被験者のyの値, x_mis: yが欠測である被験者のxの値 ; * x,yがともに観測 ; ll_obs = -log(theta[3]) -log(theta[4]) -1/2#log(1-theta[5]##2) -1/(2#(1-theta[5]##2)) # ( (x_obs-theta[1])##2/theta[3]##2 -2#theta[5]#(x_obs-theta[1])#(y_obs-theta[2])/(theta[3]#theta[4]) +(y_obs-theta[2])##2/theta[4]##2 ) ; ... finish ;


※二変量正規分布： 1

2𝜋𝜎𝑥𝜎𝑦 1 − 𝜌2

exp −1

2(1 − 𝜌2)

𝑥 − 𝜇𝑥2

𝜎𝑥2 − 2𝜌

𝑥 − 𝜇𝑥 𝑦 − 𝜇𝑦𝜎𝑥𝜎𝑦

+𝑦 − 𝜇𝑦

2

𝜎𝑦2


theta[1] 𝜇𝑥

theta[2] 𝜇𝑦

theta[3] 𝜎𝑥

theta[4] 𝜎𝑦

theta[5] 𝜌


– 対数尤度の計算

55

proc iml ; ... start likelihood(theta) global(x_obs, y_obs, x_mis) ; * theta = (mux muy sx sy rho) ; * x_obs: yが欠測でない被験者のxの値, y_obs: yが欠測でない被験者のyの値, x_mis: yが欠測である被験者のxの値 ; ... * xが観測、yが欠測 ; ll_mis = -log(theta[3]) -(x_mis-theta[1])##2 / (2#theta[3]##2) ; ll_all=ll_obs//ll_mis ; ll=ll_all[+] ; return(ll) ; finish ;


※一変量正規分布： 1

2𝜋𝜎𝑥2

exp −𝑥 − 𝜇𝑥

2

2𝜎𝑥2


theta[1] 𝜇𝑥

theta[2] 𝜇𝑦

theta[3] 𝜎𝑥

theta[4] 𝜎𝑦

theta[5] 𝜌


– パラメータの最尤推定

– 推定値

MIXEDプロシジャの結果と一致する 56

proc iml ; ... x0={0 0 1 1 0} ; opt1={1} ; con={. . 1e-10 1e-10 -1, . . . . 1} ; call nlpnrr(rc1,xr1,"likelihood",x0) opt=opt1 blc=con ; mux=xr1[1] ; muy=xr1[2] ; vx=xr1[3]##2 ; vy=xr1[4]##2 ; cov=xr1[5]#xr1[3]#xr1[4] ; print mux muy vx vy cov ; quit ;


マクロプログラムの構成➀

• %initial

1. 群情報を変数groupに格納し、昇順に0から連番を振る

2. 被験者と時点のすべての組み合わせからなるデータセット（欠測レコードを含むデータセット）を作成

3. mixedプロシジャにより、マクロ変数"modl"、"clasvar"、"covtype"に指定したモデルを用いてMMRMを実行し、最適化の際に使用する初期値を計算

4. Logisticプロシジャのoutdesignオプションを用いて、inputdsで指定したDSについて、マクロ変数"modl"、"clasvar"で指定したモデルに対応する計画行列を作成

57

マクロプログラムの構成➁

• %SeM2

1. 被積分関数の定義

2. 被積分関数の各パラメータでの微分を定義

3. 欠測時点を特定するとともに、非単調な欠測がある場合、それを生じさせているレコードを削除

4. 完了例と中止例のそれぞれで対数尤度関数を計算

5. 完了例と中止例で対数尤度関数の各パラメータでの微分を定義

6. nlpqn call又はnlpnrr callにより最適化を実施

7. nlpfdd callによりヘシアン行列を計算

8. 時点ごと、及び全体のLS meanとそのSEを計算するとともにLS meanの群間差のZ検定に基づくp値を算出

9. 各種結果を含むデータセットの作成と結果の出力

58

マクロプログラムの構成➂

• %Selection_Model2

1. %initialの呼び出し

2. %SeM2の呼び出し

• %SM_GridSearch

1. &psi5grid、 &psi6gridからpsi5とpsi6の値の読み取り

2. psi5とpsi6のすべての組み合わせについて、%Selection_Model2を実行

59

SMによる感度分析に関する議論（まとめ）

• NRC (2010)

– Selection Modelにおいて、応答変数（欠測データを含む𝑓(𝒀|𝜽) ）の同時分布を仮定することは自然

– Selection Modelを用いた感度分析の結果は応答変数の分布に対する仮定に影響を受けやすいため注意が必要（𝑓(𝒀|𝜽) の仮定の妥当性は確認できない）

– Selection Modelを用いた感度分析のデメリットとして、感度パラメータ（𝜓）の解釈が困難であること、感度パラメータの適切な範囲の設定が困難であることがあげられる

• O'Kelly and Ratitch (2014)

– Selection Modelによる感度分析から得られる結果は、解釈や臨床家への説明が難しいことが多い

– 臨床家がモデルのパラメータを解釈可能な場合に限り、Selection Modelによる感度分析を行うべき

60

SMによる感度分析に関する議論

• Diggle and Kenward (1994)

– The method proposed relies on the correct description of the conditional distribution of a missing value given the history of the process. This suggests that the estimators may not be robust to misspecification of the joint distribution, particularly the covariance structure.

– Dawid, Laird, Little, Matthews, Rubin and Skinner all commented on the sensitivity of our analyses to model assumptions which are difficult to verify from the available data. We agree, and suspect that the same must be true of any attempt to make inferences in the presence of informative drop-outs ... this suggests that more attention should be paid to sensitivity analyses under different plausible assumptions about the drop-out process

61

• NRC (2010)

– it seems more natural to assume a model for the full-data response, as is done in selection models

– However, it may not be intuitive to specify the relationship between nonresponse probability and the outcome of interest, which typically has to be done in the logit or probit scale. Moreover, the predictive distribution of missing responses typically is intractable, so it can be difficult to understand in simple terms how the missing observations are being imputed under a given model

– "In parametric selection models fit under the MNAR assumption, identification of parameters and sensitivity to assumptions raises serious problems", "selection models are highly sensitive to parametric assumptions about the full data distribution"

– potential disadvantage of selection models relates to interpretation of the sensitivity parameter. Particularly for continuous measures, it may be difficult to interpret nonresponse rates on the odds scale and to specify reasonable ranges for the sensitivity parameter 62


• Molenberghs et al. (2007)

– while MNAR models are more general and explicitly incorporate the dropout mechanism, the inferences they produce are typically highly dependent on the untestable and often implicit assumptions built in regarding the distribution of the unobserved measurements given the observed ones.

– concern has been growing precisely about the fact that models often rest on strong assumptions and relatively little evidence from the data themselves

– we must be extremely careful with interpreting evidence for or against MNAR in a selection model context

– there are further important issues surrounding MNAR based models. Even when the measurement model (e.g., the multivariate normal model) would beyond any doubt be the choice of preference for describing the measurement process should the data be complete, then the analysis of the actually observed, incomplete version is, in addition, subject to further untestable modeling assumptions

63


• O'Kelly and Ratitch (2014) – "Use selection model ... only if confident that the modeled parameters

can be interpreted clearly by a non-statistician", "we find the results from selection models difficult to interpret and difficult to explain to clinicians"

– Regression coefficient for 𝑌𝑚 in the model for R can allow the model to test for dependence on unobserved data. However, importantly ... this test itself relies on untestable assumptions about the distribution of the data, and so is not a useful test of dependence – it cannot be used to prove that the data are or are not MAR

– scenarios of greater or lesser dependence ... can be played out by assigning larger or smaller values to the coefficient for 𝑌𝑚 in the model, and assessing the resulting estimates of treatment effect. ... we add a caveat that these estimates can be difficult to interpret in a clinically meaningful way

64


• Daniels and Hogan (2008) – selection models make assumptions about the missingness

mechanism in a way that is inextricably linked to assumptions about distribution of the outcome, and so are not even in theory suited be used as sensitivity analyses

65


• Mallinckrodt (2013) – in addition to estimating the MNAR parameters, which is driven by

assumption, it is useful to vary the parameters describing the MNAR mechanism across a plausible range

– For example, a selection model can be implemented in which from the dropout model an estimate of the association between the present, possibly missing efficacy outcome and the probability of dropout is obtained. However, it is hard to have confidence in this estimate. Therefore, subsequent selection models can be run wherein differing values of this parameter are input and fixed in the analysis, thereby facilitating assessment of changes in the magnitude of the primary treatment contrast across the various analyses.

66


• NRC (2010) – Plots such as those shown in Figure 5-2 can be helpful in

understanding how values of the sensitivity parameter correspond to imputed means for the missing outcomes.

67

SMにおける感度パラメータの解釈

Documents

2016年SASユーザー総会 Selection Model【企画セッション】 欠測のあるデータの解析のためのSASプログラムの紹介 ～データ発生・DIAマクロとプロシジャの進展～

2016年SASユーザー総会 Selection Model【企画セッション】欠測のあるデータの解析のためのSASプログラムの紹介～データ発生・DIAマクロとプロシジャの進展～