View
0
Download
0
Category
Preview:
Citation preview
A Flexible Copula Specification Test under Censorship
Juan Lin∗ Ximing Wu†
Abstract
We propose a copula specification test for copula-based multivariate survival models under
censorship. This flexible test is applicable to both Archimedean and non-Arhcimedean copulas
and provides useful pointers for alternative copula construction when a null distribution is
rejected. It is shown to be consistent and asymptotically distribution free. We demonstrate
its good finite sample performance via Monte Carlo simulations. Two empirical applications
illustrate the usefulness of this test.
Keywords: copula specification; smooth test; censorship; survival analysis
∗Department of Finance, School of Economics & Wang Yanan Institute for Studies in Economics, Xiamen Uni-versity, China. Email: linjuan.bnu@gmail.com†Corresponding author. Department of Agricultural Economics, Texas A&M University, College Station, TX,
77843, U.S.A. Tel: (979) 458-1355, Fax: (979) 862-1563, Email: xwu@tamu.edu
1 Introduction
Copula has been an increasingly popular tool to model the dependence structure of multivariate
data. One major advantage of the copula approach is its flexibility in separately modeling the
marginal distributions of individual variables and their dependence structure; cf. Cherubini et al.
(2004, 2011), McNeil et al. (2005), Patton (2012) and references therein. It has been widely
appreciated that the specification of copula models is of great importance since different copulas
lead to multivariate models that may have very different dependence properties. For instance, the
Gaussian copula has been widely used to model the collateralized debt obligation (CDO) due to its
simplicity. However, it imposes the restriction that extreme events are asymptotically independent
and therefore does not accommodate tail dependence during times of financial turmoil. Due to this
drawback, it has been vilified as “the formula that killed Wall Street” (Salmon, 2009). A number
of studies have attempted to address the issue of copula specification. For general overviews of
copula specification tests for complete (i.e., uncensored) data, see among others Chen and Fan
(2005, 2006), Genest et al. (2009), Berg (2009) and Fermanian (2013).
This study concerns censored data, which are frequently encountered in many areas, such as
actuarial science, finance, economics and biology; cf. Frees et al. (1996), Klugman and Parsa (1999),
Denuit et al. (2006), Rigobon and Stoker (2007) among others. For example, in household economic
surveys, high earnings are typically topcoded (right censored) for confidentiality. Unemployment
duration data are often censored because an unemployment spell might continue beyond the end of
data collection period. In credit risk analysis, a firm’s lifetime is censored if either default does not
occur by the end of the observation period, or if the firm drops out of the sample for reasons other
than default such as merger, acquisition or liquidation. In actuarial science, the insurance losses are
usually bounded from above because the amount of claim cannot exceed the insured’s policy limit.
The lifetimes of annuitants are censored if they live beyond the end of their contractual guarantee
period.
There is a small body of work that extends copula specification tests to censored data, with
a focus on bivariate Archimedean copulas. Shih (1998) introduced a procedure to assess the fit
of the Clayton copula. Wang and Wells (2000) proposed a model selection procedure within a
family of Archimedean copulas based on the truncated version of the Kendall process introduced
by Genest and Rivest (1993). Lakhal-Chaieb (2010) further modified Wang and Wells’s (2000)
test, proposing a new inverse probability censoring weighted estimator for Kendall’s τ . Emura
et al. (2010) proposed a test that is based on the discrepancy between the unweighted and weighted
concordance estimators of the associated Archimedean copula parameter. Wang (2010) considered a
test based on the correlation between two random quantities that are shown to be independent under
Archimedean copulas by Genest and Rivest (1993). Yilmaz and Lawless (2011) proposed some
likelihood and pseudo-likelihood ratio procedures for parametric and semiparametric Archimedean
copula families.
1
Several recent studies considered non-Archidemean copulas for censored data. Andersen et al.
(2005) proposed a goodness-of-fit test based on the partition of the unit square and the comparison
of the estimated frequencies for a parametrically estimated copula and an empirical copula. This
test, however, may not be suitable for heavy censoring; see Yilmaz and Lawless (2011). Chen et al.
(2010) proposed a likelihood-based criterion of copula selection under general censorship. Gribkova
and Lopez (2015) considered non-parametric copula estimation under bivariate censoring.
In this paper we propose a general copula specification test for censored data along the lines
of Neyman’s (1937) smooth test of distributions. The smooth test has been developed in many
directions. For instance, Gray and Pierce (1985) extended it to censored data; Lin and Wu (2015)
applied it to copula specification for complete data. Ledwina (1994) introduced a data driven
smooth test of distribution, which is generalized to test independence by Kallenberg and Ledwina
(1999). This study extends data driven smooth test to multivariate models with censored data.
Unlike most of the existing tests, the proposed test accommodates both Archimedean and non-
Archimedean copulas. We show that the proposed test is asymptotically distribution free and
establish its large sample properties. In addition, it is diagnostic and conductive to improved
copula specification: when a null hypothesis is rejected under moment functions selected by some
data-driven principle, an alternative copula can be constructed by augmenting the null distribution
with these selected moment functions in the spirit of Efron and Tibshirani (1996).
The remainder of this paper is organized as follows. Section 2 briefly reviews Neyman’s smooth
test of distribution and several extensions pertinent to this study. Section 3 proposes a copula
smooth test for censored data under simple hypotheses without unknown parameters. In Section
4, we present a general copula smooth test with estimated marginal distributions and copula pa-
rameters and derive its large sample properties. Finite sample performances of the proposed test
are investigated in section 5 through Monte Carlo simulations. Section 6 provides two empirical
applications in actuarial science. All proofs are relegated to the appendix.
Throughout the paper, we use upper case letters to denote cumulative distribution functions
and corresponding lower case letters to denote density functions. We use subscript t to index
observations and subscript j to denote the coordinate of multivariate random vectors. We use Ik
to denote an k × k identity matrix and a prime symbol to indicate matrix transpose.
2 Background
Neyman (1937) introduced the smooth test of goodness of fit. The main thrust of smooth test is
to embed a hypothesized null density into a larger exponential family of alternatives, transforming
the hypothesis regarding a certain distribution into that of a general distribution against a nested
special case. Gray and Pierce (1985) extended Neyman’s smooth test to accommodate censored
data. Denote by T the survival time of interest from an unknown continuous distribution and
by D a censoring variable. We assume that D is independent of T . Suppose n independently
2
(but not necessarily identically) distributed observations (Xt, δt)nt=1 are available, where Xt =
Tt ∧ Dt and δt = I(Tt ≤ Dt), in which a ∧ b = min(a, b) and I(·) is the indicator function.
The hypothesis of interest is whether a random sample T1, ..., Tn are generated from a specific
distribution F (t; θ), where θ is a vector of finite-dimensional unknown parameters. Let f(t; θ) and
S(t; θ) be the corresponding density and survival function respectively. Under the null hypothesis,
S(Tt; θ)nt=1 are distributed according to the standard uniform distribution. Gray and Pierce
(1985) considered a family of alternative densities for T of the form
f(t; θ) exp(λ′ψ(S(t; θ))− λ0), (1)
where ψ = (ψ1, ..., ψk)′ is a vector of linearly independent, bounded real valued functions defined
on [0, 1], λ = (λ1, ..., λk)′ ∈ Rk, and λ0 is a normalization constant.1 Neyman’s smooth test of
distribution then amounts to the test of λ = 0.
Define ut = S(Xt; θ). The average log-likelihood of the observed data under alternative (1)
through a change of variable is given by
1
n
n∑t=1
δt[log f(Xt; θ) + λ′ψ(ut)] + (1− δt) log
∫ ut
0exp(λ′ψ(u))du
− λ0. (2)
Let θ be the maximum likelihood estimate for θ and ut = S(Xt; θ). The score of (2) with respect
to λ, evaluated at λ = 0, is given by
g =1
n
n∑t=1
g(ut, δt),
with
g(ut, δt) = δtψ(ut) + (1− δt)∫ ut
0ψ(u)du/ut −
∫ 1
0ψ(u)du
= δtψ(ut) + (1− δt)E[ψ(U)|U < ut]− E[ψ(U)],
where U is a standard uniform random variable. It is seen that the uncensored observations con-
tribute to the score function in the usual form of ψ(S(Xt; θ)), while contributions from the censored
observations take the form E[ψ(S(Tt; θ))|Tt > Xt]. Thus, g yields the conditional expectation of
the score function for the complete data Ttnt=1 given the observed data (Xt, δt)nt=1. This is
similar to the idea of the EM algorithm for censored data proposed by Dempster et al. (1977).
It is straightforward to show that under the null hypothesis, n−1∑n
t=1E[g(ut, δt)] = 0. Let Ω =
n−1∑n
t=1 var[g(ut, δt)]. It follows that under the null hypothesis, gp→ 0 and
√nΩ−1/2g
d→ N(0, Ik)
1In a similar spirit, Pena (1998) considered a smooth goodness-of-fit test for hazard functions, embedding a baselinehazard function in a larger family of hazard functions.
3
as n → ∞ under suitable regularity conditions. Let Ω be a consistent estimate of Ω. Neyman’s
smooth test of distribution for censored data is then given by
Nk = ng′Ω−1g.
Under the null hypothesis, Nk converges in distribution to the χ2 distribution with k degrees of
freedom as n→∞.
Compared to the omnibus tests (e.g. the Komogorov-Smirnov test or Cramer-von Mises test),
the smooth tests have attractive finite sample properties (see Rayner and Best (1990) for a compre-
hensive review of the smooth tests). In practice, caution should be exercised regarding the choice
of the number of basis functions k, which influences the power of smooth tests. A customary prac-
tice is to use a small k, usually less than 4. Alternatively Ledwina (1994) proposed a data driven
smooth test. She suggested first applying the Bayesian Information Criterion (BIC) to choose a
suitable distribution within family (1) and then constructing the test statistic based on the chosen
distribution. In an extension to Ledwina (1994), Inglot and Ledwina (2006) considered a hybrid
selection criterion, which combines the strength of the BIC and Akaike Information Criterion (AIC)
and is shown to improve the power.
Kallenberg and Ledwina (1999) and Kallenberg (2009) proposed smooth tests for bivariate
distributions. Let ψi1i2(u1, u2) = ψi1(u1)ψi2(u2) and Ψ be a non-empty subset of ψi1i2 : 1 ≤ i1, i2 ≤d(n). Denote the cardinality of Ψ by k = |Ψ| and write Ψ = Ψ1, ...,Ψk. They considered a
bivariate version of (1) for the joint distribution of the pseudo-observations of two random variables,
exp
(k∑i=1
λiΨi(u1, u2)− λ0
), (u1, u2) ∈ [0, 1]2, (3)
where λ0 is a normalization constant. The test of independence within the exponential family (3)
is equivalent to a test of the hypothesis: λ = (λ1, . . . , λk)′ = 0. A smooth test of independence can
then be constructed analogously to the test of univariate uniformity.
In their study of bivariate distributions, Kallenberg and Ledwina (1999) suggested two spec-
ifications of the candidate set Ψ. The first specification includes only ‘diagonal’ entries such as
ψ11, ψ22, ... and so on. The second specification is more flexible and allows ‘non-diagonal’ entries
ψi1i2 with i1 6= i2. The set Ψ is arranged such that Ψ1 = ψ11 and the remaining entries are ar-
ranged in the descending order according to the absolute value of sample average of ψi1i2 ’s. In
either case, the BIC truncation rule is then applied to the candidate set.2 Denote by K the number
of terms selected according to the BIC. Kallenberg and Ledwina (1999) showed that Kp→ 1 and
the corresponding test statistic converges in distribution to χ21 under independence as n→∞ .
2In a similar spirit, Kallenberg (2009) used a threshold rule to select significant elements of Ψ for copula specifi-cations.
4
3 Copula test for censored data under simple hypothesis
To fix idea, in this section we introduce the smooth copula specification test for censored data under
the simplifying assumption of no unknown parameters. The general test under unknown marginal
distributions and copula parameters is given in the next section.
A copula is a multivariate probability distribution with standard uniform marginal distributions.
Copulas describe the dependence between random variables. Suppose (T1, T2) is a paired survival
times with joint survival function S(t1, t2) and density function f(t1, t2). Let (S1, S2) and (f1, f2)
denote the corresponding continuous marginal survival functions and density functions for (T1, T2).
A straightforward application of Sklar’s (1959) Theorem yields that there exists a unique copula
function C(·, ·) with density function c(·, ·) on [0, 1]2, such that
S(t1, t2) = C(S1(t1), S2(t2)),
f(t1, t2) = c(S1(t1), S2(t2))f1(t1)f2(t2),
where C(u1, u2) is sometimes called the survival copula.
Let c0(u1, u2;α), where (u1, u2) ∈ [0, 1]2, be a class of copula density functions parametrized
by a finite p-dimensional coefficient α ∈ A ⊂ Rp. In this section we consider the simplest case
where the marginal distributions are known and the hypothesized copula distribution is completely
specified with α = α0. The corresponding simple null hypothesis then takes the form
H0 : Pr(c(u1, u2) = c0(u1, u2;α0)) = 1, (4)
against the alternative hypothesis
H1 : Pr(c(u1, u2) = c0(u1, u2;α0)) < 1.
for some α0 ∈ A ⊂ Rp.We propose a smooth test of copula specification in the presence of censored data. Let ψ =
(ψ1, . . . , ψk)′ be a vector of linearly independent bounded real valued functions defined on [0, 1]2
and λ = (λ1, ..., λk)′ ∈ Rk. Similarly to Neyman’s smooth test of distribution, our smooth test
of copula specification can be motivated by a smooth alternative. Denote U0j = Sj(Tj), j = 1, 2.
Consider a family of densities for (U01 , U
02 ) of the form,
cψ(u1, u2;α0, λ) = c0(u1, u2;α0) exp(λ′ψ(u1, u2)− λ0) (5)
with a normalization constant λ0 = log∫c0(u1, u2;α0) exp(λ′ψ(u1, u2))du1du2. We note that cψ
in (5) has an appealing information theoretic interpretation: it can be derived as the density that
minimizes the conditional Kullback-Leibler information criterion between the target density and
the null density subject to moment conditions associated with ψ (cf. Efron and Tibshirani (1996)).
5
Under the assumption that (U01 , U
02 ) are distributed according to a distribution in family (5),
hypothesis (4) is equivalent to that λ = 0. Define
µψ = E[ψ(U01 , U
02 )] =
∫[0,1]2
ψ(u1, u2)c0(u1, u2;α0)du1du2,
where the expectation is taken with respect to the hypothesized null copula distribution. For sim-
plicity, we shall omit the subscript [0, 1]2 in the double integration whenever there is no ambiguity.
In the presence of right censorship, suppose we have n independent (but not necessarily identically
distributed) observations (X1t, X2t, δ1t, δ2t)nt=1, where Tjt and Djt are survival and censoring vari-
ables, Xjt = Tjt ∧ Djt and δjt = I(Tjt ≤ Djt), j = 1, 2. One can then construct a score test on
λ = 0 based on the discrepancy between µψ, which is the population mean of ψ(U01 , U
02 ) under
the null, and the sample analog of the conditional mean of the ψ(U01 , U
02 ) given the observations
(X1t, X2t, δ1t, δ2t)nt=1.
Let Ujt = Sj(Xjt), j = 1, 2. Denote by Cψ(u1, u2;α0) the distribution function of the den-
sity cψ(u1, u2;α0). The average log-likelihood under the parametric copula family (5) is given by
n−1∑n
t=1 lψ(U1t, U2t;α0), where
lψ(u1t, u2t;α) = δ1tδ2t log cψ(u1t, u2t;α) + δ1t(1− δ2t) log∂Cψ(u1t, u2t;α)
∂u1
+ (1− δ1t)δ2t log∂Cψ(u1t, u2t;α)
∂u2+ (1− δ1t)(1− δ2t) logCψ(u1t, u2t;α). (6)
Note that we suppress the dependence of lψ(u1t, u2t;α) on δ1t, δ2t for notational simplicity. Such
simplifications will also be applied to other likelihood functions. Substituting (5) into (6) yields
the following log-likelihood
1
n
n∑t=1
δ1tδ2t[λ
′ψ(U1t, U2t) + log c0(U1t, U2t;α0)]
+ δ1t(1− δ2t) log
∫ U2t
0exp(λ′ψ(U1t, u2))c0(U1t, u2;α0)du2
+ (1− δ1t)δ2t log
∫ U1t
0exp(λ′ψ(u1, U2t))c0(u1, U2t;α0)du1
+ (1− δ1t)(1− δ2t) log
∫ U2t
0
∫ U1t
0exp(λ′ψ(u1, u2))c0(u1, u2;α0)du1du2 − λ0
. (7)
The derivative of (7) with respect to λ, evaluated at λ = 0, is given by
gψ =1
n
n∑t=1
gψ(U1t, U2t;α0), (8)
6
where
gψ(u1t, u2t;α) = δ1tδ2tψ(u1t, u2t) + δ1t(1− δ2t)
∫ u2t0 ψ(u1t, u2)c0(u1t, u2;α)du2
C1(u1t, u2t;α)
+ (1− δ1t)δ2t
∫ u1t0 ψ(u1, u2t)c0(u1, u2t;α)du1
C2(u1t, u2t;α)
+ (1− δ1t)(1− δ2t)
∫ u2t0
∫ u1t0 ψ(u1, u2)c0(u1, u2;α)du1du2
C0(u1t, u2t;α)
−∫ψ(u1, u2)c0(u1, u2;α)du1du2 (9)
with C1(u1t, u2t;α) =∫ u2t
0 c0(u1t, u2;α)du2 and C2(u1t, u2t;α) =∫ u1t
0 c0(u1, u2t;α)du1. Again we
suppress the dependence of gψ on δ1t and δ2t when there is no confusion.
Similar to g defined in Section 2, gψ estimates the conditional expectation of the score function
for the (unobserved) complete data given the observations (X1t, X2t, δ1t, δ2t)nt=1. Define
Ωψ =1
n
n∑t=1
var[gψ(U1t, U2t;α0)]. (10)
It follows that under the null hypothesis, gψp→ 0 and
√nΩ−1/2ψ gψ
d→ N(0, Ik) as n → ∞ under
suitable regularity conditions.
Next define Ωψ = n−1∑n
t=1 gψ(U1t, U2t;α0)gψ(U1t, U2t;α0)′. We construct a smooth test for
copula specification as follows:
Qk = ng′ψΩ−1ψ gψ. (11)
Under null hypothesis (4), Qk can be shown to converge in distribution to the χ2 distribution with
k degrees of freedom under suitable regularity conditions given in the next section.
4 Copula test for censored data under composite hypothesis
In practice, the marginal distributions and copula coefficients are usually unknown and have to be
estimated. In the presence of unknown parameters, we now face the following composite hypothesis:
H0 : Pr(c(u1, u2) = c0(u1, u2;α0)) = 1 for some α0 ∈ A, (12)
against the alternative hypothesis
H1 : Pr(c(u1, u2) = c0(u1, u2;α)) < 1 for all α ∈ A.
One can estimate the marginal distributions using parametric or nonparametric methods. In
7
general, tests with parametrically estimated marginal distributions are more efficient if the para-
metric margins are correctly specified. However, under incorrect distributional assumptions, the
tests can suffer persistent size distortions even if the sample size goes to infinity; see Yilmaz and
Lawless (2011). In contrast, tests with non-parametrically estimated marginals are immune from
misspecification of the marginal distributions. We therefore focus on tests with non-parametrically
estimated marginals in this study.
Recall that Ujt = Sj(Xjt), j = 1, 2. We estimate Sj(·) by the Kaplan-Meier estimator
Sj(x) =∏
Xj(t)≤x
(1− 1
n− t+ 1
)δj(t),
where for j = 1, 2, Xj(1) ≤ Xj(2) ≤ ... ≤ Xj(n) are the order statistics of Xjtnt=1, and δj(t)nt=1 are
similarly defined. Lai and Ying (1991) established the consistency of Sj(·) under the assumption
of independent censoring.
Define (U1t, U2t) = (S1(X1t), S2(X2t)). Given (U1t, U2t)nt=1, we proceed to estimate the copula
parameters via the maximum likelihood estimation,
αn = arg maxα
1
n
n∑t=1
l(U1t, U2t;α), (13)
where l(u1t, u2t;α) is the same as (6) with c0 taking the place of cψ and C0 taking the place of Cψ.
Let ΨS be a non-empty subset of basis functions ψi1i2 : 1 ≤ i1, i2 ≤ d, where d is a fixed
upper bound.3 Denote the cardinality of ΨS by KS = |ΨS |. Similarly to (5), we consider a smooth
alternative distribution given by
cS(u1, u2;α0, λS) = c0(u1, u2;α0) exp(∑KS
i=1 λS,iΨS,i(u1, u2)− λS,0), (14)
where λS = (λS,1, ..., λS,KS)′ and λS,0 is a normalization constant.
One can envision that a smooth test on hypothesis (12) can be constructed based on the
estimated score
gS(αn) =1
n
n∑t=1
gS(U1t, U2t; αn), (15)
where gS(u1t, u2t;α) is the same as (9) with ψ replaced by ΨS . However, the test statistics (11) for
simple hypotheses is not directly applicable here because of the presence of nuisance parameters in
(15). There are two sets of nuisance parameters: the finite dimensional copula parameter α and
the infinite-dimensional marginal survival function Sj , j = 1, 2.
3Note that Escanciano and Lobato (2009) and Escanciano et al. (2013) also considered a fixed upper bound in theselection criterion.
8
In order to construct a test based on (15), we need to account for the influences of nuisance
parameters. We start with the asymptotic distribution of αn, which is studied by Chen et al. (2010).
Let lα(u1, u2;α) and lj(u1, u2;α) denote the partial derivatives of l(u1, u2;α) with respect to α and
uj . Further denote lαα(u1, u2;α) = ∂2l(u1, u2;α)/∂α∂α′, lαj(u1, u2;α) = ∂2l(u1, u2;α)/∂α∂uj and
Wjt ≡Wj(Xjt, δjt;α0) = Elαj(U1s, U2s;α0)Ij(Xjt, δjt)(Xjs)|Xjt, δjt, (16)
Ij(Xjt, δjt)(Xjs) = −Sj(Xjs)
[∫ Xjs
−∞
dNjt(u)
Pn,j(u)−∫ Xjs
−∞
IXjt ≥ udHj(u)
Pn,j(u)
],
where Hj(x) ≡ − log(Sj(x)) is the cumulative hazard function of Xj , Njt(x) = δjtI(Xjt ≤ x),
dNjt(x) = Njt(x)−Njt(x−), and Pnj(x) = n−1∑n
k=1 P (Xjk ≥ x). Also define
Bn = − 1
n
n∑t=1
Elαα(U1t, U2t;α0),
Σn =1
n
n∑t=1
varlα(U1t, U2t;α0) +2∑j=1
Wjt.
The asymptotic distribution of αn is given by the following theorem.
Theorem 1. (a) Under conditions C1-C5 given in the Appendix, ||αn − α0|| = op(1). (b) Fur-
thermore, if conditions A1-A4 hold, then BnΣ−1/2n√n(αn − α0)
d→ N(0, Ip).
Under the null hypothesis (12), we also have Bn = n−1∑n
t=1 lα(U1t, U2t;α0)lα(U1t, U2t;α0)′ by
the information matrix equality. The asymptotic variance of√n(αn − α0), B−1
n ΣnB−1n , can be
simplified to
B−1n +
1
nB−1n
n∑t=1
varW1t +W2tB−1n .
If the marginal distributions are known, the asymptotic variance of αn is reduced to B−1n .
Next define
Bn =1
n
n∑t=1
lα(U1t, U2t; αn)lα(U1t, U2t; αn)′,
Wjt =1
n
n∑s=1,s 6=t
lαj(U1s, U2s; αn)Ij(Xjt, δjt)(Xjs),
9
where for j = 1, 2,
Ij(Xjt, δjt)(Xjs) = −Sj(Xjs)
[I(Xjt ≤ Xjs)δjt
n−1∑n
k=1 I(Xjk ≥ Xjt)− 1
n
n∑l=1
I(Xjs ≥ Xjl)I(Xjt ≥ Xjl)δjl[n−1
∑nk=1 I(Xjk ≥ Xjl)]2
].
(17)
As indicated in Chen et al. (2010), an alternative expression for Ij(Xjt, δjt)(Xjs) is
Ij(Xjt, δjt)(Xjs) = −Sj(Xjs)
IXjt ≤ Xjs, δjt = 1Pnj(Xjt)
−∑
Xjl≤Xjs
I(Xjt ≥ Xjl)∆Hj(Xjl)
Pnj(Xjl)
,where Pnj(x) = n−1
∑nk=1 I(Xjk ≥ x) and ∆Hj(x) =
IYj(x)>0Yj(x)
dNj(x), in which Yj(x) =∑n
k=1 IXjk ≥x and Nj(x) =
∑nk=1Njk(x). ∆Hj(x) is the so-called Nelson’s estimator. Chen et al. (2010) es-
tablished the equivalence between these two expressions.
The asymptotic variance of αn can be consistently estimated by
B−n + B−n
1
n
n∑t=1
(2∑j=1
Wjt)(
2∑j=1
Wjt)′
B−n ,
where B−n is the generalized inverse of Bn. Further define
GS,α =1
n
n∑t=1
E[gS,α(U1t, U2t;α0)],
Zjt ≡ Zj(Xjt, δjt;α0) = E[gS,j(U1s, U2s;α0)Ij(Xjt, δjt)(Xjs)|Xjt, δjt], j = 1, 2, (18)
where gS,α(u1, u2;α) and gS,j(u1, u2;α) are the partial derivatives of gS(u1, u2;α) with respect to
α and uj . We obtain the following asymptotic distribution of gS(αn).
Theorem 2. Suppose that all conditions of Theorem 1 are satisfied and conditions T1-T3 given
in the Appendix hold. Under null hypothesis (12), gS(αn)p→ 0 and
√nΩ−1/2S gS(αn)
d→ N(0, IKS),
where
ΩS =1
n
n∑t=1
var
gS(U1t, U2t;α0) +
2∑j=1
Zjt + G′S,αB−1n
lα(U1t, U2t;α0) +
2∑j=1
Wjt
. (19)
Remark 1. The asymptotic variance ΩS reflects the influence of nuisance parameters in gS(αn).
If the marginal distributions are known, the terms involving W ′s and Z ′s are not needed, re-
sulting in ΩS = n−1∑n
t=1 vargS(U1t, U2t;α0) + G′S,αB−1
n lα(U1t, U2t;α0)
. When the copula pa-
rameter α is known, the third term on the right hand side of (19) is not needed, resulting in
10
ΩS = n−1∑n
t=1 vargS(U1t, U2t;α0) +∑2
j=1 Zjt. If both the marginal distributions and copula
parameter are known, ΩS is further reduced to n−1∑n
t=1 vargS(U1t, U2t;α0), which is exactly the
variance (10) derived in the previous section for simple hypotheses.
Remark 2. Another interesting special case is the independent copula, under which the estimation
error of the marginal distribution is asymptotically ignorable; see Genest et al. (1995). The proposed
test can be accordingly simplified to test independence between T1 and T2 under censorship. Similarly
to (14), we consider a smooth alternative distribution
cS(u1, u2;λS) = exp(∑KS
i=1 λS,iΨS,i(u1, u2)− λS,0).
The corresponding score function is given by gI = n−1∑n
t=1 gI(U1t, U2t), where
gI(u1t, u2t) = δ1tδ2tψ(u1t, u2t) + δ1t(1− δ2t)
∫ u2t0 ψ(u1t, u2)du2
u2t
+ (1− δ1t)δ2t
∫ u1t0 ψ(u1, u2t)du1
u1t
+ (1− δ1t)(1− δ2t)
∫ u2t0
∫ u1t0 ψ(u1, u2)du1du2
u1tu2t
It follows readily that gIp→ 0 and
√nΩ−1/2I gI
d→ N(0, IKS), where
ΩI =1
n
n∑t=1
var
gI(U1t, U2t) +2∑j=1
Zjt
.This test extends the rank based independence test of Kallenberg and Ledwina (1999) to censored
data.
Next we present a consistent estimator for ΩS . Define
GS,α =1
n
n∑t=1
gS,α(U1t, U2t; αn), (20)
Zjt =1
n
n∑s=1,s 6=t
gS,j(U1t, U2t; αn)Ij(Xjt, δjt)(Xjs), j = 1, 2. (21)
Let
ϕt = gS(U1t, U2t; αn) +2∑j=1
Zjt + G′S,αB−n
lα(U1t, U2t; αn) +2∑j=1
Wjt
.
11
It follows that ΩS can be estimated consistently by
ΩS =1
n
n∑t=1
(ϕt − ϕ)(ϕt − ϕ)′, (22)
where ϕ = n−1∑n
t=1 ϕt.
We can now construct a semiparametric smooth test of copula specification as follows.
Theorem 3. Suppose all conditions of Theorem 1 and 2 are satisfied. The semiparametric smooth
test of copula specification for censored data is given by
QS = ngS(αn)′Ω−1S gS(αn). (23)
Under null hypothesis (12), QSd→ χ2
KS.
A critical component of data driven smooth tests is the selection of suitable set of functions
ΨS from a candidate set Ψ = ψi1i2 : 1 ≤ i1, i2 ≤ d. In the spirit of Kallenberg and Ledwina
(1999), we rearrange the candidate set to Ψs such that Ψs,1 = ψ11 and the remaining elements,
ψi1i2 : 1 ≤ i1, i2 ≤ d, (i1, i2) 6= (1, 1), are ordered in the descending order according to their
corresponding entries in the vector Ω−1/2Ψ |
√ngΨ|. The normalization of the moments is necessary
as these moments are generally correlated and differ in variance.4 Denote by |Ψs| the cardinality of
Ψs. Let Ψs,(k) = Ψs,1, ...,Ψs,k, k = 1, ..., |Ψs| and the corresponding gs,(k)(U1t, U2t;α0), Ωs,(k) and
Qs,(k), as given in (15), (22) and (23), are similarly defined; their sample analogs are denoted by
gs,(k)(αn), Ωs,(k) and Qs,(k), respectively. For each k, let us,(k) = Ω−1/2s,(k) gs,(k)(αn). Following Inglot
and Ledwina (2006), we use the following criterion to select a suitable ΨS , whose cardinality is
denoted by KS :
KS = mink : Qs,(k) − Γ(k, n, ζ) ≥ Qs,(r) − Γ(r, n, ζ), 1 ≤ k, r ≤ |Ψs|, (24)
where Γ(r, n, ζ) is a complexity penalty defined as:
Γ(r, n, ζ) =
r log n if max1≤i≤|Ψs| |√nus,i| ≤
√ζ log n,
2r if max1≤i≤|Ψs| |√nus,i| >
√ζ log n,
(25)
where us,i is the ith element of us,(k) and ζ = 2.4. Note that the penalty is ‘adaptive’ such that
either the AIC or BIC is adopted in a data driven manner, depending on the empirical evidence
pertinent to the magnitude of deviation from the null copula distribution.
Next we present the asymptotic properties of the proposed test QS based on a set of basis
functions ΨS selected according to the procedure described above.
4In contrast, Kallenberg and Ledwina (1999) use ranks to construct a test of independence. The moments basedon Legendre polynomials are free of nuisance parameters and orthonormal under the null hypothesis.
12
Theorem 4. Let KS be selected according to (24). Suppose all conditions of Theorem 1 and 2 hold.
(a) Suppose (U1t, U2t) follows the distribution C0(u1, u2, α0). Then Pr(KS = 1)p→ 1 and QS
d→ χ21.
(b) Let P be an alternative. Suppose that under P, αn converges to some pseudo-parameter α
such that n−1∑n
t=1EP [gs,Υ(S1(X1t), S2(X2t);α)] 6= 0 for some Υ in 1, ..., k, where gs,Υ(u1t, u2t;α)
is the Υth element of gs,(k)(u1t, u2t;α), k in 1, . . . , |Ψs|. Then QS →∞ as n→∞.
Since the proposed test is constructed in a data driven fashion, the number of functions selected
is random even under the null hypothesis. Consequently, the χ21 distribution does not provide
an adequate approximation to the distribution of the test statistic for moderate sample sizes.
Nevertheless since it is asymptotically distribution free, the asymptotic distribution of the test
under the null can be easily approximated via simulations. Below we present a simple simulation
procedure for this purpose.
• Step 1: Generate an n-by-2 i.i.d. random sample (U1t, U2t)nt=1 from the Uniform[0, 1]2.
• Step 2: Select a set of basis functions ΨS according to (24); Calculate the test statistic Q∗S .
• Step 3: Repeat steps 1 and 2 L times; denote the results by Q∗S,lLl=1.
• Step 4: Use the qth percentile of Q∗S,lLl=1 to approximate the qth percent critical value for
the distribution of QS .
We stress that unlike tests that rely on bootstrap procedures for inference, the simulation
procedure adopted in this test is particularly easy to implement. It samples from independent
standard uniform distributions and does not entail simulation of the unknown censoring process
nor repeated estimation of the hypothesized copula from bootstrapped samples.
5 Simulations
We conduct a series of Monte Carlo simulations to assess the finite-sample performance of the
proposed test. Following the design of Wang (2010), we consider two commonly used parametric
Archimedean copulas as the null copula distributions: the Clayton and Frank copula. They share
a common structure Cφ(u1, u2) = φ−1φ(u1) + φ(u2), 0 < u1, u2 < 1, where the function φ(·) is
often referred to as the generator of an Archimedean copula. Regarding the alternative copulas,
we include the same two Archimedean copulas and Gumbel copula. For each copula, we consider
three levels of dependence by setting its parameter such that the corresponding Kendall’s τ = 0.3,
0.5 and 0.7. We generate the marginal distributions of T1 and T2 from the standard exponential
distribution. The censoring variable D1 and D2 are generated from the exponential distributions
with mean 5, such that there are about 30% of pairs where at least one component is censored. We
set sample size n = 300 and repeat each experiment 1,000 times.
13
We select the basis functions from the normalized Legendre polynomials, which are orthonormal
with respect to the Lebesgue measure on [0, 1]. Our ψ functions consist of elements of the tensor
products of these basis functions. Let ψj denote the jth normalized Legendre polynomial and
ψij(u1, u2) = ψi(u1)ψj(u2). We consider the candidate set ψ11, ψ12, ψ22, ψ13, ψ23, ψ33, which is the
upper triangle of a 3 by 3 matrix of basis functions. We apply the information criterion prescribed
in the previous section to select a suitable set of basis functions, based on which the test statistic is
calculated. We have experimented with larger candidate sets and found little difference in terms of
size and power across experiment designs and sample sizes. The critical value is computed following
the prescribed simulation procedure.
Most existing copula tests for censored data rely on bootstrap for inference. One exception is
Wang (2010). Let U and V be uniform random variables with an Archimedean copula function Cφ
and Y1 = φ(U)/ φ(U) + φ(V ) and Y2 = Cφ(U, V ). Genest and Rivest (1993) show that Y1 and Y2
are independent. Wang (2010) designed an Archimedean copula specification test that examines
the correlation between estimated Y1 and Y2. For censored data, he applied a multiple imputation
procedure to recover the pseudo complete data and then construct the test based on multiply
imputed complete data sets. Both our test and Wang’s test do not rely on bootstrap critical
values and therefore are computationally efficient. Unlike Wang’s test, ours does not require data
imputation.
For each experiment, we estimate the marginal distributions nonparametrically and calculate
their corresponding test statistics of copula specification. The empirical sizes and powers of the
proposed tests QS are reported in Table 1. For comparison, we also include the reported results
by Wang (2010) in the columns signified by Tm. The bold entries reflect the empirical sizes and
the regular font entries show the power of the tests. Our test shows reasonable size around 5% and
Wang’s test tends to be undersized. The two tests have similar powers in the testing of Clayton
copula. When the null hypothesis is Frank copula, our test shows considerably higher power,
especially when τ = 0.3 and 0.5. We caution that the two tests are not completely comparable.
Wang’s test utilizes a relationship specific to Archimedean copulas while our test is a general one
that can be applied to all copulas. In addition, Wang’s test only examines one specific deviation from
the null (no-zero correlation between two independent random variables), whereas ours considers
many plausible deviations in a data-driven fashion.
14
Table 1: Empirical sizes and powers of the proposed tests (%)
Copula under the null hypothesis
Clayton Frank
True Copula τ QS Tm QS Tm
Clayton 0.3 7.7 2.0 51.3 10.7
0.5 7.6 3.0 91.7 43.5
0.7 6.4 5.3 96.1 73.4
Frank 0.3 69.5 76.8 4.1 2.0
0.5 97.1 100 4.7 2.2
0.7 98.4 100 7.8 3.2
Gumbel 0.3 95.2 84.6 47.6 19.6
0.5 100 99.6 84.5 67.0
0.7 100 100 97.1 94.2
Note: This table reports the finite sample performances of the proposed test (QS). The finite sample performances
of Wang’s (2010) test (Tm) are also reported. The bold entries reflect the empirical sizes. The nominal size is 5%.
The sample size is 300. Results are based on 1, 000 replications.
6 Empirical example
In this section, we apply the proposed tests to two insurance company data.5 In either example, we
first conduct the specification test. When the null hypothesis is rejected, we proceed to estimate
an alternative copula function using diagnostic information provided by the test.
6.1 LOSS and ALAE data
This dataset, collected by the US Insurance Services Office, consists of 1,500 observations of insur-
ance loss and Allocated Loss Adjustment Expenses (ALAE). Among them, 34 observations of the
loss are right censored because the amount of claim cannot exceed the policy limit. The ALAE
data are not censored. Frees and Valdez (1998), Klugman and Parsa (1999), Chen and Fan (2005),
Denuit et al. (2006) and Lin and Wu (2015) have analyzed the subset of 1,466 complete observa-
tions. Chen et al. (2010) and Yilmaz and Lawless (2011) have studied the full dataset with 1,500
observations. In the analysis of the full dataset, it is important to account of the censoring in the
loss data; for example, the mean loss of censored claims is much higher than the corresponding
mean for uncensored claims, see Frees and Valdez (1998).
5We thank the Society of Actuaries, through the courtesy of Edward W. (Jed) Frees and Emiliano A. Valdez, forallowing use of these two data sets in this paper.
15
We estimate the marginal survival functions by the Kaplan-Meier estimator. The scatterplots
of the loss versus ALAE data on the logarithm scale and the resulting pseudo-observations are
displayed in Figure 1a and 1b.6 The black and gray symbols represent uncensored and censored
observations respectively in Figure 1a and 1b. There exists apparent positive right-tail dependence
between Loss and ALAE: large losses tend to be associated with large ALAE’s. On the other hand,
there is relatively weaker left-tail dependence between them. Due to this asymmetric dependence,
Gumbel copula was chosen as the best fitting copula by the existing studies. We therefore conduct
the specification test under null hypotheses of Gumbel copula. The semiparametric estimator yields
αn = 1.4440. The smooth test statistic QS = 0.0151, with a data-driven dimension KS = 1. The
simulated critical value for the sample size n = 1, 500 at the 5% level is 5.6612. In accordance with
the previous studies, the proposed smooth test does not reject the hypothesis of Gumbel copula.
2 4 6 8 10 12 14
46
810
12
LOSS
ALA
E
* *
***
*
*
*
*
*
*
*
**
*
**
**
*
*
*
*
*
**
**
*********************
***
*
*
*
********
*
****
*
**
**********************************
**
*
*****
********
*****
*
**
*
***********************
*
*
**
**
*
*
*
*
**
*******
**
*
*
*
***
*********************************
**
***
*
*
*
*
**
*
*
*
**
**
**********************************************************
***
*
*
*
**
****
*
*
****************************
*
*
**
*
*
*
*
*
**
**
***************
***********
*
**
**
***********
*
***
**
**
**
******
*
******
*
*
*
*********************************************************************
****
*
*
*
*
*************
*
*
*
*
***************
*
*
***
*
*
*
********
***
*
***
*
********
******************************************
*
*
*
*
*
*
*****
***
*
*
*
****
*
*
********
*
*
*
*
**
*
*
*********
*
*
*
****
*
*******************************************************************
**
**
*
***********
*
**
*
*
**
*
*
********
********************
***
*
*
**
**
***
*
*
**
*
*
*
*
****
*
***
*
*
*
*
************************************************
*
*
*
*
*
*****
****
********
***
*
**
*********
*
**
*****
*
***
*
***
*
****
*********
**
*******************************
***
****
**
*
****
*
******
*
*
*
************
*****
***
*
**
**
****************************************
*
*
**
*
*
*
*
***
*
*
*
****
*
*
*
******
*
*
********************
*
*
**
*
*
*
*
****
*
*
******
**
*
*
*
***
*
**************
****
*
**
**
**
*
***********
*
**
****
*
**
*
**
**************
**
***
******
*
*
**
*
**
************************
*
****
*
*
**
*****
*
****
**
********
*
*
*
***
*
*
*******
*
***
****
*
*
***
*
***
*
**
*
***************
*
*
******
**
**
********
***
****
*
**
*
**
*********
**
**
*
**
*
*
*
*
*
*
**
***
*
*
********
**
*
**
*
****
******
*
*
*
*
*
*
*
****
*
*
*
*
*
*
*
*
********
*
***
**
****
*
*****
*
***
*
*******
***
****
*
*
*
*
***
**
*
**
**
(a) Loss and ALAE on a log scale
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
LOSS
ALA
E
*
*
***
*
*
*
*
**
*
**
*
*
*
**
*
*
*
*
*
**
*******************
****
***
*
*
*
****
***
*
*
***
*
*
*
*
*********************************
*
**
*
****
*
*****
***
**
*
**
*
**
************************
*
*
**
**
*
*
*
*
****
**
*
*
*
**
*
*
*
**
*
*******************************
*
*
**
*
**
*
*
*
*
*
*
*
*
*
**
**
**********************************************************
*
*
*
*
*
*
*
*
**
*
*
*
*
******************
**********
**
*
*
*
*
*
*
*
**
*
*
*****
******
***
*
*
****
****
**
*
****
***********
*
*
**
*
*
**
**
*****
*
*
**
*
*
*
*
*
*
**********************************************************************
**
*
*
*
*
*
*
***
***
**
*
*
*
*
*
*
*
*
*
*******
********
*
*
***
*
*
****
**
*
**
*
*
*
*
*
*
*
*
*
***
****
***********
*******************************
*
*
*
*
*
*
*****
***
*
*
*
*
*
*
*
*
*
**
***
*
*
*
*
*
*
*
**
*
*
****
*
*
**
*
*
*
*
**
**
*
*******************************************************************
**
**
*
**
******
*
**
*
**
*
*
**
*
*
*****
*
*
*
*****
*
**************
*
*
*
*
*
**
**
**
*
**
*
*
*
*
*
*
*
*
**
*
***
*
*
*
*************************************************
*
*
*
*
*
*
**
*
*
*
**
*
*
*****
**
*
*
*
*
*
*
***
**
****
*
**
**
***
*
**
*
*
*
**
*
***
*
**
*
*
*
****
**
************
*******************
**
*
*
***
*
*
*
*
***
*
**
*
**
*
*
*
*
****
****
***
*
**
*
**
*
*
*
*
**
*
*
***************************************
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
***
*
*
*
*
*
**
*
*
*
**
*
*******************
*
*
*
*
*
*
*
*
**
**
*
*
***
*
**
*
*
*
*
*
**
*
*
***
****
*
******
*
*
*
*
*
*
*
**
*
*
*
*
**
********
*
**
***
*
*
*
*
*
*
*
***
*
*
****
*****
*
*
**
*
**
*
**
*
*
*
*
*
*
**
********
****************
*
*
**
*
*
*
*
*
*****
*
***
*
**
*
*
*****
*
*
*
*
**
*
*
*
**
*
*
****
**
*
****
*
*
*
*
*
*
***
*
**
*
*
***
***********
*
*
**
**
*
*
**
*
*
*
*
****
**
***
**
*
*
*
*
*
*
**
***********
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
******
**
*
**
*
*
**
*
*
**
*
**
*
**
*
*
*
*
****
*
*
*
*
***
*
*
*******
*
***
*
*
****
*
***
**
*
**
**
******
*
***
*
***
*
*********
**
**
*
*
*
*
*
*
*
***
***
*
*
*
*
**** *
*
*
***
*
******
(b) Pseudo-observations
Figure 1: Scatterplots of Loss Versus ALAE
We also conduct the specification test under the null hypotheses of Gaussian copula due to its
popularity in practice 7. The semiparametric estimate of the Gaussian copula coefficient yields
αn = 0.4671. The smooth test statistic, with a data-driven dimension KS = 3, is calculated to
be QS = 39.5312, considerably larger than the critical value of 5.6612. Thus the proposed test
decisively rejected the hypothesis of Gaussian copula at the 5% level.
6In this section, the pseudo-observations correspond to the marginal distributions rather than the marginal survivalfunctions.
7One important application of Gaussian copula for censored data is to construct the conditional Spearman corre-lations. Ang and Chen (2002) used the truncated normal distribution to define the conditional Pearson correlationson joint downside movements or upside movements, which occur when both a U.S. equity portfolio and the U.S.market fall or rise. They find significant correlation asymmetry in the U.S. equity market. By applying the Gaussiancopula, it’s straightforward to construct the conditional Spearman’s correlations and capture the asymmetry in thefinancial market.
16
An advantage of the proposed tests is that they provide useful pointers in the construction of
alternative copulas. When a null hypothesis is rejected, we can augment the null copula distribution
with extra features suggested by the selected moment functions of our data-driven test. This
strategy is used by Kallenberg (2008) for copula estimation with complete data and here we extend
it to censored data. In the present example, given the hypothesized Gaussian copula and the
selected moment functions ψ11, ψ12, ψ23, the implied alternative copula takes the form
ca(u1, u2;α, λ) = cG(u1, u2;α) exp(λ1ψ11(u1, u2) + λ2ψ12(u1, u2) + λ3ψ23(u1, u2)− λ0),
where cG is the Gaussian copula density function, and λ0 is a normalization constant. We adopt a
two-stage estimation approach. In the first step, we estimate the parameter α by restricting λi = 0,
i = 1, 2, 3. In the second step, we fix the estimated αn and use the EM algorithm to evaluate
the log-likelihood function corresponding to the parametric copula family ca(u1, u2; αn, λ). At the
maximization step, we use a Newton-type algorithm (see Wu, 2010) to update λ’s.
Figure 2a and 2b report the contour plots of the Gaussian copula density cG(αn = 0.4671) and
the alternative copula density function ca. The Gaussian copula allows only symmetric dependence.
In contrast, the alternative copula density captures a salient feature of the data: the density at the
upper-right corner is clearly higher than that at the lower-left tail corner, reflecting the asymmetric
tail behaviors revealed in Figure 1b.
0
1
2
3
4
5
6
0.2 0.4 0.6 0.8
0.2
0.4
0.6
0.8
(a) Gaussian copula density cG
0
2
4
6
8
10
0.2 0.6
0.2
0.4
0.6
0.8
(b) Alternative copula density ca
Figure 2: Contour plots of copula density functions for Loss and ALAE data
17
6.2 Annuitants mortality data
Standard insurance industry practice assumes independence of lifetimes when pricing annuities
where the promise is based on more than one life. Frees et al. (1996) study a dataset on 14,947
joint and last-survivor annuity contracts of a large Canadian insurer over the period December 29,
1988, through December 31, 1993. They use a Frank copula to model joint-life mortality and find
strong positive dependence between joint lives. Their finding suggests that annuity values ought
to be reduced by approximately 5 percent when dependent mortality models are used. Similarly
Carriere (2000) found that a (generalized) Frank copula fits the data well.
The dataset contains 11,454 heterosexual non-duplicating joint life contracts. During the sam-
ple period, there were 195 contracts with both annuitants died, 1,303 contracts with one of two
annuitants died, and 9,956 contracts with both annuitants survived. Due to the extremely high
censoring rate (98.3%) in this data, we utilize a more balanced subset in this exercise. In particu-
lar, we include 195 uncensored observations and randomly draw 195 observations from the sample.
The resultant subsample has 29 contracts with one variable censored and 166 contracts with both
variables censored; the overall censoring rate is 50%. We estimate the marginal survival functions
by the Kaplan-Meier estimator. The scatterplots of the lifetimes of the primary annuitants versus
those of secondary annuitants, and their corresponding pseudo-observations are displayed in Figure
3a and 3b. The black and gray symbols represent uncensored and censored lifetimes respectively
in Figure 3a and 3b. We conduct the specification test under the null hypotheses of Frank copula.
The semiparametric estimator yields αn = 7.9965, implying a strong positive dependence between a
couple’s mortalities. The smooth test statistic QS = 5.0244, with a data-driven dimension KS = 1.
The simulated critical value for the sample size n = 390 at the 5% level is 7.6409. In accordance
with the previous studies, the proposed smooth test does not reject the hypothesis of Frank copula.
References
Andersen, P. K., Ekstrøm, C. T., Klein, J. P., Shu, Y., and Zhang, M.-J. (2005), “A Class of Good-
ness of Fit Tests for a Copula Based on Bivariate Right-Censored Data,” Biometrical Journal,
47, 815–824.
Ang, A. and Chen, J. (2002), “Asymmetric correlations of equity portfolios,” Journal of financial
Economics, 63, 443–494.
Berg, D. (2009), “Copula goodness-of-fit testing: an overview and power comparison,” European
Journal of Finance, 15, 675–701.
18
60 70 80 90 100
5060
7080
9010
0
Lifetimes of primary annuitants
Life
times
of s
econ
dary
ann
uita
nts
*
*
*******
**
***
****
*
**
*
**
*
***
*
*
*
*
*******
**
********
*******
*
*
*
**
*
*
**
****
**
***
*
*
**
**
**
*****
**
************
*
*
****
**
*
*
*
********
*
*
*
********
**
***
****
**
*******
***
***
*
****
*
****
*
**
****
*
********
**
***
*
*****
*
**
(a) Lifetimes of the annuitants
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lifetimes of primary annuitants
Life
times
of s
econ
dary
ann
uita
nts
*
*
*******
*
*
*
**
*******
*
**
*
***
*
*
*
*
****
*
*
*
**
**
******
****
***
*
*
*
*
*
**
*
*
**
*
*
**
*
**
*
*
**
**
**
*
**
*
*
*
*
*
***
**
***
*
*
*
*
*
****
**
*
*
*
*
**
*
****
*
*
*
*
*
*
*
*
*
*
*
**
**
*
**
*
*
**
**
**
**
*
**
*
*
*
*
*
*
***
*
****
*
**
***
*
*
**
**
**
**
**
***
******
*
**
*
*
*
*
*
**
*
*
*
*
*
*
**
* *
**
*
*
*
*
**
*
*
* **
*
*
*
* *
*
*
*
*
*
*
*
*
*
**
*
**
*
*
*
*
*
*
*
*
***
*
*
**
*
*
*
*
*
*
*
*
***
*
*
*
*
*
*
*
*
**
*
*
*
*
*
*
*
**
*
*
*
*
*
*
**
* *
*
*
**
*
*
*
*
*
*
*
* *
* *
*
*
*
**
***
*
*
*
*
*
*
*
*
*
**
**
*
*
**
**
*
*
*
*
**
*
* *
*
*
*
*
*
*
**
*
*
*
*
**
*
*
*
*
*
*
*
*
*
*
*
*
*
**
*
*
**
*
**
*
*
*
*
(b) Pseudo-observations
Figure 3: Scatterplots of lifetimes of the primary and secondary annuitants
Carriere, J. F. (2000), “Bivariate Survival Models for Coupled Lives,” Scandinavian Actuarial
Journal, 2000, 17–32.
Chen, X. and Fan, Y. (2005), “Pseudo-likelihood ratio tests for semiparametric multivariate copula
model selection,” Canadian Journal of Statistics, 33, 389–414.
— (2006), “Estimation and model selection of semiparametric copula-based multivariate dynamic
models under copula misspecification,” Journal of Econometrics, 135, 125–154.
Chen, X., Fan, Y., Pouzo, D., and Ying, Z. (2010), “Estimation and model selection of semipara-
metric multivariate survival functions under general censorship,” Journal of Econometrics, 157,
129 – 142.
Cherubini, U., Luciano, E., and Vecchiato, W. (2004), Copula Methods in Finance, Wiley, England.
Cherubini, U., Mulinacci, S., Gobbi, F., and Romagnoli, S. (2011), Dynamic Copula methods in
finance, Wiley, England.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), “Maximum likelihood from incomplete
data via the EM algorithm,” Journal of the Royal Statistical Society. Series B (Methodological),
1–38.
Denuit, M., Purcaruff, O., and Van Keilegorni, I. (2006), “Bivariate Archimedean copula models
for censored data in non-life insurance,” Journal of Actuarial Practice, 13, 5–32.
19
Efron, B. and Tibshirani, R. (1996), “Using specially designed exponential families for density
estimation,” The Annals of Statistics, 24, 2431–2461.
Emura, T., Lin, C.-W., and Wang, W. (2010), “A goodness-of-fit test for Archimedean copula
models in the presence of right censoring,” Computational Statistics & Data Analysis, 54, 3033–
3043.
Escanciano, J. C. and Lobato, I. N. (2009), “An automatic portmanteau test for serial correlation,”
Journal of Econometrics, 151, 140–149.
Escanciano, J. C., Lobato, I. N., and Zhu, L. (2013), “Automatic specification testing for vector
autoregressions and multivariate nonlinear time series models,” Journal of Business & Economic
Statistics, 31, 426–437.
Fermanian, J.-D. (2013), “An overview of the goodness-of-fit test problem for copulas,” in Copulae
in Mathematical and Quantitative Finance, Springer, pp. 61–89.
Frees, E. and Valdez, E. (1998), “Understanding relationships using copulas,” North American
actuarial journal, 2, 1–25.
Frees, E. W., Carriere, J., and Valdez, E. (1996), “Annuity valuation with dependent mortality,”
Journal of Risk and Insurance, 229–261.
Genest, C., Ghoudi, K., and Rivest, L. (1995), “A Semiparametric Estimation Procedure of De-
pendence Parameters in Multivariate Families of Distributions,” Biometrika, 82(3), 543–552.
Genest, C., Remillard, B., and Beaudoin, D. (2009), “Goodness-of-fit tests for copulas: A review
and a power study,” Insurance: Mathematics and Economics, 44, 199–213.
Genest, C. and Rivest, L.-P. (1993), “Statistical inference procedures for bivariate Archimedean
copulas,” Journal of the American statistical Association, 88, 1034–1043.
Gray, R. and Pierce, D. (1985), “Goodness-of-fit tests for censored survival data,” The Annals of
Statistics, 13, 552–563.
Gribkova, S. and Lopez, O. (2015), “Non-parametric Copula Estimation Under Bivariate Censor-
ing,” Scandinavian Journal of Statistics, 42, 925–946.
Inglot, T. and Ledwina, T. (2006), “Towards data driven selection of a penalty function for data
driven Neyman tests,” Linear algebra and its applications, 417, 124–133.
20
Jennrich, R. I. (1969), “Asymptotic properties of non-linear least squares estimators,” The Annals
of Mathematical Statistics, 40, 633–643.
Kallenberg, W. C. (2008), “Modelling dependence,” Insurance: Mathematics and Economics, 42,
127–146.
— (2009), “Estimating copula densities, using model selection techniques,” Insurance: mathematics
and economics, 45, 209–223.
Kallenberg, W. C. and Ledwina, T. (1999), “Data-driven rank tests for independence,” Journal of
the American Statistical Association, 94, 285–301.
Klugman, S. and Parsa, R. (1999), “Fitting bivariate loss distributions with copulas,” Insurance:
Mathematics and Economics, 24, 139–148.
Lai, T. L. and Ying, Z. (1991), “Estimating a distribution function with truncated and censored
data,” The Annals of Statistics, 19(1), 417–442.
Lakhal-Chaieb, M. (2010), “Copula inference under censoring,” Biometrika, 97, 505–512.
Ledwina, T. (1994), “Data driven version of the Neyman smooth test of fit,” Journal of American
Statistical Association, 89, 1000–1005.
Lin, J. and Wu, X. (2015), “Smooth tests of copula specifications,” Journal of Business & Economic
Statistics, 33, 128–143.
McNeil, A., Frey, R., and Embrechts, P. (2005), Quantitative risk management: concepts, tech-
niques, and tools, Princeton university press.
Neyman, J. (1937), “‘Smooth test’ for goodness of fit,” Skandinaviske Aktuarietidskrift, 20, 150–199.
Patton, A. (2012), “A review of copula models for economic time series,” Journal of Multivariate
Analysis, 110, 4–18.
Pena, E. A. (1998), “Smooth goodness-of-fit tests for composite hypothesis in hazard based models,”
The Annals of Statistics, 26, 1935–1971.
Rayner, J. and Best, D. (1990), “Smooth tests of goodness of fit: an overview,” International
Statistical Review, 58, 9–17.
Rigobon, R. and Stoker, T. M. (2007), “Estimation with censored regressors: basic issues,” Inter-
national Economic Review, 48, 1441–1467.
21
Salmon, F. (2009), “Recipe for Disaster: The Formula That Killed Wall Street,” Wired, 17, 74–79,
112.
Shih, J. H. (1998), “A goodness-of-fit test for association in a bivariate survival model,” Biometrika,
85, 189–200.
Shih, J. H. and Louis, T. A. (1995), “Inferences on the Association Parameter in Copula Models
for Bivariate Survival Data,” Biometrics, 51, 1384–1399.
Wang, A. (2010), “Goodness-of-fit tests for Archimedean copula models,” Statistica Sinica, 441–
453.
Wang, W. and Wells, M. T. (2000), “Model selection and semiparametric inference for bivariate
failure-time data,” Journal of the American Statistical Association, 95, 62–72.
Wu, X. (2010), “Exponential Series Estimator of multivariate densities,” Journal of Econometrics,
156, 354 – 366.
Yilmaz, Y. E. and Lawless, J. F. (2011), “Likelihood ratio procedures and tests of fit in parametric
and semiparametric copula models with censored data,” Lifetime data analysis, 17, 386–408.
Appendix
We first introduce some notations. For a vector a = (a1, ..., ak)′ and r > 0, define |a|r =
(∑k
i=1 |ai|r)1/r, |a| = |a|2 and |a|∞ = max1≤i≤k |ai|. For a matrix A = (aij)1≤i≤m,1≤j≤n, let ‖A‖denote the Euclidean norm and ‖A‖∞ = max1≤i≤m,1≤j≤n |aij |. For j = 1, 2, denote Ujt = Sj(Xjt).
To simplify notation, we let∑
t =∑n
t=1,∑
j =∑2
j=1. Some conditions are needed to establish the
large sample properties of the proposed tests.
Conditions C1 - C5 are sufficient to ensure the convergence of the maximum likelihood esti-
mators αn to α0 in Theorem 1.
C1 (i) The sequence of survival variables, (T1t, T2t)nt=1, is an i.i.d. sample from an unknown
survival function S(t1, t2) with continuous marginal survival functions Sj(·), j = 1, 2.
(ii) The sequence of censoring variables (D1t, D2t)nt=1, is an independent sample with joint
survival functions Gt(x1, x2)nt=1 = P (D1t > x1, D2t > x2)nt=1 and marginal survival
functions Gjt(·)nt=1, j = 1, 2.
22
(iii) The censoring variables (D1t, D2t) are independent of survival variables (T1t, T2t) and
there is no mass concentration at 0 in the sense that lim supn→∞ n−1∑
t(1−Gjt(η))→ 0 as
η → 0.
C2 The true (unknown) copula function C(u1, u2) has continuous partial derivatives.
C3 Let A be a compact subset of Rp. For every ε > 0,
lim infα∈A:|α−α0|≥ε
1
n
∑t
E[l(U1t, U2t;α0)]− E[l(U1t, U2t;α)] > 0.
C4 (i) For any (u1, u2) ∈ (0, 1)2, l(u1, u2;α) is a continuous function of α ∈ A.
(ii) Let Lt = supα∈A |l(U1t, U2t;α)| and Ltα = supα∈A |lα(U1t, U2t;α)|. Then
limM→∞
limn→∞
sup1
n
∑t
ELtI(Lt ≥M) + LtαI(Ltα ≥M) = 0.
(iii) For any η > 0, ε > 0, there is M > 0 such that |l(u1, u2;α)| ≤ M |l(u1, u2;α)| for all
α ∈ A and all uj ∈ [η, 1) such that 1− uj ≥ ε(1− uj), j = 1, 2.
C5 If Tjtnt=1 are subject to non-trivial censoring (i.e., Djt 6=∞ ), then Sj is truncated at the tail
in the sense that for some τj , Sj(xj) = Sj(τj) for all xj ≥ τj and
lim inf n−1∑n
t=1Gjt(τj)Sj(τj) > 0.
In contrast to the censoring mechanism in Shih and Louis (1995), Condition C1(ii) allows the
censoring variables (D1t, D2t)nt=1 to be non-identically distributed. In addition, no assumptions
are made on the joint survival function Gt(x1, x2) of the censoring variables (D1t, D2t). Therefore,
general censoring mechanism is allowed in this framework, for example, both variables are censored,
one censored and the other uncensored, or one random censoring and the other fixed censoring.
Conditions C3 is the identifiability condition for the tests. Condition C5 is imposed to handle the
tail instability of the Kaplan-Meier estimator, specifically for non-identically distributed censoring
times.
Conditions A1 - A4 are sufficient to ensure the asymptotic normality of αn in Theorem 1.
A1 (i) Bn = −n−1∑
tElαα(U1t, U2t;α0) is positive definite;
(i) Σn = n−1∑
t varlα(U1t, U2t;α0) +∑
jWjt is finite and positive definite, where Wjt is
defined in (16);
(ii) lα(U1t, U2t;α0) +∑
jWjtnt=1 satisfies Lindeberg condition.
23
A2 lαα(u1, u2;α) and lαj(u1, u2;α), j = 1, 2 are well-defined and continuous in (u1, u2;α) ∈(0, 1)2 ×A.
A3 (i) |lα(u1, u2;α0)| ≤ q∏2j=1uj(1 − uj)−aj for some q > 0 and aj ≥ 0 such that
lim supn−1∑
tE[∏2j=1Ujt(1− Ujt)−2aj ] <∞;
(ii)|lαj(u1, u2;α0)| ≤ constant × uj(1 − uj)−bjuk(1 − uk)−ak for some bj , ak and j 6= k
such that lim supn−1∑
tE[Ujt(1−Ujt)ξj−bjUkt(1−Ukt)−ak ] <∞ for some ξj ∈ (0, 1/2).
A4 (i)Let Ltαj = supα∈A |lαj(U1t, U2t;α)| and Ltαα = supα∈A |lαα(U1t, U2t;α)|.Then, limM→∞ lim supn→∞ n
−1∑
tELtαjI(Ltαj ≥M) + LtααI(Ltαα ≥M) = 0;
(ii) For any η > 0 and any ε > 0, there is M > 0, such that |lα(u1, u2;α)|+ |lαα(u1, u2;α)| ≤M|lα(u1, u2;α)|+|lαα(u1, u2;α)| for all α ∈ A and all uj ∈ [η, 1) such that 1−uj ≥ ε(1−uj),j = 1, 2.
Conditions A3 and A4 allow the score function and its partial derivatives with respect to uj ,
j = 1, 2 to blow up at the boundaries but require them be dominated by a weighted function which
satisfies certain moment conditions. These conditions characterize many popular copula functions
such as the Gaussian, t and Clayton copulas (see Chen et al. (2010) for details).
Proof of Theorem 1. Under conditions C1-C5, the consistence of αn is readily obtained according
to Proposition 3.1 of Chen et al. (2010). Next under conditions A1-A4, the asymptotic normality
of αn is given by Proposition 3.2 of Chen et al. (2010).
Conditions T1 - T3 are sufficient to ensure the asymptotic normality of the test statistics QS
in (23).
T1 (i) For any (u1, u2) ∈ (0, 1)2, gS(u1, u2;α) is a continuous function of α, gS(u1, u2;α) has
continuous partial derivatives with respect to u1, u2;
(ii) n−1∑
tE[supα∈A:|α−α0|=o(1) |gS(U1t, U2t;α)|] <∞.
T2 (i) For any (u1, u2) ∈ (0, 1)2, gS,α(u1, u2;α) is a continuous function of α in a neighborhood
of α0;
(ii) n−1∑
tE[supα∈A:|α−α0|=o(1) |gS,α(U1t, U2t;α)|] <∞.
T3 (i) ΩS defined in (19) is finite and positive definite; (ii) Ωs,(k) has all its eigenvalues bounded
below and above by some finite positive constants.
24
Proof of Theorem 2. In the proofs that follow we make use of Lemma A.1. of Chen et al. (2010).
We present the results below for the reader’s convenience.
Lemma (Chen et al., 2010) Suppose that Conditions C1 and C5 are satisfied. Then: (i) the
marginal Kaplan-Meier estimators are uniformly strongly consistent: supx≤τj |Sj(x) − Sj(x)| → 0
a.s. for j = 1, 2; (ii) they can be expressed as martingale integrals:
Sj(x)− Sj(x) = −Sj(x)
∫ x
−∞
Sj(u−)
Sj(u)
∑t dMjt(u)∑
t I(Xjt ≥ u)
where Mjt(x) = Njt(x)−∫ x−∞ Jjt(u)dHj(u), in which Njt(x) = δjtI(Xjt ≤ x), Jjt(x) = I(Xjt ≥ x),
and Hj(x) = − log(Sj(x)).
By Theorem 1, we have |αn − α0| = op(1). Applying Taylor’s expansion to g(αn) with respect
to αn at α0 yields
gS(αn) =1
n
∑t
gS(U1t, U2t;α0) + g′S,α(U1t, U2t; αn)(αn − α0)
, (A.1)
where αn is between α0 and αn.
Under Condition T2, we have
supα∈A:|α−α0|=o(1)
∣∣∣∣∣ 1n∑t
gS,α(U1t, U2t;α)− GS,α
∣∣∣∣∣ = o(1).
It follows that (A.1) can be written as,
gS(αn) =1
n
∑t
gS(U1t, U2t;α0) + G′S,α(αn − α0) + op(n−1/2). (A.2)
Next Theorem 1 indicates that αn can be expressed as an asymptotically linear estimator such that
αn − α0 = B−1n
1
n
∑t
lα(U1t, U2t;α0) + op(1). (A.3)
Plugging (A.3) into (A.2) yields,
gS(αn) =1
n
∑t
gS(U1t, U2t;α0) + G′S,αB−1n lα(U1t, U2t;α0)+ op(1) (A.4)
By the mean-value theorem, we expand (A.4) to obtain
gS(αn) =1
n
∑t
gS(U1t, U2t;α0) + G′S,αB−1n lα(U1t, U2t;α0)+ Jn + op(1),
25
where
Jn =1
n
∑j
∑t
[gS,j(U1t, U2t;α0) + G′S,αB−1n lαj(U1t, U2t;α0)](Ujt − Ujt),
in which (U1t, U2t) lie on the line segment between (U1t, U2t) and (U1t, U2t).
By Lemma 1, we have
Jn = − 1
n
∑j
∑t
gS,j(U1t, U2t;α0) + G′S,αB−1n lαj(U1t, U2t;α0)
Sj(Xjt)
∫ Xjt
−∞
Sj(u−)
Sj(u−)
∑s dMjs(u)∑
s I(Xjs ≥ u).
Then following the same logic of the proof in Proposition 3.2 of Chen et al. (2010), we can see
that gS(αn) is asymptotically a sum of independent zero-mean random vectors. Given Condition
T3 (i), Theorem 2 now follows from the standard multivariate central limit theorem for independent
but non-identically distributed random variables.
Proof of Theorem 3. By Theorem 1, we have αnp→ α0. Using the boundedness of ψi, 1 ≤ i ≤ d, it
follows that GS,αp→ GS,α and ΩS
p→ ΩS . The result of this theorem then follows readily from the
asymptotic normality of gS(αn) given in Theorem 2.
Proof of Theorem 4. Define the simplified BIC rule as follows
SKS = mink : Qs,(k) − k log n ≥ Qs,(r) − r log n, 1 ≤ k, r ≤ |Ψs|, (A.5)
In order to prove Theorem 4(a), we need to prove that, under the null hypothesis,
limn→∞
Pr(KS = SKS) = 1, (A.6)
and that
limn→∞
Pr(SKS = 1) = 1, (A.7)
We start by proving (A.6). Recall that us,(k) = Ω−1/2s,(k) gs,(k)(αn), which is a k× 1 vector. Define the
event
An(ζ) =√
n|us,(k)|∞ >√ζ log n
,
By the definition of Γ(r, n, ζ) defined in (25), in order to prove (A.6), it suffices to prove P (An(ζ))→
26
0. Define
us,(k) = Ω−1/2s,(k)
1
n
∑t
gs,(k)(U1t, U2t, α0).
We can rewrite us,(k) as,√nus,(k) =
√nΩ−1/2s,(k) Ω
1/2s,(k)us,(k) +Rn(k),
where
Rn(k) =√nΩ−1/2s,(k)
gs,(k)(αn)− 1
n
∑t
gs,(k)(U1t, U2t, α0)
. (A.8)
By (A.4) and ‖αn − α0‖ = Op(n−1/2), we have
gs,(k)(αn) =1
n
∑t
gs,(k)(U1t, U2t, α0) +Op(n−1/2).
By (A.8), we can obtain ‖Rn(k)‖∞ = Op(1). Therefore, for any ε > 0, we have P (|Rn(k)|∞ >
2−1√ζ log n) ≤ ε/2 for all large n.
Also define the event
AJ =‖Ω−1/2
s,(k) Ω1/2s,(k)‖ > J
,
where J is a positive constant to be defined later.
By the consistency of θn and uniform law of large number theorem (ULLN) of Jennrich (1969,
Theorem 2), we have,
‖Ωs,(k) − Ωs,(k)‖ = op(1)
Under Condition T3 (ii), it’s straightforward to show that there exists a positive constant J such
that P (AJ) < ε/2 for all ε > 0.
Notice that
P (An(ζ)) ≤ P (An(ζ) ∩AcJ) + P (AJ)
≤ P(√
n|Ω−1/2s,(k) Ω
1/2s,(k)us,(k)|∞ > 2−1
√ζ log n,AcJ
)+ P
(|Rn(k)|∞ > 2−1
√ζ log n
)+ ε/2
≤ P(√
n‖Ω−1/2s,(k) Ω
1/2s,(k)‖|us,(k)| > 2−1
√ζ log n,AcJ
)+ ε,
≤ P(√
n|us,(k)| > (2J)−1√ζ log n
)+ ε,
where AcJ denotes the complementary set of AJ .
27
Now since√n|us,(k)| = Op(1), it follows that
P(√
n|us,(k)| > (2J)−1√ζ log n
)→ 0,
as n→∞. Hence, since ε > 0 was arbitrary we conclude that (A.6) holds.
Next, we prove (A.7) also holds. In view of (A.6), we have Pr(SKS = 1) = 1−∑|Ψs|
k=2 Pr(SKS =
k). Because SKS = k implies that dimension k “beats ”dimension 1 and Qs,(1) ≥ 0, we obtain by
definition of SKS ,
Pr(SKS = k) ≤ Pr(Qs,(k) − k log n ≥ Qs,(1) − log n) = Pr(Qs,(k) ≥ (k − 1) log n).
We have
Pr(Qs,(k) ≥ (k − 1) log n) ≤ Q1n +Q2n,
where Q1n = Pr(Qs,(k) ≥ (k − 1) log n/2) and Q2n = Pr(|√nus,(k) −
√nus,(k)|2 ≥ (k − 1) log n/2).
Since under the null Qs,(k) converges to a non-degenerate (χ2k distributed) variable for any k,
it’s easy to prove that Q1n → 0 as n→∞. Similarly, because√n|us,(k) − us,(k)| = Op(1), Q2n → 0
as n→∞.
Further we have
Pr(QS ≤ x) = Pr(Qs,(1) ≤ x)− Pr(Qs,(1) ≤ x, SKS ≥ 2) + Pr(QS ≤ x, SKS ≥ 2).
Because Qs,(1)d→ χ2
1 under (12) and Pr(SKS ≥ 2) = 0 as n → ∞, it follows immediately that
QSd→ χ2
1.
We now prove Theorem 4(b). Define the simplified AIC rule as follows
AKS = mink : Qs,(k) − 2k ≥ Qs,(r) − 2r, 1 ≤ k, r ≤ |Ψs|. (A.9)
In order to prove Theorem 4(b), we need to prove that under the alternative distribution P,
limn→∞
Pr(KS = AKS) = 1, (A.10)
and that
Pr(AKS ≥ Υ)→ 1, (A.11)
We denote by An(ζ)c the complementary set of An(ζ). Under the alternative distribution P, there
28
exists Υ ≤ |Ψs| such that n−1∑
tEP [gs,Υ(S1(X1t), S2(X2t), α)] 6= 0. Since Υ ≤ |Ψs|, we then have
P (An(ζ)c) ≤ P(√
n|us,(Υ)| <√ζ log n
)→ 0,
since us,(Υ) = us,(Υ) + op(1) by the ULLN. Hence, (A.10) holds.
Now, according to the ULLN, for k = 1, ...,Υ−1, we have |us,(k)|2 → 0 and |us,(Υ)|2 → |us,Υ|2 >0. We can obtain,
P (AKS = k) ≤ P (Qs,(k) − 2k ≥ Qs,(Υ) − 2Υ)
≤ P (n|us,(k)|2 ≥ 2(k −Υ) + n|us,(Υ)|2)→ 0
Therefore, (A.11) also holds.
For a generic constant C > 0, we have
Pr(QS ≤ C) = Pr(QS ≤ C,AKS ≥ Υ) + o(1) ≤ P (n|us,(Υ)|2 ≤ C) + o(1) = o(1),
where the last equality follows the ULLN. Then, QS →∞ as n→∞.
29
Recommended