103
Panel Threshold Models with Interactive Fixed Effects * Ke Miao a , Kunpeng Li b and Liangjun Su a a School of Economics, Singapore Management University, Singapore b School of Economics and Management, Capital University of Economics and Business, China December 31, 2019 Abstract This paper studies estimation and inference in a panel threshold model in the presence of interactive fixed effects. We study the asymptotic properties of the least squares estimators of the regression parameters in the shrinking-threshold-effect framework. We find that under some regularity conditions, the threshold parameter estimator possesses super-consistency in the sense that its estimation error has an asymptotically negligible effect on the asymptotic properties of the slope coefficients. The inference on the threshold parameter can be conducted based on a likelihood ratio test statistic as in the cross- sectional or time series setup. We also propose a test for the presence of the threshold effect. Monte Carlo simulations suggest that our estimators and test statistics perform well in finite samples. We apply our method to study the effect of financial development on economic growth and find that there is indeed a turning point in the effect for all three measures of financial development when the cross- sectional dependence is properly accounted for. Key words: Cross sectional dependence, Dynamic panel, Economic growth, Financial development, Likelihood ratio test, Threshold regression. JEL Classification: C12, C13, C23, C24 * We would like to thank the editor Oliver Linton, an associate editor, and three anonymous referees for their constructive com- ments that greatly improve the presentation of this paper. Kunpeng Li’s research is financially supported by NSFC No.71722011 and 71571122. Su gratefully acknowledges the funding support provided by the Lee Kong Chian Fund for Excellence. Ad- dress correspondence to: Kunpeng Li, School of Economics and Management, Capital University of Economics and Business, Beijing, China, 100070. Email address: [email protected] (Ke Miao); [email protected] (Kunpeng Li); [email protected] (Liangjun Su). 1

Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Panel Threshold Models with Interactive Fixed Effects∗

Ke Miaoa, Kunpeng Lib and Liangjun Suaa School of Economics, Singapore Management University, Singapore

b School of Economics and Management, Capital University of Economics and Business, China

December 31, 2019

Abstract

This paper studies estimation and inference in a panel threshold model in the presence of interactivefixed effects. We study the asymptotic properties of the least squares estimators of the regressionparameters in the shrinking-threshold-effect framework. We find that under some regularity conditions,the threshold parameter estimator possesses super-consistency in the sense that its estimation error hasan asymptotically negligible effect on the asymptotic properties of the slope coefficients. The inferenceon the threshold parameter can be conducted based on a likelihood ratio test statistic as in the cross-sectional or time series setup. We also propose a test for the presence of the threshold effect. MonteCarlo simulations suggest that our estimators and test statistics perform well in finite samples. Weapply our method to study the effect of financial development on economic growth and find that thereis indeed a turning point in the effect for all three measures of financial development when the cross-sectional dependence is properly accounted for.

Key words: Cross sectional dependence, Dynamic panel, Economic growth, Financial development,Likelihood ratio test, Threshold regression.

JEL Classification: C12, C13, C23, C24

∗We would like to thank the editor Oliver Linton, an associate editor, and three anonymous referees for their constructive com-

ments that greatly improve the presentation of this paper. Kunpeng Li’s research is financially supported by NSFC No.71722011

and 71571122. Su gratefully acknowledges the funding support provided by the Lee Kong Chian Fund for Excellence. Ad-

dress correspondence to: Kunpeng Li, School of Economics and Management, Capital University of Economics and Business,

Beijing, China, 100070. Email address: [email protected] (Ke Miao); [email protected] (Kunpeng Li);

[email protected] (Liangjun Su).

1

Page 2: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

1 Introduction

Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-

siderable attentions in recent empirical studies. In this paper, we propose a panel threshold model with

IFEs, which includes both important effects in a model. The proposed model allows us to study thresh-

old effects, IFEs, or both in a unified way. The proposed model has a wide range of applications. In a

recent study, Chudik et al. (2017) investigate the debt-threshold effect on output. The threshold effect of

government debt on economic output has been well documented in the literature; see Reinhart and Rogof-

f (2010), Cecchetti, Mohanty and Zampolli (2011), and Checherita-Westphal and Rother (2012), among

others. However, as argued by Chudik et al. (2017), the existing studies on the debt-growth nexus fail

to incorporate cross-sectionally dependent errors that may exist across countries. This motivates them to

consider a panel threshold model with heterogeneous coefficients and cross-sectionally dependent errors

where the latter are modeled via the use of IFEs to deal with strong cross-sectional dependence. In this pa-

per, we will consider another empirical example, the nexus of financial depth and economic growth where

numerous studies have documented the presence of both threshold effects and unobserved heterogeneity

where the latter is controlled via the use of one-way individual fixed effects or two-way additive fixed

effects. In this paper, we propose to replace the two-way fixed effects by the IFEs, which allow for a larger

degree of unobserved heterogeneity and cross-sectional dependence. It is interesting to know whether one

can continue to find the evidence of threshold effects in the presence of IFEs. Using the World Develop-

ment Indicators (WDI) data across 50 countries ranging from 1971 to 2015, we find strong evidence of

threshold effects and IFEs, and the IFEs cannot be simplified into the two-way fixed effects. This confirms

the necessity of incorporating the two effects into one model.

Apparently, the proposed model is related to two distinct branches of the econometrics literature, name-

ly, the threshold models and the panel data models with IFEs. Threshold models can be traced back to Tong

(1978), and have experienced substantial advancements over the last four decades. Early developments of

threshold models focus much on fixed threshold effects. An undesirable consequence of this framework is

that it is very difficult to conduct inference on the threshold value. As shown in Theorem 2 of Chan (1993),

the limiting distribution of the least squares (LS) estimator of the threshold parameter is a functional of

compound Poisson process, which involves many nuisance parameters such as the marginal distribution of

the regressors and the regression coefficients. For this reason, most subsequent studies assume shrinking

threshold effects to facilitate the inference in the threshold parameter. For example, Hansen (2000) devel-

ops a full statistical theory of the LS estimator for a linear cross-sectional regression model with threshold

effects. Seo and Linton (2007) consider a smoothed LS estimation and establish the inferential theory for

their semiparametric estimators in the framework of both shrinking and fixed threshold effects. As regard

to panel threshold models, Hansen (1999) studies a static panel threshold model where the slope coefficient

estimator is subject to the celebrated incidental parameter issue of Neyman and Scott (1948). Dang, Kim

and Shin (2012) propose to apply the GMM technique to estimate a dynamic panel threshold model under

2

Page 3: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

the traditional large-N and short-T setup where N and T denote the number of cross-sectional units and

the number of time periods, respectively. Ramırez-Rondan (2015) considers a similar model and advocates

the use of maximum likelihood estimation. All the above studies assume that either the regressors or the

threshold variable or both are exogenous. This assumption is restrictive in some empirical applications. To

allow for endogenous regressors, in an empirical paper Kremer, Bick and Nautz (2013) estimate a dynamic

panel threshold model by combining the forward orthogonal deviation transformation with the instrumen-

tal variable technique. In a recent paper, Seo and Shin (2016) propose a GMM method by extending the

approaches of Hansen (1999, 2000) and Caner and Hansen (2004) to estimate a dynamic panel model with

endogenous threshold variable and regressors. They show that if the threshold variable is endogenous, the

estimator of the threshold parameter would lose super-consistency. It is worth mentioning that none of

the above papers emphasize the issue of cross-sectional dependence within the data. If the cross-sectional

dependence is not fully captured by model, the estimator would typically suffer inconsistency.

There is a rapidly growing literature on panel data models with IFEs; see Bai (2009), Bai and Liao

(2016), Moon and Weidner (2015, 2017), Li, Qian and Su (2016), Lu and Su (2016), among others. In a

typical panel data model with IFEs, the unobserved errors are specified to have a factor structure, in which

both factors and factor loadings can have arbitrary correlations with the regressors. This specification

generalizes the traditional two-way additive scale form to a two-way multiplicative vector form. As a result,

the IFEs model allows more richness of the unobserved heterogeneity that may vary across both time and

individuals. Since the allowance and control of unobserved heterogeneity is one of most attractive features

of panel data models, panel data model with IFEs becomes very appealing to empirical studies. Most of

the studies on IFEs so far are limited to linear models; see, e.g., Bai (2009), Bai and Liao (2016), Moon

and Weidner (2015, 2017). Other studies, such as Chen, Fernandez-Val and Weidner (2014), consider the

nonlinear models with IFEs but their analysis relies on the assumption of continuous differentiability of

the objective function. In the proposed model, the nonlinear part is not differentiable at the change point,

which greatly complicates the asymptotic analysis.

In this paper, we employ the least squares method to estimate the proposed model under the large-N

and large-T setup. We consider the framework of diminishing threshold effects in the spirit of Hansen

(2000). That is, the threshold effects shrink to zero as the sample size tends to infinity. We allow the re-

gressors to be arbitrarily correlated with the IFEs provided that they have enough variations after projecting

out the IFEs. We show that the threshold parameter can be estimated at a rate related to the magnitude of

the threshold effect and the asymptotic distribution of the threshold parameter estimator is asymptotically

pivotal up to a scale nuisance parameter. Under some regularity conditions, this rate, together with the

shrinking rate of threshold effects, ensures the estimation of the threshold parameter to have no asymptotic

effect on the estimation of slope coefficients. In other words, the slope coefficients can be estimated as if

one knew the true threshold value.

The most challenging part of our analysis is to conduct a nonstandard analysis on the threshold param-

eter estimator jointly with large dimensional incidental parameters estimators. Although a few previous

3

Page 4: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

studies, such as Hansen (1999) and Seo and Shin (2016), also have incidental parameters in their threshold

models, the analyses in these models essentially only involve with low dimensional estimators because

the incidental parameters can be concentrated out by the within-group transformation or first differencing,

and the resultant objective function eventually depends on the finite dimensional regression coefficients

and threshold parameter. This is in contrast with the current study in the high dimensional estimators are

always present and have to be addressed throughout the whole analysis. Due to the presence of large di-

mensional estimators, the analyses for consistency and convergence rate require some novel arguments that

are quite different from the existing literature. To see this point, we note that, in a standard threshold model

or a panel data model with IFEs, the consistency can be established by only working with the objective

function. But in the current model, the analysis is much complicated, and we take three steps to achieve

this goal. In the first step, we work with the objective function to obtain the consistency of the estimators

of all parameters but the threshold parameter, where the consistency of the large dimensional incidental

parameters estimators is defined under some chosen norm invariant to rotational indeterminacy. In the sec-

ond step, we work with the first order condition to derive some preliminary convergence rates for the slope

coefficient estimators. In the third step, we work with the rescaled objective function to derive the consis-

tency of the threshold-value estimator. Since the rescaled value is possibly large due to the fast shrinking

threshold effects, the objective function in this step needs to be carefully chosen to offset the impact of

the slow convergence rates of the incidental parameters estimators.¬ To the best of our knowledge, the

methodology to prove consistency in this paper is new in the econometrics literature and can be useful to

other discontinuous regression models in the presence of incidental parameters. For the convergence rate,

a primary tool to deal with large dimensional parameters in panel data models with IFEs is the Cauchy-

Schwarz inequality. Unfortunately, this tool does not provide useful bound when deriving the convergence

rate of the threshold parameter estimator because of the special property of indicator function. Some new

arguments are therefore developed in this paper to deal with this issue.

In view of the fact that the unknown parameter in the limiting distribution of the threshold parameter

estimator cannot be estimated accurately, we follow the lead of Hansen (2000) and propose a likelihood

ratio (LR) test to facilitate inference on the threshold parameter. Again, since the estimators of the large

dimensional incidental parameters would change under different threshold values, the analysis of the LR

statistic is quite different from that in Hansen (2000). We find that the LR statistic is asymptotically pivotal

in the case of conditional homoskedasticity. When conditional heteroskedasticity is present, the limiting

distribution involves an unknown parameter that can be consistently estimated nonparametrically. We also

consider the hypothesis testing on the presence of threshold effects, and propose a sup-Wald statistic. We

also propose a procedure to obtain asymptotically correct critical values via simulations.

¬In the M -estimation framework, if L(θ) is the objective function of the unknown parameters θ that is to be maximized or

minimized, then L(θ) − c is also a valid one for any constant c. In classical analysis, c is suggested to be L(θ0) where θ0 is the

true values. However, due to the presence of large dimensional parameters, L(θ)−L(θ0) is not a good objective function in this

paper.

4

Page 5: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

We run Monte Carlo simulations to examine the finite sample performance of the LS estimators. For

both dynamic and static models, the estimators are well-behaved in terms of asymptotic bias, standard

deviation and coverage probability of the 95% confidence interval. Our slope coefficient estimators behave

similarly to the infeasible estimators that are obtained when the threshold value is observed a priori. For

the test of threshold effect, our simulations indicate that the rejection rate is close to the nominal level

under the null hypothesis of the absence of threshold effects and the test has reasonable power under the

alternative. In a nutshell, the simulations indicate that the LS estimators perform well in various data

generating processes.

To illustrate the usefulness of the proposed model, we consider an empirical application on the re-

lationship between economic growth and financial development with the World Development Indicators

(WDI) data. We find that both threshold effects and IFEs are present in the model, and that the financial

development is beneficial to economic growth when it is below some threshold level and it harms growth

otherwise. The former finding justifies the study of panel threshold model with IFEs and the latter is

consistent with the conventional wisdom.

The outline of the paper is as follows. We introduce our model, discuss the estimation methods and

list some basic assumptions in Section 2. We derive the asymptotic properties of these estimators in

Section 3. We also study the likelihood ratio test on the threshold value and investigate several relevant

issues associated with our model such as the threshold effect in the error variance, the determination of

the number of factors, and the test of IFEs versus the two-way additive fixed effects in this section. We

consider the hypothesis testing on the presence of threshold effect in Section 4. Section 5 reports the Monte

Carlo simulation findings. Section 6 applies our method to study the relationship between economic growth

and financial development. Section 7 concludes. The proofs of the main results in the paper are given in

Appendix A. Additional materials can be found in the Online Supplemental Material.

Notation. Let Im denote an m ×m identity matrix. For a real m × n matrix A = (Aij), we use ‖A‖and ‖A‖sp to denote its Frobenius norm and spectral norm, respectively. Let A′ denote the transpose of A.

When A has rank n, let PA = A(A′A)−1A′ and MA = Im − PA. When A is symmetric, we use µr(A)

to denote its rth largest eigenvalue; µmax (A) and µmin (A) denote the largest and smallest eigenvalues of

A, respectively. Let 1· be the indicator function. The symbolp→ denotes convergence in probability, d→

convergence in distribution, and plim probability limit. We use (N,T )→∞ to signify that N and T pass

to infinity jointly.

2 Model, estimation method and assumptions

2.1 Model

Let N be the number of cross-sectional units and T the number of time periods. Consider the model

yit = β0′xit + δ0′xitdit(γ0) + λ0′

i f0t + eit, i = 1, ..., N, t = 1, ..., T, (2.1)

5

Page 6: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where xit is a K × 1 vector of observable regressors, β0 is a K × 1 vector of slope coefficients, δ0 is

a K × 1 vector of slope coefficients representing the threshold effect, γ0 is a scalar threshold coefficient,

dit(γ) ≡ 1qit ≤ γ, qit is a scalar threshold variable, λ0i is anR0×1 vector of unobserved factor loadings,

f0t is an R0 × 1 vector of unobserved common factors, and eit is the idiosyncratic error term. Throughout

the paper, we use the superscript zero to signify the true parameter value. We use f0tr and λ0

ir to denote the

rth component of f0t and λ0

i , respectively, where r = 1, . . . , R0. We assume γ0 ∈ Γ ≡ [γ, γ], where γ and

γ are two fixed constants. Following Hansen (2000), we consider the shrinking threshold effect framework

by assuming that δ0 ≡ δ0NT → 0 with the convergence rate specified below as (N,T )→∞.

Let Λ0 ≡ (λ01, . . . , λ

0N )′, F 0 ≡ (f0

1 , . . . , f0T )′, et = (e1t, . . . , eNt)

′, Yt ≡ (y1t, . . . , yNt)′, Xt ≡

(x1t, . . . , xNt)′ and Xt(γ) ≡ (x1td1t(γ), . . . , xNtdNt(γ))′. We can write the model in (2.1) in vector

form

Yt = Xtβ0 +Xt(γ

0)δ0 + Λ0f0t + et = Xt,γ0θ

0 + Λ0f0t + et, (2.2)

where t = 1, . . . , T, θ0 ≡ (β0′, δ0′)′ and Xt,γ ≡ (Xt, Xt(γ)).

For the moment we assume that the true number of factors R0 is known and given by R. In Section

3.6 below, we propose a way to consistently estimate the number of factors. In factor analysis, it is well

known that Λ and F can only be identified up to a rotation. We follow Bai and Ng (2002) and Bai (2003)

and consider the following set of identification restrictions:

(i) Λ′Λ/N = IR, and (ii) F ′F is a diagonal matrix with diagonal elements ordered in descending order.

2.2 Estimation method

Given R, we can concentrate out the T ×R matrix F and obtain the following Gaussian QML estimate of

(θ,Λ, γ) :

(θ, Λ, γ) ≡ argmin(θ,Λ,γ)∈R2K×L×ΓL(θ,Λ, γ),

where

L(θ,Λ, γ) ≡T∑t=1

(Yt −Xt,γθ)′MΛ(Yt −Xt,γθ), (2.3)

(θ,Λ, γ) ∈ R2K × L× Γ and L ≡ Λ ∈ RN×R : Λ′Λ/N = IR.The above minimization problem can be solved in two steps:

(i) In the first step, we keep γ fixed so that the objective function in (2.3) can be minimized as in Bai

(2009), Moon and Weidner (2015, 2017) and Lu and Su (2016) to obtain the estimate (θ(γ), Λ(γ)).

Let L∗(γ) ≡ L(θ(γ), Λ(γ), γ).

(ii) In the second step, one can search over the interval Γ ≡ [γ, γ] to minimize L∗(γ).

Because L∗(γ) is a step function that takes on less than NT distinct values, we can follow Hansen

(2000) and search for γ over Γn = Γ ∩ qit, 1 ≤ i ≤ N, 1 ≤ t ≤ T. When NT is large, we can

6

Page 7: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

approximate Γ by grid of n points for some n ≤ NT. For example, let q(j) denote the (η0 + j−1n−1(1−2η0))-

th quantile of the sample qit, 1 ≤ i ≤ N, 1 ≤ t ≤ T for j = 1, . . . , n and Γn = q(1), . . . , q(n).We then define γn = argminγ∈ΓnL

∗(γ), which will provide a good approximation to γ. Hansen (1999)

recommends choosing η0 = 1% or 5%. Given γ, the estimates of θ and Λ are calculated according to

θ ≡ θ(γ) and Λ ≡ Λ(γ), respectively. Once the estimates of θ, γ, and Λ are obtained, the estimate of F

can be constructed by the plug-in method (see Section 2.3).

Remark 2.1. This paper adopts the approach to concentrate out the factors first under the identification

restrictions stated at the end of Section 2.1. One can alternatively consider concentrating out the factor

loadings under the identification restrictions that F ′F = IR and Λ′Λ is a diagonal matrix with descending

diagonal elements. The estimates of θ and γ under the two sets of identification restrictions would be the

same, and so are their asymptotic distributions. In this paper, we consider the data scenario of N/T → κ,

as (N,T ) → ∞. Under this scenario, it does not make a big difference in computational time for the two

concentration strategies. In a more general case where N and T diverge at different rates, it is desirable to

concentrate out the larger dimension matrix.

Remark 2.2. As emphasized in the introduction, the objective function L(θ,Λ, γ) is non-differentiable

with respect to γ, which would have significant consequence on the asymptotic properties of the LS estima-

tors. Alternatively, one can adopt the idea of Seo and Linton (2007) and use a smoothed objective function

to estimate the model. Let K (·) be a bounded function such that lims→−∞

K (s) = 0 and lims→∞

K (s) = 1.

The smoothed objective function is defined as

Ls(θ,Λ, F, γ, h) =N∑i=1

T∑t=1

[yit − x′itβ − x′itδK (

γ − qith

)− λ′ift]2,

where h = o (1) is a bandwidth parameter. The estimators of θ and γ can be obtained by minimizing the

above function. This estimation method has the benefit that one can analyze the estimators in a unified

way, irrespective of the fixed threshold effects or shrinking threshold effects; see Seo and Linton (2007)

for details. But the computation burden becomes larger and one has to determine h.

Remark 2.3. In this paper, we do not consider the case in which the regressor xit is correlated with

the idiosyncratic error eit. When the correlation is present, we need to apply the instrumental variable

(IV) method to estimate the model. For simplicity, suppose that all the regressors are endogenous, but we

have dw-dimensional instrumental variables wit with dw ≥ K. Let Wt,γ be defined the same as Xt,γ . The

estimation consists of two steps. In this first step, for each given θ and γ, we estimate φ(θ, γ) and Λ(θ, γ)

according to(φ(θ, γ), Λ(θ, γ)

)= argmax

(φ,Λ)∈Φ×L

T∑t=1

(Yt −Xt,γθ −Wt,γφ

)′MΛ

(Yt −Xt,γθ −Wt,γφ

),

where Φ is some compact parameter space for φ. In the second step, we obtain the estimator θ and γ by

(θ, γ) = argmax(θ,γ)∈Θ×Γ

φ(θ, γ)′W−1NT φ(θ, γ).

7

Page 8: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where WNT is a weighting matrix that can be chosen as a consistent estimate of var(φ(θ0, γ0)) based on

some preliminary consistent estimate (θ, γ) of (θ, γ). Intuitively, if θ and γ are chosen at their true values,

the estimator φ in the first step would be very close to 0. So by minimizing the weighted norm of φ in the

second step, we would obtain the consistent estimators for θ0 and γ0. Such an estimation idea has been

used by Chernozhukov and Hansen (2008) and Su and Hoshino (2016) in the IV quantile model, and by

Lee, Moon and Weidner (2012) and Moon, Shum and Weidner (2018) in the IFEs framework.

We end this subsection with two equations, which serve as the bases for our asymptotic analyses. By

definition, (θ, Λ) = argmin(θ,Λ)∈R2K×L L(θ,Λ, γ). This implies that given γ, the estimates θ and Λ solve

the following system of equations

θ =

( T∑t=1

X ′t,γMΛXt,γ

)−1 T∑t=1

X ′t,γMΛYt (2.4)

and [1

NT

T∑t=1

(Yt −Xt,γ θ)(Yt −Xt,γ θ)′]Λ = ΛVNT , (2.5)

where VNT is a diagonal matrix whose diagonal elements are the R largest eigenvalues of the matrix in the

square brackets, arranged in decreasing order.

2.3 Assumptions

We first introduce some notations for ease of exposition. Let xk,it denote the kth element of xit. Let Xk

and Xk(γ) be two N × T matrices with (i, t)th entry being xk,it and xk,itdit(γ), respectively. Define

Xk,γ =

Xk if k ≤ K,Xk−K(γ) if K < k ≤ 2K

.

Define X ≡ (X1, . . . ,XK), X(γ) ≡ (X1(γ), . . . ,XK(γ)) and X∗γ ≡ (X,X(γ)). Then we can rewrite

the model (2.1) in matrix form:

Y = β0 X + δ0 X(γ0) + Λ0F 0′ + e,

where β0 X ≡K∑k=1

β0kXk, δ0 X(γ0) ≡

K∑k=1

δ0kXk(γ

0),

and Y and e are N × T matrices with the (i, t)th entry yit and eit, respectively. It can be readily shown

that Λ is√N times the first R left-singular vectors of the matrix Y − β X− δ X(γ). According the

PC method, the estimator F is given by F =[Y − β X− δ X(γ)

]′Λ/N .

With the above symbols, we further introduce the following notations that will be used in the subse-

8

Page 9: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

quent assumptions. Define NT × 2K matrix Z(Λ, γ) as

Z(Λ, γ) ≡

Z1(Λ, γ)

Z2(Λ, γ)...

ZT (Λ, γ)

= (MF 0 ⊗MΛ)

X1 X1(γ)

X2 X2(γ)...

...

XT XT (γ)

= (MF 0 ⊗MΛ) (vec(X1,γ), . . . , vec(X2K,γ)),

where Zk(Λ, γ), the k-th column of Z(Λ, γ), is equal to vec(MΛXk,γMF 0) by the identity vec(ABC) =

(C ′ ⊗A)vec(B) and “⊗” denotes the Kronecker product. From this result, it is obvious that Zk(Λ, γ) has

smaller variation compared to vec(Xk,γ). The matrix Z(Λ, γ) plays an important role in the identification

of θ0; see Assumption A.1 below. Let Xt(γa, γb) ≡ Xt(γa)−Xt(γb) for γa, γb ∈ Γ. Similar to Z(Λ, γ),

we define an NT ×K matrix:

Z(Λ, γ) ≡ (MF 0 ⊗MΛ)

X1(γ, γ0)

X2(γ, γ0)...

XT (γ, γ0)

.

The matrix Z(Λ, γ) is related to the identification of γ0.

Let D ≡ σ(F 0,Λ0), the minimal sigma-field generated from F 0 and Λ0, and PrD(·) ≡ Pr(·|D) and

ED(·) ≡ E(·|D). Let FNT,t ≡ σ((xit, qit, ei,t−1), (xi,t−1, qi,t−1, ei,t−2), . . . Ni=1). Let fit,D(γ) denote

the probability density function (PDF) of qit conditional on D, and ED(·|γ) ≡ ED(·|qit = γ). Let M

denote a generic positive constant that may vary across places. We make the following assumptions for the

asymptotic analysis.

Assumption A.1. There exists some constant τ > 0 such that, as (N,T )→∞,

(i) Pr(min(Λ,γ)∈L×Γ µmin [B(Λ, γ)] ≥ τ)→ 1, where B(Λ, γ) = 1NTZ(Λ, γ)′Z(Λ, γ);

(ii) Pr(minγ∈Γ µmin[I(γ)] ≥ τ min1, |γ−γ0|)→ 1, where I(γ) = 1NT Z(Λ0, γ)′MZ(Λ0,γ)Z(Λ0, γ)

with MZ(Λ0,γ) = INT −Z(Λ0, γ)[Z(Λ0, γ)′Z(Λ0, γ)]−1Z(Λ, γ)′.

Assumption A.2. (i) ED(e8+εit ) and ED(‖xit‖8+ε) are uniformly bounded by a non-random constant for

some constant ε > 0.

(ii) E∥∥f0

t

∥∥8 ≤M and 1T

∑Tt=1 f

0t f

0′t

p→ Σf > 0 for some R×R matrix Σf as T →∞;

(iii) E∥∥λ0

i

∥∥8 ≤M and 1N

∑Ni=1 λ

0iλ

0′i

p→ Σλ > 0 for some R×R matrix Σλ as N →∞;

(iv) ‖e‖sp = Op(√N +

√T ).

Assumption A.3. The threshold effect δ0 satisfies that δ0 = (NT )−αC0 for some α ∈ (0, 1/2), C0 ∈ RK

and C0 6= 0.

9

Page 10: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Assumption A.4. (i) For each i = 1, . . . , N , (xit, qit, eit) : t = 1, 2, . . . is conditional strong mixing

given D with the mixing coefficients αDNT,i(·); αm ≡ sup1≤i≤N αDNT,i(m) satisfies that αm = O(m−ζ)

for some ζ > 12p4p−1 and p > 4;

(ii) (xit, qit, eit), i = 1, . . . , N , are mutually independent of each other conditional on D;

(iii) For each i = 1, . . . , N , E(eit|D ∨ FNT,t) = 0 a.s.;

(iv) There exists a constant cf <∞ such that maxi,t supγ∈Γ fit,D(γ) < cf ;

(v) There exist D-dependent variables Mit,D such that supγ∈ΓED(‖xit‖4|qit = γ) ≤ Mit,D, supγ∈Γ

ED(‖xiteit‖4|qit = γ) ≤ Mit,D. We have Pr( 1NT

∑Ni=1

∑Tt=1M

2it,D < M2) → 1 for some

M <∞ as (N,T )→∞.

Assumption A.1 is an identification condition for θ and γ. Assumption A.1(i) extends Assumption A

in Bai (2009) to require the non-colinearity of the regressors uniformly over γ ∈ Γ. More specifically,

it requires that the residuals from the linear projections of any linear combination of the form α∗ X∗γ

on the column space of F 0 first and then on that of Λ0 be asymptotically nondegenerate with α∗ ∈ R2K .

Assumption A.1(ii) requires that Z(Λ0, γ) andZ(Λ0, γ) be not colinear uniformly. To gain more intuitions

of these two conditions, consider model (2.2), which is equivalent to

Y =K∑k=1

Xkβ0k +

K∑k=1

Xk(γ0)δ0 + Λ0F 0′ + e.

As pointed out in Bai (2009), the large dimensional nuisance parameters Λ0 and F 0 are eliminated through

projection matrices in the least squares estimation. Following this intuition, we therefore premultiply MΛ0

and postmultiply MF 0 on both sides to obtain

MΛ0YMF 0 =K∑k=1

MΛ0XkMF 0β0k +

K∑k=1

MΛ0Xk(γ0)MF 0δ0

k + MΛ0eMF 0 .

Taking vector operation on both sides and using the definition of Z(Λ, γ), we have

(MF 0 ⊗MΛ0)vec(Y) = (MF 0 ⊗MΛ0)

[K∑k=1

vec(Xk)β0k +

K∑k=1

vec(Xk(γ0))δ0

k + vec(e)

]= Z(Λ0, γ0)θ0 + (MF 0 ⊗MΛ0)vec(e). (2.6)

If we treat equation (2.6) as a linear regression model on θ0 = (β0′, δ0′)′, it is natural to impose the

full column rank assumption on Z(Λ0, γ0) to identify the parameter θ0, which is equivalent to assuming

Pr(µmin[B(Λ0, γ0)] ≥ τ) → 1. In our model, Λ0 and γ0 are both unobserved and simultaneously esti-

mated with θ0. So one may expect that Pr(µmin[B(Λ, γ)] ≥ τ) → 1 still holds. Since Λ and γ do not

have the consistency property so far, and can be any values in the parameter space, this motivates us to

impose Assumption A.1(i). To understand Assumption A.1(ii), we first consider Hansen (2000)’s model:

10

Page 11: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Y = Xβ0 +Xγ0δ0 + e where the definitions of notations are self-evident. We note that the identification

condition for γ0 in the Hansen’s model is that the function

z(γ) = plimN→∞

µmin

( 1

NX ′γ0MXγXγ0

)= plim

N→∞µmin

( 1

N(Xγ −Xγ0)′MXγ (Xγ −Xγ0)

)has the minimum value 0 at γ0. In the transformed model (2.6), the matrices Z(Λ0, γ) and Z(Λ0, γ)

play the same roles as Xγ − Xγ0 and Xγ in Hansen’s model. We therefore impose Assumption A.1(ii)

to guarantee the identification of γ0. Note that Assumption A.1(ii) is equivalent to the condition that

the matrix I(γ) has minimum value 0 at γ0. More specifically, if γ falls in some neighborhood of γ0, the

minimum eigenvalue behaves like the function f(γ) = |γ−γ0|which achieves 0 at γ0, and if γ falls outside

of this neighborhood, the minimum eigenvalue is always greater than some τ > 0. In Hansen (2000), he

gives some more primitive conditions to identify γ0. However, this is feasible because of the special

property that X ′γXγ0 = X ′γ0Xγ0 for γ ≥ γ0 and X ′γXγ0 = X ′γXγ for γ < γ0 which only holds in his

linear model. In model (2.6), we generally do not have Z†(Λ0, γ)′Z†(Λ0, γ0) = Z†(Λ0, γ0)′Z†(Λ0, γ0)

for γ ≥ γ0 due to the presence of large dimensional incidental parameters, where Z†(Λ, γ) = (MF 0 ⊗MΛ)

[vec(X1(γ)), . . . , vec(Xk(γ))

]. So it seems infeasible to give more primitive conditions like Hansen

(2000) in this paper.

Assumption A.2(i)-(iii) imposes some moment conditions on the regressors, error terms, factors and

factor loadings. Note that we only consider strong factors here. Assumption A.2 (iv) is frequently assumed

in the literature; see, e.g., Su and Chen (2013) and Moon and Weidner (2015). Assumption A.3 specifies

a diminishing threshold effect. The same assumption is also imposed in other studies; see Hansen (1999),

Caner and Hansen (2004), among others.

Assumption A.4 is similar to Assumption A.2 of Su and Chen (2013). We assume conditional strong

mixing across t in Assumption A.4(i) and conditional independence across i in Assumption A.4(ii). As

Su and Chen (2013) remark, the conditional strong mixing in Assumption A.4(i) can be replaced by the

unconditional strong mixing if we assume that the factor loadings are nonrandom. Assumption A.4(ii)

does not rule out the possibility of unconditional cross-sectional dependence among xit, qit, eit arising

from the common factors. The martingale difference sequence (m.d.s.) condition in Assumption A.4(iii)

simplifies the asymptotic analysis. It allows for conditional heteroskedasticity, skewness, or kurtosis of

unknown form in eit but rules out serial correlations. Note that Assumption A.4(iii) does allow for the

presence of lagged dependent or independent variables. If serial correlations are suspected to exist among

errors, we can add lagged dependent or independent variables into the model to remove them. So this

assumption is not restrictive. Assumption A.4 (iv)-(v) imposes some conditions on the conditional PDF

and moments of xit givenD. Assumption A.4(iv) assumes the conditional PDF of qit is uniformly bounded;

Assumption A.4(v) assumes that the fourth order conditional moments of xit and xiteit are well behaved.

Note that Mit,D is not assumed to be bounded uniformly over (i, t) but its sample second moment is well

behaved. Therefore Assumption A.4(v) is not restrictive.

11

Page 12: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

3 Asymptotic Property

In this section, we study the asymptotic properties of the estimators. We first establish the consistency

of (θ, Λ, γ) and their convergence rates, derive the asymptotic distributions of θ and γ, and consider the

statistical inference on γ based on a likelihood ratio test statistic. Then we investigate several relevant

issues associated with our model such as the threshold effect in the error variance, the determination of the

number of factors, and the test of IFEs versus the two-way additive fixed effects.

3.1 Consistency

This subsection establishes the consistency of the LS estimators defined in Section 2.2. We achieve this

goal in three steps. We first show the consistency of θ and Λ in Theorem 3.1. With consistency, we next

give a preliminary convergence rate of θ, which is given in Proposition A.2 in the appendix. Based on this

convergence rate, we finally establish the consistency of γ in Theorem 3.2.

Theorem 3.1 (Consistency of θ and Λ) Suppose that Assumptions A.1-A.4 hold. Then

(i) θ − θ0 p→ 0;

(ii) The matrix N−1Λ0′Λ is invertible and ‖PΛ− PΛ0‖ = op(1).

Theorem 3.1 establishes the consistency of the estimators θ and Λ, which is analogous to Proposition

1 in Bai (2009). It provides the basis for subsequent analyses. For example, with the consistency of θ, the

terms involved with ‖θ − θ0‖2 are of smaller order than ‖θ − θ0‖ and become asymptotically negligible.

The following theorem shows the consistency of γ. The proof of this theorem requires considerable

amount of work. To appreciate the difficulty, we note that the estimators of the large dimensional incidental

parameters (i.e., λi’s or ft’s) have slow convergence rates N−1/2 or T−1/2. However, Assumption A.3

specifies an (NT )−α threshold effect, so we have to multiply (NT )2α on the objective function L(θ,Λ, γ)

to obtain a non-shrinking threshold effect. As α is close to 1/2 from the below, (NT )2α is slightly smaller

than NT . This gives rise to a challenging issue: the estimation errors coming from incidental parameters

cause serious problems to our analyses because of their slow convergence rates. In contrast, under the fixed

threshold effect framework, the proof of consistency for γ is much easier as the normalization scale over

there is equal to 1.

Theorem 3.2 (Consistency of γ) Under Assumptions A.1-A.4, with N/T → κ > 0 as (N,T ) → ∞, we

have γ − γ0 = op(1).

The proof of Theorem 3.2 consists of two steps. Using the first order condition (2.4), together with the

consistency established in the previous theorem, we first show that the LS estimator θ has a preliminary

convergence rate (NT )−α. Next, we show that the rescaled objective function

L∗(γ) = (NT )2α[L(θ(γ), Λ(γ), γ

)− L

(θγ0 , Λγ0 , γ

0)]

12

Page 13: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

behaves like the one of a standard linear regression model (2.6), where θγ0 and Λγ0 are the LS estimator

when the threshold value γ0 is observed a priori. Then invoking Assumption A.1(ii), whose implications

are discussed in the linear regression model (2.6) in Section 2.3, we obtain the consistency of γ. Note that

the magnitude of ‖θγ0 − θ0‖ is Op( 1N + 1

T ), as documented in the previous studies such as Bai (2009) and

Moon and Weidner (2017), implying (NT )α‖θγ0 − θ0‖ = op(1) as N/T → κ. So the estimation error in

θγ0 is asymptotically negligible. We emphasize that the adjusting constant here should be L(θγ0 , Λγ0 , γ

0),

instead of L(θ0,Λ0, γ0) as the classical analysis suggests. The reason is that with (NT )2α rescaled value,

the estimation errors from Λ0 have an asymptotically non-negligible effect on the objective function. But

L(θγ0 , Λγ0 , γ

0)

contains the same estimation errors. So we use it as the adjusting constant to remove this

effect.

3.2 Convergence rates

Given the consistency results in Theorems 3.1 and 3.2, we next establish the convergence rates for θ and γ.

Because we do not have explicit expressions for these estimators, the derivations of these rates are tedious.

We need the following assumption for the theoretical analysis.

Assumption A.5. Let MD(γ) ≡ (NT )−1∑N

i=1

∑Tt=1ED(xitx

′it|γ)fit,D(γ), then the following state-

ments hold:

(i) MD(γ0) > 0 a.s. and MD(γ) is continuous at γ = γ0;

(ii) For all ε > 0, there exist constants N, T , B > 0 and τ1 > 0, such that for all N ≥ N and T ≥ T

Pr

(inf

|γ−γ0|<Bµmin(MD(γ)) > τ1

)> 1− ε.

Assumption A.5 (i)-(ii) assumes that the square matrix MD(γ) is well behaved in the neighborhood of

γ0.

The following theorem establishes the convergence rates of γ and θ.

Theorem 3.3 (Convergence rates of γ and θ) Suppose that Assumptions A.1-A.5 hold andN/T → κ > 0

as (N,T )→∞. Then

(i) (NT )1−2α(γ − γ0) = Op(1);

(ii)√NT (θ − θ0) = Op(1).

Theorem 3.3(i) shows that γ − γ0 = Op((NT )−1+2α), which hinges on the magnitude of threshold

effects. This result is quite intuitive. When α is close to zero, the threshold effects are large, we expect

a more precise estimation of γ0, leading to a faster convergence rate. On the other hand, when α is close

to 1/2, the threshold effects are small, we expect a less precise estimation, which corresponds to a slower

convergence rate. Also note that by equation (2.1),

eit = yit − x′itβ − x′itδdit(γ)− λ′ift

13

Page 14: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

= eit + (λ0′i f

0t − λ′ift)− x′it(β − β0)− x′itdit(γ0)(δ − δ0)− x′it[dit(γ)− dit(γ0)]δ.

Theorem 3.3(i) implies that for any ε > 0, there is a Cε such that Pr((NT )2α−1|γ−γ0| > Cε) < ε. On the

set (NT )2α−1|γ − γ0| ≤ Cε, we have E[‖xit(dit(γ)− dit(γ0))‖

]= O(|γ − γ0|) = O((NT )2α−1). This

result, in conjunction with the fact δ = Op((NT )−α) and the inequality Pr(A) ≤ Pr(A|B) + Pr(Bc),

implies that the last term in the last displayed equation is Op((NT )α−1) = op(1√NT

). Compared with the

third and fourth terms, which areOp( 1√NT

) due to the result in Theorem 3.3(ii), we see that the last term is

asymptotically negligible. However, when γ = γ0, the last term is gone. So the estimation error associated

with γ0 has asymptotically negligible effects on the residual eit. Since our least squares estimation aims

to minimize∑N

i=1

∑Tt=1 e

2it, we expect that the estimator of θ0 with an unknown γ0 is asymptotically

equivalent to that with a known γ0.

Theorem 3.3 is derived under the shrinking threshold effects assumption. When the threshold effects

are fixed, we conjecture that γ − γ0 = Op((NT )−1) and√NT (θ − θ0) = Op(1). The first result is a

natural extension of the result in Theorem 3.3(i) by letting α → 0. As regard to the second result, note

that the estimation of θ0 under the fixed threshold effects cannot be worse than the one under the shrinking

case because of the stronger signal for γ0 under the former, but cannot be better than the one when γ0 is

observed a priori. In both the shrinking threshold case and the observed γ0 case, the estimators are√NT -

consistent. This leads us to conjecture the second result. Furthermore, since the limiting distribution of

the slope coefficient estimators under an unknown γ0 in the shrinking case are the same as the one in the

observed-γ0 case, we conjecture that the limiting distribution of θ − θ0 are the same as the result given in

Theorem 3.4 below due to the same arguments.

In this paper, we assume that N and T pass to infinity at the same rate as in Moon and Weidner (2015,

2017). It is possible to allow N and T to diverge at different rates as in Bai (2009) and Lu and Su (2016).

In this case, the estimators for slope coefficients would have three asymptotic bias terms entailed by weak

exogeneity and heteroskedasticity, which are of order T−1 and N−1. Then the result in Theorem 3.3(ii)

should be changed to θ − θ0 = Op(N−1 + T−1).

3.3 Asymptotic distributions of θ and γ

Given the convergence rates of γ and θ, we next present the limiting distributions. Our analysis indicates

that θ has an asymptotically non-negligible bias. So the explicit expression of bias is also our target. For

ease of exposition, we introduce the following notations.

Let Zk,γ = MΛ0Xk,γMF 0 , where Xk,γ = Xk if k ≤ K, and Xk−K(γ) if K < k ≤ 2K. Let zk,it,γbe the (i, t)-th entry of the matrix Zk,γ and zit,γ = (z1,it,γ , . . . , z2K,it,γ)′ be the 2K-dimensional vector

composed of zk,it,γ . Define

ωNT (γ1, γ2) =1

NT

N∑i=1

T∑t=1

zit,γ1z′it,γ2 ,

14

Page 15: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

ΩNT (γ1, γ2) =1

NT

N∑i=1

T∑t=1

zit,γ1z′it,γ2e

2it,

B1,kNT (γ) =1

Ntr[PF 0ED(e′Xk,γ)

],

B2,kNT (γ) =1

Ttr[ED(ee′)MΛ0Xk,γF

0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′],

B3,kNT (γ) =1

Ntr[ED(e′e)MF 0X′k,γΛ0(Λ0′Λ0)−1(F 0′F 0)−1F 0′

].

We impose the following assumption on the above terms.

Assumption A.6. (i) The probability limits ω(γ1, γ2) ≡ plim(N,T )→∞ωNT (γ1, γ2) and Ω(γ1, γ2) ≡plim(N,T )→∞ΩNT (γ1, γ2) are present and non-random, and are finite uniformly over (γ1, γ2) ∈ Γ × Γ;

(ii) The probability limits B`,k(γ) ≡ plim(N,T )→∞B`,kNT (γ) for ` = 1, 2, 3 and k = 1, . . . , 2K are

present and non-random, and are finite uniformly over γ ∈ Γ.

Assumption A.6 imposes some high level conditions on the terms appearing in the limiting distribu-

tion of θ. The non-randomness of ω(γ1, γ2) and Ω(γ1, γ2) is crucial since it consists of the prerequisite

conditions for the martingale central limit theorem. It is desirable to specify some primitive conditions to

guarantee that the limits are fixed values. However, we emphasize that any effort on resorting to the prim-

itive conditions would inevitably specify the internal structure of ED(Xk), which, however, is generally

unknown in practice, and has no guidance from the economic theories. Following the treatment of IFEs in

the literature, we directly make high-level conditions. In our simulations, we generate xit in a particular

way. With this additional information on xit, we can verify Assumption A.6 directly.

Let B`(γ0) = (B`,1(γ0), . . . ,B`,2K(γ0))′ for ` = 1, 2, 3. The following theorem reports the asymptotic

distribution of θ.

Theorem 3.4 (Asymptotic normality of θ) Suppose that Assumptions A.1-A.6 hold and N/T → κ > 0 as

(N,T )→∞. Then √NT (θ − θ0)

d→ N(ω−10 B, ω−1

0 Ω0ω−10 ),

where ω0 ≡ ω(γ0, γ0), B ≡ −κ1/2B1(γ0)− κ−1/2B2(γ0)− κ1/2B3(γ0) and Ω0 ≡ Ω(γ0, γ0).

Theorem 3.4 indicates that θ has three asymptotically non-negligible bias terms associated with B1(γ0),

B2(γ0) and B3(γ0). B1(·) is related to the possible appearance of the lagged dependent variables and it

vanishes if the regressors are conditionally strictly exogenous in the sense that ED(e′Xk,γ) = 0 (i.e.,

ED(eitxis) = 0 and ED[eitxisdis(γ0)] = 0 for all t, s). In our setting both ED(e′e) and ED(ee′) are

diagonal matrices. The second term, B2(·), vanishes in the absence of cross sectional heterogeneity in

which case ED(ee′) is proportional to an identity matrix. Similarly, the third term, B3(·), vanishes in the

absence of heterogeneity along the time dimension in which case ED(e′e) is proportional to an identity

matrix. In the general case, we allow for weakly exogenous regressors and both cross-sectional and time

15

Page 16: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

series heteroskedasticity so that all three bias terms are present. In practice, one can follow Bai (2009) and

Moon and Weidner (2017) to construct the bias-corrected estimates. The procedure is standard and omitted

here for brevity.

To make inference on θ0,we also need to estimate the asymptotic variance ω−10 Ω0ω

−10 consistently. Let

λ′i be the ith row of Λ and f ′t be the tth row of F , where ft = (Λ′Λ)−1Λ′(Yt−Xt,γ θ) = Λ′(Yt−Xt,γ θ)/N .

A consistent estimator, by the plug-in method, is given by ω−1Ωω−1, where

ω ≡ 1

NT

N∑i=1

T∑t=1

zit,γ z′it,γ , Ω ≡ 1

NT

N∑i=1

T∑t=1

zit,γ z′it,γ e

2it,

zit,γ is defined as a 2K × 1 vector with kth entry equal to (i, t)th entry of MΛXk,γMF

and eit = yit −x′it,γ θ − λ

′ift.

Next, we establish the asymptotic distribution of γ. Let fit(·) denote the PDF of qit. Let

Df,NT (γ) =1

NT

N∑i=1

T∑t=1

E(xitx′it|qit = γ)fit(γ), and

Vf,NT (γ) =1

NT

N∑i=1

T∑t=1

E(xitx′ite

2it|qit = γ)fit(γ).

Then we add the following assumption.

Assumption A.7. (i) The limits Df (γ) ≡ lim(N,T )→∞

Df,NT (γ) and Vf (γ) ≡ lim(N,T )→∞

Vf,NT (γ) exist

uniformly over γ ∈ Γ and are continuous at γ = γ0;

(ii) There is a constant M such that

Var( 1√

NT

N∑i=1

T∑t=1

ED(‖git(γ1, γ2)‖2))≤M max

i,tE(‖git(γ1, γ2)‖4

),

where git(γ1, γ2) = xit |dit(γ1)− dit(γ2)| and xiteit |dit(γ1)− dit(γ2)|.

Assumption A.7(i) requires the limits Df (γ) and Vf (γ) to be well defined. It also assumes the conti-

nuity of Vf (γ) at γ0. The discontinuous case can be allowed and is discussed in Section 3.5 below. As-

sumption A.7(ii) requires the conditional expectations of ‖xit‖2|dit(γ1)− dit(γ2)| and ‖xiteit‖2|dit(γ1)−dit(γ2)| to be well behaved.

Let D0f ≡ Df (γ0) and V 0

f ≡ Vf (γ0). The following theorem gives the asymptotic distribution of γ.

Theorem 3.5 (Asymptotic distribution of γ) Suppose that Assumptions A.1-A.5 and A.7 hold, andN/T →κ > 0 as (N,T ) → ∞. Then (NT )1−2α(γ − γ0)

d→ φξ, where ξ = argmax−∞<r<∞[−12 |r| + W (r)],

φ = C0′V 0f C

0/(C0′D0fC

0)2, and W (·) is a two-sided standard Brownian motion on the real line.

A two-sided Brownian motion W (·) on the real line is defined as

W (r) = W1(−r)1 r ≤ 0+W2(r)1 r > 0 ,

16

Page 17: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where W1(·) and W2(·) are two independent standard Brownian motions on [0,∞). Theorem 3.5 implies

that the pseudo statistic (NT )1−2α(γ − γ0)/φ has an asymptotically pivotal distribution, ξ. This result is

similar to Theorem 1 in Hansen (2000).

The asymptotic result in Theorem 3.5 relies critically on the shrinking effect assumption. In the fixed

threshold effect framework (i.e., α = 0), it is possible to demonstrate NT (γ − γ0) = Op(1). But deriving

the asymptotic distribution is not an easy task. Based on Theorem 2 of Chan (1993), we conjecture that the

limiting distribution is the maximizer of some compound Poisson process, which may involve the marginal

distribution of xit and the regression coefficients.

3.4 The likelihood ratio test

To make inference on γ0, one may be tempted to apply the asymptotic distribution result in Theorem 3.5.

But the limiting distribution of γ depends on the scale parameter φ which is hard to estimate accurately.

Inferences based on Theorem 3.5 tend to be poor in finite samples. Following the lead of Hansen (2000),

we consider a likelihood ratio (LR) statistic in this subsection to test the null hypothesis H0 : γ = γ0. Let

(θ(γ), Λ(γ)) be the estimator when the threshold value γ is given and L∗(γ) = L(θ(γ), Λ(γ), γ). Define

LRNT (γ) = NTL∗(γ)− L∗(γ)

L∗(γ),

where L∗(γ) = L(θ(γ), Λ(γ), γ) = L(θ, Λ, γ).

The following theorem reports the asymptotic distribution of LRNT (γ0).

Theorem 3.6 (Likelihood ratio test) Suppose that Assumptions A.1-A.7 hold and N/T → κ > 0 as

(N,T )→∞. Then under H0 : γ = γ0, we have

LRNT (γ0)d−−→ η2Ξ,

where η2 = C0′V 0f C

0/(σ2C0′D0fC

0), σ2 = plim(N,T )→∞1NT

∑Ni=1

∑Tt=1 e

2it, and Ξ = max−∞<r<∞

[−|r|+ 2W (r)] has the distribution function characterized by Pr(Ξ ≤ z) = (1− e−z/2)2.

The result in Theorem 3.6 is essentially same to Theorem 2 in Hansen (2000). The major difference

lies in the appearance of Λ(γ) in our definition of L∗(γ), which is a large dimensional matrix estimator.

Bounding the change of the likelihood value arising from the change of such a large dimensional matrix

estimator is an indispensable step in our theoretical analysis and it requires the use of the celebrated Davis

and Kahan’s (1970) sin(Θ) theorem. Apparently, such a step is not needed in Hansen’s analysis.

As Theorem 3.6 suggests, we still have a nuisance parameter η2. In the special case that the errors are

homoskedastic over the cross-section and time series dimensions, this parameter is equal to 1. But for the

general case, one need to estimate η2 consistently. Noting that

η2 =plim(N,T )→∞

1NT

∑Ni=1

∑Tt=1E[(δ0′xiteit)

2|qit = γ0]fit(γ0)

σ2plim(N,T )→∞1NT

∑Ni=1

∑Tt=1E[(δ0′xit)2|qit = γ0]fit (γ0)

,

17

Page 18: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

we propose to estimate η2 by

η2 =

∑Ni=1

∑Tt=1Kh(γ − qit)(δ′xiteit)2

σ2∑N

i=1

∑Tt=1Kh(γ − qit)(δ′xit)2

,

where σ2 = 1NT L

∗(γ), Kh(u) = h−1K(u/h), h → 0 is bandwidth parameter and K(·) is a kernel

function. We can readily show that σ2 = σ2 +op(1) and η2 = η2 +op(1) under some regularity conditions

on h and K(·). Given the consistent estimate of η2, we can consider the normalized LR statistic

NLRNT (γ0) = LRNT (γ0)/η2.

We can easily tabulate the asymptotic critical value for NLRNT (γ0). We can also invert this statistic to

obtain the asymptotic 1− α confidence interval for γ :

CI1−α = γ ∈ Γ : NLRNT (γ) ≤ Ξ1−α

where Ξ1−α denotes the 1−α quantile of Ξ. For example Ξ1−α =5.94, 7.35, and 10.59 for α =0.10, 0.05,

and 0.01, respectively.

3.5 Threshold effect in the error variance

So far, the asymptotic results in the previous subsections are derived under the assumption that Vf (γ) is

continuous at γ0 (see Assumption A.7(i)). This entails a neat and symmetric asymptotic distribution of

γ. However, there are cases where this assumption is violated. For example, the conditional distribution

of the regressors given the threshold variable changes across the threshold value, or the distribution of the

error term changes across the threshold value, or both. Among these cases, a particularly interesting case is

that the variance of the error term has threshold effect; see, e.g., Seo and Linton (2007). In this subsection,

we consider the extension to allow threshold effects in the error variance.

A direct consequence of the presence of threshold effects in the error variance is that Vf (γ) has a jump

at γ0. We assume that the left and right limits exist with V 0f,− = limγ↑γ0Vf (γ) and V 0

f,+ = limγ↓γ0Vf (γ).

The results in Theorems 3.5-3.6 now need to be modified to accommodate the discontinuity fact. However,

we emphasize that the arguments and methods of deriving the results in Theorems 3.5-3.6 are only slightly

affected in this more general case. In fact, we now need to derive the asymptotic results separately for the

γ ≤ γ0 and γ > γ0 cases. But even under Assumption 7(i), the proofs of Theorems 3.5-3.6 are conducted

separately for the subcases that γ ≤ γ0 and γ > γ0. So the only difference in the general case is how to

unify the asymptotic results of the two subcases. Now, the asymptotic distribution of (NT )1−2α(γ − γ0)

changes to

argmax−∞<r<∞

[φL(− 1

2|r|+W1(−r)

)1r ≤ 0+ φR

(− 1

2|r|+W2(r)

)1r > 0

],

18

Page 19: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

with φL = C0′V 0f,−C

0/(C0′D0fC

0)2 and φR = C0′V 0f,+C

0/(C0′D0fC

0)2. Since φL and φR do not

average out as they do in model (2.1) without a threshold effect in the error variance, the leading term

of (NT )1−2α(γ − γ0) now is not asymptotically pivotal up to a scalar.

The LR test statistic LRNT (γ0) has the asymptotic distribution

max−∞<r<∞

[η2L

(− |r|+ 2W1(−r)

)1r ≤ 0+ η2

R

(− |r|+ 2W2(r)

)1r > 0

], (3.1)

where η2L = C0′V 0

f,−C0/(σ2C0′D0

fC0) and η2

R = C0′V 0f,+C

0/(σ2C0′D0fC

0). Again, the asymptotic dis-

tribution of LRNT (γ0) is not pivotal up to a scalar any more. We can construct a nonparametric estimator

to estimate η2L:

η2L =

NT∑N

i=1

∑Tt=1Kh(γ − qit)(δ′xiteit)21qit ≤ γ

σ2[∑N

i=1

∑Tt=1Kh(γ − qit)1qit ≤ γ][

∑Ni=1

∑Tt=1Kh(γ − qit)(δ′xit)2]

.

The estimator for η2R can be constructed analogously. Note that the expression in (3.1) is equivalent to

max[

max0<r<∞

η2L

(− r + 2W1(r)

), max

0<r<∞η2R

(− r + 2W2(r)

)].

Given the fact that Pr(max0<r<∞−12r + W (r) < z) = 1 − e−z for a standard Wiener process, the

relationship of the asymptotic p-value and LRNT (γ0) can be easily derived as

Asy. p-value = 1−[1− e

− 1

2η2L

LRNT (γ0)][1− e

− 1

2η2R

LRNT (γ0)].

Alternatively, we can calculate the critical value under nominal size α, Cα, by solving the following equa-

tion: [1− e

− Cα2η2L

][1− e

− Cα2η2R

]= 1− α.

The finite sample performance of the above discussed procedure is examined via Monte Carlo simulations

in Section F of the online supplement.

3.6 Determination of R0

When implementing the least squares estimation on the proposed model, one has to determine the number

of factors (R0). This is an intrinsic issue related to factor analyses. For approximate factor models, there

are various ways to determineR0; see Bai and Ng (2002), Onatski (2010), and Ahn and Horenstein (2013),

among others. However, it seems difficult to extend existing methods to the current model because of the

nonlinearity arising from the threshold effects. Recently, Moon and Weidner (2018) and Chernozhukov

et al. (2018) consider the nuclear norm regularized estimation of panel regression models and provide

methods to determine the number of factors by singular value thresholding (SVT). This subsection extends

these methods to determine the number of factors in our panel threshold model. Let sr(A) denote the rth

largest singular value of an N × T matrix A. Let ‖A‖∗ =∑min(N,T )

r=1 sr(A) denote the nuclear norm of A.

The procedure goes as follows:

19

Page 20: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

1. Conduct the nuclear norm regularized estimation:

(θψ, γψ, Θψ) ≡ argmin(θ,γ,Θ)∈R2K×Γ×RN×TLn,ψ(θ, γ,Θ), where

Ln,ψ(θ, γ,Θ) =1

2NT

∥∥∥Y − β X− δ X(γ)−Θ∥∥∥2

+ψ√NT‖Θ‖∗,

for some tuning parameter ψ (N−1/2 + T−1/2).

2. Estimate R0 by R =∑min(N,T )

r=1 1sr(Θψ) ≥ χNT for some singular value threshold χNT =

log(√N +

√T )(√NTψ‖Θψ‖sp)1/2.

The idea of the above proposed method is similar to that of Moon and Weidner (2018) and Cher-

nozhukov et al. (2018). The N × T matrix estimator Θψ estimates Λ0F 0′. Under some regularity con-

ditions, we can show that ||Θψ − Λ0F 0′||sp = Op(√N +

√T ). Suppose R0 > 0, the first R0 singular

values of Λ0F 0′ are of the order Op(√NT ) and the remaining ones are equal to zero, implying that

sr(Θψ) √NT for r ≤ R0 and sr(Θψ) = Op(

√N +

√T ) for r > R0. Then one can readily show that

Pr(R = R0) → 1 as (N,T ) → ∞ if χNT log(√N +

√T )(N1/2T 1/4 + N1/4T 1/2). Similar analysis

can be applied to the special case of R0 = 0. The factor log(√N +

√T ) in χNT helps us to handle the

case R0 = 0. If R0 > 0, this factor can be ignored by setting χNT = (√NTψ‖Θψ‖sp)1/2.

In Section D of the online supplement, we provide a rigorous proof for the consistency of R. There, we

allow N and T to diverge to infinity at different rates. We also allow the threshold effects to be either fixed

or shrink to zero. In Section F of the online supplement, we provide some simulation results to demonstrate

that the SVT method works fairly well in finite samples.

In practice, we need to choose the tuning parameter ψ. As discussed in section 2.5 of Chernozhukov

et al. (2018), a desired tuning parameter ψ equals c‖e‖sp for some constant c ∈ (0, 1). To quantify ψ, we

can follow Chernozhukov et al. (2018) to compute an appropriate tuning parameter via simulation under

the Gaussian assumption. In our framework, as eit’s do not have cross-sectional or serial correlation, we

can generate uit that is independent across both (i, t) and uit ∼ N(0, σ2

). Then the tuning parameter is

given by c‖u‖sp, where u is a stack of uit’s into an N × T matrix. In practice, we can replace σ2 with

an initial consistent estimator. As the least squares estimator in our model has similar robust properties as

shown in Moon and Weidner (2015) that as long as the number of factors is not underestimated, we can

first estimate the model with Rmax > R0 and obtain an estimate of σ2.

3.7 Two-way additive effects v.s. interactive effects

Our model is attractive in its flexibility to model the unobserved heterogeneity and cross-sectional depen-

dence in real data via the IFEs. In traditional panel data models, the unobserved heterogeneity is usually

addressed via the two-way additive fixed effects. This naturally gives rise to the question on which model

should be used in empirical applications. Let αi and νt be the individual and time fixed effects. Then

αi + νt = λ′ift

20

Page 21: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

with λi = (αi, 1)′ and ft = (1, νt)′. This implies that the two-way additive fixed effects is a very special

case of IFEs with the number of factors equal to 2 and a factor and a factor loading set to 1. As a result,

if the unobserved heterogeneity in the data is of two-way additive fixed effects form but one uses our

panel threshold model with IFEs to estimate it, the resultant estimators are still consistent but inefficient.

The inefficiency is due to the fact that the useful information of partial factors and factor loadings being

observed are not properly accounted in the IFEs estimation. On the other hand, if the heterogeneity is of the

IFEs form and cannot be simplified to the two-way additive form, but one uses the within-group method

to estimate the model, the resultant estimators would be generally inconsistent because the endogeneity

arising from the unobserved factors and factor loadings are not fully controlled.

Based on the above discussion, a plausible procedure is that we first invoke the LS estimation method

to estimate the panel threshold model with IFEs, and then we test whether the estimated heterogeneity can

be further reduced to the two-way additive form. If the test is passed, we turn to the within-group estimator

to achieve efficiency; otherwise we can continue to employ the LS estimator studied in this paper.

Motivated by the above discussions, we propose a formal statistic to test the null of two-way additive

fixed-effects against the alternative of more general IFEs in Section E of the online supplement. We focus

on the case of R0 = 2 there and propose a test statistic that is asymptotically standard normal under

the null. Alternatively, we can extend the sup-type test of Castagnetti, Rossi and Trapani (2015) to our

framework. In addition, we conduct some simulations in Section F of the online supplement to evaluate

the finite sample performance of the test.

4 Testing the existence of threshold effect

Testing the existence of threshold effect is non-standard because the threshold level γ0 is not identified

when δ0 = 0. This issue has been well documented in the econometrics literature; see Andrews (1993),

Hansen (1996) and references therein. In the current paper, we are interested in testing the null hypothesis

H0 : δ0 = 0 versus the alternative H1 : δ0 6= 0. To study the local power of our test, we consider the

sequence of Pitman local alternatives H1n : δ0 = c√NT

. Note that δ0 shrinks to zero at a faster rate than the

one specified in Assumption A.3. So under the Pitman local alternatives the parameters γ0 is not identified.

Apparently, the case of c = 0 corresponds to the null of no threshold effect.

For each γ ∈ Γ = [γ, γ], we can obtain the estimator θ(γ), Λ(γ), F (γ) and bias-corrected estimator

θ(γ). Then we construct the asymptotic variance estimators

ωNT (γ, γ) ≡ 1

NT

N∑i=1

T∑t=1

zit,γ z′it,γ , and ΩNT (γ, γ) ≡ 1

NT

N∑i=1

T∑t=1

zit,γ z′it,γ eit(γ)2,

where zit,γ is a 2K × 1 vector with kth entry equal to (i, t)th entry of MΛ(γ)

Xk,γMF (γ)and eit(γ) =

yit − x′it,γ θ(γ)− λi(γ)′ft(γ). The sup-Wald statistic is defined as

supWNT ≡ supγ∈Γ

WNT (γ),

21

Page 22: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where WNT (γ) = NT · θ(γ)′LK−1NT (γ)L′θ(γ) with L = [0K×K , IK ]′ a section matrix, and KNT (γ) =

L′ωNT (γ, γ)−1ΩNT (γ, γ)ωNT (γ, γ)−1L, the estimated covariance matrix for√NTL′θ(γ).

The asymptotic property of the supWNT (γ) statistic is presented in the following theorem.

Theorem 4.1 (Wald test) Suppose that Assumptions A.1-A.2, A.4 and A.6 hold and N/T → κ > 0 as

(N,T )→∞. Then under H1n : δ0 = c/√NT, we have

supWNTd→ supγ∈Γ

W c (γ)

where

W c(γ) =[S(γ) +Q(γ)c

]′K(γ, γ)−1

[S(γ) +Q(γ)c

], Q(γ) = L′ω(γ, γ)−1ω(γ, γ0)L,

and S(γ) = L′ω(γ, γ)−1S(γ) is a mean-zero Gaussian process with covariance kernel K(γ1, γ2) =

L′ω(γ1, γ1)−1 Ω(γ1, γ2)ω(γ2, γ2)−1L.

Under H0, c = 0 and supγ∈ΓW0(γ) = supγ∈Γ S(γ)′K(γ, γ)−1S(γ). Clearly, the limiting null

distribution of supWNT depends on the Gaussian process S(γ) and is not pivotal. We cannot tabulate the

asymptotic critical values for the sup-Wald statistic. Nevertheless, given the simple structure of S(γ), we

can follow the literature (e.g., Hansen (1996)) and simulate the critical values via the following procedure:

1. Generate vit, i = 1, . . . , N, t = 1, . . . , T independently from the standard normal distribution;

2. Calculate SNT (γ) = 1√NT

∑Ni=1

∑Tt=1 zit,γ eit(γ)vit;

3. Compute supW ∗NT ≡ supγ∈Γ SNT (γ)′ωNT (γ, γ)−1LK−1NT (γ)L′ωNT (γ, γ)−1SNT (γ);

4. Repeat Steps 1-3 B times and denote the resulting supW ∗NT test statistics as supW ∗NT,j for j =

1, . . . , B.

5. Calculate the simulated/bootstrap p-value for the supWNT test as p∗W = 1B

∑Bj=1 1supW ∗NT,j

≥supWNT and reject the null when p∗W is smaller than some prescribed nominal level of sig-

nificance.

The next theorem justifies the asymptotic validity of the above procedure.

Theorem 4.2 (Bootstrap validity) Suppose that Assumptions A.1-A.2, A.4-A.7 hold and N/T → κ > 0 as

(N,T )→∞. Then supW∗NT

d→ supγ∈ΓW0(γ).

Theorem 4.2 indicates that the bootstrap statistic can mimic the asymptotic null distribution of the

statistic supWNT . When B is sufficiently large, the asymptotic critical value of the level α test based

on supWNT is approximately given by the empirical upper α-quantile of supW ∗NT,j , j = 1, . . . , B.Therefore, we can reject the null hypothesis H0 : δ0 = 0 if the simulated p-value p∗W is smaller than α.

22

Page 23: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

5 Monte Carlo simulations

In this section, we conduct Monte Carlo simulations to evaluate the finite sample performance of our

estimators and test statistics.

5.1 Data generating processes

We consider four data generating processes:

DGP 1: yit = β0yi,t−1 + δ0yi,t−11yi,t−1 ≤ γ0+ λ0′i f

0t + eit;

DGP 2: yit = β0yi,t−1 + δ0yi,t−11qit ≤ γ0+ λ0′i f

0t + eit, where the threshold variable qit is indepen-

dently and identically distributed (i.i.d.) from N(2, 1);

DGP 3: yit = β0xit + δ0xit1xit ≤ γ0+ λ0′i f

0t + eit, where xit = 2 + vit + (λ0

i + λ∗i )′(f0

t + 0.5f0t−1),

vit is i.i.d. N(0, 1) and λ∗ir is i.i.d. N(1, 1) for r = 1, 2;

DGP 4: yit = β0xit + δ0xit1qit ≤ γ0+ λ0′i f

0t + eit, where xit is generated in the same way as in DGP

3 and qit is generated in the same way as in DGP 2.

We set β0 = 0.3, δ0 = (NT )−0.2 and γ0 = 2 for all four DGPs. In all the above four DGPs, both

λ0i and f0

t are 2 × 1 vectors with each entry of λ0i being i.i.d. N(1, 1) and each entry of f0

t being i.i.d.

0.7×N(1, 1). The factors and factor loadings are mutually independent of each other. We also generate the

idiosyncratic error terms eit independently from the student t-distribution with nine degrees of freedom.

For the dynamic DGPs 1 and 2, we throw away the first 1000 time periods of observations to get rid of the

start-up effect. For the static DGPs 3 and 4, we generate correlations between the regressors xit and the

factors and factor loadings.

DGP 1 is a dynamic panel where the lagged dependent variable yi,t−1 also serves as the threshold

variable. DGP 2 is a dynamic panel with a strictly exogenous variable qit being the threshold variable.

DGP 3 is a static panel with the exogenous regressor xit serving as the threshold variable and DGP 4

is a static panel where the threshold variable qit is different from the regressor xit. Note that the high

level assumption, Assumption A.6, is satisfied in our DGPs. Take DGP1 as an example. It is easy to see

that yit is independent with yjt condition on F 0 = (f01 , . . . , f

0T )′ and yit is weakly correlated yis with

exponential-decay correlations conditional on λ0i . Then we prove the law of large numbers simply by

directly showing that the variance converges to zero. With the partition arguments as used in the proof of

Theorem 2.4.1 in Van Der Vaart and Wellner (1996), the point convergence can be readily extended to the

uniform convergence.

We are interested in the performance of our estimators and test statistics in all the above four scenarios.

In addition, we also consider generating conditional heteroskedastic errors as in Su and Chen (2013). The

simulation performance is similar to that reported here.

23

Page 24: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

5.2 Implementation and estimation results

For each DGP, we consider the feasible and infeasible bias-corrected estimators, where the infeasible

estimators are obtained by using the information of true threshold parameter γ0. For all bias-corrected

estimators, we correct all three bias terms based on the formula in Section 3 by ignoring the fact that there

is no need to correct some bias terms in some DGPs. For example, the slope coefficient estimator has only

the bias term B1(γ0) in DGPs 1 and 2, and has no bias term in DGPs 3 and 4. For the slope coefficients

(β0, δ0) and the threshold parameter γ0, we report the empirical bias (Bias), standard deviation (Std), and

coverage probability (CP) of the 95% confidence interval for the corresponding true value. The number of

repetitions for each case is set to be 500.

Table 1 reports the estimation and inference results with the unknown R0 being estimated by the SVT

method in Section 3.6. The tuning parameterψ is chosen following the description at the end of Section 3.6.

Specifically, we chose c = 0.3 in ψ. The number of factors is estimated quite accurately. When bothN and

T are not less than 50 for all DGPs, the correct estimation rate is above 99%. For each combination of N

and T in each DGP, the table reports the estimation and inference results for the feasible estimates of β0 and

δ0 on a row, followed by those of the infeasible estimates in the next row. We summarize some important

findings from Table 1. First, as expected, the infeasible estimates of β0 and δ0 tend to outperform the

feasible estimates slightly in terms of bias and standard deviation, which is especially true for the estimate

of threshold effect δ0. In addition, we find that the two estimates behave similarly, which supports the

theoretical claim that they are asymptotically equivalent. Second, for both sets of estimates, the biases and

standard deviations decrease to zero as either N or T increases. In terms of inference, the 95% confidence

intervals for the slope coefficients in these DGPs tend to be under-covered, but their performance generally

improves as either N or T increases. Third, the biases are of asymptotically smaller order compared to

the standard deviations, indicating that the bias-corrected estimator performs as our theory predicts. As

the slope coefficients estimator θ has bias term in DGPs 1-2, we observe slightly larger biases of the bias-

corrected estimator compared to that in DGPs 3-4. Fourth, for the estimate of γ0, the biases and standard

deviations tend to decrease as N or T becomes large. The 95% confidence intervals tend to under-cover

slightly when N or T is small for DGPs 1, 3, and 4, but the coverage probability approaches the nominal

level (95%) quickly as both N and T increase.

5.3 Test for the threshold effect

We next consider the test for the presence of threshold effects. We also consider both dynamic and stat-

ic cases and all four DGPs. The main difference is that now we set δ0 = δ0NT = 0, 2(NT )−1/2 and

10(NT )−1/2 in order to evaluate both the size and local power performance of our test statistic. We con-

sider three frequently-used nominal levels in empirical studies, i.e., 1%, 5% and 10%. As before, we

employ the bias-corrected estimates with three potential bias terms corrected in all cases. The number of

repetitions is 500.

24

Page 25: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 1: Performance of the least square estimators

β δ γ

N T Bias Std CP Bias Std CP Bias Std CP1 25 50 -0.0049 0.0304 93.6% 0.0055 0.0455 91.2% 0.0006 0.2525 90.2%

-0.0039 0.0298 94.2% -0.0016 0.0427 92.0%50 25 -0.0153 0.0345 88.2% 0.0099 0.0452 92.6% -0.0194 0.2264 91.2%

-0.0141 0.0336 89.4% 0.0011 0.0408 93.6%50 50 -0.0036 0.0213 93.8% 0.0032 0.0301 92.8% -0.0015 0.1277 91.4%

-0.0028 0.0214 93.4% -0.0011 0.0292 93.6%50 100 -0.0024 0.0140 94.8% 0.0022 0.0196 95.4% 0.0001 0.0738 94.0%

-0.0019 0.0140 94.6% 0.0000 0.0196 95.2%100 50 -0.0037 0.0152 91.0% 0.0021 0.0205 92.2% 0.0025 0.0759 93.4%

-0.0033 0.0151 91.4% 0.0001 0.0197 94.0%100 100 -0.0016 0.0099 95.0% 0.0014 0.0144 94.0% 0.0008 0.0445 94.0%

-0.0013 0.0099 95.2% 0.0002 0.0145 93.0%2 25 50 -0.0059 0.0306 89.4% 0.0021 0.0209 93.2% 0.0019 0.0279 90.4%

-0.0054 0.0305 90.2% 0.0010 0.0210 92.2%50 25 -0.0140 0.0289 89.4% 0.0018 0.0196 93.6% -0.0003 0.0266 92.6%

-0.0140 0.0285 90.4% 0.0010 0.0192 93.8%50 50 -0.0045 0.0201 92.4% 0.0000 0.0139 93.4% -0.0001 0.0150 94.4%

-0.0044 0.0201 92.4% -0.0005 0.0138 93.6%50 100 -0.0021 0.0138 94.6% 0.0000 0.0103 92.8% 0.0006 0.0098 95.8%

-0.0019 0.0138 94.8% -0.0002 0.0102 93.4%100 50 -0.0043 0.0148 93.2% 0.0005 0.0097 93.0% 0.0006 0.0128 93.2%

-0.0040 0.0146 93.2% 0.0001 0.0097 93.2%100 100 -0.0011 0.0102 93.2% 0.0003 0.0074 92.8% 0.0002 0.0068 94.4%

-0.0010 0.0101 94.0% 0.0001 0.0074 93.2%3 25 50 0.0018 0.0232 92.2% 0.0064 0.0409 92.6% -0.0178 0.2124 91.4%

0.0019 0.0219 93.2% -0.0010 0.0399 94.8%50 25 0.0036 0.0247 89.2% 0.0089 0.0461 88.2% -0.0242 0.2751 91.2%

0.0034 0.0229 91.4% 0.0002 0.0428 91.4%50 50 -0.0012 0.0157 92.2% 0.0047 0.0282 92.2% -0.0099 0.1384 91.2%

-0.0007 0.0157 92.4% 0.0002 0.0272 93.4%50 100 -0.0003 0.0111 93.0% 0.0013 0.0202 91.6% -0.0089 0.1328 92.2%

0.0000 0.0110 93.2% -0.0012 0.0196 92.4%100 50 0.0001 0.0111 93.4% 0.0028 0.0207 92.0% -0.0094 0.0899 94.4%

0.0003 0.0110 93.6% 0.0004 0.0211 92.2%100 100 0.0000 0.0075 93.4% 0.0016 0.0132 94.0% 0.0004 0.0418 94.4%

0.0002 0.0075 93.6% 0.0005 0.0133 93.8%4 25 50 0.0023 0.0299 88.6% 0.0042 0.0355 91.4% -0.0103 0.2241 91.0%

0.0016 0.0265 90.0% 0.0005 0.0357 90.8%50 25 0.0021 0.0289 85.8% 0.0022 0.0353 91.6% -0.0063 0.1931 92.4%

0.0021 0.0268 86.8% -0.0013 0.0357 90.6%50 50 0.0010 0.0173 92.2% 0.0003 0.0231 93.8% -0.0015 0.0748 94.6%

0.0014 0.0167 92.4% -0.0015 0.0228 94.4%50 100 0.0006 0.0114 94.6% 0.0011 0.0153 95.6% -0.0029 0.0517 93.4%

0.0009 0.0113 95.2% -0.0002 0.0154 94.4%100 50 0.0004 0.0123 92.0% 0.0010 0.0160 94.4% 0.0022 0.0436 95.2%

0.0009 0.0122 92.2% -0.0001 0.0161 94.4%100 100 -0.0001 0.0083 93.4% 0.0004 0.0109 95.2% 0.0016 0.0292 96.4%

0.0002 0.0083 93.8% -0.0002 0.0109 95.0%

Note. We report the bias, std and CP for the feasible estimates followed those for the infeasible estimates with true γ0,where CP refers to the coverage probability for the 95% confidence intervals.

25

Page 26: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 2: Rejecting frequency for testing the threshold effect at 1%, 5% and 10% nominal levels

DGP N T δ0NT=0 δ0NT= 2/√NT δ0NT= 10/

√NT

1% 5% 10% 1% 5% 10% 1% 5% 10%1 50 50 0.018 0.074 0.106 0.100 0.290 0.406 1.000 1.000 1.000

50 100 0.018 0.068 0.142 0.100 0.268 0.358 1.000 1.000 1.000100 50 0.018 0.076 0.110 0.112 0.256 0.356 1.000 1.000 1.000100 100 0.006 0.064 0.122 0.094 0.236 0.372 1.000 1.000 1.000

2 50 50 0.024 0.086 0.156 0.460 0.698 0.794 1.000 1.000 1.00050 100 0.018 0.084 0.144 0.458 0.674 0.776 1.000 1.000 1.000100 50 0.016 0.068 0.118 0.476 0.666 0.750 1.000 1.000 1.000100 100 0.012 0.064 0.132 0.400 0.616 0.748 1.000 1.000 1.000

3 50 50 0.024 0.084 0.144 0.150 0.352 0.448 1.000 1.000 1.00050 100 0.014 0.052 0.104 0.138 0.332 0.424 1.000 1.000 1.000100 50 0.016 0.080 0.162 0.150 0.362 0.470 1.000 1.000 1.000100 100 0.010 0.066 0.120 0.140 0.324 0.458 1.000 1.000 1.000

4 50 50 0.026 0.076 0.140 0.136 0.334 0.436 1.000 1.000 1.00050 100 0.006 0.072 0.136 0.120 0.332 0.454 1.000 1.000 1.000100 50 0.026 0.098 0.158 0.146 0.276 0.384 1.000 1.000 1.000100 100 0.024 0.058 0.116 0.122 0.280 0.400 1.000 1.000 1.000

Table 2 reports the test results. First, under H0 : δ0 = 0, the rejection rates for all DGPs are close to the

the nominal levels with moderate deviations when N or T is not large enough. But as N and T becomes

large, the size of our test improves quickly. Second, the rejection rates increase fast as δ0 deviates from

0 further and further. Note that our asymptotic estimation theory considers the case where δ0 is of order

(NT )−α with α < 1/2. In order to see the local power to approach 1, we need to have a large constant

c for the threshold effect (c/√NT ). Third, the local power performance of our test for the static panel is

similar to that for the dynamic panel.

6 Empirical Application

In this section, we apply our method to study the relationship between financial depth and economic

growth.

6.1 Literature

An important research topic in the economic growth literature is about the relationship between financial

depth and economic growth. Recent empirical studies frequently suggest that there is a turning point in

the effect of financial development on economic growth; see Levine (2003, 2005), Law and Singh (2014)

and Arcand, Berkes and Panizza (2015), among others. Levine (2005) provides an extensive survey of the

theoretical literature that emphasizes how the services provided by the financial sector would contribute

to economic growth. Law and Singh (2014) construct a sample of 87 countries for the period 1980-2010

from several datasets including World Development Indicators (WDI), Penn World Table 6.3, International

26

Page 27: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Country Risk Guide (ICRG), and the Barro and Lee’s dataset. They consider the short panel framework

by averaging the time series observations for each country over five-year periods. They find the presence

of threshold effect in the finance–growth relationship. In particular, the level of financial development is

beneficial to growth only up to a certain level of its value; beyond the level further development of finance

tends to adversely affect growth. Similarly, Arcand, Berkes and Panizza (2015) find that financial depth

starts to have a negative effect on output growth for high-income countries when credit to the private sector

reaches 100% of GDP.

Although the effect of financial sector on economic growth has been studied in a wide range, a potential

common drawback of the previous studies, such as Law and Singh (2014) and Arcand, Berkes and Panizza

(2015), is that the cross-sectional dependence among the data that arises from the unobserved common

factors is largely ignored. In this section we revisit the relationship between financial depth and economic

growth by explicitly modeling the cross-sectional dependence with a factor structure.

6.2 Model

We extend the panel threshold model of Law and Singh (2014) to allow for the presence of IFEs. The

model is given by

GROWTHit = a · 1FINit ≤ γ+ β1FINit · 1FINit ≤ γ+ β2FINit · 1FINit > γ

+ ϕ′xit + λ′ift + eit, (6.1)

where GROWTHit denotes the economic growth for country i in year t, F INit denotes the level of

financial development for country i in year t, xit is a vector of control variables, and the remaining symbols

are the same as in the theory part of the paper.

The above model is slightly different from that given by equation (2.1). First, we do not consider the

threshold effect in the coefficient of control variables xit’s. Second, as we do not allow for the intercept in

the regressors, we only put 1FINit ≤ γ as a regressor. Third, we directly write β1 and β2 as the regime

one and the regime two coefficients of FINit.

We follow Law and Singh (2014) and collect annual data from the WDI database between 1971 and

2015. The dataset is a balanced panel with N = 50 and T = 45. We consider three measures of financial

development, namely, private sector credit (PSC), liquid liabilities (LL), and domestic credit (DC). All

these three banking sector development indicators are expressed as ratios to GDP. The control variables

include: initial per capita GDP, trade openness, government expenditure, and population growth. Table 3

reports the descriptive statistics of the variables used in our regression. It is seen that the economic growth

exhibits a large variation among the 50 countries over the 45 years period under our investigation. In

contrast, the three measures of financial development and control variables have relatively small variations.

27

Page 28: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 3: Descriptive statistics

Unit of measurement mean std dev median min maxGrowth % 3.604 4.520 3.892 -36.700 39.487

Private Sector Credit log(% of GDP) 3.304 0.900 3.266 0.329 5.570Liquid Liability log(% of GDP) 3.620 0.645 3.595 1.495 5.445Domestic Credit log(% of GDP) 3.407 0.912 3.363 0.432 5.743

Lag GDP Per Capita log(US$) 2010 constant price 8.358 1.552 8.227 5.390 11.425Population Growth % 1.781 1.084 1.871 -1.475 6.366

Trade openness log(% of GDP) 4.054 0.600 4.050 1.844 6.090Government expenditure log(% of GDP) 2.629 0.369 2.617 1.169 3.772

Table 4: Test for the presence of threshold effect

R Private Sector Credit Liquid Liability Domestic CreditsupWNT p-value supWNT p-value supWNT p-value

1 44.661 0.000 39.849 0.000 41.839 0.0002 74.643 0.000 63.684 0.000 66.610 0.0003 109.856 0.000 69.551 0.000 84.437 0.0004 36.041 0.000 22.737 0.000 41.788 0.0005 38.052 0.000 23.387 0.000 49.027 0.000

6.3 Test for the presence of threshold effects

To conduct the hypothesis testing for the presence of threshold effects, we first need to specify the number

of factors. Note that under the null hypothesis, the model reduces to a standard panel data model with IFEs.

So we have a large room of choices on the methods of determining the number of factors, such as Bai and

Ng (2002), Onatski (2010) and Ahn and Horenstein (2013). Nevertheless, we find that different methods

produce different estimates of R0 for our dataset. For this reason, we are conservative to consider the tests

under the various numbers of factors. Specifically, we consider R = 1, . . . , 5 and report the corresponding

test statistics and p-values for each case.

Table 4 reports the supWNT statistics and the associated bootstrap p-values based on 500 bootstrap

replications when all the three measures of financial development are considered. Apparently, all these

supWNT statistics suggest that we reject the null hypothesis of no threshold effect at the 1% level.

6.4 Number of factors and the test against the additive fixed effects

To estimate the model, we can also consider different choices of R. We apply the SVT method in Section

3.6 to determine the number of factors. We also try to modify the existing methods to determine the number

of factors in our framework. First, we estimate the model with Rmax = 5 and obtain the residuals. Then

we conduct the eigenvalue distribution (ED) of Onatski (2010), the growth rate (GR) and eigenvalue ratio

(ER) of Ahn and Horenstein (2013) and PC and IC of Bai and Ng (2002) to estimate the number of factors

in the residuals.

The results are summarized in Table 5. ED, GR and ER choose R = 1 for all three specifications.

28

Page 29: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 5: Number of factors determined by various methods

Bai and Ng Onatski-ReStat AH SVTFIN PCp1 ICp1 ED ER GRPSC 3 2 1 1 1 2LL 3 2 1 1 1 2DC 3 2 1 1 1 2

Note: Bai and Ng refers to Bai and Ng (2002), Onatski-ReStat refers to Onatski (2010), and AHrefers to Ahn and Horenstein (2013).

PC and IC of Bai and Ng (2002) choose R to be 3 and 2, respectively. Our SVT method chooses R = 2

for all three specifications. Moon and Weidner (2015) find that in the linear panel data models with IFEs,

we can still estimate the slope coefficients consistently when the number of factors is overspecified. To be

conservative, we focus on the case where R = 3.

As our SVT method chooses R = 2, there is some possibility that the IFEs can be captured by the

two-way additive fixed effects (AFEs). In Section E of the online supplement, we propose a method to test

AFEs against IFEs. We conduct the test for all three specifications here. The test statistics are 18.11, 17.73,

and 18.52 with PSC, LL, and DC serving as FINit respectively. Under the null of AFEs, the test statistic

is N(0, 1) asymptotically. Therefore, we find a strong evidence to support the IFEs. An implication of this

result is that the existing studies may have endogeneity issue because the unobserved heterogeneity are not

fully controlled by the traditional two-way additive fixed effects.

6.5 Estimation results

Table 6 reports the regression results with R = 3. The estimates of threshold coefficient γ are 4.2322,

4.304 and 4.559, respectively when private sector credit, liquid liabilities, and domestic credit are used as a

measure for the financial development. In terms of the original percentage scale, these numbers correspond

to 68.87%, 71.30% and 95.13%, respectively, where, e.g., the first percentage suggests the turning point

for the model with private sector credit occurs when the private sector credit over the GDP ratio reaches

68.87%, a number that is substantially smaller than 100%, and a number suggested by Arcand, Berkes

and Panizza (2015). For these three estimates of γ, we find there are about 84.3%, 85.1% and 87.5% of

observations in the data that are smaller than the corresponding estimate of γ. In some sense, the estimated

threshold coefficient is far apart from the tail of the distribution of the threshold variable. In Table 6, we

also report the 95% confidence intervals that are based on the likelihood-ratio test. The three confidence

intervals are quite narrow due to the fact that threshold effects are not so small.

The estimates of β1 and β2 in (6.1) suggest that the financial development is a positive but not statis-

tically significant determinant of economic growth if it is less than a certain threshold level, and its effect

becomes negative and statistically significant when it is higher than the threshold level. This empirical

finding is roughly in line with that in Law and Singh (2014) and supports the conventional wisdom that

more finance is definitely not always better and it tends to harm economic growth after a turning point.

29

Page 30: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 6: Estimates of the slope and threshold coefficients

Model 1 Model 2 Model 3Fin Development PSC LL DC

Threshold Estimate:Threshold level (γ) 4.232 (log68.87) 4.304 (log74.00) 4.559 (log95.45)

95% CI [4.106, 4.268] [4.225, 4.360] [4.542, 4.567]Sample Quantile 84.3% 85.1% 87.5%

Impact of finance:Regime one (β1) 0.274 (0.282) 0.298 (0.393) 0.078 (0.267)Regime two (β2) -3.101 (0.698) -3.820 (0.967) -3.014 (0.965)

Impact of Covariates:Lag GDP Per Capita -2.595 (0.490) -2.561 (0.507) -2.595 (0.497)Population Growth 1.353 (0.235) 1.188 (0.229) 1.344 (0.237)

Trade Openness 3.098 (0.350) 3.143 (0.351) 3.204 (0.356)Gov Expenditure -2.827 (0.423) -2.867 (0.436) -2.785 (0.430)

Intercept -13.618 (3.406) -17.031 (4.275) -12.784 (4.556)

Note: The values without parentheses (the left column) are the least square estimates and the valuesin parentheses (the right column) are the corresponding standard errors.

However, we emphasize that although the final results are changed not so much in comparison with Law

and Singh (2014), our model should be used in this type of empirical studies since we find strong evidence

that both threshold effects and IFEs are present in the data.

7 Conclusion

In this paper, we consider the least squares estimation of the panel threshold models with IFEs. We study

the asymptotic properties of the least squares estimators in the shrinking threshold effect framework and

propose a likelihood ratio test for the threshold parameter in the model. We also propose a test for the

presence of threshold effects. Our simulations suggest that our estimators and test statistics perform well

in finite samples. We apply our method to study the effect of financial development on economic growth

and find strong evidence to support the proposed model.

There are several interesting topics for further research. First, it is interesting to consider panel thresh-

old regressions with both IFEs and endogeneity. Endogeneity has become a serious concern in recent

cross-sectional threshold models; see, e.g., Yu and Phillips (2018). The extension to our framework will

be complicated by the presence of IFEs. Second, we only consider a panel threshold model where the

regression parameters exhibit homogeneity over both cross-sectional and time dimensions. In the large N

and large T setup, there is a possibility for unobserved parameter heterogeneity or latent group structure

over the cross-sectional dimensions (e.g., Ando and Bai (2016), Su, Shi and Phillips (2016) and Su and

Ju (2018)) and structural changes along the time dimension (e.g., Qian and Su (2016), Li, Qian and Su

(2016), and Okui and Wang (2017)). We leave these topics for future research.

30

Page 31: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

APPENDIX

In this appendix we prove the main results in the paper. The proof relies on some technical lemmas whose proofs

are given in the online supplement.

A Proof of the main results

Let πNT = min(√N,√T ). As we require N/T → κ as (N,T ) → ∞, we may use the property O(T ) = O(N) in

various places.

A.1 Proof of Theorem 3.1

To prove Theorem 3.1, we need the following two lemmas.

Lemma A.1 Suppose Assumptions A.2 and A.4 hold. Then

(i)1

NT‖ee′‖ = Op(π

−1NT ),

1

NT‖e′e‖ = Op(π

−1NT ),

1

NT‖e′eF 0‖ = Op(π

−1NT ),

1

NT‖Λ0′ee′‖ = Op(π

−1NT );

(ii)1√NT‖eF 0‖ = Op(1),

1√NT‖Λ0′e‖ = Op(1),

1√NT‖Λ0′eF 0‖ = Op(1);

(iii) supγ∈Γ

1

NT‖eX′k,γ‖ = Op(π

−1NT ) for k = 1, . . . , 2K;

(iv) supγ∈Γ

1√NT|tr(eX′k,γ)| = Op(1), sup

γ∈Γ

1√NT|tr(eX′k,γMΛ0)| = Op(1) for k = 1, . . . , 2K.

Lemma A.2 Under Assumptions A.2 and A.4,

(i) supΛ∈L

∥∥∥ 1

NT

T∑t=1

X ′tMΛet

∥∥∥ = Op(π−1NT ),

(ii) sup(Λ,γ)∈L×Γ

∥∥∥ 1

NT

T∑t=1

Xt(γ)′MΛet

∥∥∥ = Op(π−1NT ),

(iii) supΛ∈L

∥∥∥ 1

NT

T∑t=1

f0′t Λ0′MΛet

∥∥∥ = Op(π−1NT ),

(iv) supΛ∈L

∣∣∣ 1

NT

T∑t=1

e′tPΛet

∣∣∣ = Op(π−2NT ),

where L =

Λ ∈ RN×R : N−1Λ′Λ = IR.

Proof of Theorem 3.1. (i) Let Xt(γ1, γ2) = Xt(γ1)−Xt(γ2). Note that

Yt −Xtβ −Xt(γ)δ ≡ Λ0f0t + et + Ψt(θ, γ),

31

Page 32: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where Ψt(θ, γ) ≡ −Xt(β − β0) −Xt(γ)(δ − δ0) −Xt(γ, γ0)δ0 = −Xt,γ(θ − θ0) −Xt(γ, γ

0)δ0. The objectivefunction can be rewritten as

L(θ,Λ, γ) =

T∑t=1

[Λ0f0

t + et + Ψt(θ, γ)]′MΛ

[Λ0f0

t + et + Ψt(θ, γ)]

= L1(θ,Λ, γ) + L2(θ,Λ, γ) +

T∑t=1

e′tMΛet, (A.1)

where

L1(θ,Λ, γ) =

T∑t=1

Ψt(θ, γ)′MΛΨt(θ, γ) +

T∑t=1

f0′t Λ0′MΛΛ0f0

t + 2

T∑t=1

Ψt(θ, γ)′MΛΛ0f0t , and

L2(θ,Λ, γ) = 2

T∑t=1

f0′t Λ0′MΛet + 2

T∑t=1

Ψt(θ, γ)′MΛet.

It is easy to verify that L(θ0,Λ0, γ0) =∑Tt=1 e

′tMΛ0et. Then we have

L(θ,Λ, γ)− L(θ0,Λ0, γ0) = L1(θ,Λ, γ) + L2(θ,Λ, γ)−T∑t=1

e′t(PΛ − PΛ0)et. (A.2)

By Lemma A.2(i)-(iv), we have

sup‖θ‖≤M

sup(Λ,γ)∈L×Γ

∣∣∣ 1

NTL2(θ,Λ, γ)− 1

NT

T∑t=1

e′t(PΛ − PΛ0)et

∣∣∣ = op(1).

This result, together with the fact that L(θ, Λ, γ)− L(θ0,Λ0, γ0) ≤ 0, implies 1NT L1(θ, Λ, γ) + op(1) ≤ 0.

Let Ψ(θ, γ) = (Ψ1(θ, γ), . . . ,ΨT (θ, γ)), which is an N × T matrix. Noting that

T∑t=1

Ψt(θ, γ)′MΛΨt(θ, γ) = tr[Ψ(θ, γ)′MΛΨ(θ, γ)]

= tr[(MF 0 + PF 0)Ψ(θ, γ)′MΛΨ(θ, γ)]

= tr[MF 0Ψ(θ, γ)′MΛΨ(θ, γ)]

+ tr[(F 0′F 0)−1/2F 0′Ψ(θ, γ)′MΛΨ(θ, γ)F 0(F 0′F 0)−1/2

],

T∑t=1

f0′t Λ0′MΛΛ0f0

t = tr[(F 0′F 0)1/2Λ0′MΛΛ0(F 0′F 0)1/2], and

T∑t=1

Ψt(θ, γ)′MΛΛ0f0t = tr[(F 0′F 0)1/2Λ0′MΛΨ(θ, γ)F 0(F 0′F 0)−1/2],

we haveL1(θ,Λ, γ) = tr[MF 0Ψ(θ, γ)′MΛΨ(θ, γ)] + tr[Ξ(θ, γ)′MΛΞ(θ, γ)],

where Ξ(θ, γ) ≡ Λ0(F 0′F 0)1/2 + Ψ(θ, γ)F 0(F 0′F 0)−1/2. Let B(Λ, γ), Z(Λ, γ) and Z(Λ, γ) be the notationsintroduced in Section 2.3. Let

B(Λ, γ) ≡ 1

NTZ(Λ, γ)′Z(Λ, γ) and B(Λ, γ) ≡ 1

NTZ(Λ, γ)′Z(Λ, γ).

32

Page 33: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Using the properties that tr (B1B2B3) =vec(B1)′(B2 ⊗ I)vec(B′3) and tr (B1B2B3B4) =vec(B1)

′(B2 ⊗B′4)vec(B′3)

for any conformable matrices B1, B2, B3, B4 and an identity matrix I (see, e.g., Bernstein (2005, p.253)), we have

tr[MF 0Ψ(θ, γ)′MΛΨ(θ, γ)] = vec(Ψ(θ, γ)

)′(MF 0 ⊗MΛ) vec

(Ψ(θ, γ)

).

Noting that (MF 0 ⊗MΛ)vec(Ψ(θ, γ)) = −Z(Λ, γ)(θ− θ0)−Z(Λ, γ)δ0 and MF 0 ⊗MΛ is a projection matrix, wehave

1

NTtr [MF 0Ψ(θ, γ)′MΛΨ(θ, γ)] =

[θ − θ0 + B(Λ, γ)−1B(Λ, γ)δ0

]′B(Λ, γ)[θ − θ0 + B(Λ, γ)−1B(Λ, γ)δ0

]+ δ0′[B(Λ, γ)− B(Λ, γ)′B(Λ, γ)−1B(Λ, γ)

]δ0

≡ 1

NTL1,1(θ,Λ, γ) +

1

NTL1,2(Λ, γ), say, (A.3)

where the invertibility ofB(Λ, γ) is ensured by Assumption A.1(i). By quadratic form, we have (NT )−1L1,1(θ,Λ, γ) ≥0. Noting that (NT )−1L1,2(Λ, γ) can be written as δ0′Z(Λ, γ)′MZ(Λ,γ)Z(Λ, γ)δ0, we also have (NT )−1L1,2(Λ, γ) ≥0. It is easy to verify that B(Λ, γ), B(Λ, γ) and B(Λ, γ) are Op(1) uniformly in (Λ, γ). Noting that δ0 = o(1),

(NT )−1L1,2(Λ, γ) ≥ 0 and (NT )−1tr[Ξ(θ, γ)′MΛΞ(θ, γ)] ≥ 0 for all (γ,Λ), the fact that 1NT L1(θ, Λ, γ) +

op(1) ≤ 0 implies(θ − θ0)′B(Λ, γ)(θ − θ0) + op(1) < 0.

This implies that ‖θ − θ0‖ = op(1) by Assumption A.1(i).(ii) Given ‖θ − θ0‖ = op(1), we can readily show that

1

NTtr[Ψ(θ, γ)′MΛΨ(θ, γ)] = op(1) and

1

NTtr[Ψ(θ, γ)′MΛΛ0F 0′] = op(1).

These results, in conjunction with the fact that 1NT L1(θ, Λ, γ) = op(1), implies that 1

NT tr[Ξ′MΛΞ] = op(1) whereΞ ≡ Ξ(θ, γ). It follows that

1

NTtr(F 0Λ0′MΛΛ0F 0′) = tr

(Λ0′MΛΛ0

N

F 0′F 0

T

)= op(1).

Because T−1F 0′F 0 > 0, the above equation implies that

Λ0′MΛΛ0

N=

Λ0′Λ0

N− Λ0′Λ

N

Λ′Λ0

N= op(1).

Hence, Λ′Λ0/N is invertible and it follows that IR − Λ′PΛ0Λ = op(1). Consequently, we have∥∥PΛ − PΛ0

∥∥2= tr[

(PΛ − PΛ0

)2] = 2tr(IR − Λ′PΛ0Λ/N) = op (1) .

This completes the proof of (ii).

A.2 Proof of Theorem 3.2

To prove Theorem 3.2, we need two propositions. In proposition A.1 and A.2, we establish preliminary convergence

rates of Λ and θ respectively. Given the rates, we go back to equation (A.2) and show the consistency of γ.

Proposition A.1 Let H ≡ (F 0′F 0/T )(Λ0′Λ/N)V −1NT . Under Assumptions A.1-A.4,

33

Page 34: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

(i) VNTp→ V , where V is a diagonal matrix consisting of eigenvalues of ΣΛ0ΣF 0 ;

(ii) H is invertible and 1N ‖Λ− Λ0H‖2 = Op(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT ).

Proof of Proposition A.1. Let et ≡ Yt − Xtβ − Xt(γ)δ = et + Λ0f0t − Xt,γ0(θ − θ0) + Xt(γ

0, γ)δ. By theeigenvalue equation

(1NT

∑Tt=1 ete

′t

)Λ = ΛVNT , we can obtain the following decomposition

Λ− Λ0H =1

NT

ee′ + eF 0Λ0′ + Λ0F 0′e′ −

2K∑k=1

(θk − θ0k)eX′k,γ0

−2K∑k=1

(θk − θ0k)Xk,γ0e′ +

K∑k=1

δkeXk(γ0, γ)′ +

K∑k=1

δkXk(γ0, γ)e′

−2K∑k=1

(θk − θ0k)Λ0F 0′X′k,γ0 −

2K∑k=1

(θk − θ0k)Xk,γ0F 0Λ0′

+

K∑k=1

δkΛ0F 0′Xk(γ0, γ)′ +

K∑k=1

δkXk(γ0, γ)F 0Λ0′

+

2K∑k=1

2K∑k′=1

(θk − θ0k)(θk′ − θ0

k′)Xk,γ0X′k′,γ0 +

K∑k=1

K∑k′=1

δk δk′Xk(γ0, γ)Xk′(γ0, γ)′

−K∑k=1

2K∑k′=1

δk(θk′ − θ0k′)Xk(γ0, γ)X′k′,γ0 −

2K∑k=1

K∑k′=1

δk′(θk − θ0k′)Xk,γ0Xk′(γ

0, γ)′

Λ

≡ (I1 + · · ·+ I15)ΛV −1NT , (A.4)

where Xk(γ1, γ2) ≡ Xk(γ1)−Xk(γ2) for k = 1, . . . ,K.

For I1, we have ‖I1‖ = 1NT ‖ee

′‖ = Op(π−1NT ) by Lemma A.1(ii). For I2 and I3, we have

‖I2‖ = ‖I3‖ =1

NT‖eF 0Λ0′‖ ≤ 1√

T

‖eF 0‖√NT

‖Λ0‖√N

= Op(1√T

).

For I4 and I5, noting that ‖eX′k,γ0‖ = Op(N√T ) and |θk−θ0

k| ≤ ‖θ−θ0‖,we have ‖I4‖ = ‖I5‖ = Op(T−1/2‖θ−

θ0‖). For I5 and I6, noting that ‖eXk(γ0, γ)′‖ ≤ 2 supγ∈γ ‖eXk(γ)′‖ = Op(N√T ) by Lemma A.1(i), we have

‖I6‖ = ‖I7‖ = Op(T−1/2‖δ‖). For I8 and I9, we cam show that

1

NT‖Λ0F 0′X′k,γ0‖ ≤

‖Λ0‖√N

‖F 0‖√T

‖Xk,γ0‖√NT

= Op(1),

which implies that ‖I8‖ = ‖I9‖ = Op(‖θ − θ0‖). Similarly, ‖I10‖ = ‖I11‖ = Op(‖δ‖). For I12, . . . , I15, we

can bound their Frobenius norm by Op(‖θ − θ0‖ + ‖δ‖). By the triangle inequality, ‖δ‖ ≤ ‖δ0‖ + ‖δ − δ0‖ ≤‖δ0‖+ ‖θ − θ0‖. It follows that ‖I1 + · · ·+ I15‖ = Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ).

Premultiplying Λ0′/N on both sides of (A.4), we can obtain that

Λ0′Λ

NVNT =

Λ0′Λ0

N

F 0′F 0

T

Λ0′Λ

N+ op(1).

Noting that both Λ0′Λ0

NF 0′F 0

T and Λ0′ΛN are asymptotically nonsingular matrices, the above equality shows that the

columns of Λ0′ΛN are the (non-normalized) eigenvectors of the matrix Λ0′Λ0

NF 0′F 0

T , and VNT consists of the eigenval-

ues of the same matrix (in the limit). Thus, VNTp→ V where V is a diagonal matrix consisting of eigenvalues of

ΣΛ0ΣF 0 .

34

Page 35: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

(ii) Noting that VNT is invertible, we have

N−1/2‖Λ− Λ0H‖ = N−1/2∥∥(I1 + · · ·+ I15)ΛV −1

NT

∥∥Checking the terms one by one, we can readily show that N−1/2‖Λ − Λ0H‖ = Op(‖θ − θ0‖ + ‖δ0‖ + π−1

NT ), or

equivalently, 1N ‖Λ− Λ0H‖2 = Op(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT ).

Lemma A.3 Under Assumptions A.1-A.4,

(i)1

NΛ0′(Λ− Λ0H) = Op(‖θ − θ0‖+ ‖δ0‖+ π−2

NT );

(ii)1

NΛ′(Λ− Λ0H) = Op(‖θ − θ0‖+ ‖δ0‖+ π−2

NT );

(iii) HH ′ = (1

NΛ0′Λ0)−1 +Op(‖θ − θ0‖+ ‖δ0‖+ π−2

NT );

(iv) ‖PΛ − PΛ0‖2 = Op(‖θ − θ0‖+ ‖δ0‖+ π−2NT ).

Lemma A.4 Under Assumptions A.1-A.4, as N,T →∞ and N/T → κ > 0,

(i)1√NT‖e′(Λ− Λ0H)‖ = Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT );

(ii)1

(NT )1−2α

∣∣∣ T∑t=1

Ψt(θ, γ)′MΛet

∣∣∣ = op(1),

(iii)1

(NT )1−2α

∣∣∣ T∑t=1

Ψt(θ, γ)′MΛΛ0f0t −R

∣∣∣ = op(1),

(iv)1

(NT )1−2α

∣∣∣ T∑t=1

f0′t Λ0′MΛet + tr[e′MΛ0ePF 0 ]

∣∣∣ = op(1),

(v)1

(NT )1−2α

∣∣∣ T∑t=1

e′t(PΛ − PΛ0)et

∣∣∣ = op(1),

(vi)1

(NT )1−2α

∣∣∣ T∑t=1

f0′t Λ0′MΛΛ0f0

t − tr(e′MΛ0ePF 0)−R∣∣∣ = op(1),

where Ψ(θ, γ) = Xt,γ(θ − θ0) +Xt(γ, γ0)δ0 and

R =

2K∑k=1

2K∑k′=1

(θk − θ0k)(θk′ − θ0

k′)1

NTtr[Xk,γMΛXk′,γPF 0

]+

K∑k=1

K∑k′=1

δ0kδ

0k′

1

NTtr[Xk(γ, γ0)MΛXk′(γ, γ

0)PF 0

]+ 2

2K∑k=1

K∑k′=1

(θk − θ0k)δ0

k′1

NTtr[Xk,γMΛXk′(γ, γ

0)PF 0

].

Proposition A.2 Suppose that Assumptions A.1-A.4 hold and N/T → κ > 0. Then ‖θ − θ0‖ = Op((NT )−α).

Proof of Proposition A.2. By definition,

θ =

( T∑t=1

X ′t,γMΛXt,γ

)−1( T∑t=1

X ′t,γMΛYt

).

35

Page 36: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Substituting Yt = Xt,γθ0 +Xt(γ

0, γ)δ0 + Λ0f0t + et into the above equation yields

( 1

NT

T∑t=1

X ′t,γMΛXt,γ

)(θ− θ0) =

1

NT

T∑t=1

X ′t,γMΛet +1

NT

T∑t=1

X ′t,γMΛΛ0f0t +

1

NT

T∑t=1

X ′t,γMΛXt(γ0, γ)δ0.

(A.5)Consider the first term on the right hand side (RHS) of (A.5). By

PΛ − PΛ0 =1

N(Λ− Λ0H)(Λ− Λ0H)′ +

1

N(Λ− Λ0H)H ′0′

+1

NΛ0H(Λ− Λ0H)′ +

1

NΛ0[HH ′ − (

Λ′Λ

N)−1]Λ0′, (A.6)

the first term on the RHS of (A.5) is equal to

1

NT

T∑t=1

X ′t,γMΛ0et −1

NT

T∑t=1

X ′t,γ

[ 1

N(Λ− Λ0H)(Λ− Λ0H)′

]et −

1

NT

T∑t=1

X ′t,γ

[ 1

N(Λ− Λ0H)H ′0′

]et

− 1

NT

T∑t=1

X ′t,γ

[ 1

NΛ0H(Λ− Λ0H)′

]et −

1

NT

T∑t=1

X ′t,γ

[ 1

NΛ0(HH ′ − (

Λ′Λ

N)−1)

Λ0′]et = II1 − · · · − II5, say

Let IIl,k be the kth element of IIl for l = 1, . . . , 5 and k = 1, . . . , 2K. Because K is a finite value, it suffices toconsider IIl,k for each k. Term II1,k is Op( 1√

NT) since supγ∈Γ | 1√

NT

∑Tt=1X

′t,γMΛ0et| = Op(1). For II2,k,

|II2,k| ≤‖Λ− Λ0H‖2

N

‖eX′k,γ‖NT

=1√TOp(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT )

since supγ∈Γ1NT ‖eX

′k,γ‖ = Op(

1√T

). For II3,k,

|II3,k| ≤1√N

‖Λ− Λ0H‖√N

‖H ′‖‖Λ0′eX′k,γ‖

NT= Op(

1√NT

)Op(‖θ − θ0‖+ ‖δ0‖+ π−1NT )

since supγ∈Γ1NT ‖Λ

0′eX′k,γ‖ = Op(1√T

). For II4,k,

|II4,k| ≤ ‖H‖‖eX′k,γ‖NT

‖Λ0‖√N

‖Λ− Λ0H‖√N

= Op(1√T

)Op(‖θ − θ0‖+ ‖δ0‖+ π−1NT ).

For II5,k,

|II5,k| ≤1√N

‖Λ0‖√N

‖Λ0′eX′k,γ‖NT

‖HH ′ − (Λ′Λ

N)−1‖ = Op(

1√NT

)Op(‖θ − θ0‖+ ‖δ0‖+ π−2NT ).

Summarizing all the above results, we have

1

NT

T∑t=1

X ′t,γMΛet = op(‖θ − θ0‖+ ‖δ0‖) +Op(π−2NT ). (A.7)

Next, we consider the second term on the right hand side of (A.5). By the fact that MΛΛ0 = MΛ(Λ0 − ΛH−1) andequation (A.4), we can write the kth entry of 1

NT

∑Tt=1X

′t,γMΛΛ0f0

t as

1

NTtr[(Λ0 − ΛH−1)′MΛXk,γF

0]

= − 1

NTtr[(I1 + · · ·+ I15)′MΛXk,γF

0G′Λ′]

= −(J1,k + · · ·+ J15,k), say. (A.8)

36

Page 37: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where G = (N−1Λ0′Λ)−1(T−1F 0′F 0)−1 and J1,k, . . . , J15,k are implicitly defined in the above expression. ForJ1,k, we use Lemma A.1(ii) and the fact that Λ = Λ0H + (Λ− Λ0H) to obtain

|J1,k| =1

N2T 2

∣∣∣tr(Λ′ee′MΛXk,γF0G′)

∣∣∣≤

1√N‖H‖‖Λ

0′ee′‖NT

+‖Λ− Λ0H‖√

N

‖ee′‖NT

‖MΛXk,γ‖√NT

‖F 0‖√T‖G‖

= Op(1√

NπNT) +Op(π

−1NT ‖θ − θ

0‖+ π−1NT ‖δ

0‖+ π−2NT ).

For J2,k, we have

J2,k =1

N2T 2tr(e′MΛXk,γF

0G′Λ′Λ0F 0′)

=1

NTtr(e′MΛXk,γPF 0).

Then we have|J2,k| ≤

1

NT

∣∣∣tr(e′MΛ0Xk,γPF 0)∣∣∣+

1

NT

∣∣∣tr[e′(PΛ0 − PΛ)Xk,γPF 0 ]∣∣∣.

The first term is Op( 1√NT

) since supγ∈Γ | 1√NT

tr(e′MΛ0Xk,γPF 0)| = Op(1). For the second term, by the expres-sion of PΛ − PΛ0 , it is equal to

− 1

NTtr[ 1

N(Λ− Λ0H)(Λ− Λ0H)′Xk,γPF 0e′

]− 1

NTtr[ 1

NΛ0H(Λ− Λ0H)′Xk,γPF 0e′

]− 1

NTtr[ 1

N(Λ− Λ0H)H ′0′Xk,γPF 0e′

]− 1

NTtr[ 1

NΛ0(HH ′ − 1

N(Λ0′Λ0)−1)Λ0′Xk,γPF 0e′

]We use II6,k, . . . , II9,k to denote the above four terms. For II6,k,

|II6,k| ≤‖Λ− Λ0H‖2

N

‖Xk,γPF 0e′‖NT

=1√TOp(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT )

since supγ∈Γ1NT ‖Xk,γPF 0e′‖ = Op(

1√T

). For II7,k,

|II7,k| ≤1√NT‖H‖‖Xk,γF

0‖√NT

T‖(F 0′F 0)−1‖‖F0′e′0‖√NT

‖Λ− Λ0H‖√N

=1√NT

Op(‖θ − θ0‖+ ‖δ0‖+ π−1NT )

by supγ∈Γ1√NT‖Xk,γF

0‖ ≤ ‖XkF0‖ = Op(1). For II8,k,

|II8,k| ≤1√T‖H‖‖Λ

0‖√N

‖Xk,γF0‖√

NTT‖(F 0′F 0)−1‖‖F

0′e′‖√NT

‖Λ− Λ0H‖√N

=1√TOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ).

For II9,k,

|II9,k| ≤1√T

‖Λ0‖√N

‖Xk,γF0‖√

NTT‖(F 0′F 0)−1‖‖F

0′e′0‖√NT

∥∥∥HH ′− 1

N(Λ0′Λ0)−1

∥∥∥ =1√TOp(‖θ−θ0‖+‖δ0‖+π−2

NT ).

Summarizing the above results, we have |J2,k| = op(‖θ − θ0‖+ ‖δ0‖) +Op(π−2NT ). For J3,k, we have

|J3,k| =∣∣∣∣ 1

N2T 2tr(

Λ′eF 0(Λ0 − ΛH−1)′MΛXk,γF0G′)∣∣∣∣

≤ 1√NT

‖Λ′eF 0‖√NT

‖Λ− Λ0H‖√N

‖H−1‖‖MΛXk,γ‖√

NT

‖F 0G′‖√T

=1√NT

Op(‖θ − θ0‖+ ‖δ0‖+ π−1NT ) +

1√TOp(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT ),

37

Page 38: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where we use the fact that

‖Λ′eF 0‖√NT

≤ ‖H‖‖Λ0′eF 0‖√NT

+√N‖Λ− Λ0H‖√

N

‖eF 0‖√NT

= Op(1) +√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ).

Next, we consider J4,k. Note that

J4,k =

2K∑k′=1

(θk′ − θ0k′) ·

1

N2T 2tr(Xk′,γ0e′MΛXk,γF

0G′Λ′)≡

2K∑k′=1

(θk′ − θ0k′) · J4,k(k′).

We can show that |J4,k(k′)| is bounded by

1√T

‖Xk′,γ0e′‖N√T

‖Xk,γ‖√NT

‖F 0G′Λ′‖√NT

= Op(1√T

).

It follows that J4,k = op(‖θ − θ0‖). By the same token, we can show that J5,k = op(‖θ − θ0‖), J6,k = op(‖δ‖) =

op(‖δ0‖+ ‖θ − θ0‖) and J7,k = op(‖δ0‖+ ‖θ − θ0‖). Next, we consider J8,k. Note that

J8,k =

2K∑k′=1

(θk′ − θ0k′) ·

1

N2T 2tr[Xk′,γ0F 0(Λ0 − ΛH−1)′MΛXk,γF

0G′Λ′]≡

2K∑k′=1

(θk′ − θ0k′) · J8,k(k′),

where

|J8,k(k′)| ≤‖Xk′,γ0‖√

NT

‖F 0‖√T

‖Λ− Λ0H‖‖H−1‖√N

‖Xk,γ‖√NT

‖F 0G′Λ′‖√NT

= Op(‖θ − θ0‖+ ‖δ0‖+ π−1NT ).

Then J8,k = op(‖θ − θ0‖). By the same token, we can show that J10,k = op(‖θ − θ0‖ + ‖δ0‖). For J9,k, we havethat

J9,k =1

N2T 2

2K∑k′=1

(θk − θ0k)tr[Λ0F 0X′k′,γ0MΛXk,γF

0G′Λ′] =1

NT

2K∑k′=1

(θk′ − θ0k′)tr

[X′k′,γ0MΛXk,γPF 0

].

It follows that J9,k = Op(‖θ − θ0‖). For J11,k, we have that

J11,k =1

NT

K∑k′=1

δk′tr[Xk′(γ

0, γ)MΛXk,γPF 0

].

It follows that J11,k = Op(‖θ − θ0‖ + ‖δ0‖). For the terms J12,k, . . . , J15,k, we can readily show that they are allop(‖θ − θ0‖+ ‖δ0‖). Given the above results, with some manipulations on the terms J9,k and J11,k, we have

1

NT

T∑t=1

X ′t,γMΛΛ0f0t =

( 1

NT

T∑t=1

X ′t,γMΛ

T∑s=1

Xs,γast

)(θ − θ) +

( 1

NT

T∑t=1

X ′t,γMΛ

T∑s=1

Xs(γ, γ0)ast

)δ0

+Op(π−2NT ) + op(‖θ − θ0‖+ ‖δ0‖). (A.9)

Substituting (A.7) and (A.9) into (A.5), we have

[B(Λ, γ) + op(1)](θ − θ0) = −[B(γ, Λ) + op(1)]δ0 +Op(π−2NT ). (A.10)

Multiplying (NT )α on both sides, by Assumptions A.1(i) and Assumption A.3, we have (NT )αθ = Op(1).

38

Page 39: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Proof of Theorem 3.2. We use Lemma A.4 (ii)-(vi) to prove this theorem. Consider the objective functionL(θ, Λ, γ).By definition, we have

1

(NT )1−2αL(θ, Λ, γ) =

1

(NT )1−2α

T∑t=1

[Λ0f0

t + et − Ψt(θ, γ)]′MΛ

[Λ0f0

t + et − Ψt(θ, γ)]

=1

(NT )1−2α

T∑t=1

Ψt(θ, γ)′MΛΨt(θ, γ) +1

(NT )1−2α

T∑t=1

e′tMΛet

+1

(NT )1−2α

T∑t=1

f0′t Λ0′MΛΛ0f0

t − 21

(NT )1−2α

T∑t=1

Ψt(θ, γ)′MΛΛ0f0t

− 21

(NT )1−2α

T∑t=1

Ψt(θ, γ)′MΛet + 21

(NT )1−2α

T∑t=1

f0′t Λ0′MΛet,

where Ψt(θ, γ) is defined in Lemma A.4. Using results (a)-(e), together with the fact that

1

(NT )1−2α

T∑t=1

Ψt(θ, γ)′MΛΨt(θ, γ) =

2K∑k=1

2K∑k′=1

(θk − θ0k)(θk′ − θ0

k′)1

NTtr[Xk,γMΛXk′,γ

]+

K∑k=1

K∑k′=1

δ0kδ

0k′

1

NTtr[Xk(γ, γ0)MΛXk′(γ, γ

0)]

+ 2

2K∑k=1

K∑k′=1

(θk − θ0k)δ0

k′1

NTtr[Xk,γMΛXk′(γ, γ

0)]

we have

1

(NT )1−2αL(θ, Λ, γ) =

2K∑k=1

2K∑k′=1

(θk − θ0k)(θk′ − θ0

k′)1

NTtr[Xk,γMΛXk′,γMF 0

]+

K∑k=1

K∑k′=1

δ0kδ

0k′

1

NTtr[Xk(γ, γ0)MΛXk′(γ, γ

0)MF 0

]+ 2

2K∑k=1

K∑k′=1

(θk − θ0k)δ0

k′1

NTtr[Xk,γMΛXk′(γ, γ

0)MF 0

]+

1

(NT )1−2αtr[e′MΛ0eMF 0

]+ op(1).

Let θγ0 and Λγ0 be the estimator defined by

(θγ0 , Λγ0) = argmax(θ,Λ)∈Θ×L

L(θ,Λ, γ0).

Note that θγ0 is the least squares estimator for a standard IFEs model with known γ0. According to Bai (2009) andMoon and Weidner (2017), θγ0 − θ0 = Op(π

−2NT ), or equivalently (NT )α(θγ0 − θ0) = op(1) under N/T → κ.

Given this, with the same arguments in deriving the above expression, we have

1

(NT )1−2αL(θγ0 , Λγ0 , γ0) =

1

(NT )1−2αtr[e′MΛ0eMF 0

]+ op(1).

However, we also have L(θ, Λ, γ) ≤ L(θγ0 , Λγ0 , γ0) due to the definition of (θ, Λ, γ). Given this result, with somealgebra manipulation on the first three terms of L(θ, Λ, γ), we have[

(NT )α(θ − θ0) + B(Λ, γ)−1B(Λ, γ)C0]′B(Λ, γ)

[(NT )α(θ − θ0) + B(Λ, γ)−1B(Λ, γ)C0

]39

Page 40: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

+ C0′[B(Λ, γ)− B(Λ, γ)′B(Λ, γ)−1B(Λ, γ)

]C0 ≤ op(1).

Noting that (l, k)th entry of B(γ, Λ) is 1NT tr[X′l,γMΛXk,γMF 0 ], we have

supγ∈Γ

∥∥∥B(γ, Λ)− B(γ,Λ0)∥∥∥ ≤ [ 2K∑

l,k=1

( 1

NTtr[MF 0X′l,γ(MΛ −MΛ0)Xk,γ ]

)2]1/2≤[ 2K∑l,k=1

( 1

NT‖MF 0X′l,γ‖‖Xk,γ‖‖PΛ − PΛ0‖

)2]1/2= Op(1) · ‖PΛ − PΛ0‖ = op(1),

since supγ∈Γ ‖Xk,γ‖/√NT ≤ ‖Xk‖/

√NT = Op(1). Similarly we have that

supγ∈Γ

∥∥∥B(γ, Λ)− B(γ,Λ0)∥∥∥ = op(1), and

∥∥∥B(γ, Λ)− B(γ,Λ0)∥∥∥ = op(1).

Hence we can getC0′[B(γ,Λ0)− B(γ,Λ0)′B(γ,Λ0)−1B(γ,Λ0)

]C0 = op(1).

By Assumption A.1(ii), we have

op(1) = C0′[B(γ,Λ0)− B(γ,Λ0)′B(γ,Λ0)−1B(γ,Λ0)

]C0 = C0′I(γ)C0 ≥ ‖C0‖2τ min[1, |γ − γ0|],

for some constant τ > 0 with probability approaching 1. Hence, we must have |γ − γ0| = op(1).

Proposition A.3 Suppose that Assumptions A.1-A.4 hold and N/T → κ. Then we have

(i) ‖θ − θ0‖ = op((NT )−α);

(ii) For Proposition A.1 and Lemmas A.3-A.4, we can strengthen Op(‖θ − θ0‖+ ‖δ0‖) to op((NT )−α).

Proof of Proposition A.3. (i) Revisit (A.10). We have shown above that supγ∈Γ ‖B(Λ, γ) − B(Λ0, γ)‖ = op(1)

and supγ∈Γ ‖B(Λ, γ) − B(Λ0, γ)‖ = op(1). However, we also have B(Λ0, γ)p−→ B(Λ0, γ0) and B(Λ0, γ)

p−→B(Λ0, γ0) = 0 due to the continuous mapping theorem and the definition of B. Given this, we have that µmin(B(Λ, γ)) ≥τ + op(1) and B(Λ, γ) = op(1). These two results, together with (A.10), give (i).

(ii) Note that in the previous analysis, all the terms involving δ0 or δ must include the symbol Xk(γ0, γ). Given

the consistency of γ, the “Op” terms now change to “op” terms. This result, together with (NT )α(δ − δ0) = op(1),

leads to (ii).

A.3 Proof of Theorem 3.3

Let hit(γ1, γ2) ≡ ‖xiteit‖|dit(γ1)−dit(γ2)|, kit(γ1, γ2) ≡ ‖xit‖|dit(γ1)−dit(γ2)|, and αNT ≡ (NT )1−2α. Define

JNT (γ) =1√NT

N∑i=1

T∑t=1

xiteitdit(γ),

40

Page 41: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

J∗NT (γ) =1√NT

T∑t=1

Xt(γ)′PΛ0et,

KNT (γ) =1

NT

N∑i=1

T∑t=1

kit(γ, γ0)2,

K∗NT (γ) =1

N2T

N∑i=1

N∑j=1

T∑t=1

‖xit‖ ‖xjt‖∥∥λ0

j

∥∥ ∥∥λ0i

∥∥ ∣∣dit(γ0)− dit(γ)∣∣ ,

GNT (γ) =1

NT

N∑i=1

T∑t=1

(C0′xit)2∣∣dit(γ)− dit(γ0)

∣∣ , and

G∗NT (γ) =1

T

T∑t=1

∥∥∥ 1

N

N∑i=1

xitλ0′i dit(γ, γ

0)∥∥∥2

.

To prove Theorem 3.3, we add the following proposition and three lemmas.

Lemma A.5 Suppose that Assumptions A.4-A.5 hold. For all η > 0 and ε > 0, there exists some v <∞ such thatfor any B <∞,

(i) Pr

(sup

vαNT

≤|γ−γ0|≤B

∥∥JNT (γ)− JNT (γ0)∥∥

√αNT |γ − γ0|

> η

)≤ ε;

(ii) Pr

(sup

vαNT

≤|γ−γ0|≤B

∥∥J∗NT (γ)− J∗NT (γ0)∥∥

√αNT |γ − γ0|

> η

)≤ ε.

Lemma A.6 Suppose Assumptions A.4-A.5 hold. There exist constants B > 0 and 0 < d, k <∞, such that for all1 > η > 0, ε > 0, and cNT → 0, cNTαNT →∞, there exists a v <∞ such that for large enough (N,T ),

Pr

(inf

vαNT

≤|γ−γ0|≤B

GNT (γ)

|γ − γ0|< (1− η)d

)≤ ε, Pr

(sup

vαNT

≤|γ−γ0|≤cNT

G∗NT (γ)

|γ − γ0|> η

)≤ ε,

Pr

(sup

vαNT

≤|γ−γ0|≤B

KNT (γ)

|γ − γ0|> (1 + η)k

)≤ ε, and Pr

(sup

vαNT

≤|γ−γ0|≤B

K∗NT (γ)

|γ − γ0|> (1 + η)k

)≤ ε.

Lemma A.7 Let

H1,NT (γ) ≡ 1

NT

T∑t=1

Xt(γ, γ0)′(PΛ − PΛ0)Xt,γ0 , H2,NT (γ) ≡ 1

(NT )1−α

T∑t=1

Xt(γ, γ0)′(PΛ − PΛ0)Λ0ft,

H3,NT (γ) ≡ 1

(NT )1−α

T∑t=1

Xt(γ, γ0)′(PΛ − PΛ0)et, and H4,NT (γ) ≡ 1

NT

T∑t=1

Xt(γ, γ0)′(PΛ − PΛ0)Xt(γ, γ

0).

Suppose Assumptions A.1-A.5 hold, and N/T → κ > 0 as (N,T )→∞. Then for arbitrary ε > 0 and η > 0, thereare constants B > 0 and v > 0 such that

Pr

(sup

vαNT

≤|γ−γ0|≤B

‖Hl,NT (γ)‖|γ − γ0|

> η

)≤ ε, for l = 1, 2, 3, 4.

Proof of Theorem 3.3. (i) Now we have that γ − γ0 = op(1), θ − θ0 = op((NT )−α) and 1NΛ0′Λ0 p→ Σλ > 0. Let

the constants B, d and k be as defined in Lemmas A.5-A.7, and m ≡ 2||Σ−1λ ||. Let M ≡ max(d, k,m, ||C0||, 1)

41

Page 42: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

and choose η and ν small enough such that max(η, ν) < M and d −M3(18ν + 22η + 20νη) > 0. Let the event

ENT be the joint event that

(1)∣∣γ − γ0

∣∣ ≤ B,

(2) ‖(N−1Λ0′Λ0)−1‖ ≤ m,

(3) (NT )α‖θ − θ0‖ ≤ ν,

(4) inf vαNT

≤|γ−γ0|≤BGNT (γ)|γ−γ0| > (1− η)d,

(5) sup vαNT

≤|γ−γ0|≤BKNT (γ)|γ−γ0| < (1 + η)k,

(6) sup vαNT

≤|γ−γ0|≤BK∗NT (γ)|γ−γ0| < (1 + η)k,

(7) sup vαNT

≤|γ−γ0|≤B‖J∗NT (γ)−J∗NT (γ0)‖√

αNT |γ−γ0| < η,

(8) sup vαNT

≤|γ−γ0|≤B‖JNT (γ)−JNT (γ0)‖√

αNT |γ−γ0| < η,

(9) sup vαNT

≤|γ−γ0|≤B‖Hl,NT (γ)‖|γ−γ0| < η, for l = 1, 2, 3, 4, and

(10) sup vαNT

≤|γ−γ0|≤BG∗NT (γ)|γ−γ0| < η,

Fix ε > 0, one can choose v for large enough (N,T ) such that Pr(ENT ) ≥ 1 − ε, by the Assumption A.2(ii) and

Lemmas A.5-A.7. Let δ = (NT )−αC, we have ‖C − C0‖ ≤ (NT )α‖θ − θ0‖ ≤ ν.It suffices to show that ENT implies

∣∣γ − γ0∣∣ ≤ v

αNT. Conditional on ENT , we consider v

αNT≤∣∣γ − γ0

∣∣ ≤cNT ≤ B and calculate

(NT )2α−1

|γ − γ0|

(L(θ, Λ, γ)− L(θ, Λ, γ0)

)=

(NT )2α

|γ − γ0|1

NT

T∑t=1

δ′Xt(γ, γ0)′MΛXt(γ, γ

0)δ

− 2(NT )2α

|γ − γ0|1

NT

T∑t=1

δ′Xt(γ, γ0)′MΛXt,γ0(θ − θ0)

− 2(NT )2α

|γ − γ0|1

NT

T∑t=1

δ′Xt(γ, γ0)′MΛΛ0f0

t

− 2(NT )2α

|γ − γ0|1

NT

T∑t=1

δ′Xt(γ, γ0)′MΛet

≡ L1 + · · ·+ L4. (A.11)

For L1, we have

L1 =(NT )2α

|γ − γ0|1

NT

T∑t=1

δ0′Xt(γ, γ0)′MΛ0Xt(γ, γ

0)δ0

+(NT )2α

|γ − γ0|1

NT

T∑t=1

(δ + δ0)′Xt(γ, γ0)′MΛ0Xt(γ, γ

0)(δ − δ0)

42

Page 43: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

− (NT )2α

|γ − γ0|1

NT

T∑t=1

δ′Xt(γ, γ0)′(PΛ − PΛ0)Xt(γ, γ

0)δ

≡ L11 + L12 + L13.

By events (10) and (4) we have

L11 ≥(NT )2α

|γ − γ0|1

NT

T∑t=1

δ0′Xt(γ, γ0)′Xt(γ, γ

0)δ0

− (NT )2α

|γ − γ0|1

N2T

T∑t=1

‖δ0′Xt(γ, γ0)′Λ0‖2

∥∥∥(1

NΛ0′Λ0)−1

∥∥∥>GNT (γ)

|γ − γ0|− ‖C0‖2 G

∗NT (γ)

‖γ − γ0‖

∥∥∥(1

NΛ0′Λ0)−1

∥∥∥> (1− η)d− ‖C0‖2mη > d− (M +M3)η.

For L12, we have

|L12| ≤ ‖C − C0‖‖C + C0‖ 1

|γ − γ0|NT

∥∥∥ T∑t=1

Xt(γ, γ0)′MΛ0Xt(γ, γ

0)∥∥∥

≤ ‖C − C0‖‖C + C0‖KNT (γ)

|γ − γ0|≤ ν(2‖C0‖+ ν)(1 + η)k ≤ 2M2(1 + η)ν,

by events (3) and (5). For L13, we have

|L13| ≤ (‖C0‖+ ν)2 ‖H4,NT (γ)‖|γ − γ0|

≤ 4M2η,

by events (3) and (9). For L2, we have

|L2| ≤ 2(NT )2α

|γ − γ0|

∥∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′Xt,γ0(θ − θ0)

∥∥∥∥+ 2

(NT )2α

|γ − γ0|

∥∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′PΛ0Xt,γ0(θ − θ0)

∥∥∥∥+ 2

(NT )2α

|γ − γ0|

∥∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′(PΛ − PΛ0)Xt,γ0(θ − θ0)

∥∥∥∥≡ L21 + L22 + L23.

For L21, noting that∥∥∥∥ 1

NT

T∑t=1

Xt(γ, γ0)′Xt,γ0

∥∥∥∥ ≤ ∥∥∥∥ 1

NT

T∑t=1

Xt(γ, γ0)′Xt

∥∥∥∥+

∥∥∥∥ 1

NT

T∑t=1

Xt(γ, γ0)′Xt(γ

0)

∥∥∥∥ ≤ 2KNT (γ),

we have

L21 ≤ 4‖C‖[(NT )α‖θ − θ0‖

]KNT (γ)

|γ − γ0|≤ 8M2(1 + η)ν,

43

Page 44: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

by events (3) and (5). Similarly, we have

L22 ≤ 4‖C‖(

(NT )α‖θ − θ0‖)∥∥∥(

Λ0′Λ0

N)−1∥∥∥K∗NT (γ)

|γ − γ0|≤ 8M3(1 + η)ν, and

L23 ≤ 2‖C‖(

(NT )α‖θ − θ0‖)H1,NT (γ)

|γ − γ0|≤ 2Mην,

by events (3), (6) and (9). Next, we consider L3. We have

|L3| ≤ 2(NT )2α

|γ − γ0|

∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′(PΛ − PΛ0)Λ0f0

t

∥∥∥≤ 2‖C‖

∥∥∥ 1

(NT )1−α

T∑t=1

Xt(γ, γ0)′(PΛ − PΛ0)Λ0ft

∥∥∥ 1

|γ − γ0|

= 2‖C‖H2,NT (γ)

|γ − γ0|≤ 4Mη,

by events (3) and (9). For L4, we have

|L4| = 2(NT )2α

|γ − γ0|

∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′MΛet

∥∥∥≤ 2

(NT )2α

|γ − γ0|

∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′et

∥∥∥+ 2(NT )2α−1/2

|γ − γ0|

∥∥∥ 1√NT

T∑t=1

δ′Xt(γ, γ0)′PΛ0et

∥∥∥+ 2

(NT )2α

|γ − γ0|

∥∥∥ 1

NT

T∑t=1

δ′Xt(γ, γ0)′(PΛ − PΛ0)et

∥∥∥≡ L41 + L42 + L43.

By events (3) and (7)-(9), we have

L41 ≤ 2‖C‖‖JNT (γ)− JNT (γ0)‖√αNT |γ − γ0|

≤ 4Mη,

L42 ≤ 2‖C‖‖J∗NT (γ)− J∗NT (γ0)‖√αNT |γ − γ0|

≤ 4Mη,

L43 ≤ 2‖C‖‖H3,NT (γ)‖|γ − γ0|

≤ 4Mη.

Therefore, we conclude that

(NT )2α

|γ − γ0|(L(θ, Λ, γ)− L(θ, Λ, γ0)) ≥ L1 − |L2| − |L3| − |L4|

≥ d−M3(18ν + 22η + 20νη) > 0

for v/αNT ≤ |γ − γ0| ≤ cNT . We can conclude that when ENT occurs, we have |γ − γ0| < v/αNT . Note thatENT happens with probability larger than 1− ε. Therefore, for any ε > 0, there is a constant v such that for (N,T )

sufficiently large, we have

Pr(|γ − γ0| ≥ v

αNT

)< ε.

This shows that |γ − γ0| = Op(α−1NT ).

44

Page 45: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

(ii) As mentioned in the proof of the second result of Proposition A.3, all the terms involving δ0 or δ have

Xk(γ, γ0). Given the obtained convergence rate of γ, these terms now are Op((NT )α−1) = op(1√NT

). Now revisit

the proof of Proposition A.2. Neglecting all the terms involving δ0, we immediately obtain√NT‖θ− θ0‖ = Op(1)

when N/T → κ. In addition, all the results in Lemma A.3 can be sharpened to Op(π−2NT ), and the second result of

Proposition A.1 can also be sharpened to Op(π−2NT ).

A.4 Proof of Theorem 3.4

To prove Theorem 3.4, we need the following two lemmas. Lemma A.8 finds the asymptotic bias terms and Lemma

A.9 establishes a central limit theorem (CLT). The main difficulty is to find the asymptotic distribution of CNT (γ)

= (NT )−1∑i,t zit,γeit. Because zit,γ is a variable depends on all entries of Xk,γ2Kk=1, it is difficult to show CLT

directly. To establish the CLT, we follow the same strategy as used by Moon and Weidner (2017).We define some notations. For each k = 1, . . . , 2K, we define N ×T matrices Xk,γ , Xk,γ and Zk,γ as follows:

Xk,γ = ED(Xk,γ), Xk,γ = Xk,γ −Xk,γ , and Zk,γ = MΛ0Xk,γMF 0 + Xk,γ .

Because Xk,γ is centered around zero conditional onD, one can verify that ‖Xk,γPF 0‖ and ‖PΛ0Xk,γ‖ are op(√NT ).

In the proof of Lemma A.9, we can see that eitzit,γ is an m.d.s., where zit,γ is defined analogously to zit,γ .

Lemma A.8 Suppose that Assumptions A.1-A.6 hold and N/T → κ > 0. Then for k = 1, . . . , 2K, we have

(i) (NT )−1/2tr(MΛ0Xk,γMF 0e′) = (NT )−1/2tr(e′Zk,γ)−√κB1,kNT (γ) + op(1), uniformly on γ

(ii) B2,NT (γ) = B2,NT (γ0) + op(1);

(iii) B3,NT (γ) = B3,NT (γ0) + op(1).

Lemma A.9 Suppose that Assumptions A.2, A.4 and A.6 hold. Then

1√NT

N∑i=1

T∑t=1

eitzit,γd−→ G(γ) in `∞(Γ),

where G(γ) is some Gaussian process with E(G(γ1)G(γ2)′) = Ω(γ1, γ2) and `∞(Γ) denotes the space of bounded

functions over the compact set Γ endowed with the uniform metric.

Proof of Theorem 3.4. Revisit equation (A.5) in the proof of Proposition A.2. The first term on the RHS is1NT

∑Tt=1X

′t,γMΛet, whose kth element is studied in details. According to the analysis there, only II1,k and II4,k

matter since the remaining terms are either op(‖θ − θ0‖), or op(π−3NT ), or involve δ0. For those terms involving δ0,

they are Op((NT )α−1) due to the arguments in the proof of result (ii) of Theorem 3.3. For II4,k, substituting (A.4)into it and some straightforward computation shows that

II4,k =1

Ntr[ED(e′e)X′0k,γ(Λ0′Λ0)−1(F 0′F 0)−1F 0′

]+Op(π

−3NT ) + op(‖θ − θ0‖) +Op((NT )α−1).

45

Page 46: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Given this, we have

1

NT

T∑t=1

X ′tk,γMΛet =1

NT

T∑t=1

X ′tk,γMΛ0et −1

Ntr[ED(e′e)X′0k,γ(Λ0′Λ0)−1(F 0′F 0)−1F 0′

]+ op(

1√NT

),

where Xtk,γ is the kth column of Xt,γ . Next, we consider the second term 1NT

∑Tt=1X

′t,γMΛΛ0f0

t , whose kthterm is analyzed according to (A.8). By the same arguments, i.e., neglecting all the terms that are op(‖θ − θ0‖), orOp(π

−3NT ), or involve δ0, we only keep J1,k, J2,k and J9,k. This gives

1

NT

T∑t=1

X ′tk,γMΛΛ0f0t =

1

NT 2

T∑t=1

T∑s=1

astX′tk,γMΛXs,γ0(θ − θ0)

− 1

NT 2

T∑t=1

T∑s=1

astX′tk,γMΛes − J1,k + op(

1√NT

).

For the second term on the RHS of the above equation, it can be decomposed into five terms, which are giv-en in the analysis of J2,k in the proof of Proposition A.2. The analysis there indicates that only the first term

1NT tr(e′MΛ0Xk,γPF 0) and II8,k matter. Substituting (A.4) into II8,k, we have

II8,k =1

Ntr[ED(e′e)PF 0X′0k,γ(Λ0′Λ0)−1(F 0′F 0)−1F 0′

]+Op(π

−3NT ) + op(‖θ − θ0‖) +Op((NT )α−1).

In addition, with the results in Lemma A.3, we can readily show that J1,k = B2,k,NT (γ)+op(1√NT

). With the aboveresults, we have

1

NT

T∑t=1

X ′tk,γMΛΛ0f0t =

1

NT 2

T∑t=1

T∑s=1

astX′tk,γMΛXs,γ0(θ − θ0)− 1

NT 2

T∑t=1

T∑s=1

astX′tk,γMΛ0es − B2,k,NT (γ)

+1

Ntr[ED(e′e)PF 0X′0k,γ(Λ0′Λ0)−1(F 0′F 0)−1F 0′

]+ op(

1√NT

).

Finally, we consider the last term:

1

NT

T∑t=1

Xtk,γMΛX(γ0, γ)δ0 = Op(|γ − γ0|)Op(‖δ0‖) = Op((NT )α−1) = op(1√NT

).

Given the above analysis, we have

[B(Λ0, γ) + op(1)]√NT (θ − θ0) =

√NTCNT (γ)−

√T

NB2,NT (γ)−

√N

TB3,NT (γ) + op(1).

Given the convergence rate of γ, one can verify that

B(Λ0, γ) = ωNT (γ0, γ0) + op(1),

√NTCNT (γ) =

1√NT

N∑i=1

T∑t=1

eitzit,γ −√N

TB1,NT (γ),

√NTB`,NT (γ) =

√NTB`,NT (γ0) + op(1), for ` = 1, 2, 3,

1√NT

N∑i=1

T∑t=1

eitzit,γ =1√NT

N∑i=1

T∑t=1

eitzit,γ0 + op(1),

46

Page 47: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where the last result is due to Lemma A.9. It follows that[ωNT (γ0, γ0)

]−1√NT (θ − θ0) =

1√NT

N∑i=1

T∑t=1

eitzit,γ0 −√N

TB1,NT (γ0)

−√T

NB2,NT (γ0)−

√N

TB3,NT (γ0) + op(1).

Then by Lemmas A.8 and A.9 as well as Assumption A.6, we have that as N/T → κ,

ω0

√NT (θ − θ0)− B d→ N(0,Ω0),

where B = −κ1/2B1(γ0)− κ−1/2B2(γ0)− κ1/2B3(γ0) and ω0 = ω(γ0, γ0).

A.5 Proof of Theorem 3.5

Let g ≡ C0′D0fC

0 and h ≡ C0′V 0f C

0. Let

RNT (v) ≡√αNT

[JNT (γ0 +

v

αNT)− JNT (γ0)

],

GNT (v) ≡ αNTGNT (γ0 +v

αNT), and KNT (v) ≡ αNTKNT (γ0 +

v

αNT).

DefineLNT (v) = L(θ, Λ, γ0)− L(θ, Λ, γ0 +

v

αNT).

To prove Theorem 3.5, we need the following three lemmas.

Lemma A.10 Suppose that Assumptions A.2 and A.4–A.5 and N/T → κ > 0. Then RNT (v) ⇒ B(v), where

B(v) is a Brownian motion with covariance matrix E [B(u)B(v)′] = V 0f min (u, v).

Lemma A.11 Suppose that Assumptions A.2 and A.4–A.5 hold and N/T → κ > 0. Then GNT (v)p→ g |v| and

KNT (v)p→ ‖Df‖ |v| uniformly in v ∈ Υ, where g = C0′D0

fC0 and Υ is a compact set on the real line that includes

0 as its interior point.

Lemma A.12 Suppose that Assumptions A.1-A.7 hold and N/T → κ > 0. Then

LNT (v)⇒ L(v) = −g |v|+ 2√hW (v),

where h = C0′V 0f C

0.

Proof of Theorem 3.5. Noting that γ = argminγL(θ(γ), Λ(γ), γ) = argminγL(θ, Λ, γ) where θ = θ(γ) andΛ = Λ(γ), we have αNT (γ − γ0) = argmaxvLNT (v). By Theorem 3.3, αNT (γ − γ0) = Op(1). So it sufficesto confine the analysis on some compact set K. By Lemma A.12, LNT (v) ⇒ L(v) in `∞(K), the space of allthe bounded functions over some compact set K endowed with the uniform metric. The limit functional L(v) iscontinuous, has a unique maximum, and lim|v|→∞ L(v) = −∞ almost surely. It therefore satisfies the conditions inTheorem 2.7 of Kim and Pollard (1990). By the argmax continuous mapping theorem (CMT), we have

αNT (γ − γ0)d→ argmaxv∈RL(v).

47

Page 48: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Following the proof of Theorem 1 in Hansen (2000), we have

argmaxv∈RL(v) = φargmaxr∈R

[−gφ |r|+ 2

√h√φW (r)

]= φargmaxr∈R[−gφ |r|+ 2gφW (r)]

= φargmaxr∈R[−|r|2

+W (r)],

where we apply the change of variables with v = φr, φ = h/g2 and the distributional equality W (a2r) = aW (r).

This completes the proof of the theorem.

A.6 Proof of Theorem 3.6

To prove theorem 3.6, we need the following lemma.

Lemma A.13 Suppose Assumptions A.1-A.5 hold and N/T → κ > 0. Then L(θγ0 , Λγ0 , γ0) − L(θ, Λ, γ0) =

op(1).

Proof of Theorem 3.6. Recall that σ2 = 1NT L(θ, Λ, γ). It is easy to show that σ2 p→ σ2. By the definitions of

LRNT (·) and LNT (·), we have

σ2LRNT (γ0)− LNT (v) =[L(θγ0 , Λγ0 , γ0)− L(θ, Λ, γ0 +

v

αNT)]−[L(θ, Λ, γ0)− L(θ, Λ, γ0 +

v

αNT)]

= L(θγ0 , Λγ0 , γ0)− L(θ, Λ, γ0) = op(1),

where the last result is due to Lemma A.13. By Lemma A.12 and the CMT,

LRNT (γ0) =LNT (v)

σ2+ op(1) =

supv LNT (v)

σ2+ op(1)

d→ supv L(v)

σ2.

The limit distribution is

1

σ2supv

[− g|v|+ 2

√hW (v)

]=

1

σ2supu

[− g∣∣∣ hg2u∣∣∣+ 2

√hW (

h

g2u)]

=h

gσ2supu

[− |u|+ 2W (u)

]= η2Ξ,

where the first equality holds by the change of variables (v = hg2u) and the second equality holds by the fact that

W (a2r) = aW (r).Note that we can write Ξ = 2 max(Ξ1,Ξ2), where Ξ1 = supr≤0[− 1

2 |r| + W (r)] and Ξ2 = supr≥0[− 12 |r| +

W (r)]. Ξ1 and Ξ2 are independent exponential random variables with distribution function Pr(Ξ1 ≤ x) = 1− e−x.It follows that

Pr(Ξ ≤ x) = Pr(2 max (Ξ1,Ξ2) ≤ x) = Pr(Ξ1 ≤ x/2) Pr(Ξ2 ≤ x/2) = (1− e−x/2)2..

This completes the proof of the theorem.

48

Page 49: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

A.7 Proof of Theorem 4.1

Proof of Theorem 4.1. Let θ(γ) be the bias corrected estimator that can be obtained as in Bai (2009) or Moon andWeidner (2017) by treating γ as known. Note that the model can be written as

yit = x′itθ0 + λ0′

i f0t +

(eit +

1√NT

x′itcdit(γ0)).

Treating the expression in the brackets as a new error term, by Bai (2009) or Moon and Weidner (2017),

√NT (θ(γ)− θ0) = ωNT (γ, γ)−1 1√

NT

N∑i=1

T∑t=1

zit,γeit

+ ωNT (γ, γ)−1 1

NT

N∑i=1

T∑t=1

zit,γ(zit,γ − zit,γ0)′Lc+ opγ(1).

where opγ(1) denotes the terms which are op(1) uniformly in γ. Recall that SNT (γ) ≡ 1√NT

∑Ni=1

∑Tt=1 zit,γeit

and let ωNT (γ1, γ2) ≡ 1NT

∑Ni=1

∑Tt=1 zit,γ1

z′it,γ2. By L′θ0 = δ0,

√NTδ0 = c and c = L′Lc, we have

√NTL′θ(γ) = L′Lc+ L′ωNT (γ, γ)−1SNT (γ) + L′ωNT (γ, γ)−1

[ωNT (γ, γ0)− ωNT (γ, γ)

]Lc+ opγ(1)

= L′ωNT (γ, γ)−1SNT (γ) + L′ωNT (γ, γ)−1ωNT (γ, γ0)Lc+ opγ(1)

⇒ L′ω(γ, γ)−1S(γ) + L′ω(γ, γ)−1ω(γ, γ0)Lc ≡ S(γ) +Q(γ)c,

where S(γ) ≡ L′ω(γ, γ)−1S(γ) and Q(γ) ≡ L′ω(γ, γ)−1ω(γ, γ0)L.

Next, it is standard to show that L′VNT (γ)Lp→ L′ω(γ, γ)−1Ω(γ, γ)ω(γ, γ)−1L = K(γ, γ) uniformly in γ.

Then by the CMT, we have WNT (γ)⇒W c(γ).

A.8 Proof of Theorem 4.2

Proof of Theorem 4.2. Let WNT = (yit, xit, qit) , 1 ≤ i ≤ N, 1 ≤ t ≤ T and S∗NT (γ) = 1√NT

∑Ni=1

∑Tt=1

zit,γeitvit. Let P ∗ (·) , E∗ (·) and Var∗ (·) denote the probability, expectation and variance conditional on the random

sample WNT . We say that ANT = op∗ (1) is Pw (‖ANT ‖ ≥ ε) = op (1) for any ε > 0. Note that ANT = op (1)

implies that ANT = op∗ (1) . It suffices to prove the theorem by showing that (i) S∗NT (γ)⇒ S(γ) conditional on w,

(ii) ωNT (γ, γ) = ω (γ, γ) + op∗ (1) uniformly in γ ∈ Γ, and (iii) SNT (γ) = S∗NT (γ) + op∗ (1) uniformly in γ ∈ Γ,

(iv) ΩNT (γ, γ) = Ω (γ, γ) + op∗ (1) uniformly in γ ∈ Γ.We first show (i). Note that by Assumption A.7(i)

E∗ [S∗NT (γ1)S∗NT (γ2)′] =1

NT

N∑i=1

T∑t=1

zit,γ1z′it,γ2

e2it = ΩNT (γ1, γ2) = Ω(γ1, γ2) + op∗ (1) .

So S∗NT (γ) is a zero-mean Gaussian process with covariance function kernel Ω(γ1, γ2) asymptotically. In addition,

it is standard to show that the finite dimensional distribution of S∗NT converges to that of SNT as (N,T )→∞. The

stochastic equicontinuity also holds by standard arguments. Then we have S∗NT ⇒ S conditional onWNT .To show (ii), it suffices to show 1

NT

∑i,t ‖zit,γ − zit,γ‖2 = op∗ (1) . We have

1

NT

∑i,t

‖zit,γ − zit,γ‖2 =1

NT

2K∑k=1

‖MΛ0Xk,γMF 0 −MΛ(γ)Xk,γMF (γ)‖2.

49

Page 50: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

We can readily show ‖PΛ0 − PΛ(γ)‖ = Op(π−1NT ) and ‖PF 0 − PF (γ)‖ = Op(π

−1NT ) uniformly. Hence, the result

follows.To show (iii), notice that

S∗NT (γ)− SNT (γ) =1√NT

∑i,t

zit,γ (eit − eit(γ)) vit +1√NT

∑i,t

(zit,γ − zit,γ) eit(γ)vit

≡ A1 (γ) +A2 (γ) . say.

For A1 (γ) , we have eit(γ) = yit − xit,γ ′θ(γ)− λ′ift = eit − x′it,γ(θ(γ)− θ0)− (λi(γ)′ft(γ)− λ0′i f

0t ), and that

A1 (γ) = − 1√NT

∑i,t

zit,γx′it,γ(θ(γ)− θ0)vit −

1√NT

∑i,t

zit,γ(λi(γ)′ft(γ)− λ0′i f

0t )vit

≡ −A11 (γ)−A12 (γ) , say.

The first term is bounded by ‖ 1√NT

∑i,t zit,γx

′it,γvit‖‖θ(γ)− θ0‖ = Op(‖θ− θ0‖ (lnN)

3) = op∗ (1) uniformly in

γ because one can show that supγ∈Γ ‖ 1√NT

∑i,t zit,γx

′it,γvit‖ = Op∗((lnN)

3) by using Bernstein-type inequality

for independent observations (see, e.g., the online supplement of Su, Shi and Phillips (2016)). Note that

‖Var∗(A12)‖ = ‖ 1

NT

∑i,t

zit,γz′it,γ(λi(γ)′ft(γ)− λ0′

i f0t )2‖

≤( 1

NT

∑i,t

‖zit,γ‖4)1/2( 1

NT

∑i,t

(λi(γ)′ft(γ)− λ0′i f

0t )4)1/2

= op∗ (1)

as we can readily show that 1NT

∑i,t(λi(γ)′ft(γ)− λ0′

i f0t )4 = op∗(1) uniformly in γ.

Consider A2 (γ), we have

A2 (γ) =1√NT

∑i,t

(zit,γ − zit,γ) eitvit −1√NT

∑i,t

(zit,γ − zit,γ)xit(γ)′(β(γ)− β)vit

− 1√NT

∑i,t

(zit,γ − zit,γ) (λi(γ)′ft(γ)− λ0′i f

0t )vit.

≡ A21 (γ)−A22 (γ)−A23 (γ) , say.

For A21 (γ), we can calculate the variance of A21 conditional onWNT :

‖Var∗(A21)‖ = ‖ 1

NT

∑i,t

(zit,γ − zit,γ) (zit,γ − zit,γ)′e2it‖

≤ supi,t

e2it ·

1

NT

∑i,t

‖zit,γ − zit,γ‖2

= Op((NT )2/(8+ε)) ·Op(π−1NT ) = op∗ (1) .

For A22(γ) and A23(γ), we can follow the arguments as used in the analysis of A11(γ) and A12(γ) and show they

are op∗ (1) uniformly in γ. Then SNT (γ)− S∗NT (γ) = op∗ (1) uniformly in γ. Thus we have SNT (γ)⇒ S(γ).

To show (iv), we can follow similar arguments to (iii). The analysis is tedious and omitted.

The final result follows from the CMT.

50

Page 51: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

References

Ahn, S. C., and Horenstein, A. R. 2013. Eigenvalue ratio test for the number of factors. Econometrica, 81(3), 1203-

1227.

Ando, T., Bai, J., 2016. Panel data models with grouped factor structure under unknown group membership. Journal

of Applied Econometrics, 31, 163-191.

Andrews, D. W. (1993). Tests for parameter instability and structural change with unknown change point. Econo-

metrica, 821-856.

Arcand, J. L., Berkes, E., Panizza, U., 2015. Too much finance? Journal of Economic Growth, 20, 105-148.

Bai, J., 2003. Inferential theory for factor models of large dimensions. Econometrica, 71, 135-173.

Bai, J., 2009. Panel data models with interactive fixed effects. Econometrica, 77, 1229-1279.

Bai, J., Liao, Y. 2016. Efficient estimation of approximate factor models via penalized maximum likelihood. Journal

of econometrics, 191(1), 1-18.

Bai, J., Ng, S., 2002. Determining the number of factors in approximate factor models. Econometrica, 701, 191-221.

Bernstein, D. S., 2005. Matrix Mathematics: Theory, Facts, and Formulas with Application to Linear Systems The-

ory. Princeton University Press, Princeton.

Bonhomme, S., Manresa, E., 2015. Grouped patterns of heterogeneity in panel data. Econometrica, 83, 1147-1184.

Caner, M., Hansen, B.E., 2004. Instrumental variable estimation of a threshold model. Econometric Theory, 20,

813-843.

Castagnetti, C., Rossi, E., & Trapani, L. 2015. Inference on factor structures in heterogeneous panels. Journal of

econometrics, 184(1), 145-157.

Cecchetti, S. G., Mohanty, M. S. and Zampolli, F. 2011, Achieving growth amid fiscal imbalances: the real effects

of debt, Federal Reserve Bank of Kansas City Proceedings, 145-296.

Chan, K. S. (1993). Consistency and limiting distribution of the least squares estimator of a threshold autoregressive

model. The annals of statistics, 21(1), 520-533.

Checherita-Westphal, C., Rother, P. 2012. The impact of high government debt on economic growth and its channels:

An empirical investigation for the euro area. European economic review, 56(7), 1392-1405.

Chen, M., Fernandez-Val, I., and Weidner, M., 2014. Nonlinear panel models with interactive effects. Working paper,

Department of Economics, Boston University.

Chernozhukov, V., Hansen, C. 2008. Instrumental variable quantile regression: A robust inference approach. Journal

of Econometrics, 142(1), 379-398.

Chernozhukov, V., Hansen, C., Liao, Y. and Zhu, Y., 2018. Inference For Heterogeneous Effects Using Low-Rank

Estimations. arXiv preprint arXiv:1812.08089.

Chudik, A., Mohaddes, K., Pesaran, M. H., & Raissi, M. 2017. Is there a debt-threshold effect on output growth?.

Review of Economics and Statistics, 99(1), 135-150.

51

Page 52: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Dang, V.A., Kim, M., Shin, Y., 2012. Asymmetric capital structure adjustments: New evidence from dynamic panel

threshold models. Journal of Empirical Finance, 19, 465-482.

Davis, C., and Kahan, W. M., 1970. The rotation of eigenvectors by a perturbation. III. SIAM Journal on Numerical

Analysis, 7(1), 1-46.

Hansen, B. E., 1996. Inference when a nuisance parameter is not identified under the null hypothesis. Econometrica,

64, 413-430.

Hansen, B. E., 1999. Threshold effects in non-dynamic panels: estimation, testing, and inference. Journal of Econo-

metrics, 93, 345-368.

Hansen, B. E., 2000. Sample splitting and threshold estimation. Econometrica, 68, 575-603.

Kim, J., Pollard, D., 1990. Cube root asymptotics. Annals of Statistics, 18, 191-219.

Kremer, S., Bick, A., Nautz, D., 2013. Inflation and growth: New evidence from a dynamic panel threshold analysis.

Empirical Economics, 44, 861-878.

Law, S. H., Singh, N., 2014. Does too much finance harm economic growth?. Journal of Banking & Finance, 41,

36-44.

Lee, N., Moon, H. R., Weidner, M. 2012. Analysis of interactive fixed effects dynamic linear panel regression with

measurement error. Economics Letters, 117(1), 239-242.

Levine, R., 2003. More on finance and growth: more finance, more growth? Review-Federal Reserve Bank of Saint

Louis, 85, 31-46.

Levine, R. 2005. Finance and growth: theory and evidence. In Aghion, P. & Durlauf, S. (Eds.), Handbook of Eco-

nomic Growth, Ch. 12, pp. 865-934. Amsterdam: Elsevier.

Li, D., Qian, J., Su, L., 2016. Panel data models with interactive fixed effects and multiple structural breaks. Journal

of the American Statistical Association, 111, 1804-1819.

Lu, X., Su, L., 2016. Shrinkage estimation of dynamic panel data models with interactive fixed effects. Journal of

Econometrics, 190, 148-175.

Moon, H. R., Shum, M., Weidner, M. 2018. Estimation of random coefficients logit demand models with interactive

fixed effects. Journal of Econometrics, 206(2), 613-644.

Moon, H. R., Weidner, M., 2015. Linear regression for panel with unknown number of factors as interactive fixed

effects. Econometrica, 83, 1543-1579.

Moon, H. R., Weidner, M., 2017. Dynamic linear panel regression models with interactive fixed effects. Econometric

Theory, 33, 158-195.

Moon, H. R., Weidner, M. 2018. Nuclear norm regularized estimation of panel regression models. arXiv preprint

arXiv:1810.10987.

Neyman, J., Scott, E. L. 1948. Consistent estimates based on partially consistent observations. Econometrica, 16(1),

1-32.

Okui, R., Wang, W., 2017. Heterogeneous structural breaks in panel data models. Working Paper, Erasmus School

of Economics, Erasmus Universiteit Rotterdam.

52

Page 53: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Onatski, A., 2010. Determining the number of factors from empirical distribution of eigenvalues. Review of Eco-

nomics and Statistics, 92, 1004-1016.

Qian, J., Su, L. 2016. Shrinkage estimation of common breaks in panel data models via adaptive group fused lasso.

Journal of Econometrics, 191, 86-109.

Ramırez-Rondan, N., 2015. Maximum likelihood estimation of dynamic panel threshold models. Working paper,

Central Bank of Peru.

Reinhart, C. M., Rogoff, K. S. (2010). Growth in a Time of Debt. American economic review, 100(2), 573-78.

Seo, M. H., Linton, O., 2007. A smoothed least squares estimator for threshold regression models. Journal of Econo-

metrics, 141, 704-735.

Seo, M. H., Shin, Y., 2016. Dynamic panels with threshold effect and endogeneity. Journal of Econometrics, 195,

169-186.

Su, L., Chen, Q., 2013. Testing homogeneity in panel data models with interactive fixed effects. Econometric Theory,

29, 1079-1135.

Su, L., Hoshino, T. 2016. Sieve instrumental variable quantile regression estimation of functional coefficient models.

Journal of Econometrics 191, 231-254.

Su, L., Jin, S., Zhang, Y., 2015. Specification test for panel data models with interactive fixed effects. Journal of

Econometrics, 186, 222-244.

Su, L., Ju, G., 2018. Identifying latent grouped patterns in panel data models with interactive fixed effects. Journal

of Econometrics, 206, 554-573.

Su, L., Shi, Z., Phillips, P. C. B, 2016. Identifying latent structures in panel data. Econometrica, 84, 2215-2264.

Tong, H. (1978), On a Threshold Model in Pattern Recognition and Signal Processing, ed. C. H. Chen, Amsterdam:

Sijhoff & Noordhoff

Van Der Vaart A. W., Wellner J. A. 1996. Weak Convergence and Empirical Processes with Application to Statistics.

Springer.

Yu, P., Phillips, P. C. B, 2018. Threshold regression with endogeneity. Journal of Econometrics, 203, 50-68.

53

Page 54: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Supplementary Appendix to“Panel Threshold Model with Interactive Fixed Effect”

(NOT for Publication)

Ke Miaoa, Kunpeng Lib and Liangjun Sua

aSchool of Economics, Singapore Management University, SingaporebSchool of Economics and Management, Capital University of Economics and Business, Beijing, China

This supplement is composed of five parts. Section B contains the proofs of the technical lemmas in the above paper.

Section C contains several useful lemmas that are used in the proofs of the technical lemmas in Section B. Section

D studies the determination of the number of factors. Section E provides a consistent test for the two-way additive

fixed effects specification. Section F reports some additional Monte Carlo simulation results.

B Proof of the Technical Lemmas

Proof of Lemma A.1. (i) By Assumption A.2 (iv), (NT )−1‖ee′‖ ≤ (NT )−1‖e‖sp‖e‖ = Op(π−1NT ). Similarly,

(NT )−1‖e′e‖ = Op(π−1NT ). Next, (NT )−1‖e′eF 0‖ ≤ (NT )−1‖e‖sp‖eF 0‖ = (NT )−1Op(

√N+√T )Op(

√NT ) =

Op(π−1NT ), where we use the fact that ‖eF 0‖ = Op(

√NT ) to be shown below. Analogously, (NT )−1‖Λ0′ee′‖ =

Op(π−1NT ).

(ii) Noting that E∥∥eF 0

∥∥2= Etr

(eF 0′F 0e′

)=∑i

∑t,sE

(f0′t f

0s eiteis

)=∑i

∑tE(f0′t f

0t e

2it

)= O (NT ) ,

we have ‖eF 0‖ = Op(√NT ) by Chebyshev inequality. Similarly, we can show that ‖Λ0′e‖ = Op(

√NT ). For

‖Λ0′eF 0‖, we have

E‖Λ0′eF 0‖2 = E[tr(Λ0′eF 0F 0′e′Λ0

) ]=

R∑r=1

N∑i,j=1

T∑t,s=1

E[λ0′i,rf

0t f

0′s λ

0j,reitejs

]

=R∑r=1

N∑i=1

T∑t=1

E[λ0′i,rf

0t f

0′t λ

0i,re

2it

]= O (NT ) .

Then ‖Λ0′eF 0‖ = Op(√NT ).

(iii) The result follows from the inequality ||eX′k,γ || ≤ ||e||sp||X′k,γ ||, Assumption A.2(iv) and the fact that

supγ ||X′k,γ || = Op(√NT ) for k = 1, . . . , 2K.

(iv) We only study tr(eX′k,γ) as the analysis of tr(eX′k,γMΛ0) is similar. For k = 1, . . . ,K, the result is trivial.For k = K + 1, . . . , 2K, we show the joint result on JNT (γ) ≡ 1√

NT

∑Ni=1

∑Tt=1 xiteitdit(γ). By Lemma C.3, we

have

Pr

(supγ∈Γ||JNT (γ)− JNT (γ0)|| > η

)≤C2C

2D(γ − γ)2

η4,

where CD is a D-dependent variable bounded with probability approaching one, and C2 < ∞. For any ε > 0, we

can choose large enough η to ensure the probability is bounded by ε. Hence, the desired result follows.

1

Page 55: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Proof of Lemma A.2. (i) Note that the kth entry of 1NT

∑Tt=1X

′tMΛet is 1

NT

∑Tt=1X

′t,kMΛet = 1

NT tr(MΛXke′)

and1

NTtr(MΛXke

′) =1

NTtr(Xke

′)− 1

NTtr(PΛXke

′).

By Lemma A.1 (i), we have∣∣ 1NT tr(Xke

′)∣∣ ≤ 1

NT ‖Xke′‖ = Op

(π−1NT

). For the second term, we have

1

NTsupΛ∈L‖tr(PΛXke

′)‖ ≤ supΛ∈L‖PΛ‖ ·

1

NT‖Xke

′‖ = R1/2Op(π−1NT

),

where we use the fact that ‖PΛ‖2 =tr(PΛPΛ) = R. It follows that supΛ∈L

∥∥∥ 1NT

∑Tt=1X

′tMΛet

∥∥∥ = Op(π−1NT

).

(ii) Note that the kth entry of 1NT

∑Tt=1Xt (γ)

′MΛet is 1NT tr(MΛXK+k,γe

′) and

1

NTtr(MΛXK+k,γe

′) =1

NTtr(XK+k,γe

′)− 1

NTtr(PΛXK+k,γe

′).

By Lemma A.1 (i), we have supγ∣∣ 1NT tr(XK+k,γe

′)∣∣ ≤ supγ

1NT ‖XK+k,γe

′‖ = Op(π−1NT

). For the second term,

we have1

NTsupΛ∈L‖tr(PΛXK+k,γe

′)‖ ≤ supΛ∈L‖PΛ‖ ·

1

NT‖XK+k,γe

′‖ = R1/2Op(π−1NT

).

It follows that supΛ∈L supγ

∥∥∥ 1NT

∑Tt=1Xt (γ)

′MΛet

∥∥∥ = Op(π−1NT

).

(iii) Noting that ‖PΛ‖ = R1/2 and 1(NT )1/2

∥∥Λ0′eF 0∥∥ = Op (1) , we have

supΛ∈L

∥∥∥ 1

NT

T∑t=1

f0′t Λ0′MΛet

∥∥∥ = supΛ∈L

1

NT

∣∣tr (Λ0′MΛeF0)∣∣

≤ 1√N

‖Λ0‖√N

‖eF 0‖√NT

(1 + supΛ∈L‖PΛ‖) = Op(π

−1NT ).

(iv) supΛ∈L

∥∥∥ 1NT

∑Tt=1 e

′tPΛet

∥∥∥ = supΛ∈L∥∥ 1NT tr (PΛee

′)∥∥ ≤ supΛ∈L ‖PΛ‖ 1

NT ‖e‖2sp = Op

(π−2NT

).

Proof of Lemma A.3. (i) By (A.4) in the proof of Proposition A.1, we have

N−1Λ0′(Λ− Λ0H) = N−1Λ0′(I1 + · · ·+ I15)ΛV −1NT .

For N−1Λ0′I1Λ, we have

N−1‖Λ0′I1Λ‖ ≤ N−1‖Λ0′I1Λ0H‖+N−1‖Λ0′I1(Λ− Λ0H)‖.

The first term on the right hand side (RHS) is bounded by 1N

1NT ‖e

′Λ0‖2‖H‖ = Op(N−1). The second term on the

RHS is bounded by

1

N2T‖Λ0′ee′(Λ− Λ0H)‖ ≤ 1√

N

‖Λ0′e‖√NT

‖e‖sp√NT

‖(Λ− Λ0H)‖√N

= N−1/2π−1NTOp(‖θ − θ

0‖+ ‖δ0‖+ π−1NT ).

For N−1Λ0′I2Λ, we have by Lemma A.1(ii)

N−1‖Λ0′I2Λ‖ =1

N2T‖Λ0′eF 0Λ0′Λ‖ ≤ 1

NT‖Λ0′eF 0‖ 1

N‖Λ0′Λ‖ = Op((NT )−1/2).

2

Page 56: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

For N−1Λ0′I3Λ, we have by Lemma A.1(ii)

N−1‖Λ0′I3Λ‖ =1

N2T‖Λ0′Λ0F 0′e′Λ‖

≤ 1√NT

‖Λ0′Λ0‖N

‖F 0′e′Λ0‖√NT

‖H‖+1√T

‖Λ0′Λ0‖N

‖F 0′e′‖√NT

‖(Λ− Λ0H)‖√N

=1√NT

+1√TOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ).

For N−1Λ0′I4Λ, . . . , N−1Λ0′I15Λ, it is easy to verify them to be Op(‖δ0‖+ ‖θ − θ0‖) and the analysis is omitted.

To sum up, we have N−1Λ0′(Λ− Λ0H) = Op(|δ0|+ ‖θ − θ0‖+ π−2NT ).

(ii) Noting that N−1Λ′(Λ− Λ0H) = N−1(Λ− Λ0H)′(Λ− Λ0H) +N−1H ′Λ0′(Λ− Λ0H), we have by (i)

1

N‖Λ′(Λ− Λ0H)‖ ≤ 1

N‖Λ− Λ0H‖2 +

1

N‖Λ0′(Λ− Λ0H)‖‖H‖ = Op(‖δ0‖+ ‖θ − θ0‖+ π−2

NT ).

(iii) By (i)-(ii) and the fact that N−1Λ′Λ = IR, we have

Λ0′Λ

N− Λ0′Λ0

NH = Op(‖δ0‖+ ‖θ − θ0‖+ π−2

NT ),

IR −Λ′Λ0

NH = Op(‖δ0‖+ ‖θ − θ0‖+ π−2

NT ).

Premultiplying the first equation by H ′ and using transpose of the second equation to obtain

IR −H ′Λ0′Λ0

NH = Op(‖δ0‖+ ‖θ − θ0‖+ π−2

NT ).

We have already verified that H = Op(1) and is invertible in the proof of Proposition A.1. Premultiplying the lastequation by H ′−1 and postmultiplying it by H ′ yields

IR −Λ0′Λ0

NHH ′ = Op(‖δ0‖+ ‖θ − θ0‖+ π−2

NT ).

The desired result follows by premultiplying both sides of the last equation by (Λ0′Λ0/N)−1.

(iv) Note that ‖PΛ − PΛ0‖2 = 2tr(IR − 1N Λ′PΛ0Λ). The result follows from (i)-(iii).

Proof of Lemma A.4. (i) By (A.4) in the proof of Proposition A.1, we have

e′(Λ− Λ0H) = e′(I1 + · · ·+ I15)ΛV −1NT .

For e′I1Λ, we have by Lemma A.1(iii)

‖e′I1Λ‖ ≤√NT‖e‖2sp

NT

‖e′Λ‖√NT

≤√NT‖e‖2sp

NT

(‖e′Λ0‖√NT

+‖e′(Λ− Λ0H)‖√

NT

)=√NTOp(π

−2NT )

[Op(N

1/2π−1NT ) + (NT )−1/2

∥∥∥e′(Λ− Λ0H)‖]

=√NTOp(N

1/2π−3NT ) + op

(‖e′(Λ− Λ0H)‖

).

For e′I2Λ, we have by Lemma A.1(ii)

‖e′I2Λ‖ = N‖e′eF 0‖NT

‖Λ0′Λ‖N

= Op(Nπ−1NT ).

3

Page 57: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

For e′I3Λ, we have by Lemma A.1(ii)

‖e′I3Λ‖ ≤ ‖e′Λ0‖√NT

‖F 0′eΛ‖√NT

≤ ‖e′Λ0‖√NT

(‖F 0′eΛ0‖√

NT+‖F 0′‖√T

‖e′(Λ− Λ0H)‖√N

)= Op(1)

Op(1) +N−1/2

∥∥∥e′(Λ− Λ0H)∥∥∥ .

For e′I4Λ and e′I5Λ, we have by Lemma A.1(iii)

‖e′I4Λ‖ ≤ 1

NT

2K∑k′=1

|θk′ − θ0k′ |‖e′eX

′k′,γΛ‖

≤ N√T

2K∑k′=1

|θk′ − θ0k′ |‖e‖sp√NT

‖eX′k′,γ‖NT

‖Λ‖√N

= Op(N√Tπ−2

NT ‖θ − θ0‖),

and

‖e′I5Λ‖ ≤ 1

NT

2K∑k′=1

|θk′ − θ0k′ |‖e′Xk′,γe

′Λ‖ ≤√NT

2K∑k′=1

|θk′ − θ0k′ |‖e′Xk′,γ‖NT

‖e′Λ‖√NT

=√NTOp(‖θ − θ0‖)Op

(π−1NT

) [Op(1) + (NT )−1/2‖e′(Λ− Λ0H)‖

]= Op(

√NTπ−1

NT ‖θ − θ0‖) + π−1

NTOp

(‖e′(Λ− Λ0H)‖

).

Similarly, by Lemma A.1(ii)-(iii) and the fact that ‖e‖sp/√NT = Op

(π−1NT

), we have

‖e′I6Λ‖ ≤ N√T

K∑k′=1

|δk|‖e‖sp√NT

‖e′Xk

(γ0, γ

)‖

NT

‖Λ‖√N

= N√Tπ−2

NTOp(‖θ − θ0‖+ ‖δ0‖),

‖e′I7Λ‖ ≤√NT

K∑k′=1

|δk|‖e′Xk

(γ0, γ

)‖

NT

‖e′Λ‖√NT

=√NTOp(‖θ − θ0‖+ ‖δ0‖)

[Op (1) + (NT )

−1/2 ‖e′(Λ− Λ0H)‖],

and

‖e′I8Λ‖ ≤ 1

NT

2K∑k′=1

|θk′ − θ0k′ |∥∥∥e′Λ0F 0′X′k′,γΛ

∥∥∥≤√NT

2K∑k′=1

|θk′ − θ0k′ |‖e′Λ0‖√NT

‖F 0′‖√T

‖X′k′,γ‖√NT

‖Λ‖√N

= Op(√NT‖θ − θ0‖).

Analogously, we can show that ‖e′ (I9) Λ‖ = Op(√NT‖θ − θ0‖) and ‖e′ (Is) Λ‖ =

√NTOp(‖θ − θ0‖ + ‖δ0‖)

for s = 10, . . . , 15. Summarizing the above results yields ‖e′(Λ−Λ0H)‖/√NT = Op(‖θ− θ0‖+ ‖δ0‖+ π−1

NT ).(ii) Recalling that Ψt(θ, γ) = Xt,γ(θ − θ0) +Xt(γ, γ

0)δ0 and (NT )α‖θ − θ0‖ = Op(1), we can show that

1

(NT )1−2α

∣∣∣ T∑t=1

Ψt(θ, γ)′MΛet

∣∣∣ ≤ supγ∈Γ

∣∣∣∣ 1

(NT )1−2α

T∑t=1

Ψt(θ, γ)′MΛet

∣∣∣∣≤∥∥∥(NT )α(θ − θ0)

∥∥∥ supγ∈Γ

∥∥∥∥ 1

(NT )1−α

T∑t=1

X ′t,γMΛet

∥∥∥∥4

Page 58: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

+∥∥∥(NT )αδ0

∥∥∥ supγ∈Γ

∥∥∥∥ 1

(NT )1−α

T∑t=1

Xt(γ0, γ)′MΛet

∥∥∥∥.However, we actually have proved in (A.7) that

supγ∈Γ

∥∥∥∥ 1

NT

T∑t=1

X ′t,γMΛet

∥∥∥∥ = op(‖θ − θ0‖+ ‖δ0‖) +Op(π−2NT ).

Also note that

supγ∈Γ

∥∥∥∥ 1

NT

T∑t=1

Xt(γ0, γ)′MΛet

∥∥∥∥ ≤ 2 supγ∈Γ

∥∥∥∥ 1

NT

T∑t=1

X ′t,γMΛet

∥∥∥∥.Combining the above results yields the conclusion in (ii).

(iii) This is a direct consequence of (A.9).(iv) Note that

1

NT

T∑t=1

f0′t Λ0′MΛet = − 1

NT

T∑t=1

f0′t H

−1′(Λ− Λ0H)′MΛet

= − 1

NT

T∑t=1

f0′t H

−1′V −1NT Λ′(I1 + · · ·+ I15)MΛet = −(J1 + · · ·+ J15),

where J1, . . . , J15 are implicitly defined in the above expression. We investigate the above terms one by one. For J1,

J1 =1

N2T 2tr[H−1′V −1

NT Λ′ee′MΛeF0]

=1

NTtr[(F 0′F 0)−1(Λ′0)−1Λee′eF 0

]− 1

N2Ttr[(F 0′F 0)−1(Λ′0)−1Λee′ΛΛ′eF 0

].

Note that 1NT Λ′ee′eF 0 = 1

NT Λ′[ee′−E(ee′)]eF 0+ 1NT Λ′E(ee′)eF 0.Noting that ‖Λ−Λ0H‖/

√N = Op((NT )−α+

π−1NT ), we have

1

NT

∥∥∥Λ′[ee′ − E(ee′)]eF 0∥∥∥ ≤ √N ‖Λ− Λ0H‖√

N

‖[ee′ − E(ee′)]eF 0‖NT

+ ‖H ′‖‖Λ0′[ee′ − E(ee′)]eF 0‖

NT

=√N[Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )]

+Op(1),

and

1

NT‖Λ′E(ee′)eF 0‖ ≤

√T‖Λ− Λ0H‖√

N

‖E(ee′)eF 0‖T√NT

+

√T

N‖H ′‖‖Λ

0′E(ee′)eF 0‖T√NT

=√T[Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )]

+Op(1).

Then

1

N2T‖Λ′ee′ΛΛ′eF 0‖ ≤ 1√

N

‖Λ′[ee′ − E(ee′)]Λ‖N√T

‖Λ′eF 0‖√NT

+

√T

N

‖Λ′E(ee′)Λ‖NT

‖Λ′eF 0‖√NT

= N[Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )]3

+√T[Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )]

+Op(1).

Summarizing the above results yields

(NT )2αJ1 =(NT )2α

NT

(√N +

√T )[Op((NT )−α + π−1

NT )]

+Op(1) +N[Op((NT )−α + π−1

NT )]3

= op(1).

5

Page 59: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Next, we study J2. Note that

1

NT

T∑t=1

f0′t H

−1′V −1NT Λ′

1

NTΛ0F 0′e′MΛet =

1

NTtr[e′MΛePF 0 ] =

1

NTtr[e′MΛ0ePF 0 ]− 1

NTtr[e′(PΛ−PΛ0)ePF 0

].

By the expression of PΛ − PΛ0 , we have

1

NTtr[e′(PΛ − PΛ0)ePF 0

]=

1

N2Ttr[e′(Λ− Λ0H)(Λ− Λ0H)′ePF 0

]+ 2

1

N2Ttr[e′(Λ− Λ0H)H ′0′ePF 0

]+

1

N2Ttr[e′0[HH ′ − (

Λ0′Λ0

N)−1]Λ0′ePF 0

]We use II1, II2 and II3 to denote the three terms on the RHS of the last equation temporarily. Note that

|II1| ≤1

T

‖Λ− Λ0H‖2

N

‖eF 0‖2

NT

∥∥∥(F 0′F 0

T)−1∥∥∥ =

1

TOp(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT ),

|II2| ≤1√NT‖H‖‖Λ− Λ0H‖√

N

‖F 0′e′‖√NT

‖Λ0′eF 0‖√NT

∥∥∥(F 0′F 0

T)−1∥∥∥ =

1√NT

Op(‖θ − θ0‖+ ‖δ0‖+ π−1NT ),

and

|II3| ≤1

NT

‖Λ0′eF 0‖2

NT

∥∥∥(F 0′F 0

T)−1∥∥∥‖HH ′ − (

Λ0′Λ0

N)−1‖ =

1

NTOp(‖θ − θ0‖+ ‖δ0‖+ π−2

NT ).

Summarizing the above results, we have

(NT )2αJ2 =1

(NT )1−2αtr[e′MΛ0ePF 0 ] + op(1).

Next, we study J3. Note that

|J3| ≤1√NT‖H−1′‖‖V −1

NT ‖‖Λ′eF 0‖√

NT

‖Λ0′MΛeF0‖

NT.

So this term is of a smaller order relative to 1NT ‖Λ

0′MΛeF0‖ and is therefore asymptotically negligible. For terms

J4-J7 and J12-J15, we can readily show that

7∑l=4

‖Il‖+

15∑l′=12

‖Il′‖ = Op(1√T

)Op(‖θ − θ0‖+ ‖δ0‖) +Op(‖θ − θ0‖2 + ‖δ0‖2)

and

7∑l=4

|Jl|+15∑

l′=12

|Jl′ | ≤1√T‖H−1‖‖V −1

NT ‖‖Λ‖√N

( 7∑l=4

‖Il‖+

15∑l′=12

‖Il′‖)‖eF 0‖√

NT(1 + ‖PΛ‖)

= Op(1

T)Op(‖θ − θ0‖+ ‖δ0‖) +

1√TOp(‖θ − θ0‖2 + ‖δ0‖2).

Then it follows that (NT )2α(∑7

l=4 |Jl|+∑15l′=12 |Jl′ |

)= op(1). For J8, we have

J8 =

2K∑k=1

(θk − θ0k)

1

N2T 2tr[H−1′V −1

NT ΛXk,γ0F 0H−1′(Λ0H − Λ)′MΛeF0].

6

Page 60: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Then

|J8| ≤1√NT‖θ − θ0‖‖H−1‖2‖V −1

NT ‖‖Λ‖√N

( 2K∑k=1

‖Xk,γ0‖√NT

)‖F 0‖√T

‖Λ− Λ0H‖√N

‖eF 0‖√NT

(1 + ‖PΛ‖)

=1√NT‖θ − θ0‖Op(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ),

and (NT )2α|J8| = op(1). Analogously, we can show that (NT )2α|J10| = op(1). For J9, we have

J9 =

2K∑k=1

(θk − θ0k)

1

N2T 2tr[H−1′V −1

NT Λ′0F 0′X′k,γ0MΛeF0],

and

|J9| ≤1√NT‖θ − θ0‖‖H−1‖‖V −1

NT ‖‖Λ‖√N

‖Λ0‖√N

2K∑k=1

(‖F 0X′k,γ0eF 0‖T√NT

+‖F 0‖√T

‖Xk,γ0‖√NT

‖Λ‖√N

‖Λ′eF 0‖√NT

)=

1√NT‖θ − θ0‖

(Op(1) +

√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )).

Then (NT )2α|J9| = op(1). By the same token, (NT )2α|J11| = op(1). Summarizing the above results, we have

(iv).(v) Note that

1

NTtr[e′(PΛ − PΛ0)e

]=

1

N2Ttr[e′(Λ− Λ0H)(Λ− Λ0H)′e

]+ 2

1

N2Ttr[e′(Λ− Λ0H)H ′0′e

]+

1

N2Ttr[e′0[HH ′ − (

Λ0′Λ0

N)−1]Λ0′e

].

By Lemma A.3 (iii) and Lemma A.4 (i), the first term on the RHS is bounded in norm by 1N2T ‖e

′(Λ − Λ0H)‖2 =1NOp(‖θ − θ

0‖2 + ‖δ0‖2 + π−2NT ), and the third term is bounded in norm by 1

N2T ‖Λ0′e‖2‖HH ′ − (Λ0′Λ0

N )−1‖ =1NOp(‖θ − θ

0‖2 + ‖δ0‖2 + π−2NT ). So the (NT )2α times of these two terms are both op(1). It remains to study the

second term via the decomposition:

1

N2Ttr[e′(Λ− Λ0H)H ′0′e

]=

1

N2Ttr[e′(I1 + · · ·+ I15)ΛV −1

NTH′0′e]

= J1 + · · ·+ J15.

Again, we consider the fifteen terms one by one. For J1,

|J1| ≤1

N‖H−1‖‖V −1

NT ‖‖Λ0′ee′‖NT

‖Λ′ee′‖NT

=1

N‖H−1‖‖V −1

NT ‖‖Λ0′ee′‖NT

(‖H ′‖‖Λ

0′ee′‖NT

+√N‖Λ− Λ0H‖√

N

‖ee′‖NT

)=

1

NOp(π

−1NT )

(Op(π

−1NT ) +

√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )Op(π−1NT )

)=

1

NOp(‖θ − θ0‖+ ‖δ0‖+ π−2

NT ).

So (NT )2αJ1 = op(1). For J2,

|J2| ≤1

N√NT‖H−1‖‖V −1

NT ‖‖Λ0′ee′0‖NT

‖F 0e′Λ‖√NT

≤ 1

N√NT‖H−1‖‖V −1

NT ‖‖Λ0′Λ0‖‖e‖2sp

NT

‖F 0e′Λ‖√NT

=1

N√NT

Op(1 +N

T)√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ) = Op(1

NT) +

1

N√TOp(‖θ − θ0‖+ ‖δ0‖).

So (NT )2αJ2 = op(1). For J3, we have

|J3| ≤1

NT‖H−1‖‖V −1

NT ‖‖Λ0′ee′eF 0‖

NT

‖Λ0‖√N

‖Λ‖√N

= Op(1

NT),

7

Page 61: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where we use the fact that 1NT ‖Λ

0′ee′eF 0‖ = 1NT ‖Λ

0′(ee′ − E(ee′))eF 0‖+ 1NT ‖Λ

0′E(ee′)eF 0‖ = Op(1) dueto the results in J1. Then (NT )2αJ3 = op(1). By the same arguments on J4, . . . , J7 and J12, . . . , J15, we can showthat

7∑l=4

|Jl|+15∑

l′=12

|Jl′ | = Op(1

T)Op(‖θ − θ0‖+ ‖δ0‖) +

1√TOp(‖θ − θ0‖2 + ‖δ0‖2).

So we have (NT )2α(∑7l=4 |Jl|+

∑15l′=12 |Jl′ |) = op(1). It remains to study J8, . . . J11. For J8, we have

|J8| ≤1

N‖θ − θ0‖‖H−1‖‖V −1

NT ‖‖Λ0′ee′0‖NT

‖F 0‖√T

(

2K∑k=1

‖Xk,γ0‖√NT

)‖Λ‖√N

=1

NOp(‖θ − θ0‖),

and (NT )2αJ8 = op(1). Similarly, we can show that (NT )2αJ10 = op(1). Consider J9:

|J9| ≤1√N‖θ − θ0‖‖H−1‖‖V −1

NT ‖‖Λ0′ee′‖NT

‖Xk,γ0‖√NT

‖F 0‖√T

‖Λ0‖√N

‖Λ‖√N

=1√NOp(‖θ − θ‖)Op(π−1

NT ).

Then (NT )2αJ9 = op(1). Analogously, we have (NT )2αJ11 = op(1). Then the result in (v) follows.(vi) Note that

1

NT

T∑t=1

f0′t Λ0′MΛΛ0f0

t =1

NTtr[H−1′(Λ− Λ0H)′MΛ(Λ− Λ0H)H−1(F 0′F 0)

]=

1

NTtr[H−1′V −1

NT Λ′(I1 + · · ·+ I15)′MΛ(I1 + · · ·+ I15)ΛV −1NTH

−1F 0′F 0].

Define for l, l′= 1, ..., 15,

Jl,l′ =1

NTtr[H−1′V −1

NT Λ′IlMΛIl′ΛV−1NTH

−1F 0′F 0].

We partition the index set into three groups: G1 = 1, 2, 3, G2 = 9, 11 and G3 = 4, 5, 6, 7, 8, 10, 12, 13, 14, 15.First, it is easy to verify that∑

l∈G1,l′∈G3

|Jl,l′ | = Op

(π−2NT + π−1

NT (‖θ − θ0‖+ ‖δ0‖))Op(‖θ − θ0‖+ ‖δ0‖),

∑l∈G3,l′∈G1

|Jl,l′ | = Op

(π−2NT + π−1

NT (‖θ − θ0‖+ ‖δ0‖))Op(‖θ − θ0‖+ ‖δ0‖),

∑l∈G2,l′∈G3

|Jl,l′ | = op(‖θ − θ0‖2 + ‖δ0‖2),∑

l∈G3,l′∈G2

|Jl,l′ | = op(‖θ − θ0‖2 + ‖δ0‖2), and

∑l∈G3,l′∈G3

|Jl,l′ | = op(‖θ − θ0‖2 + ‖δ0‖2).

So (NT )2α times of these terms are all op(1). So it suffices to study the remaining terms. For J1,1, we have

|J1,1| =∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′ee′MΛee′ΛV −1

NTH−1F 0′F 0

]∣∣∣≤ 1

N‖H−1‖2‖V −1

NT ‖2(‖Λee′‖

NT

)2(‖F 0‖√T

)2

(1 + ‖PΛ‖)

=1

N

(Op(π

−2NT ) +NOp(‖θ − θ0‖2 + ‖δ0‖2 + π−2

NT )Op(π−2NT )

).

8

Page 62: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

So (NT )2αJ1,1 = op(1). For J2,2, we have

|J2,2| =∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′0F 0′e′MΛeF0Λ0′ΛV −1

NTH−1F 0′F 0

]∣∣∣=

1

NTtr(ePF 0e′)−

∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′0F 0′e′PΛeF0Λ0′ΛV −1

NTH−1F 0′F 0

]∣∣∣.Then (NT )2α|J2,2 − 1

NT tr(ePF 0e′)| = op(1) as the second term in the last line is bounded by

1

NT‖H−1‖2‖V −1

NT ‖2(‖F 0‖√

T

)2(‖Λ′0‖N

)2(‖F 0′e′Λ‖√NT

)2

=1

NT

(Op(1) +NOp(‖θ − θ‖2 + ‖δ0‖2)

).

For J3,3, we have

|J3,3| =∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′eF 0Λ0′MΛΛ0F 0′e′ΛV −1NTH

−1F 0′F 0]∣∣∣

=1

NT‖H−1‖2‖V −1

NT ‖2(‖F 0‖√

T

)2(‖F 0′e′Λ‖√NT

)2 ‖Λ0′MΛΛ0‖N

= op(1

NT).

So (NT )2α|J3,3| = op(1). For J1,2, we have

|J1,2| =∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′ee′MΛeF0Λ0′ΛV −1

NTH−1F 0′F 0

]∣∣∣≤ 1

NT‖H−1‖2‖V −1

NT ‖2(‖F 0‖√

T

)2 ‖Λ0′Λ‖N

(‖Λ′ee′eF 0‖NT

+

√T

N

‖Λ′ee′Λ‖NT

‖Λ′eF 0‖√NT

)=

1

NT

(Op(1) +

√T (‖θ − θ0‖+ ‖δ0‖+ π−1

NT ))

So (NT )2α|J1,2| = op(1). Since |J1,2| = |J2,1|, we have (NT )2α|J2,1| = op(1). For J1,3, we have

|J1,3| =∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′ee′MΛΛ0F 0′e′ΛV −1NTH

−1F 0′F 0]∣∣∣

≤ 1

N√T‖H−1‖2‖V −1

NT ‖2(‖F 0‖√

T

)2 ‖Λee′‖NT

‖MΛΛ0‖√N

‖F 0′e′Λ‖√NT

=1

N√T

(Op(1) +

√N(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ))(Op(π

−1NT ) +

√NOp(‖θ − θ0‖+ ‖δ0‖)

).

So (NT )2α|J1,3| = op(1). Since |J1,3| = |J3,1|, we have (NT )2α|J3,1| = op(1). For J2,3, we have

|J2,3| =∣∣∣ 1

N3T 3tr[H−1′V −1

NT Λ′0F 0′e′MΛΛ0F 0′e′ΛV −1NTH

−1F 0′F 0]∣∣∣

≤ 1

NT‖H−1‖2‖V −1

NT ‖2(‖F 0‖√

T

)2 ‖ΛΛ0‖N

‖F 0′e′Λ‖√NT

(‖F 0′e′0‖√NT

+‖F 0′e′Λ‖√

NT

‖ΛΛ0‖N

)=

1

NT

(Op(1) +

√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT ))2

.

So (NT )2α|J2,3| = op(1). Since |J2,3| = |J3,2|, we have (NT )2α|J3,2| = op(1). Next, we consider J1,9.

|J1,9| =∣∣∣ 1

N2T 2tr[H−1′V −1

NT Λ′ee′MΛ

( 2K∑k=1

(θk − θ0k)Xk,γ0

)F 0]∣∣∣

≤ 1√N‖θ − θ0‖‖H−1‖‖V −1

NT ‖‖Λ′ee′‖NT

(

2K∑k=1

‖Xk,γ0‖√NT

)‖F 0‖√T

(1 + ‖PΛ‖)

9

Page 63: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

=1√N

(Op(π

−1NT ) +

√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )Op(π−1NT )

)Op(‖θ − θ0‖).

Then (NT )2α|J1,9| = op(1). Since |J9,1| = |J1,9|, we have (NT )2α|J9,1| = op(1). Similarly, we can show that(NT )2α|J1,11| = op(1) and (NT )2α|J11,1| = op(1). For J2,9, we have

|J2,9| =∣∣∣ 1

N2T 2tr[H−1′V −1

NT Λ′0F 0′e′MΛ

( 2K∑k=1

(θk − θ0k)Xk,γ0

)F 0]∣∣∣

≤ 1√NT‖θ − θ0‖‖H−1‖‖V −1

NT ‖‖Λ′0‖N

( 2K∑k=1

‖F 0′e′Xk,γ0‖√NT

+‖F 0′e′Λ‖√

NT

‖Λ‖√N

‖Xk,γ0‖√NT

)‖F 0‖√T

=1√NT

(Op(1) +

√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )))Op(‖θ − θ0‖).

Then (NT )2α|J2,9| = op(1). Since |J9,2| = |J2,9|, we have (NT )2α|J9,2| = op(1). Similarly, we can show that(NT )2α|J2,11| = op(1) and (NT )2α|J11,2| = op(1). For J3,9, we have

|J3,9| =∣∣∣ 1

N2T 2tr[H−1′V −1

NT Λ′eF 0Λ0′MΛ

( 2K∑k=1

(θk − θ0k)Xk,γ0

)F 0]∣∣∣

≤ 1√NT‖θ − θ0‖‖H−1‖‖V −1

NT ‖‖Λ′eF 0‖√

NT

‖Λ0‖√N

(

2K∑k=1

‖Xk,γ0‖√NT

)‖F 0‖√T

(1 + ‖PΛ‖)

=1√NT

(Op(1) +

√NOp(‖θ − θ0‖+ ‖δ0‖+ π−1

NT )))Op(‖θ − θ0‖).

Then (NT )2α|J3,9| = op(1). Since |J9,3| = |J3,9|, we have (NT )2α|J9,3| = op(1). Similarly, we can show that(NT )2α|J3,11| = op(1) and (NT )2α|J11,3| = op(1). Finally, we have¬

J9,9 + J9,11 + J11,9 + J11,11 =

2K∑k=1

2K∑k′=1

(θk − θ0k)(θk′ − θ0

k′)1

NTtr[Xk,γMΛXk′,γPF 0

]+ 2

2K∑k=1

K∑k′=1

(θk − θ0k)δ0

k′1

NTtr[Xk,γMΛXk′(γ, γ

0)PF 0

]+

K∑k=1

K∑k′=1

δ0kδ

0k′

1

NTtr[Xk(γ, γ0)MΛXk′(γ, γ

0)PF 0

].

Summarizing the above results, we have (vi).

Proof of Lemma A.5: (i) LetANT =

sup v

αNT≤|γ−γ0|≤B

‖JNT (γ)−JNT (γ0)‖√αNT |γ−γ0| > η

and B = CD ≤ C. We have

Pr(Bc) ≤ ε/2 for large enough (N,T ). We are left to prove Pr(ANT |B) ≤ ε/2. We focus on the case in which

γ > γ0 as the γ < γ0 case follows by symmetric arguments.Fix η > 0 and ε > 0. For j = 1, 2, . . . , set γj = γ0 + v2j−1/αNT , where v < ∞ will be determined later on.

We can show by direct calculation:

supv

αNT≤γ−γ0≤B

∥∥JNT (γ)− JNT (γ0)∥∥

√αNT |γ − γ0|

≤ supj>0

∥∥JNT (γj)− JNT (γ0)∥∥

√αNT |γj − γ0|

¬Note that the decomposition (A.4) is based on the expansion et = et + Λ0f0t −Xt,γ0(θ− θ0)−Xt(γ, γ0)δ. Alternatively,

we can derive an equivalent decomposition based on et = et + Λ0f0t − Xt,γ(θ − θ0) − Xt(γ, γ0)δ0. If we use the second

expression, we can obtain immediately the following result.

10

Page 64: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

+ supj>0

supγj≤γ≤γj+1

‖JNT (γ)− JNT (γj)‖√αNT |γj − γ0|

.

By Assumption A.4(iv)-(v) and by Lemma C.1, we have

ED∥∥JNT (γj)− JNT (γ0)

∥∥2=

1

NT

N∑i=1

T∑t=1

EDh2it(γj , γ

0) ≤ CD∣∣γj − γ0

∣∣ . (B.1)

By Assumption A.4(iv), Markov inequality, and (B.1) we have

PrD

(supj>0

∥∥JNT (γj)− JNT (γ0)∥∥

√αNT |γj − γ0|

> η

)≤ 1

η2

∞∑j=1

ED∥∥JNT (γj)− JNT (γ0)

∥∥2

αNT |γj − γ0|2

≤ CDη2

∞∑j=1

1

αNT |γj − γ0|=

2CDη2v

.

Set δj = γj+1 − γj and ηj =√αNT

∣∣γj − γ0∣∣ η. Then

PrD

(supj>0

supγj≤γ≤γj+1

‖JNT (γ)− JNT (γj)‖√αNT |γj − γ0|

> η

)≤

∞∑j=1

PrD

(sup

γj≤γ≤γj+1

‖JNT (γ)− JNT (γj)‖√αNT |γj − γ0|

> η

)

=

∞∑j=1

PrD

(sup

γj≤γ≤γj+δj‖JNT (γ)− JNT (γj)‖ > ηj

).

When v ≥ 1, we have δj ≥ α−1NT ≥ (NT )

−1 for all j > 0. Furthermore,√NT ≥ 6C/ηj is satisfied if v ≥

max[1, 6C/η]. Then by Lemma C.3 and conditional on B, there exists a constant C2 such that

∞∑j=1

PrD

(sup

γj≤γ≤γj+δj‖JNT (γ)− JNT (γj)‖ > ηj

∣∣∣∣∣B)≤∞∑j=1

C2C2δ2j

η4j

=4C2C

2

3v2η4.

By choosing a large enough v we can have Pr(ANT |B) ≤ ε/2.(ii) The kth entry of J∗NT (γ1)− J∗NT (γ2) can be written as

1√NT

tr[eX′k(γ1, γ2)PΛ0

]=

1√NT

tr[Λ0′eX′k(γ1, γ2)Λ0(Λ0′Λ0)−1

],

which is bounded by 1N√NT

∥∥Λ0′eX′k(γ1, γ2)Λ0∥∥ · ∥∥(N−1Λ0′Λ0)−1

∥∥ . Note that N−1Λ0′Λ0 p→ Σλ > 0 by

Assumption A.2. It suffices to bound 1N√NT

∥∥Λ0′eX′k(γ1, γ2)Λ0∥∥ . The (r, l)th entry of Λ0′eX′k(γ1, γ2)Λ0 is∑

t

∑i,j λ

0irλ

0jleitXjt,k[djt(γ2)− djt(γ1)]. We can do similar analysis as in Lemmas C.1-C.3 and part (i) to obtain

the desired result.

Proof of Lemma A.6. Without loss of generality (WLOG), we consider the case in which γ > γ0 as the γ < γ0 casecan be analyzed by symmetric arguments. Note thatEDGNT (γ) = 1

NT

∑Ni=1

∑Tt=1ED

[(C0′xit)

2(dit(γ)− dit(γ0))],

and when γ > γ0, we haved

dγEDGNT (γ) = C0′MD(γ)C0,

where MD(γ) = 1NT

∑Ni=1

∑Tt=1ED(xitx

′it|γ)fit,D(γ). For some B > 0 small enough, we let

dD = min|γ−γ0|≤B

C0′MD(γ)C0 > 0 a.s.

11

Page 65: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

by Assumption A.5(i). Then

min|γ−γ0|≤B

EDGNT (γ)

|γ − γ0|≥ dD. (B.2)

By Lemma C.2,

ED |GNT (γ)− EDGNT (γ)|2 ≤∥∥C0

∥∥4ED

∣∣∣∣∣ 1

NT

N∑i=1

T∑t=1

[k2it(γ, γ

0)− EDk2it(γ, γ

0)]∣∣∣∣∣

2

≤ (NT )−1∥∥C0

∥∥4C1CD

∣∣γ − γ0∣∣ , (B.3)

Fix ε, η > 0 and set d = τ1∥∥C0

∥∥2, where τ1 > 0 is as defined in Assumption A.5(ii). Noting that min|γ−γ0|≤B C

0′

×MD(γ)C0 ≥∥∥C0

∥∥2inf |γ−γ0|<B µmin(MD(γ)), by Assumption A.5(ii). we have Pr(dD ≥ d)→ 1, as (N,T )→

∞. Let

b =1− η/21− η

> 1 and v =8 ‖C0‖4 C1C

εη2d2(1− 1/b).

For j = 1, . . . ,m+ 1, set γj = γ0 + vbj−1/αNT such that γm − γ0 = vbm−1/αNT ≤ B and γm+1 − γ0 > B. ByMarkov inequality and (B.2)-(B.3)

PrD

(sup

1≤j≤m

∣∣∣∣GNT (γj)− EDGNT (γj)

EDGNT (γj)

∣∣∣∣ > η

2

)≤ 4

η2

m∑j=1

ED |GNT (γj)− EDGNT (γj)|2

|EDGNT (γj)|2

≤4∥∥C0

∥∥4C1CD

NTη2d2D

m∑j=1

1

|γj − γ0|

≤ CD

C

d2

d2D

ε

2.

For any γ ∈ [γ0 + v/αNT , γ0 +B], there is some 1 ≤ j ≤ m such that γj < γ < γj+1. Then

GNT (γ)

|γ − γ0|≥ GNT (γj)

|γj+1 − γ0|=

∣∣γj − γ0∣∣

|γj+1 − γ0|· EDGNT (γj)

|γj − γ0|· GNT (γj)

EDGNT (γj)

≥ 1

b· dD ·

GNT (γj)

EDGNT (γj).

Conditional on the event that sup

1≤j≤m

∣∣∣∣GNT (γj)− EDGNT (γj)

EDGNT (γj)

∣∣∣∣ < η

2

,

CD < C and

inf |γ−γ0|<B µmin(MD(γ)) > τ1), we have

GNT (γ)

|γ − γ0|>

1

b· d · (1− η/2) = d(1− η).

The probability of the joint events can be bounded below by 1− ε. Therefore, we have

Pr

(inf

vαNT

≤γ−γ0≤B

GNT (γ)

|γ − γ0|< (1− η)d

)≤ ε/2.

A symmetric argument applies to the case −B ≤ γ − γ0 ≤ − vαNT

.For the second result, by the triangle inequality,

1

T

T∑t=1

∥∥∥ 1

N

N∑i=1

xitλ0′i dit(γ, γ

0)∥∥∥2

≤ 1

T

T∑t=1

[1

N

N∑i=1

‖λ0i ‖ · ‖xit‖ · dit(γ, γ0)

]2

≡ G∗∗NT (γ).

12

Page 66: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

WLOG, assume γ > γ0. Let γj = γ0+ vαNT

2j . By definition, it is seen that there are at mostm = blog2(αNT cNT /v)c+2 . log2N points in the interval γ − γ0 ∈ [ v

αNT, cNT ]. Note that G∗∗NT (γ) is an increasing function of γ, we have,

for any γ ∈ [γk, γk+1],G∗∗NT (γ)

|γ − γ0|≤ G∗∗NT (γk+1)

|γk − γ0|.

Then

PrD

(sup

vαNT

≤γ−γ0≤cNT

G∗NT (γ)

|γ − γ0|> η

)≤ PrD

(sup

vαNT

≤γ−γ0≤cNT

G∗∗NT (γ)

|γ − γ0|> η

)

≤ PrD

(max

0≤k≤m

G∗∗NT (γk+1)

|γk − γ0|> η

)≤

m∑k=0

PrD

(G∗∗NT (γk+1)

|γk − γ0|> η

).

However, for each given γ, by the Cauchy-Schwarz inequality,

G∗∗NT (γ) ≤ 21

T

T∑t=1

[1

N

N∑i=1

‖λ0i ‖‖xit‖dit(γ, γ0)− ED

(‖λ0

i ‖‖xit‖dit(γ, γ0))]2

︸ ︷︷ ︸I1(γ)

+ 21

T

T∑t=1

[1

N

N∑i=1

ED‖λ0i ‖‖xit‖dit(γ, γ0)

]2

︸ ︷︷ ︸I2(γ)

.

For the second term, it is easy to verify that for each given γ,

I2(γ) =1

T

T∑t=1

[1

N

N∑i=1

ED‖λ0i ‖‖xit‖dit(γ, γ0)

]2

. |γ − γ0|2 a.s.

For the first term, by Assumption A.4(iv), we have that for each t,

ED

∥∥∥ 1√N

N∑i=1

‖λ0i ‖‖xit‖dit(γ, γ0)− ED

(‖λ0

i ‖‖xit‖dit(γ, γ0))∥∥∥4

≤ 1

N2

N∑i=1

ED

(‖λ0

i ‖4‖xit‖4dit(γ, γ0))

+ 3[ 1

N

N∑i=1

ED

(‖λ0

i ‖2‖xit‖2dit(γ, γ0))]2

.1

N|γ − γ0|+ |γ − γ0|2 a.s.

which, by the Cauchy-Schwarz inequality, implies

ED[I21 (γ)

]≤ 1

NT

T∑t=1

ED

∥∥∥ 1√N

N∑i=1

‖λ0i ‖‖xit‖dit(γ, γ0)− ED

(‖λ0

i ‖‖xit‖dit(γ, γ0))∥∥∥4

.1

N2|γ − γ0|+ 1

N|γ − γ0|2.

For I2(γk+1), we have that 2I2(γk+1)|γk−γ0| . |γk − γ0| . cNT . Since cNT shrinks to 0 as N and T become larger, we

conclude that for sufficiently large N and T , 2I2(γk+1)|γk−γ0| <

η2 . Given this, we have

PrD

(G∗∗NT (γk+1)

|γk − γ0|> η

)≤ PrD

(2I1(γk+1) + 2I2(γk+1)

|γk − γ0|≥ η

)

13

Page 67: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

≤ PrD

(I1(γk+1) ≥ 1

4η|γk − γ0|

)≤ 16

η2|γk − γ0|2ED(I2

1 (γk+1))

.16

η2|γk − γ0|2[ |γk+1 − γ0|

N2+|γk+1 − γ0|2

N

].

Thenm∑k=1

Pr

(G∗∗NT (γk+1)

|γk − γ0|> η

).

αNTη2N2v

+log2N

Nη2−→ 0,

since αNT /N2 → 0. The proofs for the other results are similar and less difficult and thus omitted.

Proof of Lemma A.7. We only provide the proof of the second result because that of the others are relatively easier.For H2,NT (γ), we use the identity PΛ = ΛΛ′/N and make the following decomposition

H2,NT (γ) =1

(NT )1−α

T∑t=1

Xt(γ, γ0)′Λ0

[HH ′ −

(N−1Λ0′Λ0

)−1] Λ0′Λ0

Nf0t

+1

(NT )1−α

T∑t=1

Xt(γ, γ0)′Λ0H

(Λ− Λ0H)′Λ0

Nf0t

+1

(NT )1−α

T∑t=1

Xt(γ, γ0)′(Λ− Λ0H)H ′

Λ0′Λ0

Nf0t

+1

(NT )1−α

T∑t=1

Xt(γ, γ0)′(Λ− Λ0H)

(Λ− Λ0H)′Λ0

Nf0t

≡ H21,NT (γ) +H22,NT (γ) +H23,NT (γ) +H24,NT (γ), say.

First, we consider H21,NT (γ). The kth element is equal to

[H21,NT (γ)]k =1

(NT )1−α tr

F 0′Xk(γ, γ0)′Λ0

[HH ′ −

(N−1Λ0′Λ0

)−1] Λ0′Λ0

N

∥∥F 0′Xk(γ, γ0)′Λ0∥∥

NT·∥∥∥∥Λ0′Λ0

N

∥∥∥∥ · (NT )α∥∥∥HH ′ − (N−1Λ0′Λ0

)−1∥∥∥

= op(1) ·∥∥F 0′Xk(γ, γ0)′Λ0

∥∥NT

by Lemma A.3 (iii) and Proposition A.3 (ii). By the similar analysis as used in the proof of Lemma A.6, for anyε > 0, there exists a constant v, B and K5 <∞ such that

Pr

supv

αNT≤|γ−γ0|≤B

∥∥F 0′Xk(γ, γ0)′Λ0∥∥

NT |γ − γ0|> K5

< ε.

By the above result, for any ε > 0 and η > 0, with the same v, B, we have

Pr

supv

αNT≤|γ−γ0|≤B

‖H21,NT (γ)‖|γ − γ0|

> η

< ε.

14

Page 68: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

By similar arguments and noting that (Λ− Λ0H)′Λ0/N = op((NT )−α), we can show

Pr

supv

αNT≤|γ−γ0|≤B

‖H22,NT (γ)‖|γ − γ0|

> η

< ε.

For H23,NT (γ), we expand Xt(γ, γ0)′(Λ− Λ0H). Using (A.4), we obtain the kth entry of H23,NT (γ) as

[H23,NT (γ)]k =1

(NT )1−α tr

F 0′Xk(γ, γ0)′(Λ− Λ0H)H ′

Λ0′Λ0

N

=

1

(NT )1−α tr

F 0′Xk(γ, γ0)′(I1 + · · ·+ I15)ΛV −1

NTH′Λ

0′Λ0

N

.

Then we have∣∣[H23,NT (γ)]k∣∣ ≤ 1

(NT )1−α

∥∥∥F 0′Xk(γ, γ0)′(I1 + · · ·+ I15)Λ∥∥∥∥∥∥∥V −1

NTH′Λ

0′Λ0

N

∥∥∥∥≤ 1

(NT )1−α

15∑l=1

∥∥∥F 0′Xk(γ, γ0)′(Il)Λ∥∥∥∥∥∥∥V −1

NTH′Λ

0′Λ0

N

∥∥∥∥≡ (L1,k + · · ·+ L15,k)

∥∥∥∥V −1NTH

′Λ0′Λ0

N

∥∥∥∥ , say.

For L1,k, we have

L1,k =1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′ee′Λ∥∥∥

≤ 1

(NT )2−α

∥∥F 0′Xk(γ, γ0)′ee′Λ0∥∥ · ||H||+ 1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′ee′(Λ− Λ0H)∥∥∥

≡ L1,k1 + L1,k2.

For L1,k1, one can follow same strategy of proof of Lemma A.5 to show

Pr

supv

αNT≤|γ−γ0|≤B

||F 0′Xk(γ, γ0)′ee′Λ0||(NT )2−α |γ − γ0|

> η

< ε.

For L1,k2, we have

L1,k2 =1

(NT )2−α

∥∥∥ N∑i=1

N∑j=1

T∑s=1

T∑t=1

f0t xk,it[dit(γ)− dit(γ0)]eisejs(λj −Hλ0

i )′∥∥∥

≤ 1

NT

N∑i=1

T∑t=1

||f0t xk,it[dit(γ)− dit(γ0)]|| · 1

(NT )1−α maxi≤N

∥∥∥ N∑j=1

T∑s=1

eisejs(λj −Hλ0i )′∥∥∥.

It suffices to show the second term is op(1). We have that

1

(NT )1−α maxi≤N

∥∥∥ N∑j=1

T∑s=1

eisejs(λj −Hλ0i )′∥∥∥ ≤ 1

(NT )1−α

∥∥∥ee′(Λ− Λ0H)∥∥∥

= (NT )α||e||sp√NT

||e′(Λ− Λ0H)||√NT

= op(1)

15

Page 69: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

by Lemma A.4 (i).Now, we consider L2,k :

L2,k =1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′eF 0Λ0′Λ∥∥∥ ≤ (NT )α√

NT

∥∥F 0′Xk(γ, γ0)′eF 0∥∥

T√NT |γ − γ0|

‖Λ0′Λ‖N

.

We can show

Pr

supv

αNT≤|γ−γ0|≤B

∥∥F 0′Xk(γ, γ0)′eF 0∥∥

T√NT

> η

< ε

by following the arguments used in the proof of Lemma A.5. Given this, it is easy to see that 1|γ−γ0|L2,k = op(1)

uniformly on∣∣γ − γ0

∣∣ ∈ [ vαNT

, B]. For L3,k, we have

L3,k =1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′Λ0F 0′e′Λ∥∥∥ ≤ (NT )α√

T

∥∥F 0′Xk(γ, γ0)′Λ0∥∥

NT

∥∥F 0′e′∥∥

√NT

‖Λ‖√N.

By the same arguments in the proof of [H21,NT (γ)]k, we have 1|γ−γ0|L3,k = op(1) uniformly on

∣∣γ − γ0∣∣ ∈

[ vαNT

, B]. For L4,k, we have

L4,k ≤2K∑k′=1

∣∣∣θk′ − θ0k′

∣∣∣ 1

(NT )2−α

∥∥∥∥F 0′Xk(γ, γ0)′eX′k′,γ0Λ

∥∥∥∥=

2K∑k′=1

(NT )α∣∣∣θk′ − θ0

k′

∣∣∣ 1

(NT )2

∥∥∥∥ N∑i=1

N∑j=1

T∑s=1

T∑t=1

f0t xk,it

[dit (γ)− dit

(γ0)]eisXk′,js,γ0 λ′j

∥∥∥∥≤

2K∑k′=1

(NT )α∣∣∣θk′ − θ0

k′

∣∣∣ 1

(NT )2

N∑i=1

T∑t=1

‖f0t ‖ |xk,it|

∣∣dit (γ)− dit(γ0)∣∣max

i≤N

∥∥∥∥ N∑j=1

T∑s=1

eisxk′,js,γ0 λ′j

∥∥∥∥.We can show

Pr

supv

αNT≤|γ−γ0|≤B

1

NT |γ − γ0|

N∑i=1

T∑t=1

‖f0t ‖ |xk,it|

∣∣dit (γ)− dit(γ0)∣∣ > K5

< ε

by similar arguments as used in the proof of Lemma A.5. Noting that (NT )α∣∣∣θk′ − θ0

k′

∣∣∣ = op(1), it suffices to show

maxi≤N ‖∑Nj=1

∑Ts=1 eisxk′,js,γ0 λ′j‖/(NT ) = Op(1) for k′ = 1, . . . , 2K. By Cauchy-Schwarz inequality, we

have

1

NT

∥∥∥∥ N∑j=1

T∑s=1

eisxk′,js,γ0 λ′j

∥∥∥∥ ≤(

1

NT 2

N∑j=1

∣∣∣∣ T∑s=1

eisxk′,js,γ0

∣∣∣∣2)1/2( N∑j=1

‖λj‖)1/2

=√R

(1

NT 2

N∑j=1

∣∣∣∣ T∑s=1

eisxk′,js,γ0

∣∣∣∣2)1/2

.

We have that

Pr

(maxi≤N

1

NT 2

N∑j=1

∣∣∣ T∑s=1

eisxk′,js,γ0

∣∣∣2 > η

)≤ 1

η2

1

N2T 4

N∑i=1

E( N∑j=1

∣∣∣ T∑s=1

eisxk′,js,γ0

∣∣∣2)2

≤ 1

NT 4

N∑i=1

N∑j=1

E∣∣∣ T∑s=1

eisxk′,js,γ0

∣∣∣4

16

Page 70: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

= O(N/T 2) = o(1).

The analysis of L5,k, L6,k and L7,k is similar to that of L4,k and we omit the details for brevity.Next, we consider L8,k. Note that

L8,k ≤2K∑k′=1

∣∣∣θk′ − θ0k′

∣∣∣ 1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′Λ0F 0′X′k′,γ0Λ∥∥∥

=

2K∑k′=1

(NT )α∣∣∣θk′ − θ0

k′

∣∣∣ ∥∥F 0′Xk(γ, γ0)′Λ0∥∥

NT

∥∥∥F 0′X′k′,γ0Λ∥∥∥

NT

= op(1)

∥∥F 0′Xk(γ, γ0)′Λ0∥∥

NT.

Then 1||γ−γ0||L8,k = op(1) uniformly on

∣∣γ − γ0∣∣ ∈ [ v

αNT, B]. For L9,k, we have

L9,k ≤2K∑k′=1

∣∣∣θk′ − θ0k′

∣∣∣ 1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′Xk′,γ0F 0Λ0′Λ∥∥∥

=

2K∑k′=1

(NT )α∣∣∣θk′ − θ0

k′

∣∣∣ ∥∥F 0′Xk(γ, γ0)′Xk′,γ0F 0∥∥

NT 2

∥∥∥Λ0′Λ∥∥∥

N.

We can follow similar arguments as used in the proof of Lemma A.6 to show that, for any ε > 0, there exists aconstant v, B and K5 <∞ such that

Pr

supv

αNT≤|γ−γ0|≤B

1

NT 2 |γ − γ0|

N∑i=1

T∑t=1

T∑s=1

∥∥f0s

∥∥∥∥f0t

∥∥ ‖xit‖ ‖xis‖ ∣∣dit(γ)− dit(γ0)∣∣ > K5

< ε.

Noting that∥∥F 0′Xk(γ, γ0)′Xk′,γ0F 0

∥∥ /(NT 2) ≤ 1NT 2

∑Ni=1

∑Tt=1

∑Ts=1

∥∥f0s

∥∥∥∥f0t

∥∥ ‖xit‖ ‖xis‖ ∣∣dit(γ)− dit(γ0)∣∣,

we have 1||γ−γ0||L9,k = op(1) uniformly on

∣∣γ − γ0∣∣ ∈ [ v

αNT, B]. For L10,k, we have

L10,k ≤K∑k′=1

δk′1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′Λ0F 0′X′k′(γ0, γ)Λ

∥∥∥≤

K∑k′=1

(NT )α‖δk′‖∥∥F 0′Xk(γ, γ0)′Λ0

∥∥NT

∥∥F 0′∥∥

√NT

∥∥Xk′(γ0, γ)

∥∥√NT

‖Λ‖√N

= op(1)

∥∥F 0′Xk(γ, γ0)′Λ0∥∥

NT,

where we used the fact∥∥Xk′(γ

0, γ)∥∥ /√NT = op(1). Then 1

||γ−γ0||L10,k = op(1) uniformly on∣∣γ − γ0

∣∣ ∈[ vαNT

, B]. For L11,k, we have

L11,k ≤K∑k′=1

δk′1

(NT )2−α

∥∥∥F 0′Xk(γ, γ0)′Xk′(γ0, γ)F 0Λ0′Λ

∥∥∥≤

K∑k′=1

(NT )α‖δk′‖∥∥F 0′Xk(γ, γ0)′Xk′(γ

0, γ)F 0∥∥

NT 2

‖Λ0′Λ‖N

17

Page 71: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

If suffices to show that ‖F0′Xk(γ,γ0)′Xk′ (γ

0,γ)F 0‖|γ−γ0|NT 2 = op(1) uniformly on

∣∣γ − γ0∣∣ ∈ [ v

αNT, B]. Denote dit(γ, γ′) =

|dit(γ)− dit(γ′)| . Conditional on∣∣γ − γ0

∣∣ ≤ cNT , we have∥∥F 0′Xk(γ, γ0)′Xk′(γ0, γ)F 0

∥∥NT 2

≤ 1

NT

N∑i=1

T∑t=1

∥∥f0t

∥∥ ‖xit‖ dit(γ, γ0)1

T

T∑s=1

∥∥f0s

∥∥ ‖xis‖ dis(γ, γ0)

≤ 1

NT

N∑i=1

T∑t=1

∥∥f0t

∥∥ ‖xit‖ dit(γ, γ0) ·maxi≤N

1

T

T∑s=1

∥∥f0s

∥∥ ‖xis‖ dis(γ, γ0).

It suffices to show the maxi≤N1T

∑Ts=1

∥∥f0s

∥∥ ‖xis‖ dis(γ, γ0) = op(1). Noting that we already have preliminary

rate |γ − γ0| = Op((NT )−α + (NT )1/2−α). We can establish this result by simple arguments.

ForL12,k−L15,k, the analysis is easier and omitted. Summarizing the above results, we have 1|γ−γ0|H23,NT (γ) =

op(1) uniformly on vαNT

≤∣∣γ − γ0

∣∣ ≤ B.

WithH24,NT (γ) = 1(NT )1−α tr

F 0′Xk(γ, γ0)′(Λ− Λ0H)H ′ (Λ−Λ0H)′Λ0

N

, we can easily show 1

|γ−γ0|H24,NT (γ)

= op(1) uniformly on vαNT

≤∣∣γ − γ0

∣∣ ≤ B using the analysis of L1,k, ..., L15,k and the fact that (Λ−Λ0H)′Λ0

N =

op(1). Combining the results on H21,NT (γ), ..., H24,NT (γ), we have the desired results.

Proof of Lemma A.8. (i) Noting that MΛ0Xk,γ0MF 0 = Zk,γ0 − Xk,γ0PF 0 + PΛ0Xk,γ0 + PΛ0Xk,γ0PF 0 , weprove the claim by showing that (i1) 1√

NTtr(e′PΛ0Xk,γ0) = op(1), (i2) 1√

NTtr(e′PΛ0Xk,γ0PF 0) = op(1), and (i3)

1√NT

trPF 0 [e′Xk,γ0 − ED(e′Xk,γ0)] = op(1). We first show (i1). Noting that

1

N2TE∥∥∥Λ0′Xk,γ0e′Λ0

∥∥∥2

=1

N2T

R∑l=1

R∑r=1

E[ED

( N∑i=1

N∑j=1

T∑t=1

λ0irλ

0jlXjt,k,γ0eit

)2]

=1

N2T

N∑i=1

N∑j=1

T∑t=1

R∑r=1

R∑l=1

E(λ0irλ

0jlXjt,k,γ0eit

)2

= O(1),

by Assumption A.4 and the fact that Xjt,k,γ0 and eit both have conditional mean zero. Then we have∣∣∣∣ 1√NT

tr(e′PΛ0Xk,γ0

)∣∣∣∣ ≤ 1

N1/2

1

NT 1/2

∥∥∥Λ0′Xk,γ0e′Λ0∥∥∥∥∥(N−1Λ0′Λ0)−1

∥∥ = Op(N−1/2).

For (i2), noting that ‖PF 0‖ = R1/2, we have

1√NT

tr(e′PΛ0Xk,γ0PF 0

)≤ R1/2

N1/2

1

NT 1/2

∥∥∥Λ0′Xk,γ0e′Λ0∥∥∥∥∥(N−1Λ0′Λ0)−1

∥∥ = Op(N−1/2).

For (i3), we first consider F 0′[e′Xk,γ0 − ED(e′Xk,γ0)]F 0. Letting ζits = eitXis,k,γ0 − ED(eitXis,k,γ0), we have

E∥∥∥F 0′[e′Xk,γ0 − ED(e′Xk,γ0)]F 0

∥∥∥2

=

R∑l=1

R∑r=1

E[ED

( N∑i=1

T∑s=1

T∑t=1

f0srf

0tlζits

)2]=

R∑l=1

R∑r=1

N∑i=1

E[ED

( T∑s=1

T∑t=1

f0srf

0tlζits

)2],

where we use the conditional independence across i. Then

ED

(T∑s=1

T∑t=1

f0srf

0tlζits

)2

≤(

maxt

∥∥f0t

∥∥)4 T∑s,t,u,v=1

∣∣∣CovD(eitXis,k,γ0 , eiuXiv,k,γ0)∣∣∣ .

18

Page 72: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

By the Example 2 of Moon and Weidner (2017) and Assumption A.5, we can show that

1

NT 2

N∑i=1

T∑s,t,u,v=1

∣∣∣CovD(eitXis,k,γ0 , eiuXiv,k,γ0)∣∣∣ = Op(1).

By Assumption A.2(ii) and Markov inequality, we have maxt∥∥f0t

∥∥ = Op(T1/8). Therefore

1

T√NT

∥∥∥F 0′[e′Xk,γ0 − ED(e′Xk,γ0)]F 0∥∥∥ = Op(T

−1/4) = op(1).

It follows that

1√NT

∣∣∣trPF 0 [e′Xk,γ0 − ED(e′Xk,γ0)]∣∣∣ ≤ 1

T√NT

∥∥∥F 0′[e′Xk,γ0 − ED(e′Xk,γ0)]F 0∥∥∥ ∥∥(T−1F 0′F 0)−1

∥∥ = op(1).

Noting that ED(e′Xk,γ) = ED(e′Xk,γ), we have the result desired.(ii) It suffices to show that (ii1) (NT )−1/2tr[(ee′−ED(ee′))MΛ0 Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′] = op(1),

and (ii2) (NT )−1/2tr[(ee′ − ED(ee′))MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′] = op(1). For (ii1) we have,∣∣∣∣ 1√NT

tr[(ee′ − ED(ee′))MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′

]∣∣∣∣=

∣∣∣∣ 1√NT

tr[Λ0′(ee′ − ED(ee′))MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1

]∣∣∣∣≤

∥∥Λ0′(ee′ − ED(ee′))∥∥

N√T

∥∥∥Xk,γ0PF 0

∥∥∥√NT

∥∥F 0∥∥

√T

∥∥∥∥∥(F 0′F 0

T

)−1∥∥∥∥∥∥∥∥∥∥(

Λ0′Λ0

N

)−1∥∥∥∥∥

= op(1),

where we used the fact∥∥∥Xk,γ0PF 0

∥∥∥ = op(√NT ). For (ii2), we observe that MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′

is determined byD. Therefore, ee′−ED(ee′) is independent of MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′ conditionalon D. We can easily verify that ‖MΛ0Xk,γ0 F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′‖ = Op(1). By the Lemma S.8.2 (b) ofMoon and Weidner (2017), there exists a finite non-random constant m such that

ED

∣∣∣tr[ee′ − ED (ee′)]MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′ /√NT ∣∣∣2≤ m · ED

(∥∥MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1Λ0′∥∥)2 /N.By the law of iterated expectations, we can show 1√

NTtr[Λ0′(ee′ − ED(ee′))MΛ0Xk,γ0F 0(F 0′F 0)−1(Λ0′Λ0)−1

]= op(1).

(iii) The proof is analogous to that of (ii) and thus omitted.

Proof of Lemma A.9. We need to show: (i) the finite dimensional weak convergence for any (γ1, γ2, . . . , γp); and

(ii) the stochastic equicontinuity of SNT (γ) defined below.(i) Let c be a 2K × 1 nonrandom vector such that ‖c‖ = 1. By Lemma 4 of Andrews (1993) that says that

pointwise convergence plus asymptotically independent increment implies finite dimensional convergence, it sufficesto show that, for any given γ,

1√NT

N∑i=1

T∑t=1

eitz′it,γc

d→ N(0, c′Ω(γ, γ)c), (B.4)

19

Page 73: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

since the asymptotically independent increment condition is implied by Assumption A.4. For any pair (i, t), letm = T (i− 1) + t and ξm = eitz

′it,γc. Define Fm−1 = D ∨ σ((xjs, qjs, ej,s−1) : T (j − 1) + s ≤ m), i.e.,

Fm−1 = D ∨ σ(x11, q11, x12, q12, e11, . . . , x1T , q1T , e1T−1, . . . , xj1, qj1, xj2, qj2, ej1, . . . , xjt, qjt, ejt−1

).

Define the partial sum SM =∑Ml=1

1√Mξl =

∑Ml=1 ξM,l with M = NT and ξM,l = 1√

Mξl. It is easy to see

E(ξm|Fm−1) = 0. So the sequence ξm,m = 1, . . . , M is a martingale difference sequence (m.d.s.) with respectto the filtration Fm. By Corollary 3.1 of Hall and Heyde (1980), we would have weak convergence (B.4) if we canshow (i1) for any ε > 0,

∑Ml=1E

[ξ2M,l

1(|ξM,l| > ε)|Fl−1

] p−→ 0, and (i2)∑Ml=1E(ξ2

M,l|Fl−1)

p−→ c′Ω(γ, γ)c. For

(i1), we first show∑Ml=1E

[ξ2M,l

1(|ξM,l| > ε)]→ 0. Note that

M∑l=1

E[ξ2M,l1(|ξM,l| > ε)

]=

M∑l=1

∫|ξM,l|>ε

ξ2M,ldP < ε2

M∑l=1

∫ξ4M,ldP ≤

1

Mε2maxlE(ξ4

l ).

However, E(ξ4l ) is uniformly bounded by Assumption A.2(i). This, along with the Markov inequality, implies (i1).

For (i2), noting that zl,γ∈Fl−1, we have

M∑l=1

E(ξ2M,l|Fl−1) =

1

M

M∑l=1

E(ξ2M,l|Fl−1) =

1

M

M∑l=1

(c′zl,γ z′l,γc)E(e2

l |Fl−1)

=1

M

M∑l=1

(c′zl,γ z′l,γc)e

2l −

1

M

M∑l=1

(c′zl,γ z′l,γc)

(e2l − E(e2

l |Fl−1)).

For the first term, by Lemma C.4(ii), we have 1M

∑Ml=1(c′zl,γ z

′l,γc)e

2l −c′ΩNT (γ, γ)c = op(1). For the second term,

we see that for any l 6= l′,

E[(c′zl,γ z

′l,γc)(e

2l − E(e2

l |Fl−1)(c′zl′,γ z′l′,γc)(e

2l′ − E(e2

l′ |Fl′−1)]

= 0,

which implies

E[ 1

M

M∑l=1

(c′zl,γ z′l,γc)

(e2l − E(e2

l |Fl−1))]2

= O(1

M).

So the second term is op(1). It follows that∑Ml=1E(ξ2

M,l|Fl−1) = c′ΩNT (γ, γ)c + op(1)

p−→ c′Ω(γ, γ)c by

Assumption A.6. This completes the proof of finite dimensional convergence.(ii) Let SNT (γ) ≡ 1√

NT

∑Ni=1

∑Tt=1 zit,γeit. The first K entries of SNT (γ) does not involve γ. So it suffices

to establish the stochastic equicontinuity for the last K entries. We only focus on the (K + k)th element of SNT (γ).The tightness of K-dimensional vector on the product space can be obtained by the tightness of its entries togetherwith the De Morgan’s laws. Let aij = λ0′

i (Λ0′Λ0)−1λ0j and bts = f0′

t (F 0′F 0)−1f0s . By definition, the (K + k)th

entry of SNT (γ) can be rewritten as

Sk,NT (γ) =1√NT

tr[MF 0Xk(γ)′MΛ0e

]+

1√NT

tr[(Xk(γ)−Xk(γ))′e

]=

1√NT

N∑i=1

T∑t=1

[xitdit(γ)− ED(xitdit(γ))

]eit

+1√NT

N∑i=1

T∑t=1

[ N∑j=1

T∑s=1

(1(i = j)− aij

)(1(t = s)− bts

)ED(xjsdjs(γ))

]eit,

20

Page 74: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Now we change the above summation over two indices into one so that the order under one index is the same as theNT -dimensional vector vec(e′). Now the above expression can be further written as

Sk,NT (γ) =1√M

M∑l=1

xldl(γ)el −1√M

M∑l=1

ED(xldl(γ))el +1√M

M∑l=1

( M∑l′=1

dl,l′ED(xl′dl′(γ)))el.

with dl,l′ =(

1(i = j) − aij

)(1(t = s) − bts

)for l = T (i − 1) + t and l′ = T (j − 1) + s. We will show

the tightness of the three terms on the RHS. Note that by conditions (i) and (ii) in Assumption A.4, the sequencexldl(γ)elMl=1 with M = NT is a α-mixing sequence conditional on D. It is also a martingale difference due toAssumption A.4(iii). By the same arguments as used in Hansen (2000), we obtain the tightness of the first term­.The second term is a special case of the third by setting aij = 0 and bts = 0. So we only need to consider thethird term. For the third term, first note that (i)

∑Ml′=1 dl,l′ED(xl′dl′(γ)) is D-measurable, (ii) the sequence el is

α-mixing and a martingale difference sequence. Let

S?k,NT (γ) =1√M

M∑l=1

( M∑l′=1

dl,l′ED(xl′dl′(γ)))

︸ ︷︷ ︸≡ψl(γ)

el =1√M

M∑l=1

ψl(γ)el.

By the Burkholder’s inequality (see, e.g., Theorem 2.10 of Hall and Heyde (1980)), we have that for any pair (γa, γb),

ED

∣∣∣S?k,NT (γa)− S?k,NT (γb)∣∣∣4 ≤ 1

M2ED

[ M∑l=1

[ψl(γa)− ψl(γb)]2e2l

]2≤ 2

1

M2ED

[ M∑l=1

[ψl(γa)− ψl(γb)]2e2l − ED

([ψl(γa)− ψl(γb)]2e2

it

)]2+ 2[ 1

M

M∑l=1

ED([ψl(γa)− ψl(γb)]2e2

it

)]2.

For the first term, since el is a D-conditional α-mixing process, by the mixing inequality and Assumption A.4(i),we have that there exists a uniform bounded constant C?D such that

ED

[ 1√M

M∑l=1

[ψl(γa)− ψl(γb)]2e2l − ED

([ψl(γa)− ψl(γb)]2e2

it

)]2≤ C?D

1

M

M∑l=1

[ψl(γa)− ψl(γb)]4ED(e4l )

≤ C?D[

maxlED(e4

l )] 1

M

M∑l=1

[ψl(γa)− ψl(γb)]4.

The term ED(e4l ) is uniformly bounded by Assumption A.2(i). By (a + b + c + d)4 ≤ 24(a4 + b4 + c4 + d4),

( 1N

∑k ak)4 ≤ 1

N

∑k a

4k and [ED(X)]4 ≤ ED(X4), we have

1

M

M∑l=1

[ψl(γa)− ψl(γb)]4 =1

NT

N∑i=1

T∑t=1

[ N∑j=1

T∑s=1

(1(i = j)− aij

)(1(t = s)− bts

)ED(xjsdjs(γa, γb))

]4­We note that there is a slight difference between the Hansen (2000)’s assumption and ours. Hansen assumes that the sequence

is ρ-mixing and use the results in Peligrad (1982) to derive the fourth moment of partial sum. In our study, we assume α-mixingprocesses. However, we can also derive the fourth moment according to Utev and Peligrad (2003). So the difference is notessential.

21

Page 75: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

≤ 161

NT

N∑i=1

T∑t=1

ED

(‖xit‖4

∣∣∣dit(γa, γb)∣∣∣)+ 15

( 1

N

N∑i=1

‖λ0i ‖4)∥∥∥(

Λ0′Λ0

N)−1∥∥∥4 1

NT

N∑j=1

T∑t=1

‖λ0j‖4ED

(‖xjt‖4

∣∣∣djt(γa, γb)∣∣∣)

+ 16( 1

T

T∑t=1

‖f0t ‖4)∥∥∥(

F 0′F 0

T)−1∥∥∥4 1

NT

N∑i=1

T∑s=1

‖f0s ‖4ED

(‖xis‖4

∣∣∣dis(γa, γb)∣∣∣)+ 16

( 1

N

N∑i=1

‖λ0i ‖4)( 1

T

T∑t=1

‖f0t ‖4)∥∥∥(

Λ0′Λ0

N)−1∥∥∥4∥∥∥(

F 0′F 0

T)−1∥∥∥4

× 1

NT

N∑j=1

N∑t=1

‖λ0jf

0′s ‖4ED

(‖xjs‖4

∣∣∣djs(γa, γb)∣∣∣).We only analyze the last term as the analyses for the first three terms are implicitly contained in that of the last one.By Assumption A.4 (iv) and (v), ED(‖xjs‖4|djs(γa, γb)|) ≤ Mjs,Dcf |γa − γb|, which, in conjunction with theCauchy-Schwarz inequality, implies that

1

NT

N∑j=1

N∑t=1

‖λ0jf

0′s ‖4ED

(‖xjs‖4

∣∣∣djs(γa, γb)∣∣∣) ≤ [ 1

NT

N∑j=1

N∑t=1

‖λ0jf

0′s ‖4Mjs,D

]cf |γa − γb|

≤[ 1

N

N∑j=1

‖λ0j‖8]1/2[ 1

T

T∑s=1

‖f0s ‖8]1/2[ 1

NT

N∑j=1

T∑s=1

M2js,D

]1/2cf |γa − γb|.

Then the last term is bounded by

16( 1

N

N∑i=1

‖λ0i ‖8)( 1

T

T∑t=1

‖f0t ‖8)∥∥∥(

Λ0′Λ0

N)−1∥∥∥4∥∥∥(

F 0′F 0

T)−1∥∥∥4( 1

NT

N∑i=1

T∑t=1

M2it,D

)1/2

cf |γa − γb|.

For any ε > 0, by Assumption A.2 (ii) and (iii) as well as Assumption A.4 (iv) and (v), we can find some constantMε (which depends on ε) such that Pr(E) > 1− ε where

E =

16

[( 1

N

N∑i=1

‖λ0i ‖8)∥∥∥(

Λ0′Λ0

N)−1∥∥∥4

∨ 1

]×[( 1

T

T∑t=1

‖f0t ‖8)∥∥∥(

F 0′F 0

T)−1∥∥∥4

∨ 1

]

×[

1

NT

N∑i=1

T∑t=1

M2it,D

]1/2

cf ≤Mε

,

So, on the set E , we have

ED

[ 1√M

M∑l=1

[ψl(γa)− ψl(γb)]2e2l − ED

([ψl(γa)− ψl(γb)]2e2

it

)]2. |γa − γb|.

Similar arguments show that, on the set E ,

1

M

M∑l=1

ED([ψl(γa)− ψl(γb)]2e2

it

). |γa − γb|.

It follows that on the set E ,

ED

∣∣∣S?k,NT (γa)− S?k,NT (γb)∣∣∣4 . 1

M|γa − γb|+ |γa − γb|2.

22

Page 76: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Taking expectation with respect to D on both sides, we have that on the set E ,

E∣∣∣S?k,NT (γa)− S?k,NT (γb)

∣∣∣4 . 1

M|γa − γb|+ |γa − γb|2.

So equation (18) in Hansen (2000), the prerequisite condition for tightness, continues to hold in the current paper.The remaining proof is almost the same as Hansen (2000) since it involves the partition of Γ and the chain arguments.Given this, together with the fact

Pr(

supγ∗≤γ≤γ+δ

|S?k,NT (γ)− S?k,NT (γ∗)| > η)≤ Pr

(sup

γ∗≤γ≤γ+δ|S?k,NT (γ)− S?k,NT (γ∗)| > η

∣∣∣E)+ Pr(Ec),

we obtain the tightness of S?k,NT (γ) by letting Mε grow very slowly to infinity. This completes the proof.

Proof of Lemma A.10. We first establish the convergence of the finite dimensional distribution of RNT (v) to that

of B(v) and then show that RNT (v) is tight.Fix v ∈ Υ, where 0 is an interior point in the compact set Υ. Let Vit(γ) ≡ E(xitx

′ite

2it|qit = γ), d∗it(v) ≡

dit(γ0 + v/αNT )− dit(γ0) and uit(v) ≡ √αNTxiteitd∗it(v). Then

RNT (v) =√αNT [JNT (γ0 +

v

αNT)− JNT (γ0)] = (NT )−1/2

N∑i=1

T∑t=1

uit(v).

Let VNT (v) = (NT )−1∑Ni=1

∑Tt=1 uit(v)uit(v)′. Under Assumption A.4, uit(v) is an m.d.s. By the martingale

central limit theorem, it suffices to prove RNT (v)d→ N(0, |v|V 0

f ) by showing that (i) VNT (v)p→ |v|V 0

f and (ii)(NT )−1/2 max1≤i≤N,1≤t≤T |uit(v)| p→ 0. We first show (i). Following the proof of Lemma C.1, we can show that∂E[xitX

′ite

2itdit(γ)]

∂γ = Vit(γ)fit(γ). Then by the continuity of Vit(γ)fit(γ) at γ0, if v > 0,

E [VNT (v)] =1

NT

N∑i=1

T∑t=1

αNTE[xitx

′ite

2itd∗it(v)

]

=v

NT

N∑i=1

T∑t=1

E[xitx

′ite

2itdit(γ

0 + vαNT

)]− E

[xitx

′ite

2itdit(γ

0)]

vαNT

→ v lim(N,T )→∞

1

NT

N∑i=1

T∑t=1

Vit(γ0)fit(γ

0) = vV 0f ,

and the sign of the last expression is reversed if v < 0. Next, we calculate the variance of VNT (v). By the variancedecomposition formula, we have

E ‖VNT (v)− E [VNT (v)]‖2 =α2NT

NTE

[VarD

(1√NT

N∑i=1

T∑t=1

‖xit‖2 e2itd∗it(v)

)]

+α2NT

NTVar

∥∥∥∥∥ 1√NT

N∑i=1

T∑t=1

ED

(‖xit‖2 e2

itd∗it(v)

)∥∥∥∥∥ .The first term is bounded by α2

NT

NT C1E(CD) |v|αNT→ 0. By Assumption A.8, there is a constant C ′ such

Var

(1√NT

N∑i=1

T∑t=1

ED

(‖xit‖2 e2

itd∗it(v)

))≤ C ′E

[‖xit‖4 e2

itd∗it(v)

]= O(1/αNT ).

23

Page 77: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Then E ‖VNT (v)− E [VNT (v)]‖2 = o (1) and (i) follows. Next, we show (ii). Note that

E

∥∥∥∥(NT )−1/2 maxi,t|uit(v)|

∥∥∥∥4

≤ 1

NTmaxi,t

E |uit(v)|4 =α2NT

NTmaxi,t

E(‖xit‖4 e4itd∗it(v))

=α2NT

NTmaxi,t

E(‖xit‖4 e4it|γ0)fit(γ

0)|v|αNT

→ 0.

Then (NT )−1/2 maxi,t |uit(v)| = op (1) . Consequently, we have RNT (v)d→ N(0, |v|V 0

f ). This argument can be

extended to include any finite collection v1, . . . , vk to yield the convergence of the finite dimensional distribution

of RNT (v) to that of B(v).To establish the tightness claim, we fix ε > 0 and η > 0. Let δ = εη4/(2C2C

2). Then there is a large enough

(N,T ) such that for (N,T ) ≥ (N,T ), Pr(CD > C) < δε/2,√NT ≥ 6C/η and

δ/αNT =(NT )2αδ

NT>

1

NT.

By Lemma C.3, we have

PrD

(sup

v1≤v≤v1+δ‖RNT (v)−RNT (v1)| > η) = PrD( sup

γ1≤γ≤γ1+δ/αNT

‖JNT (γ)− JNT (γ1)‖ > η√αNT

)

≤ C2C2D(δ/αNT )2

η4/α2NT

=C2C

2Dδ

2

η4,

if√NT ≥ 6CD/η, and δ/αNT ≥ (NT )−1. Therefore, conditional on the event CD ≤ C, we have that

supv1≤v≤v1+δ ‖RNT (v)−RNT (v1)‖ > η occurs with probability less than δε/2. To sum up, we can concludethat

Pr(

supv1≤v≤v1+δ

‖RNT (v)−RNT (v1)‖ > η)≤ δε.

This implies that RNT (v) is tight. Hence, we have RNT (v)⇒ B(v).

Proof of Lemma A.11. We show the first claim. Fix v ∈ Υ. Following the analysis of VNT (v) in the proof LemmaA.10, we can show that

E[GNT (v)

]=αNTNT

N∑i=1

T∑t=1

C0′E(

[xitx

′itdit(γ

0 +v

αNT)

]− E

[xitx

′itdit(γ

0)]

C0 → g |v|

as (N,T )→∞. For Var[GNT (v)], we have

Var[GNT (γ)

]= α2

NTVar(GNT (γ0 +v

αNT))

= α2NTE

[VarDGNT (γ0 +

v

αNT)

]+ α2

NTVar[EDGNT (γ0 +

v

αNT)

].

The first term on the RHS of the last equation is bounded by∥∥C0

∥∥4αNTC1E(CD)v/(NT )→ 0 by Lemma A.6. For

the second term, by the Assumption A.7, we have Var(√NTEDGNT (γ0 + v/αNT )) ≤

∥∥C0∥∥4ME[‖xit‖4 d∗it(v)]

for some M <∞. Therefore, we have GNT (v)p→ g |v| for each v ∈ Υ.

Now, we write Υ = Υ1 ∪Υ2, where Υ1 = [−v, 0] and Υ2 = [0, v]. Since GNT (v) is monotonically increasing

on Υ2 and the limit function is continuous, the convergence is uniform over Υ2. A symmetric argument yields

uniformity over Υ1. It follows that GNT (v)p→ g |v| uniformly over v ∈ Υ.

24

Page 78: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Analogously, we can show that KNT (v)p→ tr(Df ) |v| uniformly over v ∈ Υ.

Proof of Lemma A.12. By (A.2), we have

LNT (v) = L(θ, Λ, γ0)− L(θ, Λ, γ0 +v

αNT)

=[L1(θ, Λ, γ0)− L1(θ, Λ, γ0 +

v

αNT)]

+[L2(θ, Λ, γ0)− L2(θ, Λ, γ0 +

v

αNT)]

= −GNT (v) + 2C0′RNT (v)

+ 2(NT )α(δ − δ0)′RNT (v)− 2

T∑t=1

δ′Xt(γ0 +

v

αNT, γ0)′PΛet

− (δ − δ0)′T∑t=1

Xt(γ0 +

v

αNT, γ0)′MΛXt(γ

0 +v

αNT, γ0)(δ + δ0)

+ 2

T∑t=1

δ′Xt(γ0 +

v

αNT, γ0)′MΛXt,γ0(θ − θ0)

− 2

T∑t=1

δ′Xt(γ0 +

v

αNT, γ0)′(PΛ − PΛ0)Λ0ft

≡ −GNT (v) + 2C0′RNT (v) + L1 (v) + · · ·+ L5 (v) , say.

By Lemmas A.10 and A.11, we have

−GNT (v) + 2C0′RNT (v)⇒ 2C0′B (v)− g |v| = 2√hW (v)− g |v| ≡ Q(v),

where h = C0′V 0f C

0 and W (·) is the standard Brownian motion on the real line. It suffices to show supv |Ls(v)| =op(1) for s = 1, 2, . . . , 5. Noting that (NT )α(δ − δ0) = Op((NT )α−1/2) = op(1) and supv ‖RNT (v)‖ = Op (1),we have

supv|L1(v)| ≤ 2(NT )α‖δ − δ0‖ sup

v‖RNT (v)‖ = op(1).

For L2 (v) , we have

|L2 (v)| =

∣∣∣∣∣K∑k=1

δktr[Xk(γ0 +

v

αNT, γ0)′PΛe

]∣∣∣∣∣≤

K∑k=1

|δk|∣∣∣∣tr[Xk(γ0 +

v

αNT, γ0)PΛ0e′]

∣∣∣∣+

K∑k=1

|δk|∣∣∣∣tr[Xk(γ0 +

v

αNT, γ0)(PΛ0 − PΛ)e′]

∣∣∣∣ .By Lemma C.5(i)-(ii) and the fact that ‖δ‖ = Op((NT )−α), we have supv |L2 (v)| = op(1).

Next, we consider L3 (v). Note that

supv|L3 (v)| ≤ sup

v

∣∣∣∣∣(δ − δ0)′T∑t=1

Xt(γ0 +

v

αNT, γ0)′MΛXt(γ

0 +v

αNT, γ0)(δ + δ0)

∣∣∣∣∣≤ Op((NT )−1/2−α) sup

v

∥∥∥∥∥T∑t=1

Xt(γ0 +

v

αNT, γ0)′MΛXt(γ

0 +v

αNT, γ0)

∥∥∥∥∥≤ Op((NT )−1/2+α) sup

vKNT (v) = op(1)Op(1) = op(1),

25

Page 79: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

where recall that KNT (v) = αNTKNT (γ0 + v/αNT ) and KNT (·) is defined in the proof of Theorem 3.3. ForL4 (v), we have

L4 (v) =

T∑t=1

δ′Xt(γ0 +

v

αNT, γ0)′MΛXt,γ0(θ − θ0)

=

2K∑k=1

K∑k′=1

δk′(θk − θ0k)tr[Xk′(γ

0 +v

αNT, γ0)′MΛXk,γ0

].

By Lemma C.5(iii)-(iv) and the fact that (NT )1/2+αδk′(θk − θ0k) = Op (1), we can show supv |L4 (v)| = op(1).

Now, we consider L5 (v) . Note that

L5 (v) =

K∑k=1

δktr[F 0′X′k(γ0 +

v

αNT, γ0)(PΛ − PΛ0)Λ0

].

By Lemma C.5(v) and the fact that (NT )αδk′ = Op (1), we can show supv |L5 (v)| = op(1). After bounding all the

terms, we have the result desired.

Proof of Lemma A.13. Note that

L(θγ0 , Λγ0 , γ0)− L(θ, Λ, γ0) =[L(θγ0 , Λγ0 , γ0)− L(θγ0 , Λ, γ0)

]+[L(θγ0 , Λ, γ0)− L(θ, Λ, γ0)

].

We need to show (i) L(θγ0 , Λ, γ0)−L(θ, Λ, γ0) = op(1) and (ii) L(θγ0 , Λγ0 , γ0)−L(θγ0 , Λ, γ0) = op(1). We firstconsider (i). By definition,

L(θγ0 , Λ, γ0)− L(θ, Λ, γ0) = (θ − θγ0)′T∑t=1

X ′t,γ0MΛXt,γ0(θ − θγ0) + 2(θ − θγ0)′T∑t=1

X ′t,γ0MΛXt,γ0(θ0 − θ)

+ 2(θ − θγ0)′T∑t=1

Xt,γ0MΛΛ0f0t + 2(θ − θγ0)′

T∑t=1

Xt,γ0MΛet.

By Theorem 3.4 and the corresponding results in Bai (2009) and Moon and Weidner (2017), we have√NT (θγ0 −

θ) = op(1). Given this result, together with the analysis in Lemma A.4 (ii)-(vi), we can readily show that each of the

four terms on the RHS is op(1).Next, we consider (ii). We need to establish the asymptotic order of ||Λγ0 − Λ||. Recall that Λ and Λγ0 are the

first R0 eigenvectors of Res(θ, γ)Res(θ, γ)′ at (θ, γ) and (θγ0 , γ0) respectively, where Res(θ, γ) ≡ Y − β X−δ X(γ). Now it is seen that

1√NT‖Res(θγ0 , γ0)−Res(θ, γ)‖sp ≤

1√NT

∥∥∥(β− βγ0)X+ (δ− δγ0)Xγ0 − δX(γ, γ0)∥∥∥

sp= op(

1√NT

)

by the fact that θγ0 − θ = op((NT )−1/2), γ − γ0 = Op((NT )2α−1) and δ = Op((NT )−α). By Davis-Kahan’s(1970) sin(Θ) theorem, in combination with the fact that the R0-largest eigenvalue of 1

NTRes(θ, γ)′Res(θ, γ)′

converges to the minimum eigenvalue of ΣfΣλ, we have

‖PΛγ0− PΛ‖ = op(

1√NT

).

26

Page 80: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

By the Weyl’s inequality,

‖VNT − VNT,γ0‖ ≤ 1√NT‖Res(θγ0 , γ0)−Res(θ, γ)‖sp = op(

1√NT

).

where VNT,γ0 is the diagonal matrix whose diagonal elements are the R0 largest eigenvalues of the matrix (NT )−1

Res(θγ0 , γ0)Res(θγ0 , γ0)′. With the above results, we can show that H −Hγ0 = op((NT )−1/2), where Hγ0 is the

rotational matrix when γ is set to γ0.Then we use decomposition in (A.1) to obtain |L(θ, Λγ0 , γ0)− L(θγ0 , Λ, γ0)| ≤

∑6`=1 ∆L`, where

∆L1 ≡∣∣∣tr[Ψ(θγ0 , γ0)′(PΛγ0

− PΛ)Ψ(θγ0 , γ0)]∣∣∣, ∆L2 ≡

∣∣∣tr(F 0′F 0)Λ0′(MΛγ0−MΛ)Λ0

∣∣∣,∆L3 ≡ 2

∣∣∣tr[F 0Λ0′(PΛγ0− PΛ)Ψ(θ, γ)]

∣∣∣, ∆L4 ≡ 2∣∣∣tr[F 0Λ0′(MΛγ0

−MΛ)e]∣∣∣,

∆L5 ≡ 2∣∣∣tr[Ψ(θγ0 , γ0)′(PΛγ0

− PΛ)e]∣∣∣, ∆L6 ≡

∣∣∣tr[e′(PΛγ0− PΛ)e]

∣∣∣.Noting that ||Ψ(θγ0 , γ0)|| = Op(1) and by the fact θγ0 − θ0 = Op((NT )−1/2) and ||e||sp = Op(

√N +

√T ), we

can easily establish the claims that ∆L1, ∆L3. ∆L5 and ∆L6 are all op(1). For ∆L4, we have

∆L4 = 2|tr[F 0′e′MΛγ0(Λ0 − Λγ0H−1)− F 0′e′MΛ(Λ0 − ΛH−1)]|

≤ 2|tr[F 0′e′MΛ(Λγ0H−1 − ΛH−1)]|+ 2|tr[F 0′e′(PΛγ0− PΛ)(Λ0 − ΛH−1)]|

≤ 2||F 0′e′MΛγ0

||√NT

√NT ||Λγ0H−1 − ΛH−1||+ 2

||F 0′e′||√NT

||PΛγ0− PΛ|| · ||Λ

0 − ΛH−1||

= op(1).

By the same token, we can show that ∆L2 = op(1). This completes the proof.

C Some Additional Useful Lemmas

Lemma C.1 Suppose that Assumptions A.5 holds. Then there exists a variable Cit,D that depends only on (i, t)

and D such thatEDh

rit(γ1, γ2) ≤ Cit,D |γ2 − γ1| and EDkrit(γ1, γ2) ≤ Cit,D |γ2 − γ1| ,

and there exists a variable CD such that

1

NT

N∑i=1

T∑t=1

EDhrit(γ1, γ2) ≤ CD |γ2 − γ1| and

1

NT

N∑i=1

T∑t=1

EDkrit(γ1, γ2) ≤ CD |γ2 − γ1| .

In addition, there is a constant C <∞ such that Pr(CD < C)→ 1.

Proof of Lemma C.1: (i) For any random variable Z with finite first moment, we have

ED [Zdit(γ)] = ED [Z · 1qit ≤ γ] = ED [1qit ≤ γED(Z|qit)] =

∫ γ

−∞ED(Z|qit)dFit,D(qit),

where Fit,D (·) denotes the conditional distribution of qit given D with the corresponding PDF given by fit,D(·).Taking derivative with respect to γ on both sides yields

d

dγED [Zdit(γ)] = ED(Z|qit = γ)fit,D(γ).

27

Page 81: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Then under Assumption A.5, for any r ≤ 4

d

dγED [‖xiteit‖r dit(γ)] = ED [‖xiteit‖r |qit = γ] fit,D(γ)

≤ [ED(‖xiteit‖4 |qit = γ)]r/4fit,D(γ) ≤Mit,Dcf .

Let Cit,D = Mit,Dcf . One can immediately show

ED [hrit(γ1, γ2)] ≤ Cit,D |γ2 − γ1| .

Identical arguments gives ED [krit(γ1, γ2)] ≤ Cit,D |γ2 − γ1| .(ii) Let CD = 1

NT

∑Ni=1

∑Tt=1 Cit,D and C = cfM, as in Assumption A.5(ii). We have the desired result.

Lemma C.2 Suppose that Assumptions A.4-A.5 hold. Then there exists a constant C1 such that

ED

∣∣∣ 1√NT

N∑i=1

T∑t=1

[h2it(γ1, γ2)− EDh2

it(γ1, γ2)]∣∣∣2 ≤ C1CD|γ2 − γ1| and (C.1)

ED

∣∣∣ 1√NT

N∑i=1

T∑t=1

[k2it(γ1, γ2)− EDk2

it(γ1, γ2)]∣∣∣2 ≤ C1CD|γ2 − γ1|, (C.2)

where CD is defined in Lemma C.1.

Proof of Lemma C.2: By the conditional independence across i and strong mixing over t, there is a K1 such that

ED

∣∣∣ 1√NT

N∑i=1

T∑t=1

[h2it(γ1, γ2)− EDh2

it(γ1, γ2)]∣∣∣2 =

1

N

N∑i=1

ED

∣∣∣ 1√T

T∑t=1

[h2it(γ1, γ2)− EDh2

it(γ1, γ2)]∣∣∣2

≤ 1

N

N∑i=1

K11

T

T∑t=1

ED[h2it(γ1, γ2)− EDh2

it(γ1, γ2)]2

≤ K12

NT

N∑i=1

T∑t=1

EDh4it(γ1, γ2)

≤ 2K1CD |γ2 − γ1| .

Letting C1 = 2K1, the first result follows. The proof for the second result is similar.

Lemma C.3 Suppose that Assumptions A.4-A.5 hold. For δ ≥ (NT )−1, there is a finite constant C2 such that for

all γ1 ∈ Γ, η > 0, and δ ≥ (NT )−1, if

√NT ≥ 6CD/η, then

PrD

(sup

γ1≤γ≤γ1+δ‖JNT (γ)− JNT (γ1)‖ > η

)≤ C2C

2Dδ

2

η4.

Proof of Lemma C.3: Let m be an integer satisfying NTδ/2 ≤ m ≤ NTδ, which is possible since NTδ ≥ 1.Set δm = δ/m. Let hit(γ, γ′) = ‖xiteit‖ |dit(γ′)− dit(γ)|. For k = 1, . . . ,m + 1, set γk = γ1 + (k − 1)δm,hit,k = hit(γk, γk+1), and hit,jk = hit(γk, γj). Letting HNT,k = 1

NT

∑Ni=1

∑Tt=1 hit,k, we observe that for

γk ≤ γ ≤ γk+1,

‖JNT (γ)− JNT (γk)‖ =∥∥∥ 1√

NT

N∑i=1

T∑t=1

xiteit [dit(γ)− dit(γk)]∥∥∥

28

Page 82: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

≤ 1√NT

N∑i=1

T∑t=1

‖xiteit‖ |dit(γ)− dit(γk)|

≤ 1√NT

N∑i=1

T∑t=1

‖xiteit‖ |dit(γk+1)− dit(γk)|

=√NTHNT,k ≤

√NT |HNT,k − EDHNT,k|+

√NTEDHNT,k.

It follows that

supγ1≤γ≤γ1+δ

‖JNT (γ)− JNT (γ1)‖ ≤ max1≤k≤m

supγ∈[γk,γk+1]

‖JNT (γk)− JNT (γ1) + JNT (γ)− JNT (γk)‖

≤ max2≤k≤m+1

‖JNT (γk)− JNT (γ1)‖+ max1≤k≤m

√NTEDHNT,k

+ max1≤k≤m

√NT |HNT,k − EDHNT,k| . (C.3)

In the following analysis, we shall bound the three terms on the RHS of the above equation.First, we consider ED ‖JNT (γk)− JNT (γj)‖4. By the Burkholder’s inequality (e.g., Hall and Heyde (1980,

p.23)), there is some K2 <∞ :

ED ‖JNT (γk)− JNT (γj)‖4 = ED

∥∥∥∥∥ 1√NT

N∑i=1

T∑t=1

xiteit [dit(γk)− dit(γj)]

∥∥∥∥∥4

≤ K2ED

∣∣∣∣∣ 1

NT

N∑i=1

T∑t=1

‖xiteit‖2 |dit(γk)− dit(γj)|

∣∣∣∣∣2

= K2ED

∣∣∣∣∣ 1

NT

N∑i=1

T∑t=1

h2it,jk

∣∣∣∣∣2

= K2ED

∣∣∣∣∣ 1

NT

N∑i=1

T∑t=1

(h2it,jk − EDh2

it,jk

)+

1

NT

N∑i=1

T∑t=1

EDh2it,jk

∣∣∣∣∣2

.

By Cauchy inequality, the last expression is bounded by

2K2

ED∣∣∣∣∣ 1

NT

N∑i=1

T∑t=1

(h2it,jk − EDh2

it,jk

)∣∣∣∣∣2+ (CD |k − j| δm)

2

≤ 2K2

[(C1CD |k − j| δm

NT

)+ (CD |k − j| δm)

2

]≤ 2K2(

√C1CD + CD)2 |k − j|2 δ2

m

≤ 2K2(√C1 + 1)2C2

D |k − j|2δ2m,

where we use the fact CD > 1. By Markov’s inequality,

PrD(‖JNT (γk)− JNT (γj)‖ > η) ≤ 2K2(√C1 + 1)2C2

D (|k − j| δm)2

η4.

The above inequality satisfies the condition of Theorem 10.2 of Billingsley (1999) with α = β = 1 and ul =√K2(√C1 + 1)CDδm. It follows that there exists a constant K3 such that

PrD( max2≤k≤m+1

‖JNT (γk)− JNT (γ1)‖ > η/3) ≤ 81K3K2(√C1 + 1)2C2

Dδ2

η4,

29

Page 83: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

which bounds the first term on the RHS of (C.3).Next, we consider the second term on the RHS of (C.3). Using Lemma C.2, we have

√NTEDHNT,k =

√NT

1

NT

N∑i=1

T∑t=1

EDhit,k ≤√NTCDδm ≤ 2CD/

√NT,

where the last inequality is by the fact δm ≤ 2/(NT ).

Finally, we consider the last term on the RHS of (C.3). By Theorem 2.1 of Utev and Peligrad (2003), there is aK4 <∞ such that

ED

∣∣∣√NT (HNT,k − EDHNT,k)∣∣∣4 ≤ K4

(NT )2

N∑i=1

T∑t=1

EDh4it,k +

(N∑i=1

T∑t=1

EDh2it,k

)2

≤ K4

[CDδmNT

+ (CDδm)2

]≤ K4(CD + C2

D)δ2m ≤ 2K4C

2Dδ

2m,

where the third inequality is by the fact 1/(NT ) < δm. Then by Markov inequality

PrD

(max

1≤k≤m

√NT |HNT,k − EDHNT,k| > η/3

)≤ 81m

2K4C2Dδ

2m

η4≤ 81 · 2K4C

2Dδ

2

η4,

where we used the facts mδm = δ and δm < δ.To sum up, when 2CD/

√NT ≤ η/3, we have

PrD

(sup

γ1≤γ≤γ1+δ‖JNT (γ)− JNT (γ1)‖ > η

)≤

81[K3K2(

√C1 + 1)2 + 2K4

]C2Dδ

2

η4.

Letting C2 = 81[K3K2(

√C1 + 1)2 + 2K4

], we have the result desired.

Lemma C.4 Suppose Assumptions A.2 and A.4 hold. Then we have:

(i) 1NT

∑Ni=1

∑Tt=1[e2

itzit,γ z′it,γ − ED(e2

itzit,γ z′it,γ)] = op(1);

(ii) 1NT

∑Ni=1

∑Tt=1 e

2it(zit,γ z

′it,γ − zit,γz′it,γ) = op(1).

Proof of Lemma C.4: (i) Conditional on D, e2itzit,γ z

′it,γ − ED(e2

itzit,γ z′it,γ) has mean zero, is uncorrelated across

i, and strong mixing over t. Then we have

ED

1

NT

N∑i=1

T∑t=1

[e2itzk,it,γ zl,it,γ − ED(e2

itzk,it,γZl,it,γ)]

2

=1

N2

N∑i=1

ED

1

T

T∑t=1

[e2itzk,it,γ zl,it,γ − ED(e2

itzk,it,γ zl,it,γ)]

2

=1

N2T 2

N∑i=1

T∑t=1

T∑s=1

CovD(e2itzk,it,γ zl,it,γ , e

2iszk,is,γ zl,is,γ) = Op((NT )−1),

where the last equality is obtained by a simple application of the Davydov inequality for conditional mixing process-

es. The result follows from the conditional Chebyshev inequality.

30

Page 84: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

(ii) Define the K ×K matrix: A = 1NT

∑Ni=1

∑Tt=1 e

2it(zit,γ + zit,γ)(zit,γ − zit,γ)′. Then we have

1

NT

N∑i=1

T∑t=1

e2it(zit,γ z

′it,γ − zit,γz′it,γ) =

1

2(A+A′).

LetBk be theN×T matrix with (i, t)th elementBit,k = e2it(zk,it,γ+zk,it,γ) where zk,it,γ and zk,it,γ denote the kth

element of zit,γ and zit,γ , respectively. We have ‖Bk‖ = Op(√NT ) since maxi,tE ‖Bit,k‖2 is finite. The (l, k)th

entry of A can be written as Alk = 1NT trBl(Zk,γ − Zk,γ). Therefore, we have

|Alk| ≤1

NT‖Bl‖

∥∥∥Zk,γ − Zk,γ

∥∥∥ = Op((NT )−1/2)∥∥∥Zk,γ − Zk,γ

∥∥∥ .Now, we have

Zk,γ − Zk,γ = Xk,γ −MΛXk,γMF 0 = Xk,γ − (IN − PΛ)Xk,γ(IT − PF 0)

= PΛXk,γPF 0 − Xk,γPF 0 − PΛXk,γ .

Noting that ‖Xk,γPF 0‖ = op(√NT ) and ‖PΛXk,γ‖ = op(

√NT ), we have ‖Zk,γ − Zk,γ‖ = op(

√NT ) and we

have |Alk| = op(1).

Lemma C.5 Suppose Assumptions A.1-A.6 hold, and N/T → κ > 0. Then for k = 1, . . . ,K and k′ = 1, . . . , 2K,

we have

(i) supv(NT )−α∣∣tr[Xk(γ0 + v/αNT , γ

0)′(PΛ0 − PΛ)e]∣∣ = op(1);

(ii) supv(NT )−α∣∣tr[Xk(γ0 + v/αNT , γ

0)′PΛ0e]∣∣ = op(1);

(iii) supv(NT )−1/2−α∣∣tr[Xk(γ0 + v/αNT , γ

0)′MΛ0Xk′,γ0 ]∣∣ = op(1)

(iv) supv(NT )−1/2−α∣∣tr[Xk(γ0 + v/αNT , γ

0)′(PΛ0 − PΛ)Xk′,γ0 ]∣∣ = op(1)

(v) supv(NT )−αtr[F 0′X′k(γ0 + v/αNT , γ

0)(PΛ − PΛ0)Λ0]

= op(1)

Proof of Lemma C.5: (i) We can show that (NT )−2α∥∥Xk(γ0 + v/αNT , γ

0)∥∥2

= Op(1) uniformly for v. Thenwe have

(NT )−α∣∣∣∣tr [Xk(γ0 +

v

αNT, γ0)′(PΛ − PΛ0)e

]∣∣∣∣ ≤ (NT )−α∥∥Xk(γ0 + v/αNT , γ

0)∥∥ ||PΛ − PΛ0 ||||e||sp

= Op(1) ·Op(π−2NT ) ·Op(πNT ) = op(1),

uniformly for v.

(ii) The analysis is the same as that of Lemma A.1 (iv).(iii) By triangular inequality, we have∣∣∣∣tr[Xk(γ0 +

v

αNT, γ0)′MΛ0Xk′,γ0 ]

∣∣∣∣ ≤ ∣∣∣∣tr[Xk(γ0 +v

αNT, γ0)′Xk′,γ0 ]

∣∣∣∣+

∣∣∣∣tr[Xk(γ0 +v

αNT, γ0)′PΛ0Xk′,γ0 ]

∣∣∣∣ .31

Page 85: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

For the first term, we have ∣∣∣∣tr[Xk(γ0 +v

αNT, γ0)′Xk′,γ0 ]

∣∣∣∣ ≤ N∑i=1

T∑t=1

‖xit‖2 |d∗it(v)|

= Op((NT )2α),

uniformly. Then we have (NT )−1/2−α supv∣∣tr[Xk(γ0 + v/αNT , γ

0)′Xk′,γ0 ]∣∣ = Op((NT )−1/2+α) uniformly.

For the second term,∣∣∣tr[Xk(γ0 + v/αNT , γ0)′PΛ0Xk′,γ0 ]

∣∣∣ ≤ 1

N

∥∥Λ0′Xk′,γ0Xk(γ0 + v/αNT , γ0)′Λ0

∥∥∥∥∥(Λ0′Λ0

N

)−1∥∥∥.It is easy to show that supv N

−1∥∥Λ0′Xk′,γ0Xk(γ0 + v/αNT , γ

0)′Λ0∥∥ = Op((NT )2α) uniformly. Then we have

the second term bounded by Op((NT )−1/2+α) and completes the proof.

(iv) The proof is similar to that of (i) and omitted here.(v) By the decomposition (A.6), we have

(NT )−α∣∣∣tr[F 0′X′k(γ0 +

v

αNT, γ0)(PΛ − PΛ0)Λ0

]∣∣∣≤ (NT )−α

∣∣∣tr[F 0′X′k(γ0 +v

αNT, γ0)(Λ− Λ0H)

(Λ− Λ0H)′Λ0

N

]∣∣∣+(NT )−α

∣∣∣tr[F 0′X′k(γ0 +v

αNT, γ0)(Λ0H)

(Λ− Λ0H)′Λ0

N

]∣∣∣+(NT )−α

∣∣∣tr[F 0′X′k(γ0 +v

αNT, γ0)(Λ− Λ0H)

(Λ0H)′Λ0

N

]∣∣∣+(NT )−α

∣∣∣tr[F 0′X′k(γ0 +v

αNT, γ0)Λ0

[HH ′ −

(N−1Λ0′Λ0

)−1]′Λ0]∣∣∣.

For the first and third terms on the RHS, we can apply the decomposition in A.4 for (Λ − Λ0H). This analysis

is tedious and omitted here. It is easy to show that supv∥∥F 0′X′k(γ0 + v/αNT , γ

0)Λ0∥∥ = Op((NT )2α). Then

the second term on the RHS can be bounded by (NT )−α supv∥∥F 0′X′k(γ0 + v/αNT , γ

0)Λ0∥∥ ||(Λ−Λ0H)′Λ0/N ||

= Op((NT )απ−2NT ) = op(1). Similarly we have the fourth term bounded by some op(1) term uniformly.

D Determination of the number of factors

In this section, we study the determination of the number of factors discussed in Section 3.6. See Section F for some

simulation results on the performance of the proposed method.

D.1 Consistency of the method

Let Θψ and R be as defined in Section 3.6. We first study the asymptotic property of Θψ and then show that Rp→ R0

under some regularity conditions. As stated in section 4.1 of Moon and Weidner (2018), we can first show the

consistency of Θψ before establishing consistency of (θψ, γψ). This fact is beneficial in our panel threshold model

with the IFEs as the analysis of (θψ, γψ) is challenging. Moreover, we will show the method is uniformly valid in

the sense that it is robust to the size of the threshold effect.

32

Page 86: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Following the literature, we need to impose a restricted strong convexity condition. Define the following spaceof N × T matrices:

S = (β − β0)X+δ X(γ)− δ0 X(γ0) : θ = (β′, δ′)′ ∈ R2K , γ ∈ Γ.

Assumption A.1∗. (Restricted Strong Convexity) For some c > 0, define the restricted low-rank set to be

C(c) ≡ ∆ ∈ RN×T : ||∆−MΛ0∆MF 0 ||∗ ≤ c||MΛ0∆MF 0 ||∗.

If ∆1 ∈ C(c) for some c > 0, then there is a constant κc > 0 such that inf∆2∈S ||∆1 + ∆2||2 ≥ κc||∆1||2.The cone C(c) is a space of matrices that are close to Λ0F 0′. In the case where γ0 is known, we effectively

have a linear panel data model so that the above assumption will be the same as Assumption 1 of Moon and Weidner

(2018). We will briefly discuss the sufficient conditions to ensure the restricted strong convexity below.

Proposition D.1 Let ψ (N−1/2 + T−1/2). Suppose Assumption A.1∗, A.2 and A.4 hold. Then we have: (i)

||Θψ −Θ0|||sp = Op(√N +

√T ) and (ii) R

p→ R0.

Remark. As we know that the first R0 singular values of Θ0 are of the order Op(√NT ) and other singular values

are zero. Proposition D.1(i) implies that the first R0 singular values of Θψ are also Op(√NT ), and the others are

Op(√N +

√T ). Hence, one can find an appropriate singular value thresholding parameter χNT to determine R

consistently.Proof of Proposition D.1. (i) Define two N × T matrices

∆1,ψ ≡ (βψ − β0)X+δψ X(γψ)− δ0 X(γ0) and ∆2,ψ ≡ Θψ −Θ0.

The singular value decompositions of Θ0 and MΛ0∆2,ψMF 0 can be written as

Θ0 = Λ0F 0′ = Λ1Σ1F′1 and MΛ0∆2,ψMF 0 = Λ⊥1 Σ2F

⊥′1 ,

where Λ1 ∈ RN×R0

, F1 ∈ RT×R0

, Λ⊥1 ∈ RN×(min(N,T )−R0), F⊥1 ∈ RT×(min(N,T )−R0), Σ1 =diag(σ1, . . . , σR0)

and Σ2 =diag(σR0+1, . . . , σmin(N,T )). In addition, we have Λ′1Λ⊥1 = 0 and F ′1F⊥1 = 0. Hence, we have

||Θ0 + MΛ0∆2,ψMF 0 ||∗ = ||Θ0||∗ + ||MΛ0∆2,ψMF 0 ||∗. (D.1)

Then

||Θψ||∗ = ||Θ0 + MΛ0∆2,ψMF 0 + ∆2,ψ −MΛ0∆2,ψMF 0 ||∗≥ ||Θ0 + MΛ0∆2,ψMF 0 ||∗ − ||∆2,ψ −MΛ0∆2,ψMF 0 ||∗= ||Θ0||∗ + ||MΛ0∆2,ψMF 0 ||∗ − ||∆2,ψ −MΛ0∆2,ψMF 0 ||∗.

By definition, Ln,ψ(θψ, γψ, Θψ) ≤ Ln,ψ(θ0, γ0,Θ0). This implies that

0 ≥ 1

2NT||∆1,ψ + ∆2,ψ||2 −

1

NTtr[U ′(∆1,ψ + ∆2,ψ)] +

ψ√NT

(||Θψ||∗ − ||Θ0||∗). (D.2)

33

Page 87: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

It is easy to see that 1NT |tr(U

′∆1,ψ)| ≤ C/√NT and

1

NT|tr(U ′∆2,ψ)| ≤ ||U ||sp

1

NT||∆2,ψ||∗ ≤

ψ

2√NT||∆2,ψ||∗

with probability approaching 1 for some C <∞. This, along with (D.1)-(D.2), implies that

C√NT

+3ψ

2√NT||∆2,ψ −MΛ0∆2,ψMF 0 ||∗ ≥

1

2NT||∆1,ψ + ∆2,ψ||2 +

ψ

2√NT||MΛ0∆2,ψMF 0 ||∗ (D.3)

Next we discuss two cases: (i1) ||∆2,ψ − MΛ0∆2,ψMF 0 ||∗ + ||MΛ0∆2,ψMF 0 ||∗ ≤ 4C/ψ and (i2) ||∆2,ψ −MΛ0∆2,ψMF 0 ||∗ + ||MΛ0∆2,ψMF 0 ||∗ > 4C/ψ.

In the case (i1), one can obtain ||∆2,ψ||sp ≤ ||∆2,ψ||∗ ≤ 4C/ψ = O(√N +

√T ) by the triangular inequality.

In the case (i2), the inequality in (D.3) gives us

4√NT||∆2,ψ −MΛ0∆2,ψMF 0 ||∗ ≥

1

2NT||∆1,ψ + ∆2,ψ||2 +

ψ

4√NT||MΛ0∆2,ψMF 0 ||∗.

Then we have ∆2,ψ ∈ C(7). By Assumption A.1∗, we can obtain

7ψ√NT||∆2,ψ −MΛ0∆2,ψMF 0 ||∗ ≥

κ7

NT||∆2,ψ||2.

Because rank(∆2,ψ −MΛ0∆2,ψMF 0) ≤ 2R0, we have that

||∆2,ψ −MΛ0∆2,ψMF 0 ||∗ ≤√

2R0||∆2,ψ −MΛ0∆2,ψMF 0 || ≤√

2R0||∆2,ψ||.

It follows that

||∆2,ψ||sp ≤ ||∆2,ψ|| ≤7√

2R0

κ7ψ√NT = Op(

√N +

√T ).

Hence, in both cases, we have ||∆2,ψ||sp = Op(√N +

√T ).

(ii) By Weyl’s inequality and (i), we can conclude that

sk(Θψ) = sk(Θ0) +Op(√N +

√T ) for k = 1, . . . ,min(N,T ).

We consider two cases: (ii1) R0 = 0, and (ii2) R0 > 0. In case (ii1), we have Θ0 = 0, ||Θψ||sp = Op(√N +√

T ) and sr(Θψ) = Op(√N +

√T ) for r = 1, . . . ,min(N,T ). Noting that ψ ( N−1/2 + T−1/2), we

have (√NTψ||Θψ||sp)1/2

√N +

√T . Then with probability approaching one, we have s1(Θψ) < log(N +

T )(√NTψ||Θψ||sp)1/2 ≡ χNT . It follows that

Pr(R = R0 = 0)→ 1.

In case (ii2), we have that sr(Θ0) ≥ c√NT for r = 1, . . . , R0 and some c > 0, and sr(Θ0) = 0 for r > R0. Then

sR0(Θψ) √NT, sR0+1(Θψ)

√N +

√T and ||Θψ||sp

√NT.

It follows that (√NTψ||Θψ||sp)1/2 N1/2T 1/4 +N1/4T 1/2,

Pr(sR0(Θψ) ≥ log(N + T )(√NTψ||Θψ||sp)1/2)→ 1 and sR0+1(Θψ) < log(N + T )(

√NTψ||Θψ||sp)1/2)→ 1.

Then Pr(R = R0)→ 1.

34

Page 88: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

D.2 Sufficient condition for restricted strong convexity

In this subsection, we discuss the sufficient condition for the restricted strong convexity imposed in AssumptionA.1∗. In the linear panel regression case of Moon and Weidner (2018), they present a lemma (their Lemma A.8) toprovide sufficient condition for their version of restricted strong convexity. Here we present their Lemma A.8 withminor modification and then extend it to our framework.

Lemma D.1 (Lemma A.8 of Moon and Weidner (2018)) Let s1, . . . , smin(N,T ) denote the singular values ofMΛ0∆2MF 0 . Assume that there exists a sequence qNT ≥ 2 such that

(i) ||∆2|| = Op(1);

(ii)∑min(N,T )r=qNT

s2r ≥ κc > 0 with probability approaching one;

(iii)∑qNT−2r=1 (sr − qNT )

p→∞.Then inf∆1∈C(c)||∆1 + ∆2||2 ≥ κc with probability approaching one.

The above lemma is applicable to the ‘high-dimensional’ matrix ∆2,where many sr(∆2)’s are of orderO(√

max(N,T )).The detailed discussion can be found in Section A.6 of Moon and Weidner (2018).

Let S∗ ≡ a∆ ∈ S , a ∈ R be a set that contains scaled elements of S. We state our sufficient condition in thenext lemma.

Lemma D.2 Suppose for all ∆2 ∈ S∗with ||∆2|| ∈ [1 − √κc, 1 +√κc], condition (ii) and (iii) of Lemma D.1 are

satisfied. Then Assumption A.1∗ holds.

Proof of Lemma D.2. It suffices to show that inf∆1∈C(c)inf∆2∈S ||∆1+∆2||2

||∆1||2 is bounded below by κc. Noting thatS ⊂ S∗ and ∆ ∈ S∗ if and only if a∆ ∈ S∗ for all a ∈ R,

inf∆1∈C(c)inf∆2∈S ||∆1 + ∆2||2

||∆1||2≥ inf∆1∈C(c)

inf∆2∈S∗ ||∆1 + ∆2||2

||∆1||2

= inf∆1∈C(c),||∆1||=1inf∆2∈S∗ ||∆1 + ∆2||2.

By the triangular inequality, we have that

||∆1 + ∆2|| ≥ max||∆1|| − ||∆2||, ||∆2|| − ||∆1|| = max1− ||∆2||, ||∆2|| − 1.

Then we have with probability approaching one

inf∆1∈C(c),||∆1||=1inf∆2∈S∗ ||∆1 + ∆2||2 ≥ mininf∆1∈C(c),||∆1||=1inf∆2∈S∗,||∆2||∈[1−√κc,1+√κc]||∆1 + ∆2||2, κc

≥ mininf∆2∈S∗,||∆2||∈[1−√κc,1+√κc]inf∆1∈C(c)||∆1 + ∆2||2, κc

= κc,

where the last equality holds by Lemma D.1.

E Test IFEs Against Two-Way Additive Fixed Effects

In this section, we consider testing the IFEs against the two-way additive fixed effects. When the number of factors islarger than two, the additive fixed effects cannot capture the unobserved heterogeneity in the IFEs. When the numberof factors equals one, the additive fixed effects should be either individual or time fixed effects, and one can adoptthe sup-type tests of Castagnetti, Rossi and Trapani (2015) to perform the task. Notice that the latter sup-type test isnot applicable when the number of factors is 2. In fact, to the best of our knowledge, there exists no theoretical work

35

Page 89: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

that considers testing the hypothesis H0 : λ0′i f

0t = α0

i + ν0t , i.e., the two-way additive fixed effects specification

is sufficient to capture the unobserved heterogeneity captured by the IFEs. Below we introduce a new test statisticand establish its asymptotic null distribution by assuming that the true number of factors is known and given by 2(R0= 2). Apparently, the asymptotic theories developed here have independent interests, and are useful in otherpanel data models with IFEs. See Section F for some simulation results on the performance of the proposed test.

E.1 Hypotheses and test statistic

The null hypothesis isH0 : λ0′

i f0t = α0

i + ν0t for all (i, t),

and the alternative is the negation of H0. Under H0, the true data generating process is a threshold panel with two-way additive fixed effects (α0

i + ν0t ). For any pair of (i, t), we can write it as the product of a pair of factors and

factor loadings:α0i + ν0

t = (α0i , 1) · (1, ν0

t )′ ≡ λ0′i f

0t ,

where λ0i ≡ (α0

i , 1)′ and f0t ≡ (1, ν0

t )′. Stacking the λ0i ’s and f0

t ’s, we can obtain the matrix of Λ0 ≡ (Λ01, ιN ) and

F 0 ≡ (ιT , F02 ), where Λ0

1 ≡ (α01, . . . , α

0N )′, F 0

2 ≡ (ν01 , . . . , ν

0T )′, and e.g., ιN is a N -dimensional vector of ones.

Then we can write the null hypothesis as

H0 : Λ0 = (Λ01, ιN ) and F 0 = (ιT , F

02 ).

In general, F 0 and Λ0 can not be seperately identified so that the PC estimators F and Λ are consistent withF 0 and Λ0 respectively only up to certain rotation matrix H that is not estimatable for the data. Hence, one cannotdirectly compare columns of F and Λ with ιT and ιN . Nevertheless, noting that Λ0F 0′ = Λ0

1ι′T + ιNF

0′2 , the matrix

Λ0F 0′ can be transformed to zero by the ’between and within group transformation. That is

M0NΛ0F 0′M0T = M0NΛ01ι′TM0T + M0N ιNF

0′2 M0T = 0,

where M0N ≡ IN − ιN ι′N/N and M0T ≡ IT − ιT ι

′T /T . Motivated by the above observation, we propose to

calculate the mean squared residuals:

TNT = (NT )−1||M0N ΛF ′M0T ||2.

Intuitively, ΛF ′ is a consistent estimator of Λ0F 0′ so that TNT should be small under the null and large under thealternative. In addition, we introduce a normalized version of test statistic NT bc,NT in the later section. We showthatNT bc,NT

d→ N(0, 1), under the null hypothesis. One can directly make inference usingNT bc,NT . Alternative,we also propose a bootstrap procedure using TNT , which does not require the estimation of additional parameters.

The following subsection studies the asymptotic properties of TNT .We show that after being properly recenteredand rescaled, the test statistic TNT will have a well-behaved asymptotic distribution under the null.

E.2 The asymptotic null distribution

To proceed, we introduce some notations under the null. Let Λ• ≡ (λ•1, ..., λ•N )′ = M0NΛ0

1 = (α01−α, . . . , α0

N−α)′,where α ≡

∑Ni=1 α

0i /N . Similarly, let F • ≡ (f•1 , ..., f

•T )′ = M0TF

02 = (ν0

1 − ν, . . . , ν0T − ν)′, where ν ≡∑T

t=1 ν0t /T. Let σ2

α and σ2ν be the sample covariance of α0

i ’s and ν0t ’s respectively:

σ2α ≡

1

N

N∑i=1

(α0i − α)2, and σ2

ν ≡1

T

T∑t=1

(ν0t − ν)2.

36

Page 90: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

The following proposition establishes the asymptotic distribution of TNT after appropriate centering and normaliza-tion:

Proposition E.1 Suppose Assumption A.1-A.5 hold, σ2α → σ2

α, σ2ν → σ2

ν , and N/T → κ, as (N,T ) → ∞. Letc•f = plim

(N,T )→∞

1σ2νNT

∑i,t e

2itf•2t and c•λ = plim

(N,T )→∞

1σ2αNT

∑i,t λ

•2i e

2it. UnderH0, we have

N√T(TNT − c•f/T − c•λ/N

) d→ N(0, 4κσ•2f + 4σ•2λ ),

where σ•2f and σ•2λ are defined in Lemma E.2,.

To prove the above Proposition, we need the following Lemmas:

Lemma E.1 Suppose Assumption A.1-A.5 hold, and N/T → κ, as (N,T )→∞. UnderH0, we have

||M0N ΛF ′M0T ||2

NT=

1

NT‖PΛ•e‖2 +

1

NT‖ePF•‖2 +Op(π

−4NT ).

Proof of Lemma E.1: Note that F = (Y −∑2Kk=1 Xk,γ θk)′Λ/N . We have

F = F 0 Λ0′Λ

N+

e′Λ

N+ resf ,

where ||resf ||/√T = Op(π

−2NT ) and Λ0′e·resf/

√NT = Op(π

−2NT ). Hence,

ΛF ′ = ΛΛ′Λ0

NF 0′ + Λ

Λ′e

N+ Λ · res′f ≡ A1 +A2 +A3, say.

Consider A1. We can make the following decomposition:

A1 = ΛΛ′Λ0

NF 0′ = Λ0HH ′

Λ0′Λ0

NF 0′ + (Λ− Λ0H)H ′

Λ0′Λ0

NF 0′

= Λ0F 0′ +eF 0

T

(F 0′F 0

T

)−1

F 0′

+

(Λ0F

0′e′Λ

NT+

ee′Λ

NT

)V −1NTH

′Λ0′Λ0

NF 0′

+Λ0

(HH ′

Λ0′Λ0

N− I)F 0′ +

eF 0

T

(F 0′F 0

T

)−1(HH ′

Λ0′Λ0

N− I)F 0′

+A1,4

≡ Λ0F 0′ + ePF 0 +A1,1 +A1,2 +A1,3 +A1,4,

where ||A1,4||/√NT = Op(π

−2NT ), and

tr[A′2M0NA1,4M0T ]/(NT ) = Op(π−4NT ) and tr[PF 0e′M0NA1,4M0T ]/(NT 2) = Op(π

−4NT ).

Consider A2. We can make the following decomposition:

A2 = Λ0HH ′Λ0′e

N+(Λ− Λ0H)H ′

Λ0′e

N

37

Page 91: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

+Λ0H(Λ− Λ0H)′e

N+ (Λ− Λ0H)

(Λ− Λ0H)′e

N

= Λ0

(Λ0′Λ0

N

)−1Λ0′e

N+ Λ0

[HH ′ −

(Λ0′Λ0

N

)−1]

Λ0′e

N

+(Λ− Λ0H)H ′Λ0′e

N+ Λ0H

(Λ− Λ0H)′e

N+ (Λ− Λ0H)

(Λ− Λ0H)′e

N≡ PΛ0e +A2,1 +A2,2 +A2,3 +A2,4.

One can show that

tr(A′`1,`2M0NA3M0T )/(NT ) = Op(π−4NT ), tr(PF 0e′M0NA3M0T )/(NT 2) = Op(π

−4NT ),

tr(e′PΛ0M0NA3M0T )/(N2T ) = Op(π−4NT ), tr(A′`1,`2M0NPΛ0eM0T )/(N2T ) = Op(π

−4NT ),

tr(A′`1,`2M0NePF 0M0T )/(NT 2) = Op(π−4NT ), tr(A′`1,`2M0NA`3,`3M0T )/(NT ) = Op(π

−4NT ), and

tr(e′PΛ0M0NePF 0M0T )/(N2T 2) = Op(π−4NT ),

for `1, `3 = 1, 2 and `2, `4 = 1, 2, 3, 4. It follows that

||M0N ΛF ′M0T ||2

NT=

1

NT‖M0NePF 0M0T ‖2 +

1

NT‖M0NPΛ0eM0T ‖+Op(π

−4NT )

=1

NT‖M0NePF•‖2 +

1

NT‖PΛ•eM0T ‖+Op(π

−4NT )

=1

NT‖ePF•‖2 +

1

NT‖PΛ•e‖2 +Op(π

−4NT ),

where we used the fact that PF 0M0T = PF• and M0NPΛ0 = PΛ• under the null.

Lemma E.2 Under Assumptions A.2 and A.4, σ2α → σ2

α, σ2ν → σ2

ν and N/T → κ, as (N,T )→∞. Then we have

1

σ2ν

√NT

N∑i=1

T∑s=1

T∑t=s+1

eiteisf•s f•t

d−→ N(0, σ•2f ), and

2

σ2αN

2T

N∑i=1

N∑j=i+1

T∑t=1

eitejtλ•i λ•j

d−→ N(0, σ•2λ ),

where

σ•2f = plim(N,T )→∞

1

σ4νNT

2

N∑i=1

T∑t=1

T∑s=t+1

f•2s f•2t e2ise

2it, and σ•2λ = plim

(N,T )→∞

1

σ4αNT

2

N∑i=1

N∑j=i+1

T∑t=1

λ•2i λ•2j e

2ite

2jt.

Proof of Lemma E.2: We only prove the first result and the second result can be shown similarly. Let u•it ≡1√NT

f•t eit∑Ts=t+1 f

•s eis. We need to show

∑Ni=1

∑Tt=1 u

•it

d→ N(0, σ•2f σ4ν). By Assumption A.4 (ii)-(iii), we can

construct an m.d.s. and the corresponding filtration as in the proof of Lemma A.9. We also have

1

NT 2

N∑i=1

T∑t=1

(f•t eit

T∑s=t+1

f•s eis

)2

=1

NT 2

N∑i=1

T∑t=1

T∑s=t+1

f•2s f•2t e2ise

2it

+1

NT 2

N∑i=1

T∑t=1

f•2t e2it

T∑s>t

T∑v>t and v 6=s

f•s f•v eiseis

38

Page 92: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

=1

NT 2

N∑i=1

T∑t=1

T∑s=t+1

f•2s f•2t e2ise

2it +Op(π

−2NT )

p−→ σ•2f σ4ν .

Then by the martingale central limit theorem, we have∑Ni=1

∑Tt=1 u

•it

d→ N(0, σ•2f σ4ν). Using the assumption that

σ2α → σ2

α and Slutsky’s theorem, we have the desired result. This completes the proof.

Proof of Proposition E.1: By Lemma E.1, we have

TNT =1

NT‖ePF•‖2 +

1

NT‖PΛ•e‖2 +Op(π

−4NT )

=2

σ2νNT

2

N∑i=1

T∑s=1

T∑t=s+1

eiteisf•s f•t +

2

σ2αN

2T

N∑i=1

N∑j=i+1

T∑t=1

eitejtλ•i λ•j

+1

σ2νNT

2

∑i,t

e2itf•2t +

1

σ2αN

2T

∑i,t

λ•2i e2it +Op(π

−4NT ).

It is straightforward to see that

1

σ2νNT

2

∑i,t

e2itf•2t =

1

Tc•f +Op(π

−4NT ), and

1

σ2αN

2T

∑i,t

λ•2i e2it =

1

Nc•λ +Op(π

−4NT ).

Then by Lemma E.2, we have

N√T

(TNT −

1

Tc•f −

1

Nc•λ

)=

2√N√T

1

σ2ν

√NT

N∑i=1

T∑s=1

T∑t=s+1

eiteisf•s f•t

+2

σ2αN√T

N∑i=1

N∑j=i+1

T∑t=1

eitejtλ•i λ•j +Op(π

−1NT )

d−→ N(0, 4κσ•2f + 4σ•2λ ).

E.3 Inference

E.3.1 Estimating covariance and center parameters

In this section, we derive consistent estimators of the parameters that appear in Proposition E.1.(a) Estimators for λ•i and f•t . Noting that

M0NΛ0F 0′ιTT

=M0NΛ0

1ι′T ιT + M0N ιNF

0′2 ιT

T= M0NΛ0

1 = Λ•,

we propose to estimate Λ• by Λ• ≡M0N ΛF ′ιT /T. Similarly, we can estimate F • by F • ≡M0T F Λ′ιN/N. Let λ•idenote the ith entry of Λ• and f•t denote the tth entry of F •. One can show that they are consistent estimators for λ•iand f•t . The consistent estimators for σ2

α and σ2ν are given by Λ•′Λ•/N and F •′F •/T , respectively.

(b) Estimators for c•f and c•λ. A consistent estimator for eit is given by

eit = yit − x′itβ − x′itδ · 1qit ≤ γ − λ′ift.

The consistent estimators for c•f and c•λ are given by

c•f =

(F •′F •

T

)−11

NT

∑i,t

e2itf•2t and c•λ =

(Λ•′Λ•

N

)−11

NT

∑i,t

e2itλ•2t .

39

Page 93: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

(c) Variance σ•2f and σ•2λ . Let

σ•2f =

(F •′F •

T

)−21

NT 2

N∑i=1

T∑t=1

T∑s=t+1

f•2s f•2t e2ise

2it,

and σ•2λ =

(Λ•′Λ•

N

)−21

NT 2

N∑i=1

N∑j=i+1

T∑t=1

λ•2i λ•2j e

2ite

2jt

be the estimators of σ•2f and σ•2λ , respectively.

Given the above estimators, we can obtain the rescaled version of TNT by

NT NT ≡N√T(TNT − c•f/T − c•λ/N

)2√

(N/T ) · σ•2f + σ•2λ

.

Under the null hypothesis, NT NTd→ N(0, 1).

In finite sample experiment, we observe that NT NT has positive sample mean, which leads to over-rejectionunder the null hypothesis. This is mainly due to the fact that eit converges to eit at only πNT rate, leading to badfinite sample performance of c•f and c•λ. To enhance the finite sample performance, we need to revisit c•f and c•λand make some modifications to these two estimators. As the asymptotic analysis is quite tedious, we only presenta partial analysis. Specifically, we only look at a partial expansion of (NT )−1

∑i,t e

2itf•2t and (NT )−1

∑i,t e

2itλ•2i

and find out the small quadratic terms in them. By some tedious algebra, for any pair (i, t), we have that

eit = eit −1

N

N∑j=1

w1,ijejt −1

T

T∑s=1

w2,tseis +Op(π−2NT ) ≡ eit +Op(π

−2NT ),

wherew1,ij = λ0′i (Λ0′Λ0/N)−1λ0

j , andw2,st = f0′s (F 0′F 0/T )−1f0

t . We look at the expansion of (NT )−1∑i,t e

2itf•2t

and (NT )−1∑i,t e

2itλ•2i respectively. By direct calculation, the quadratic terms in (NT )−1

∑i,t(e

2it − e2

it)f•2t are

given by1

N3T

∑i,j,t

w21,ije

2jtf•2t +

1

NT 3

∑i,s,t

w22,tse

2isf•2t −

2

NT

∑i,t

e2itf•2t

(w1,ii

N+w2,tt

T

).

Under the null, w1,ij = 1 +λ•i λ•j/σ

2α and w2,st = 1 +f•s f

•t /σ

2ν . Inserting the two identities into the above quadratic

terms, we have that

1

N3T

∑i,j,t

w21,ije

2jtf•2t =

1

N

1

NT

∑i,t

e2itf•2t +

1

Nσ2α

1

NT

∑i,t

λ•2i e2itf•2t ,

1

NT 3

∑i,s,t

w22,tse

2isf•2t =

σ2ν

T

1

NT

∑i,t

e2it +

1

T

(1

T

T∑t=1

f•4tσ4ν

1

NT

∑i,t

e2itf•2t

+ op(π−2NT ),

− 2

NT

∑i,t

e2itf•2t

(w1,ii

N+w2,tt

T

)= −

(2

N+

2

T

)1

NT

∑i,t

e2itf•2t −

2

Nσ2α

1

N2T

∑i,t

λ•2i e2itf•2t

− 2

T σ2ν

1

NT

∑i,t

e2itf•4t .

The above terms sum up to Bf,NT :=

Bf,NT = −

(1

N+

2

T− 1

T 2

T∑t=1

f•4tσ4ν

)1

NT

∑i,t

e2itf•2t −

1

Nσ2α

1

NT

∑i,t

λ•2i e2itf•2t

40

Page 94: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

+σ2ν

T

1

NT

∑i,t

e2it −

2

T σ2ν

1

NT

∑i,t

e2itf•4t .

Similarly, we can obtain the quadratic terms in (NT )−1∑i,t e

2itλ•2i :

Bλ,NT = −

(2

N+

1

T− 1

N2

N∑i=1

λ•4iσ4α

)1

NT

∑i,t

λ•2i e2it −

1

T σ2ν

1

NT

∑i,t

λ•2i e2itf•2t

+σ2α

N

1

NT

∑i,t

e2it −

2

Nσ2α

1

NT

∑i,t

e2itλ•4i .

Replacing the terms by consistent estimators introduced in the last section, we can obtain good estimators Bf,NT andBλ,NT . Hence, the bias corrected estimator for c•f and c•λ are given by c•f,bc ≡ c•f − Bf,NT and c•λ,bc ≡ c•λ − Bλ,NT .The resulting bias corrected test statistic becomes

NT bc,NT =N√T(TNT − c•f,bc/T − c•λ,bc/N

)2√

(N/T ) · σ•2f + σ•2λ

.

E.3.2 Inference procedure and wild bootstrap

Note that both NT NT and NT bc,NT converge in distribution to standard normal distribution. We can directly usethem and conduct one sided test, i.e., rejecting the null at a significance level if and only if NT NT or NT bc,NT islarger than the (1− a)th quantile of standard normal distribution.

As the test statistic contains many smaller order terms, the size control can be bad. To improve the size control,we propose to follow Su and Chen (2013) to simulate the critical values via the following procedure:

1. Obtain the least squares estimator (θ, γ, Λ, F ) and the test statistic TNT .

2. Use two way additive fixed effect estimator to estimate the model:

yit = β′xit + δ′xitdit(γ) + αi + νt + eit,

and obtain (β, δ), αi, for i = 1, ..., N , νt, for t = 1, ..., T, and eit;

3. Generate vit, i = 1, . . . , N, t = 1, . . . , T independently from the standard normal distribution, and the cor-responding y∗it :

y∗it = β′xit + δ′xitdit(γ) + αi + νt + eitvit,

for all (i, t);

4. Estimate the modely∗it = β′xit + δ′xitdit(γ) + λ′ift + e∗it,

to obtain Λ∗, F ∗ and calculate T ∗NT = (NT )−1||M′0N Λ∗F ∗′M0T ||2;

5. Repeat Steps 3-4 B times and denote the resulting T ∗NT test statistics as T ∗NT,j for j = 1, . . . , B;

6. Calculate the simulated bootstrap p-value for the TNT test as p∗ = 1B

∑Bj=1 1T ∗NT,j ≥ TNT and reject the

null when p∗ is smaller than some prescribed nominal level of significance.

41

Page 95: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Remark: (i) In the proposed bootstrap procedure, we only need to calculate the original statistic TNT withoutrecentering and normalization. Therefore, it is simple to use.(ii) One can also calculate the normalized test statistics NT NT and NT bc,NT in the bootstrap procedure. Ouradditional simulation results show that the alternative approaches only make marginal difference.(iii) As the estimation error for γ does not affect the estimate of other variables. We propose to treat γ as true valueand not estimate the threshold parameter γ in the bootstrap replications.

F Additional Monte Carlo Simulation Results

In this section we consider some additional simulations.

F.1 DGPs in Section 5 with different values of α

We first consider the same set of DGPs as in Section 5 but with α = 0.3 and 0. Table 7 reports the results in the caseα = 0.3. We summarize the major findings from Table 7. First, the performance of the slope coefficient estimatesis similar to that in the case α = 0.2. Second, the estimator γ has a larger standard deviation (std) in this case thanin the case α = 0.2 as predicted by the theory. Third, despite the larger std, the bias is negligible, and the coverageprobability of the 95% confidence interval for γ0 performs quite well. In short, these findings are consistent with theprediction from the asymptotic theory.

Table 8 reports the results in the case α = 0 or fixed threshold effects. We summarize the major findingsfrom Table 8. First, the performance of the slope coefficient estimate is also similar to that in the case α = 0.2.Second, both the bias and standard deviation of γ are small as expected given the theoretical prediction that γ−γ0 =

Op((NT )−1). Third, the 95% confidence interval for γ0 is severely under-covered, which indicates that the inferenceprocedure is the shrinking-threshold-effect framework is not valid anymore when the threshold effect is fixed. Inshort, the asymptotic theory for slope coefficient estimators continue to hold in this case whereas that for γ inTheorem 3.5 breaks down now.

F.2 DGPs with the presence of threshold effect in the error variance

Following the theoretical discussion in Section 3.5, we now consider a set of DGPs with the presence of thresholdeffect in the error variance. The DGPs are the same as those in Section 5.1 except that we now generate the errorterm as follows

eit =uit√

2· 1qit ≤ γ0+

√2uit · 1qit > γ0,

where uit is i.i.d. generated from N(0, 1). Now, Vf (γ) is not continuous at γ0 and the procedure for the inferenceon γ0 has to be modified according to Section 3.5.

Table 9 reports the results with the presence of threshold effect in the error variance. The results in Table 9 arevery similar to that in Table 1. The difference is mainly on the coverage probability of the 95% confidence intervalsfor γ0. As we have to estimate the nuisance parameters nonparametrically, additional randomness is introduced inthe test statistic. As a result, the coverage probabilities tend to under-cover slightly in some cases.

F.3 A DGP that mimics the empirical data

In this section, we consider a DGP that mimics the data in the empirical application. We generate the simulated dataas follows:

42

Page 96: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 7: Performance of the least square estimators with α = 0.3

β δ γ

N T Bias Std CP Bias Std CP Bias Std CP1 25 50 -0.0053 0.0308 0.9180 0.0117 0.0416 0.9300 0.0013 0.1591 0.9680

-0.0024 0.0307 0.9220 -0.0027 0.0409 0.938050 25 -0.0121 0.0342 0.8980 0.0109 0.0430 0.9160 0.0131 0.1647 0.9760

-0.0090 0.0344 0.9020 -0.0034 0.0432 0.942050 50 -0.0050 0.0221 0.9260 0.0075 0.0281 0.9460 -0.0067 0.1454 0.9780

-0.0031 0.0222 0.9240 -0.0016 0.0294 0.946050 100 -0.0030 0.0149 0.9460 0.0061 0.0201 0.9340 -0.0027 0.1346 0.9760

-0.0018 0.0149 0.9440 0.0002 0.0203 0.9420100 50 -0.0039 0.0144 0.9280 0.0047 0.0211 0.9260 0.0151 0.1411 0.9560

-0.0027 0.0144 0.9320 -0.0013 0.0212 0.9280100 100 -0.0020 0.0104 0.9400 0.0043 0.0149 0.9260 0.0015 0.1324 0.9460

-0.0012 0.0104 0.9500 0.0004 0.0149 0.92202 25 50 -0.0055 0.0316 0.9080 0.0035 0.0213 0.9220 0.0023 0.0972 0.9320

-0.0029 0.0315 0.9060 -0.0016 0.0218 0.924050 25 -0.0117 0.0344 0.8900 0.0052 0.0201 0.9380 0.0026 0.0903 0.9200

-0.0092 0.0339 0.9000 0.0006 0.0200 0.948050 50 -0.0049 0.0225 0.9280 0.0020 0.0142 0.9580 -0.0015 0.0674 0.9500

-0.0036 0.0221 0.9300 -0.0008 0.0144 0.942050 100 -0.0014 0.0156 0.9200 0.0011 0.0108 0.9240 -0.0058 0.0619 0.9240

-0.0008 0.0156 0.9200 -0.0006 0.0110 0.9240100 50 -0.0047 0.0155 0.9320 0.0013 0.0104 0.9440 -0.0054 0.0594 0.9560

-0.0040 0.0153 0.9440 -0.0002 0.0107 0.9400100 100 -0.0013 0.0104 0.9420 0.0010 0.0072 0.9540 0.0003 0.0401 0.9400

-0.0008 0.0105 0.9400 0.0001 0.0073 0.95003 25 50 -0.0001 0.0232 0.9020 0.0151 0.0398 0.9080 -0.0008 0.1668 0.9780

0.0020 0.0231 0.9020 0.0013 0.0401 0.930050 25 0.0001 0.0222 0.9140 0.0116 0.0447 0.8920 0.0028 0.1633 0.9620

0.0023 0.0220 0.9140 -0.0017 0.0434 0.904050 50 -0.0003 0.0155 0.9320 0.0086 0.0277 0.9360 -0.0085 0.1516 0.9760

0.0009 0.0154 0.9280 0.0000 0.0271 0.950050 100 -0.0008 0.0106 0.9280 0.0056 0.0190 0.9300 0.0050 0.1474 0.9560

0.0001 0.0106 0.9320 -0.0001 0.0191 0.9340100 50 -0.0012 0.0110 0.9420 0.0069 0.0199 0.9200 -0.0079 0.1428 0.9580

-0.0003 0.0109 0.9420 0.0010 0.0198 0.9380100 100 -0.0005 0.0070 0.9600 0.0025 0.0136 0.9320 -0.0036 0.1350 0.9620

0.0001 0.0070 0.9500 -0.0010 0.0135 0.94404 25 50 0.0003 0.0271 0.8960 0.0073 0.0350 0.9000 -0.0006 0.1469 0.9800

0.0032 0.0269 0.8840 -0.0014 0.0357 0.900050 25 -0.0001 0.0263 0.9080 0.0080 0.0340 0.9120 -0.0023 0.1569 0.9780

0.0032 0.0259 0.8940 -0.0017 0.0343 0.924050 50 -0.0009 0.0172 0.9260 0.0044 0.0223 0.9360 -0.0040 0.1410 0.9720

0.0011 0.0170 0.9200 -0.0015 0.0227 0.946050 100 -0.0012 0.0121 0.9260 0.0045 0.0158 0.9320 -0.0065 0.1298 0.9600

0.0000 0.0120 0.9260 0.0007 0.0156 0.9280100 50 -0.0003 0.0120 0.9320 0.0049 0.0157 0.9300 -0.0079 0.1250 0.9640

0.0009 0.0119 0.9300 0.0009 0.0158 0.9280100 100 -0.0008 0.0076 0.9560 0.0025 0.0104 0.9380 -0.0080 0.1197 0.9600

0.0000 0.0077 0.9600 0.0001 0.0105 0.9520

Note. We report the bias, std and CP for the feasible estimates followed those for the infeasible estimates with true γ0,where CP refers to the 95% coverage probability.

43

Page 97: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 8: Performance of the least square estimators with fixed threshold effect

β δ γ

N T Bias Std CP Bias Std CP Bias Std CP1 25 50 -0.0035 0.0283 0.920 -0.0022 0.0367 0.932 -0.0008 0.0305 0.954

-0.0026 0.0282 0.918 -0.0048 0.0365 0.94250 25 -0.0121 0.0301 0.892 -0.0012 0.0365 0.922 0.0025 0.0283 0.940

-0.0112 0.0299 0.892 -0.0034 0.0363 0.93050 50 -0.0055 0.0202 0.916 0.0000 0.0246 0.936 -0.0004 0.0166 0.952

-0.0050 0.0201 0.906 -0.0012 0.0245 0.94450 100 -0.0016 0.0140 0.930 -0.0010 0.0165 0.952 -0.0006 0.0090 0.948

-0.0014 0.0139 0.934 -0.0016 0.0164 0.958100 50 -0.0051 0.0132 0.926 -0.0028 0.0178 0.934 0.0000 0.0089 0.914

-0.0049 0.0132 0.926 -0.0034 0.0177 0.928100 100 -0.0022 0.0092 0.954 -0.0017 0.0109 0.968 0.0001 0.0040 0.900

-0.0021 0.0092 0.954 -0.0019 0.0109 0.9622 25 50 -0.0049 0.0197 0.940 0.0002 0.0153 0.926 -0.0005 0.0059 0.888

-0.0048 0.0197 0.940 0.0000 0.0153 0.92650 25 -0.0116 0.0217 0.884 0.0015 0.0155 0.938 -0.0006 0.0065 0.866

-0.0114 0.0217 0.884 0.0012 0.0155 0.94250 50 -0.0046 0.0135 0.930 0.0007 0.0104 0.940 0.0000 0.0029 0.792

-0.0046 0.0135 0.930 0.0006 0.0104 0.94250 100 -0.0019 0.0096 0.926 0.0003 0.0075 0.942 -0.0001 0.0014 0.670

-0.0019 0.0096 0.926 0.0002 0.0075 0.942100 50 -0.0040 0.0095 0.916 0.0003 0.0074 0.930 0.0000 0.0013 0.612

-0.0040 0.0095 0.918 0.0002 0.0074 0.930100 100 -0.0017 0.0067 0.930 -0.0002 0.0053 0.944 0.0000 0.0007 0.358

-0.0017 0.0067 0.930 -0.0003 0.0053 0.9443 25 50 0.0009 0.0242 0.882 0.0011 0.0416 0.924 -0.0022 0.0111 0.920

0.0009 0.0241 0.890 -0.0001 0.0418 0.92250 25 0.0010 0.0241 0.892 0.0006 0.0424 0.906 -0.0009 0.0105 0.922

0.0012 0.0241 0.892 -0.0005 0.0424 0.91050 50 0.0002 0.0154 0.934 0.0009 0.0275 0.928 -0.0005 0.0052 0.902

0.0002 0.0154 0.938 0.0006 0.0276 0.92850 100 0.0006 0.0106 0.956 -0.0001 0.0196 0.924 -0.0005 0.0029 0.828

0.0007 0.0106 0.954 -0.0003 0.0196 0.924100 50 0.0003 0.0102 0.944 -0.0001 0.0187 0.950 -0.0002 0.0023 0.778

0.0004 0.0102 0.944 -0.0003 0.0187 0.950100 100 0.0005 0.0077 0.936 0.0000 0.0129 0.958 0.0000 0.0014 0.660

0.0006 0.0077 0.934 -0.0001 0.0129 0.9584 25 50 0.0018 0.0259 0.888 0.0000 0.0328 0.930 -0.0012 0.0128 0.902

0.0021 0.0259 0.884 -0.0007 0.0329 0.93250 25 0.0027 0.0253 0.908 -0.0023 0.0350 0.926 -0.0013 0.0118 0.936

0.0029 0.0253 0.904 -0.0028 0.0350 0.92050 50 0.0006 0.0170 0.940 -0.0015 0.0238 0.922 -0.0003 0.0059 0.848

0.0007 0.0170 0.940 -0.0018 0.0237 0.92450 100 0.0005 0.0120 0.930 -0.0004 0.0164 0.928 -0.0004 0.0028 0.830

0.0005 0.0120 0.930 -0.0005 0.0164 0.928100 50 0.0010 0.0120 0.914 -0.0026 0.0153 0.940 -0.0001 0.0027 0.762

0.0010 0.0120 0.918 -0.0027 0.0153 0.940100 100 0.0005 0.0083 0.932 -0.0003 0.0108 0.946 -0.0001 0.0013 0.608

0.0005 0.0083 0.932 -0.0003 0.0108 0.946

Note. We report the bias, std and CP for the feasible estimates followed those for the infeasible estimates with trueγ0, where CP refers to the 95% coverage probability.

44

Page 98: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 9: Performance of the least square estimators with threshold in error variance

β δ γ

DGP N T bias std CP bias std CP bias std CP1 25 50 -0.0051 0.0327 0.928 0.0078 0.0363 0.936 0.0422 0.1046 0.952

-0.0038 0.0325 0.926 0.0007 0.0362 0.93650 25 -0.0122 0.0375 0.888 0.0045 0.0380 0.940 0.0509 0.0963 0.942

-0.0109 0.0375 0.892 -0.0026 0.0381 0.93650 50 -0.0046 0.0223 0.938 0.0014 0.0244 0.962 0.0444 0.0898 0.942

-0.0040 0.0224 0.942 -0.0018 0.0250 0.95450 100 -0.0027 0.0160 0.940 0.0017 0.0167 0.960 0.0327 0.0641 0.946

-0.0024 0.0160 0.942 0.0006 0.0166 0.954100 50 -0.0031 0.0149 0.952 0.0012 0.0184 0.934 0.0323 0.0688 0.946

-0.0028 0.0149 0.954 -0.0004 0.0182 0.930100 100 -0.0020 0.0108 0.946 0.0015 0.0123 0.942 0.0224 0.0432 0.938

-0.0019 0.0107 0.946 0.0003 0.0121 0.9362 25 50 -0.0055 0.0330 0.898 0.0014 0.0220 0.918 0.0135 0.0330 0.922

-0.0032 0.0329 0.902 -0.0009 0.0218 0.93450 25 -0.0127 0.0335 0.892 0.0028 0.0211 0.938 0.0127 0.0332 0.888

-0.0104 0.0331 0.896 0.0007 0.0210 0.93450 50 -0.0052 0.0230 0.926 0.0000 0.0149 0.938 0.0082 0.0208 0.932

-0.0040 0.0229 0.930 -0.0010 0.0149 0.92650 100 -0.0020 0.0158 0.922 0.0001 0.0113 0.926 0.0056 0.0149 0.896

-0.0014 0.0159 0.920 -0.0004 0.0113 0.924100 50 -0.0048 0.0158 0.932 0.0004 0.0108 0.942 0.0050 0.0123 0.904

-0.0042 0.0158 0.934 -0.0001 0.0108 0.940100 100 -0.0024 0.0102 0.958 0.0009 0.0078 0.942 0.0030 0.0071 0.960

-0.0020 0.0101 0.954 0.0006 0.0078 0.9403 25 50 -0.0012 0.0293 0.914 0.0119 0.0369 0.900 0.0515 0.1092 0.940

0.0013 0.0292 0.914 0.0056 0.0377 0.90650 25 -0.0056 0.0289 0.908 0.0209 0.0401 0.876 0.0450 0.1151 0.914

-0.0031 0.0284 0.914 0.0142 0.0403 0.88850 50 -0.0026 0.0199 0.930 0.0078 0.0252 0.940 0.0383 0.0878 0.944

-0.0011 0.0197 0.936 0.0049 0.0249 0.94650 100 -0.0011 0.0132 0.936 0.0028 0.0175 0.942 0.0368 0.0734 0.940

-0.0001 0.0131 0.938 0.0014 0.0174 0.944100 50 -0.0029 0.0137 0.948 0.0072 0.0181 0.918 0.0281 0.0705 0.924

-0.0020 0.0136 0.950 0.0056 0.0179 0.930100 100 -0.0008 0.0095 0.942 0.0023 0.0117 0.958 0.0176 0.0421 0.948

-0.0003 0.0095 0.946 0.0014 0.0117 0.9604 25 50 -0.0010 0.0343 0.886 0.0071 0.0348 0.906 0.0396 0.0930 0.932

0.0023 0.0343 0.876 0.0024 0.0352 0.90250 25 -0.0084 0.0345 0.876 0.0153 0.0352 0.894 0.0450 0.0940 0.934

-0.0045 0.0340 0.890 0.0106 0.0349 0.89250 50 -0.0033 0.0220 0.922 0.0047 0.0226 0.926 0.0280 0.0721 0.946

-0.0014 0.0218 0.922 0.0025 0.0227 0.92250 100 -0.0017 0.0152 0.922 0.0030 0.0153 0.932 0.0235 0.0545 0.936

-0.0005 0.0151 0.918 0.0018 0.0154 0.930100 50 -0.0031 0.0155 0.914 0.0056 0.0152 0.920 0.0269 0.0571 0.926

-0.0018 0.0154 0.928 0.0044 0.0153 0.926100 100 -0.0014 0.0108 0.920 0.0017 0.0117 0.916 0.0162 0.0345 0.944

-0.0007 0.0108 0.924 0.0009 0.0118 0.922

Note. We report the bias, std and CP for the feasible estimates followed those for the infeasible estimates with trueγ0, where CP refers to the 95% coverage probability.

45

Page 99: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

1. Obtain the original estmate F , Λ, a, β1, β2, γ, ϕ, and eit’s from real data.

2. Let FINit be x1,it and other control variables be x−,it = (x2,it, . . . , x5,it)′. Stack control variables xk,it’s

to five N × T matrices of regressors Xk for k = 1, . . . , 5 from the real data. Obtain Zk = MΛXkMF andcalculate the sample covariance matrix Σz of zit’s.

3. For jth replication.

(a) Generate i.i.d. z(j)it ’s from N(0,Σz) and stack them to 5 matices Z(j)

k . Let X(j)k = Z

(j)k + (Xk − Zk)

and define x(j)1,it, x

(j)−,it accordingly.

(b) Generate e(j)it = eit · εit where εit’s are i.i.d. from N(0, 1).

(c) Let y(j)it = a · 1(x

(j)1,it ≤ γ) + β1x

(j)1,it1(x

(j)1,it ≤ γ) + β2x

(j)1,it1(x

(j)1,it > γ) + x

(j)′−,itϕ+ λ′ift + e

(j)it .

Table 10 reports the simulation results based on 1000 replications where the DGP mimics that in the empiricalapplication. The finite sample performance is reasonably good. The estimation of γ is quite accurate in terms of biasand standard deviation. The 95% confidence interval for γ covers the true value with probability close to the nominallevel. For the slope coefficients, the coverage probability is slightly lower than the nominal level.

Table 10: Performance of the least square estimators where the DGP mimics realdata

a β1 β2 φ1 φ2 φ3 φ4 γ

Bias 1.374 0.315 0.586 -0.144 -0.204 0.213 -0.152 -0.023Std 4.829 0.422 1.032 0.524 0.172 0.400 0.435 0.090CP 0.842 0.859 0.815 0.904 0.754 0.869 0.907 0.971

Note. We report the bias, std and CP for the estimates where the DGP mimics data in empirical application.CP refers to the 95% coverage probability.

F.4 Determination of the number of factors

In this subsection, we conduct some Monte Carlo simulations to study the finite sample performance of our SVTmethod introduced in Section 3.6. For the tuning parameter and singular value threshold, we use ψ = 0.3(N−1/2 +

T−1/2) and χNT = (√NTψ||Θψ||sp)1/2. The DGPs are essentially the same as those used in the simulation section

in the main text except that we now consider different magnitudes of threshold effects: α = 0, 0.2 and 0.5.The results are reported in Table 11. From the results in Table 11, we can see the proposed method works

reasonably well. There is some under-estimation when N and T are small. As N or T increases, the perfor-mance improves rapidly. The empirical probability of correctly determining the number of factors reaches one when(N,T ) = (100, 100) for all DGPs with all three values of α. Comparing the results with different values of α, weconclude that the proposed method is robust to the magnitude of threshold effects.

F.5 Test IFEs Against Two-Way Additive Fixed Effects

In this subsection, we conduct some Monte Carlo simulations to evaluate the finite sample performance of the teststatistic TNT introduced in Section E.

As in Section 5, we also consider four DGPs, where the regressors and idiosyncratic error terms are generated inthe same way as in Section 5. For the factor and factor loadings, we consider three cases:

46

Page 100: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 11: Frequency of under/over estimation offactor number

DGP N T α=0 α=0.2 α=0.5

1 50 50 0.062/0 0.056/0 0.074/050 100 0.002/0 0.002/0 0.016/0100 50 0.008/0 0.006/0 0.01/0100 100 0/0 0/0 0/0

2 50 50 0.084/0 0.058/0 0.062/050 100 0.004/0 0.008/0 0.004/0100 50 0.002/0 0.008/0 0.012/0100 100 0/0 0/0 0/0

3 50 50 0.054/0 0.066/0 0.07/050 100 0.002/0 0.004/0 0.004/0100 50 0.002/0 0.006/0 0.004/0100 100 0/0 0/0 0/0

4 50 50 0.07/0 0.066/0 0.058/050 100 0.006/0 0.006/0 0.004/0100 50 0.004/0 0.002/0 0.004/0100 100 0/0 0/0 0/0

47

Page 101: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 12: Rejection rate of the test against additive fixed effects

H0 H1NT,1 H1NT,2

N T NS=5% NS=10% NS=5% NS=10% NS=5% NS=10%

1 50 50 7.3% 12.5% 42.4% 53.8% 76.4% 84.7%50 100 7.3% 11.9% 35.3% 47.5% 81.2% 89.3%100 50 6.6% 11.2% 49.2% 61.3% 81.4% 88.8%100 100 5.3% 10.5% 43.8% 56.7% 85.2% 91.4%

2 50 50 7.3% 12.0% 38.2% 50.4% 78.5% 85.8%50 100 7.9% 12.6% 36.4% 46.7% 83.1% 89.2%100 50 4.7% 9.2% 47.8% 60.5% 78.5% 85.7%100 100 5.7% 11.3% 44.1% 58.0% 85.7% 90.6%

3 50 50 7.3% 11.7% 41.5% 53.6% 77.8% 87.2%50 100 5.9% 10.6% 35.1% 47.3% 82.3% 88.2%100 50 7.6% 12.3% 54.4% 65.5% 80.0% 88.9%100 100 6.0% 10.7% 43.4% 56.4% 84.3% 90.0%

4 50 50 7.6% 12.2% 43.1% 53.8% 79.4% 87.1%50 100 7.9% 12.0% 40.7% 52.3% 82.4% 89.7%100 50 5.9% 10.0% 52.9% 65.2% 79.2% 87.7%100 100 7.4% 10.5% 44.6% 57.3% 85.2% 90.5%

Note. We report the rejection rate of the test usingNT bc,NT at 5% and 10% significance level. NS refers to nominal size.

1. The null hypothesis case (H0): λ0i = (α0

i , 1)′ for i = 1, . . . , N and f0t = (1, ν0

t ) for t = 1, . . . , T, whereα0i is i.i.d. from 2 × N(1, 1) and ν0

t is i.i.d. from 2 × N(0, 1). The factors and factor loadings are mutuallyindependent of each other.

2. The first local alternative case (H1NT,1): λ0i = (α0

i , 1)′ for i = 1, . . . , N and f0t = (1 +N−1/4T−1/2νt, ν

0t )

for t = 1, . . . , T , where α0i and ν0

t are generated in the same way as in the H0 case, and νt is i.i.d. fromN(0, 1). The factors and factor loadings are mutually independent of each other.

3. The second local alternative case (H1NT,2): λ0i = (α0

i , 1 + N−1/2T−1/4αi)′ for i = 1, . . . , N and f0

t =

(1 + N−1/4T−1/2νt, ν0t ), where α0

i , ν0t and νt are generated in the same way as in H1NT,1 case, and αi is

i.i.d. from N(0, 1). The factors and factor loadings are mutually independent of each other.

In the first case, the null hypothesisH0 : λ0′i f

0t = α0

i+ν0t is satisfied and we study the asymptotic size behavior of

our test. In the second case, we consider the local alternativeH1NT,1 : λ0′i f

0t = α0

i +ν0t +N−1/4T−1/2α0

i′νt, and in

the third case we consider the local alternativeH1NT,2 : λ0′i f

0t = α0

i +ν0t +N−1/4T−1/2α0

i′νt+N

−1/2T−1/4α′iν0t .

Table 12 reports the performance of the test using NT bc,NT in the above three cases. Apparently, the test hasreasonably well-controlled size when nominal sizes (NS) are given by 0.05 and 0.1. As N and T increase, the finitesample rejection frequently typically improves and approaches the nominal size. Similarly, the test has certain powerto detect local deviations from the null and the power under H1NT,2 improves over that under H1NT,1. Table 13reports the performance of the bootstrap test as in Section E.3.2. The results are similar to that in table 12.

48

Page 102: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

Table 13: Rejection rate of the bootstrap test against additive fixed effects

H0 H1NT,1 H1NT,2

N T NS=5% NS=10% NS=5% NS=10% NS=5% NS=10%

1 50 50 6.1% 13.2% 42.2% 57.0% 74.8% 84.8%50 100 6.6% 14.1% 33.1% 46.3% 79.4% 89.7%100 50 7.2% 13.6% 50.7% 65.7% 81.9% 90.3%100 100 5.8% 12.3% 42.4% 57.5% 84.3% 91.4%

2 50 50 5.8% 11.7% 36.5% 51.2% 75.9% 85.8%50 100 7.5% 14.0% 34.0% 47.5% 81.0% 88.9%100 50 4.6% 11.1% 48.3% 64.3% 77.0% 87.6%100 100 6.3% 13.7% 44.7% 60.1% 84.7% 91.0%

3 50 50 5.4% 11.6% 36.5% 51.0% 72.7% 85.7%50 100 5.9% 11.8% 30.5% 46.9% 78.7% 87.4%100 50 6.6% 13.4% 50.6% 63.8% 77.8% 88.4%100 100 6.6% 13.1% 40.8% 55.6% 82.8% 89.9%

4 50 50 5.5% 11.4% 38.4% 52.4% 74.9% 85.3%50 100 6.5% 13.7% 35.6% 51.2% 79.3% 89.2%100 50 5.9% 11.8% 48.3% 64.7% 77.2% 86.8%100 100 7.2% 12.5% 41.2% 56.3% 83.2% 90.1%

Note. We report the rejection rate of the test using the bootstrap test as in Section E.3.2. NS refers to nominal size.

49

Page 103: Panel Threshold Models with Interactive Fixed Effects...1 Introduction Both threshold effects and interactive fixed effects (IFEs) are of practical relevance and have received con-siderable

References

Ahn, S. C., and Horenstein, A. R. 2013. Eigenvalue ratio test for the number of factors. Econometrica, 81(3), 1203-1227.

Andrews, D. W. (1993). Tests for parameter instability and structural change with unknown change point. Econo-metrica, 821-856.

Bai, J., 2009. Panel data models with interactive fixed effects. Econometrica, 77, 1229-1279.

Bai, J., and Ng, S., 2002. Determining the number of factors in approximate factor models. Econometrica, 701,191-221.

Bernstein, D. S., 2005. Matrix Mathematics: Theory, Facts, and Formulas with Application to Linear Systems The-ory. Princeton University Press, Princeton.

Billingsley, P., 1999. Probability and Measure. John Wiley & Sons, New York.

Castagnetti, C., Rossi, E., and Trapani, L. 2015. Inference on factor structures in heterogeneous panels. Journal ofeconometrics, 184(1), 145-157.

Davis, C., and Kahan, W. M., 1970. The rotation of eigenvectors by a perturbation. III. SIAM Journal on NumericalAnalysis, 7(1), 1-46.

Hall, P., and C.C. Heyde, 1980. Martingale Limit Theory and Its Applications. Academic Press, New York.

Kim, J., and Pollard, D., 1990. Cube root asymptotics. Annals of Statistics, 18, 191-219.

Moon, H. R., and Weidner, M., 2017. Dynamic linear panel regression models with interactive fixed effects. Econo-metric Theory, 33, 158-195.

Moon, H. R., and Weidner, M. 2018. Nuclear norm regularized estimation of panel regression models. arXiv preprintarXiv:1810.10987.

Peligrad, M. 1982. Invariance principles for mixing sequences of random variables. The Annals of Probability, 968-981.

Su, L., and Chen, Q., 2013. Testing homogeneity in panel data models with interactive fixed effects. EconometricTheory, 29, 1079-1135.

Utev, S., and Peligrad, M., 2003. Maximal inequalities and an invariance principle for a class of weakly dependentrandom variables. Journal of Theoretical Probability 16, 101-115.

50