Upload
phamthuan
View
219
Download
4
Embed Size (px)
Citation preview
CHAPTER 5
Empirical Models of Market Entry
(Note: Distinguish between models without and with data on prices and quantities. For
the later, discuss the problem of sample selection in the estimation of demand, production
function, and/or variable costs due to endogenous entry. Include the mode)
Why do we estimate models of entry? In chapters 2 to 4, in the models of demand,
production, and competition in prices or quantities that we have studied so far, we have taken
the number of �rms in the market and the product characteristics of each �rm as exogenously
given. A model of entry can be seen as the component of a model market competition where
we endogenize the number of �rms and in some cases some of the characteristics of the �rm
such as product characteristics (in models of product di¤erentiation), or the �rm capacity.
In a static model of market entry, the entry cost (or �xed cost) of a �rm is a key parameter
or function that plays a key role in the determinantion of the number of �rms in the market,
in the characteristics of the active �rms, and in competition in general. Entry costs cannot
be identi�ed from the estimation of demand equations, marginal conditions of optimality
for prices and quanties, or even from production functions. In the literature of empirical
entry models, we use the principle of reveal preference to identify these parameters. If a
�rm decides to be in the market is because the value of entry in the market is greater than
the value of stauing out of the market. Under some conditions, �rms�entry decisions reveal
information about �rms�entry costs.
1. Some general ideas
1.1. What is a model of market entry? Models of market entry in IO can be char-acterized in terms of three main features. First, the key endogenous variable is a �rmdecision to operate or not in a market. Entry in a market should be understood ina broad sense. The standard example is the decision of a �rm to enter in an industry by
�rst time, but opening a new store, introducing a new product, adopting a new technology,
the release of a new movie, a potential bidder�s decision to bid in an auction, etc, are other
examples. A second important feature is that there is an entry cost associated with beingactive in the market. And third, the payo¤ of being active in the market depends (nega-
tively) on the number (and the characteristics) of other �rms active in the market, i.e., the
93
94 5. EMPIRICAL MODELS OF MARKET ENTRY
model is a game. Therefore, the typical structure of a model of entry is:
faim = 1g , f �im(Nm; Xim; "im; �) > 0 g
NOTE: a�im better than Nm, MORE GENERAL
where aim is the binary indicator of event "�rm i is active in market m"; �im is the
pro�t of being active in the market that depends on the number of �rms Nm, on exogenous
�rm and market characteristics that are observable to the researcher Xim, on exogenous
characteristics that are unobserved to the researcher "im, and on structural parameters �.
[The principle of Revealed Preference] The estimation of structural models of mar-ket entry is based on the principle of Revealed Preference. In the context of these models,
this principle establishes that if we observe a �rm operating in a market it is because its
value in that market is greater than the value of shutting down and putting its assets in
alternative uses. Under this principle, �rms�entry decisions reveal information about the
underlying latent �rm�s pro�t (or value).
[Static models] The �rst class of models that we study are static. There are manydi¤erences between static and dynamic models of market entry. But there is a simple dif-
ference that I think it is relevant to point out now. For static models of entry, we should
understand "entry" as "being active in the market" and not as a transition from being "out"
of the market to being "in" the market. That is, in these static models we ignore the fact
that, when choosing whether to be active or not in the market, some �rms are already active
(incumbents) and other �rms not (potential entrants). That is, we ignore that the choice of
non-being active in the market means "exit" for some �rms and "stay out" for others.
1.2. Why do we estimate models of market entry? As explained in the Introduc-tion to this chapter, the speci�cation and estimation of models of market entry is motivated
by the need to endogenize the number of �rms in the market, as well as some characteristics
that operate at the extensive margin. Endogenizing the number of �rms in the market is a
key aspect in any model of IO where market structure is treated as endogenous. Once we
endogenize the number of �rms in the market, we need to identify entry cost parameters,
and these parameters cannot be identify from demand equations, production functions, and
marginal conditions of optimality for prices and quantities. We identify entry costs from the
own entry model. More generally, we can distinguish the following motives for the estimation
of models of market entry.
1. Identi�cation of entry cost parameters. Parameters such us �xed productioncosts, entry costs, or investment costs do not appear in demand or production equations
but contribute to the market entry decision. These parameters can be important in the
determination of market structure and market power in an industry.
1. SOME GENERAL IDEAS 95
2. E¢ ciency. The equilibrium entry conditions contain useful information for the
identi�cation of structural parameters. Using this information can increase signi�cantly the
precision of our estimates. In fact, when the sample variability in prices and quantities
is small, the equilibrium entry conditions can have a more important contribution to the
identi�cation of demand and cost parameters than demand equations of production functions.
3. Data on prices and quantities may not be available, or at least, thesedata are not available at the level we need, e.g., at the level of individual �rm, product, and
market. Many countries have excellent surveys of manufacturers of retailers with information
at the level of speci�c industry (5 or 6 digits NAICS, SIC) and local markets (census tracts)
on the number of establishments and some measure of �rm size such as aggregate revenue.
Though we observe aggregate revenue at the industry-market level, we do not observe P and
Q at that level. Under some assumptions, it is possible to identify structural parameters
using these data and the structure of an entry model.
4. Controlling for endogeneity of �rms�entry decisions in the estimation ofdemand and production functions. The estimation of a demand system or a productionfunction may involve dealing with, in someway or the other, the estimation of a model of
market entry. Very often, in the estimation of a demand system or in the estimation of a
production function we have to deal with the endogeneity of �rms�entry and exit decisions.
Olley and Pakes (1996) show that ignoring the endogeneity of a �rm�s decision to exit from
the market (i.e., �rm�s with smaller values of unobserved productivity are more likely to exit)
can generate signi�cant biases in the estimation of production functions. Similarly, some
product characteristics (not only price) may be endogenous in the estimation of demand
systems. The choice of a product characteristic can be interpreted as an entry decision. For
instance, the decision of a co¤ee shop of having or not wireless internet access. The choice
of including or not the product attribute may not be exogenous in the demand system in
the sense that it is correlated with unobserved demand factors. That is, we observe more
demand in co¤ee shops with internet wireless not only because consumers like this service
but also because the co¤ee shops that choose to include this service are the ones with higher
exogenous demand such that it is pro�table for them to pay the �xed cost of including
wireless internet. Dealing with this endogenous product attribute requires one to specify
and estimate a model of market entry.
1.3. Road map. The type of data used and the assumptions about unobserved �rmand market heterogeneity, and about the information of the �rms and of the researcher are
very important for the estimation of entry models.
[Bresnahan and Reiss] We start with a simple and pioner model in this literature:the modesl in Bresnahan and Reiss (JPE, 1991). This paper together with Bresnahan and
96 5. EMPIRICAL MODELS OF MARKET ENTRY
Reiss (REStud, 1990) were signi�cant contributions to the structural estimation of models
of market entry that opened a new literature that has grown signi�cantly during the last
20 years. In that paper, Bresnahan and Reiss show that given a cross-section of "isolated"
local markets where we observe the number of �rms active, and some exogenous market
characteristics, including market size, it is possible to identify �xed costs and the "degree
of competition" or the "nature of competition" in the industry. By "nature of competition"
these authors (and after them, this literature) means a measure of how a �rm�s variable
pro�t declines when the number of competitors in the market increases.
What is most remarkable about Bresnahan and Reiss�s result is how with quite limited
information (e.g., no information about prices of quantities) the researcher can identify the
degree of competition using an entry model.
[Relaxing the assumption of homogeneous �rms] Bresnahan and Reiss�s model isbased on some important assumptions. In particular, �rms are homogeneous and they have
complete information. The assumption of �rm homogeneity (both in demand and costs) is
strong and can be clearly rejected in many industries. Perhaps more importantly, ignoring
�rm heterogeneity when present can lead to biased and misleading results about the degree
of competition in a industry. Therefore, the �rst assumption that we relax is the one of
homogeneous �rms.
As shown originally in the own work of Bresnahan and Reiss (Journal of Economet-
rics, 1991), relaxing the assumption of �rm homogeneity implies two signi�cant econometric
challenges. The entry model becomes a system of simultaneous equations with endogenous
binary choice variables. Dealing with endogeneity in binary choice system of equations is not
a simple econometric problem. In general, IV estimators are not available. Furthermore, the
model now has multiple equilibria. Dealing with both endogeneity and multiple equilibria
in this class of nonlinear models is an interesting but challenging problem in econometrics.
[Approaches to deal with endogeneity/multiple equilibria in games of com-plete information]. Then, we will go through di¤erent approaches that have been used inthis literature to deal with the problems of endogeneity and multiple equilibria. I think that
it is worthwhile to distinguish two groups of approaches or methods.
The �rst group of methods is characterized by imposing restrictions that imply equi-librium uniqueness for any value of the exogenous variables. Of course, �rm homogeneity
is a type of assumption that implies equilibrium uniqueness. But there are other assump-
tions that imply uniqueness even when �rms are heterogeneous. For instance, a triangular
structure in the strategic interactions between �rms (Heckman, Econometrica 1978), or se-
quential entry decisions (Berry, Econometrica 1993). Given these assumptions, these papers
deal with the endogeneity problem by using a maximum likelihood approach.
2. BRESNAHAN AND REISS (JPE, 1991) 97
The second group of methods do not impose equilibrium uniqueness. The early workof Jovanovic (Econometrica 1989) and the most recent work by Tamer (2003) were important
or in�uential for this other approach. These authors showed (Jovanovic at a general abstract
level, and Tamer in the context of a binary choice game) that identi�cation and multiple
equilibria are two very di¤erent issues in econometric models. Models with multiple equilibria
can be identi�ed, and we do not need to impose equilibrium uniqueness as a form to get
identi�cation. Multiple equilibria can be a computational nuisance in the estimation of these
models, but it is not an identi�cation problem. This simple idea has generated a signi�cant
and growing literature that deals with computational simple methods to estimate models
with multiple equilibria, and more speci�cally with the estimation of discrete games.
[Games of incomplete information] Our next step will be to relax the assumptionof complete information by introducing some variables that are private information of each
�rm. We will see that the identi�cation and estimation of these models can be signi�cantly
simpler than in the case of models of complete information.
2. Bresnahan and Reiss (JPE, 1991)
NOTE : NO TITLE BRSNAHAN & REISS: REPLACE IT WITH ENTRY MODEL
WITH HOMOGENEOUS FIRMS. TAKE SLIDES 2014.
They study several retail and professional industries in US: Doctors; Dentists; Pharma-
cies; Plumbers; car dealers; etc. For each industry, say car dealers, the dataset consists of
a cross-section of M small, "isolated" markets. We index markets by m. For each market
m, we observe the number of active �rms (Nm), a measure of market size (Sm), and some
exogenous market characteristics that may a¤ect demand and/or costs (Xm).
Data = f Nm; Sm, Xm : m = 1; 2; :::;Mg
There are several empirical questions that they want to answer. First, they want to
estimate the "nature" or "degree" of competition for each of the industries: that is, how fast
variable pro�ts decline when the number of �rms in the market increase. Second, but related
to the estimation of the degree of competition, BR are also interested in estimating how many
entrants are needed to achieve an equilibrium equivalent to the competitive equilibrium, i.e.,
hypothesis of contestable markets.
[Model] Consider a marketm. There is a numberN� of potential entrants in the market.
Each �rm decides whether to be active or not in the market. Let �m(N) be the pro�t of
an active �rm in market m when there are N active �rms. The function �m(N) is strictly
decreasing in N . If Nm is the equilibrium number of �rms in marketm, then it should satisfy
98 5. EMPIRICAL MODELS OF MARKET ENTRY
the following conditions:
�m(Nm) � 0 and �m(Nm + 1) < 0
That is, every �rm is making his best response given the actions of the others. For active
�rms, their best response is to be active, and for inactive �rms their best response is not to
enter in the market.
To complete the model we have to specify the structure of the pro�t function �m(N).
Total pro�t is equal to variable pro�t, Vm(N), minus �xed costs, Fm(N):
�m(N) = Vm(N)� Fm(N)
In this model, where we do not observe prices or quantities, the key di¤erence in the speci�-
cation of variable pro�t and �xed cost is that variables pro�ts increase with market size (in
fact, they are proportional to market size) and �xed costs do not.
The variable pro�t function of an incumbent �rm in market m when there are N active
�rms is:
Vm(N) = Sm vm(N) = Sm�XDm� � �(N)
�where Sm represent market size; vm(N) is the variable pro�t per-capita; XD
m is a vector of
market characteristics that may a¤ect the demand of the product, e.g., per capita income,
age distribution; � is a vector of parameters; and �(1), �(2), ...�(N) are parameters that
capture the degree of competition, such that we expect that �(1) � �(2) � �(3) ::: � �(N).Given that there is not �rm-heterogeneity in the variable pro�t function, there is an implicit
assumption of homogeneous product or symmetrically di¤erentiated product (e.g., Salop
circle city).
The speci�cation �xed cost is:
Fm(N) = XCm + �(N) + "m
where XCm is a vector of observable market characteristics that may a¤ect the �xed cost,
e.g., rental price; and "m is a market characteristic that is unobservable to the researchers
but observable to the �rms; and �(1), �(2), ...�(N�) are parameters. The dependence of the
�xed cost with respect to the number of �rms is very unconventional or non-standard in IO.
Bresnahan and Reiss allow for this possibility and provice several interpretations. However,
te interpretation of the parameters �(1), �(2), ...�(N�) is not completely clear. In some sense,
BR allow the �xed cost to depend on the number �rms in the market for robustness reasons.
There are several possible interpretations for why �xed costs may depend on the number
of �rms in the market: (a) entry Deterrence: incumbents create barriers to entry; (b) a
shortcut to allow for �rm heterogeneity in �xed costs, in the sense that late entrants are less
e¢ cient in �xed costs; and (c) actual endogenous �xed costs, for instance rental prices or
2. BRESNAHAN AND REISS (JPE, 1991) 99
other components of the �xed costs, no included in XCm, may increase with the number of
incumbents (e.g., demand e¤ect on rental prices). For any of these interpretations we expect
�(1) � �(2) � �(3) ::: � �(N�).
Since both �(N) and �(N) increase with N , it is clear that the pro�t function �m(N)
declines with N . Therefore, as we anticipated above, the equilibrium condition for the
number of �rms in the market can be represented as follows. For N 2 f0; 1; :::; N�g
fnm = Ng , f �m(N) � 0 AND �m(N + 1) < 0 g
It is simple to show that the model has a unique equilibrium for any value of the exoge-
nous variables and structural parameters. This is just a direct implication of the strict
monotonicity of the pro�t function �m(N).
We have a random sample fNm; Sm; XDm , X
Cm : m = 1; 2; :::;Mg and we want to use this
sample to estimate the vector of parameters:
� = f�; ; �; �(1); :::; �(N�); �(1); :::; �(N�)g
The unobserved component of the entry cost, "m, is assumed independent of (Sm; XDm ; X
Cm)
and it is i.i.d. over markets with distribution N(0; �). As usual in discrete choice models, �
is not identi�ed. We normalize � = 1, which means that we are really identifying the rest
of the parameters up to scale. We should keep this in mind for the interpretation of the
estimation results.
Given this model and sample, BR estimate � by (conditional) ML:
�̂ = argmax�
MXm=1
log Pr(Nm j �; Sm; XDm ; X
Cm)
What is the form of the probabilities Pr(Nmj�; Sm; Xm;Wm) in B&R model? This entry
model is equivalent to anOrdered Probit model for the number of �rms. We can representthe condition f�m(n) � 0 AND �m(n+1) < 0g in terms of thresholds for the unobservablevariable "m.
fNm = ng , fTm(n+ 1) < "m � Tm(n)g
and for any n 2 f1; 2; :::; N�g we have that
Tm(N) � SmXDm� �XC
m � �(n)Sm � �(n)
and Tm(0) = +1, Tm(N� + 1) = �1. This is the structure of an ordered probit model.Therefore, the distribution of the number of �rms conditional on the observed exogenous
100 5. EMPIRICAL MODELS OF MARKET ENTRY
market characteristics is:
Pr(Nm = njSm; XDm ; X
Cm) = � (Tm(n))� � (Tm(n+ 1))
��SmX
Dm� �XC
m � �(n)Sm � �(n)�
� ��SmX
Dm� �XC
m � �(n+ 1)Sm � �(n+ 1)�
This is an Ordered Probit model. The model is very simple to estimate. Almost anyeconometric software package includes a command for the estimation of the ordered probit.
[Application and Main Results] Data: 202 "isolated local markets". Why isolatedlocal markets? It is very important to include in our de�nition of market all the �rms that
are actually competing in the market and not more. Otherwise, we can introduce signi�cant
biases in the estimated parameters. If our de�nition of market is too narrow, such that we
do not include all the �rms that are actually in a market, we will conclude that there is little
entry either because �xed costs are too large or the degree of competition is strong: i.e., we
will overestimate the �0s or the �0s or both. If our de�nition of market is too broad, such
that we include �rms that are not actually competing in the same market, we will conclude
that there is signi�cant entry and to rationalize this wee need �xed costs to be small or to
have a low degree of competition between �rms. Therefore, we will underestimate the �0s or
the �0s or both.
The most common mistake of a broad de�nition of market is to have a large city as a
single market. The common mistake of a narrow de�nition of market is to have small towns
that are close to each other, or close to a large town. To avoid these type of errors, BR
construct "isolated local markets". The criteria to select isolated markets in US: (a) at least
20 miles from the nearest town of 1000 people or more; (b) At least 100 miles of cities with
100,000 people or more.
Population sizes between 500 and 75,000 people [see Figure 2 in the ]. Industries (16):
several retail industries (auto dealers, movie theaters,...) and many professions (doctors,
dentists, plumbers, barbers, ...). The model is estimated for each industry separately.
Let S(N) be the minimum market size to sustain N �rms in the market. S(N) are called
"entry thresholds" and they can be obtained (estimated) using the estimated parameters.They do not depend on the normalization � = 1. The main empirical results are: (a) For
most industries, both �(N) and �(N) increase with n. (b) There are very signi�cant cross-
industry di¤erences in entry thresholds S(N). (c) For most of the industries, entry thresholds
S(N)=N become constant for values of N greater than 4 or 5. Contestable markets?
2.1. Some questions about the econometrics. NOOOOOOOOOOOOOO
2. BRESNAHAN AND REISS (JPE, 1991) 101
(1) How is the number of potential entrants chosen in the model? Are the estimates of
the other parameters very sensitive to the value of N?
If N is constant across markets, it can be estimated consistently as N̂ = maxfnm :
m = 1; 2; :::;Mg. If the number of potential entrants is not constant across markets, weneed to make an assumption about the variables that determine Nm. For instance, if Nm =
�Sm where is a parameter, a consistent estimator of Nm is N̂m = Sm maxfnm=Sm : m =
1; 2; :::;Mg.(2) Are the estimates of the other parameters very sensitive to the value of N? Not in
this model. Explain ...
(3) What if the model includes an unobservable in the variable pro�t?
(4) What is the intuition behind the identi�cation of the e¤ect of competition in this
model? Are not there endogeneity problems?
(5) What if the industry under study is such that it is spatially concentrated in a few
local markets such that nm = 0 for most of the markets? What if competition is not really
at the level of local markets but global markets.
Today�s and next week lectures deal with several extensions of Bresnahan-Reiss entry
game:
- Nonparametric speci�cation and identi�cation;
- Introducing dynamics when panel data is available;
- Endogenous product characteristics;
- Heterogeneous �rms;
- Relaxing the "isolated markets" assumption.
By the way, by "small" and "isolated" markets we do not mean that people living in
these towns are "isolated from civilization" and don�t have access to electricity, or cable TV
:-). By "small", what BR mean is that the population is smaller than 50,000 people; and
by "isolated" they mean that these towns are at least 100 miles away of cities with 100,000
people or more. For the products under study, consumers should travel to the store to buy
the product. Therefore, the concept of "isolated market" introduced by BR tries to guarantee
that consumers in a market in the sample are not buying the product in other towns, and
that the group of consumers in these towns do not include people living in towns not in the
sample. This concept of geographic isolation is not an unrealistic for many products and
markets. The problem of this assumption is not its realism but that it limits signi�cantly the
markets and the industries that we can incorporate in our analysis. For some industries, it
eliminates the most interesting and pro�table markets which are typically located in urban
areas. And for other industries, the number of isolated markets is so small that we cannot
102 5. EMPIRICAL MODELS OF MARKET ENTRY
implement this empirical analysis. We will come back to this issue later in the course, and
we will relax the assumption of isolated markets.
3. Nonparametric identi�cation of Bresnahan-Reiss model
**** IN AN APPENDEX? MERGED WITH THE BASIC MODEL? ADD TO THE
PREVIOUS SECTION BUR MUCH REDUCED ******
There are many assumptions in BRmodel. To distinguish the key identifying assumptions
from the accessory assumptions it is useful to present a nonparametric version of their model.
Assumption 1 [Homogeneous �rms]: The pro�t of a �rm active in the market
depends on exogenous market characteristics, and on the (endogenous) number of �rms
active in the market. However, every �rm active in the market obtains the same pro�t.
The pro�t function of an active �rm in market m when the number of active �rms is n
is:
�(n;Xm; "m)
whereXm is a vector of exogenous market characteristics that are observed by the researcher,
and �m is a vector of exogenous market characteristics that are unobserved to the researcher.
Note that here we use n to represent an hypothetical value for the number of �rms, not the
actual realization that we denote by nm.
The assumption of homogeneous �rms is made both because data limitations (i.e., only
market level data), and for convenience (i.e., equilibrium uniqueness of no econometric prob-
lems associated with the endogeneity of the decisions of "other �rms"). However, it also
restricts importantly the type of questions that we can analyze with these models. There-
fore, this is the �rst assumption that we will relax in BR model.
Assumption 2: �(n;Xm; "m) is a strictly decreasing function in n.
This is a very weak assumption in a model of competition. In fact, it should hold even if
the active �rms in the market collude to achieve the monopoly outcome and share the total
pro�ts under that outcome.
Under Assumptions 1 and 2, the model establishes that the observed number of �rms in
the market, nm, is determined by the following conditions:
�(nm; Xm; "m) � 0 AND �(nm + 1; Xm; "m) < 0
Assumption 3: The unobserved "m enters additively in the pro�t function, and its
distribution function, F", is independent of (n;Xm).
�(n;Xm; "m) = �(n;Xm)� "m
3. NONPARAMETRIC IDENTIFICATION OF BRESNAHAN-REISS MODEL 103
The additive separability of the unobservable, alone without the independence assump-
tion, is not really an assumption because we can always de�ne "m as �(n;Xm; "m)��(n;Xm).
Therefore, the most important part of assumption 3 is the independence between "m and
Xm.
This assumption plays an important role in identi�cation. It can be relaxed to a certain
extent. For instance, it is straightforward to allow for "m and Xm to be mean dependent.
De�ne the conditional mean function �(Xm) � E("mjXm). Then, we can de�ne ��(n;Xm) ��(n;Xm)+�(Xm) and "�m � "m��(Xm), and therefore �(n;Xm)�"m = ��(n;Xm)�"�m. Byconstruction, "�m is mean independent of Xm, and we can prove (see below) the identi�cation
of the function ��(n;Xm). The dependence of ��(n;Xm) with respect to n is the same as the
dependence of �(n;Xm) with respect to n. So, if we are interested only on how the pro�t
function depends on the number of �rms (for di¤erent values of X) the functions �� and �
are equivalent.
Therefore, without loss of generality we can assume that "m is mean independent if Xm
and it has zero mean.
According to this model, for any n 2 f0; 1; :::g:
fnm = ng , f�(n;Xm)� "m � 0 AND �(n+ 1; Xm)� "m < 0g
, f�(n+ 1; Xm) < "m � �(n;Xm)g
Or,
fnm > ng , f"m � �(n+ 1; Xm)g
Therefore, the predictions of the model can be summarized by the following expression for
the conditional distribution of the number of �rms: for any value (n;X):
Pr (nm > n j Xm = X) = F" (�(n+ 1; X))
Note that the probability Pr (nm > n j Xm = X) can be identi�ed from the data using a
nonparametric estimator such as a frequency estimator (if X includes only discrete variables)
or a kernel estimator. For instance, a consistent kernel estimator of Pr (nm > n j Xm = X)
is: PMm=1 1 fnm > ng K
�Xm �X
b
�PM
m=1K
�Xm �X
b
�where 1f:g is the indicator function, K(:) is a kernel function (e.g., the pdf of the standardnormal distribution), and b is a bandwidth.
104 5. EMPIRICAL MODELS OF MARKET ENTRY
Given Pr (nm > n j Xm = X), we want to identify the distribution function F" and the
pro�t function �(n;X). Without further assumptions we cannot separately identify F" and
�(n;X).
Assumption 4: Let �2 be the variance of "m. The probability distribution of "m=� isknown to the researcher. For instance, "m=� is distributed N(0; 1).
Under Assumption 4, we can identify �(n;X) up to scale:
�(n;X)
�= ��1 (Pr (nm > n� 1 j Xm = X))
such that �(1;X)�
= ��1 (Pr (nm > 0 j Xm = X)),�(2;X)�
= ��1 (Pr (nm > 1 j Xm = X)), and
so on, where ��1 is the inverse of the CDF of the standard normal distribution.
Though the pro�t function is only identi�ed up to scale, we can identify the percentage
reduction in pro�ts when the number of �rms increases:
�(n;X) � ln (�(n+ 1; X))� ln (�(n;X))
= ln
��(n+ 1; X)
�
�� ln
��(n;X)
�
�
= ln���1 (Pr (nm > n j Xm = X))
�� ln
���1 (Pr (nm > n� 1 j Xm = X))
��(n;X) is the percentage reduction in the pro�t of a �rm in a market with characteristics
X when we change exogenously the number of competitors from n to n+ 1.
In fact, �(n;X) is a good measure of the % increase in total welfare associated with an
exogenous increase in the number of �rms. If �(n;X) is close to zero, then a market with
characteristicsX and a number of �rms n is very close in terms of �rms�pro�ts and consumer
welfare to a perfectly competitive market. This is an important parameter of interest for
instance for a regulator who considers the implications of a policy that tries to encourage
more entry, e.g., a policy that reduces entry costs.
Note that the value for the potential number of �rms N , does not play any role for the
predictions or the identi�cation of the model.
Given a known distribution for "m=�, the assumption of homocedasticity of "m is testable.
Also, under the assumption of homocedasticity of "m=�, and a variable in X that has a
monotonic e¤ect on �(n;X) and large range of variation (e.g., market size), then the distri-
bution of "m=� can be identi�ed nonparametrically.
Therefore, the key identifying assumptions of the model are in the assumptions of: (a)
homogeneous pro�ts for all �rms; (b) free entry conditions; and (c) our de�nition of local
market. We will relax these assumptions.
4. EMPIRICAL MODELS OF MARKET ENTRY WITH HETEROGENEOUS FIRMS 105
4. Empirical Models of Market Entry with Heterogeneous �rms
The assumption that all potential entrants and incumbents are homogeneous in their
variable pro�ts and entry costs is very convenient and facilitates the estimation, but it is
also very unrealistic in many applications. A potentially very important factor in the deter-
minantion of market structure is that �rms, potential entrants, are ex-ante heterogeneous. In
many applications we want to take into account this heterogeneity. Allowing for �rm hetero-
geneity introduces two important issues in these models: endogenous explanatory variables,
and multiple equilibria. We will comment on di¤erent approaches that have been used to
deal with these issues.
4.1. Model. Consider an industry with N potential entrants. For instance, the airline
industry. These potential entrants decide whether to be active or not in a market. We observe
M di¤erent realizations of this entry game. These realizations can be di¤erent geographic
markets (di¤erent routes of or city pairs, e.g., Toronto-New York, Montreal-Washington, etc)
or di¤erent time periods of time. For the sake of concreteness, we refer to these di¤erent
realizations of the entry game as "local markets" or "submarkets". We index �rms with
i 2 f1; 2; :::; Ng and submarkets with m 2 f1; 2; :::;Mg.
*****
Let aim 2 f0; 1g be a the binary indicator of the event "�rm i is active in market m".
For a given market m, the N �rms choose simultaneously whether to be active or not in the
market. When making his decision, a �rm wants to maximize its pro�t.
Once �rms have decided to be active or not in the market, active �rms compete in
prices or in quantities and �rms�pro�ts are realized. For the moment, we do not make it
explicit the speci�c form of competition in this second part of the game, or the structure
of demand and variable costs. We take as given an "indirect pro�t function" that depends
on exogenous market and �rm characteristics and one the number and the identity of the
active �rms in the market. This indirect pro�t function comes from a model of price or
quantity competition, but at this point we do not make that model explicit here. Also, we
consider that the researcher does not have access to data on �rms�prices and quantities such
that demand and variable cost parameters in the pro�t function cannot be estimated from
demand, and/or Bertrand/Cournot best response functions.
The (indirect) pro�t function of an incumbent �rm depends on market and �rm char-
acteristics a¤ecting demand and costs, and on the entry decisions of the other potential
106 5. EMPIRICAL MODELS OF MARKET ENTRY
entrants:
�im =
8<: �i (xim; "im, a�im) if aim = 1
0 if aim = 0
where xim and "im are vectors of exogenous market and �rm characteristics, and a�im �fajm : j 6= ig. The vector xim is observable to the researcher while "im is unobserved to theresearcher. For the moment we assume that xm � fx1m, x2m, :::, xNmg and "m � f"1m, "2m,:::, "Nmg are common knowledge for all players.For instance, in the example of the airline industry, the vector xim may include market
characteristics such as population and socioeconomic characteristics in the two cities that
a¤ect demand, characteristics of the airports such as measures of congestion (that a¤ect
costs), and �rm characteristics such as the number of other connections that the airline has
in the two airports (that a¤ect operating costs due to economies of scale and scope).
The N �rms chose simultaneously {a1m; a2m; :::; aNmg and the assumptions of Nashequilibrium hold. A Nash equilibrium in this the entry game is an N -tuple a�m = (a�1m;
a�2m; :::; a�Nm) such that for any player i:
a�im = 1��i�xim; "im, a��im
�� 0
where 1 f.g is the indicator.Given a dataset with information on faim; ximg for every �rm in the M markets, we
want to use this model to learn about the structure of the pro�t function �i(:). In these
applications, we are particularly interested in the e¤ect of other �rms�entry decisions on a
�rm�s pro�t. For instance, how Southwest entry in the Chicago-Boston submarket a¤ects
the pro�t of American Airlines.
For the sake of concreteness, consider the following speci�cation of the pro�t function:
�im = xim �i �P
j 6=i ajm �ij + "im
where xim is a 1�K vector of observable market and �rm characteristics; �i is a K�1 vectorof parameters; �i = f�ij : j 6= ig is a (N � 1) � 1 vector of parameters, with �ij being thee¤ect of �rm j0s entry on �rm i0s pro�t; "im is zero mean random variable that is observable
to the players but unobservable to the econometrician.
We assume that "im is independent of xm, and it is i:i:d: over m, and independent across
i. If xim includes a constant term, then without loss of generality E("im) = 0. De�ne
�2i � V ar("im). Then, we also assume that the probability distribution of "im=�i is knownto the researcher. For instance, "im=�i has a standard normal distribution.
4. EMPIRICAL MODELS OF MARKET ENTRY WITH HETEROGENEOUS FIRMS 107
The econometric model can be described as system of N simultaneous equations where
the endogenous variables are the entry dummy variables:
aim = 1nxim �i �
Pj 6=i ajm �ij + "im � 0
oWe want to estimate the vector of parameters � =
��i�i;�i�i: i = 1; 2; :::; N
�.
There are two main econometric issues in the estimation of this model: (1) endogenous
explanatory variables, ajm; and (2) multiple equilibria.
4.2. Endogeneity of other players�actions. In the structural (best response) equa-tion
aim = 1nxim �i �
Pj 6=i ajm �ij + "im � 0
othe actions of the other players, fajm : j 6= ig are endogenous in an econometric sense. Thatis, ajm is correlated with the unobserved term "im, and ignoring this correlation can lead to
serious biases in our estimates of the parameters �i and �i.
There two sources of endogeneity or correlation between ajm and "im: simultaneity and
common unobservables between "im and "jm. It is interesting to distinguish between these
two sources of endogeneity because they bias the parameter �ij in opposite directions.
Simultaneity. An equilibrium of the model is a reduced form equation where we repre-sent the action of each player as a function of only exogenous variables in xm and "m. In this
reduced form, ajm depends on "im. It is possible to show that this dependence is negative:
keeping all the other exogenous variables constant if "im is small enough then ajm = 0, and if
"im is large enough then ajm = 1. Suppose that our estimator of �ij ignores this dependence.
Then, the negative dependence between ajm and "im contributes to generate a upward bias
in the estimator of �ij.
That is, we will spuriously over-estimate the negative e¤ect of Southwest on the pro�t
of American Airlines because Southwest tends to enter in those markets where AA has low
values of "im.
Positively correlated unobservables. It is reasonable to expect that "im and "jmare positively correlated. This is because both "im and "jm contain unobserved market
characteristics that a¤ect in a similar way, or at least in the same direction, to all the �rms
in the same market. Some markets are more pro�table than others for every �rm, and part
of this market heterogeneity is observable to �rms but unobservables to us as researchers.
The positive correlation between "im and "jm generates also a positive dependence between
ajm and "im.
For instance, suppose that "im = !m + uim, where !m represents the common market
e¤ect, and uim is independent across �rms. Then, keeping xm and the unobserved u variables
108 5. EMPIRICAL MODELS OF MARKET ENTRY
constant, if !m is small enough then "im and ajm = 0, and if !m is large enough then "imis large and ajm = 1. Suppose that our estimator of �ij ignores this dependence. Then, the
negative positive dependence between ajm and "im contributes to generate a downward bias
in the estimator of �ij. In fact, the estimate of �ij could have the wrong sign, i.e., being
negative instead of positive.
That is, we can spuriously �nd that American Airlines bene�ts for the operation of
Continental in the same market because we tend to observe that these �rms are always active
in the same markets. This positive correlation between aim and ajm can be completely driven
by the positive correlation between "im and "jm.
These two sources of endogeneity generate biases of opposite sign in �ij. There is evidence
from di¤erent empirical applications that the biased due to unobserved market e¤ects is
much more important than the simultaneity bias. Examples: Collard-Wexler (WP, 2007) US
cement industry; Aguirregabiria and Mira (Econometrica, 2007) di¤erent retail industries in
Chile; Aguirregabiria and Ho (WP, 2007) US airline industry; Ellickson andMisra (Marketing
Science, 2008) US supermarket industry.
How do we deal with this endogeneity problem? The intuition for the identi�ca-tion in this model is similar to the identi�cation using standard Instrumental Variables (IV)
and Control Function (CF) approaches.
"IV approach": There are exogenous �rm characteristics in xjm that a¤ect the action of
�rm j but do not have a direct e¤ect on the action of �rm i: i.e., observable characteristics
with �j 6= 0 but �i = 0."CF approach": There is an observable variable Cit that "proxies" or "controls for" the
endogenous part of "im such that if we include Cit in the equation for �rm i then the new
error term in that equation and ajm become independent (conditional on Cit).
The method of instrumental variables is the most common approach to deal with en-
dogeneity in linear models. However, IV or GMM cannot be applied to estimate discrete
choice models with endogenous variables. Control function approaches: Rivers and Vuong
(1988), Vytlacil and Yilditz (2006). These approaches have not been extended yet to deal
with models with multiple equilibria or "multiple reduced forms".
An alternative approach is Maximum likelihood: If we derive the probability distribution
of the dummy endogenous variables conditional on the exogenous variables (i.e., the reduced
form of the model), we can use these probabilities to estimate the model by maximum
likelihood.
l(�) =XM
m=1ln Pr(a1m; a2m; :::; aNm j xm; �)
This is the approach that has been most commonly used in this literature. However, we will
have to deal with the problem of multiple equilibria.
4. EMPIRICAL MODELS OF MARKET ENTRY WITH HETEROGENEOUS FIRMS 109
4.3. Multiple equilibria. Consider the model with two players and assume that �1 � 0and �2 � 0.
a1 = 1 f x1�1 � �1 a2 + "1 � 0 g
a2 = 1 f x2�2 � �2 a1 + "2 � 0 gThe reduced form of the model is a representation of the endogenous variables (a1; a2) only
in terms of exogenous variables and parameters. This is the reduced for of this model:
fx1�1 + "1 < 0g & fx2�2 + "2 < 0g ) (a1; a2) = (0; 0)
fx1�1 � �1 + "1 � 0g& fx2�2 � �2 + "2 � 0g ) (a1; a2) = (1; 1)
fx1�1 � �1 + "1 < 0g & fx2�2 + "2 � 0g ) (a1; a2) = (0; 1)
fx1�1 + "1 � 0g & fx2�2 � �2 + "2 < 0g ) (a1; a2) = (1; 0)
The graphical representation in the space ("1; "2) is:
Note that when:
f0 � x1�1 + "1 < �1g and f0 � x2�2 + "2 < �2g
we have two Nash equilibria: (a1; a2) = (0; 1) and (a1; a2) = (1; 0). For this range of values
of ("1:"2), the reduced form (i.e., the equilibrium) is not uniquely determined. Therefore, we
can not uniquely determine the probability Pr(a1m; a2mjxm; �) that we need to estimate themodel by ML. We know Pr(1; 1j�), and Pr(0; 0j�), but we only have lower and upper boundsfor Pr(0; 1j�) and Pr(1; 0j�):The problem of indeterminacy of the probabilities of di¤erent outcomes becomes even
more serious in empirical games with more than 2 players or/and more than two choice
alternatives.
There have been di¤erent approaches to deal with this problem of multiple equilibria.
Some authors have imposed additional structure in the model to guarantee equi-librium uniqueness or at least uniqueness of some observable outcome (e.g., number ofentrants). A second group of studies do not impose additional structure and use methods
such that moment inequalities or pseudo maximum likelihood to estimate structuralparameters. The main motivation of this second group of studies is that identi�cation and
multiple equilibria are di¤erent problems and we do not need equilibrium uniqueness to
identify parameters.
4.4. Identi�cation and multiple equilibria. Tamer (2003) showed that all the para-meters of the previous entry model with N = 2 is (point) identi�ed under standard exclusion
restrictions, and that multiple equilibria do not play any role in this identi�cation result.
110 5. EMPIRICAL MODELS OF MARKET ENTRY
Tamer�s result can be extended to any number N of players, as long as we have the appro-
priate exclusion restrictions.
More generally, equilibrium uniqueness is neither a necessary nor a su¢ cient condition
for the identi�cation of a model (Jovanovic, 1989). To see this, consider a model with vector
of structural parameters � 2 �, and de�ne the mapping C(�) from the set of parameters �
to the set of measurable predictions of the model. For instance, C(�) may contain the proba-
bility distribution of players actions conditional on exogenous variables Pr(a1; a2; :::; aN jx; �).Multiple equilibria implies that the mapping C(:) is a correspondence. A model is not
point-identi�ed if at the observed data (say P 0 = Pr(a1; a2; :::; aN jx; �) for any vector ofactions and x0s) the inverse mapping C�1 is a correspondence. In general, C being a function
(i.e., equilibrium uniqueness) is neither a necessary nor a su¢ cient condition for C�1 being
a function (i.e., for point identi�cation).
To illustrate the identi�cation of a game with multiple equilibria, we start with a simple
binary choice game with identical players and where the equilibrium probability P is im-
plicitly de�ned as the solution of the condition P = �(�1:8 + � P ), where � is a structuralparameter, and � (:) is the CDF of the standard normal. Suppose that the true value �0is 3:5. It is possible to verify that the set of equilibria associated with �0 is C(�0) = fP (A)(�0) = 0:054, P (B)(�0) = 0:551, and P (C)(�0) = 0:924g. The game has been played Mtimes and we observe players�actions for each realization of the game faim : i;mg. Let P0be the population probability Pr(aim = 1). Without further assumptions the probability
P0 can be estimated consistently from the data. For instance, a simple frequency estimator
P̂0 = (NM)�1P
i;m aim is a consistent estimator of P0. Without further assumption, we do
not know the relationship between population probability P0 and the equilibrium probabili-
ties in C(�0). If all the sample observations come from the same equilibrium, then P0 should
be one of the points in C(�0). However, if the observations come from di¤erent equilibria in
C(�0), then P0 is a mixture of the elements in C(�0). To obtain identi�cation, we can assume
that every observation in the sample comes from the same equilibrium. Under this condition,
since P0 is an equilibrium associated with �0, we know that P0 = �(�1:8 + �0 P0). Giventhat �(:) is an invertible function, we have that �0 = (��1 (P0) + 1:8)=P0. Provided that P0is not zero, it is clear that �0 is point identi�ed regardless the existence of multiple equilibria
in the model.
5. INTRODUCTION 111
5. Introduction
Economists have devoted substantial e¤orts to understand the determinants of mar-
ket structure and the relationship between market structure, market competition and wel-
fare.Why some markets are characterized by a small umber of �rms, while in others there
are many competitors? Why average �rm size (or/and quality) varies over markets?
The work in IO between the 50s and 70s studied the relationship between market structure
and market conpetition under the assumption (implicit or explicit) that market structure is
exogenous (Bain, 1956). The typical relationship that is estimated in most of the empirical
papers from these years is a regression of market average price cost margin (e.g., Lerner
index) on a measure of market concentration (e.g., Her�ndhal index). During the 80s,
di¤erent papers in theoretical IO focused on studying how market structure is determined
endogenously in the equilbrium of strategic games. Since the 90s, the availability of �rm level
panel data from many di¤erent industries and the developments in structural econometrics
have generated important applications that study market structure using structural models
of market entry.
5.1. Structural model of market structure. What is a structural model ofmarket structure?It is an econometric model based on a game theoretic model for the determination of the
number of �rms in a market, as well as some �rms�characteristics such as prices, quantity,
capacity, quality, etc. The model makes assumptions about entry and market competition.
The primitives of the model (i.e., the structural parameters) are the demand, �rms�produc-
tion technology, and institutional factors that a¤ect entry costs, investment costs, etc.
Why structural models of market entry?Many of the structural parameters in a structural model of entry are demand parameters
and technological parameters which can be estimated from a demand system and a produc-
tion function, respectively. Then, why are we concerned with the estimation of structural
models of market entry? There are at least three reasons.
(1) Estimation of other parameters which do not appear in the demand systemor in the production function such as �xed operating costs, entry costs, sunk costs
(exogenous and endogenous), investment costs, exit values, as well as �rm heterogeneity.
These parameters can play a key role in the determination of market structure.
(2) Data on prices and quantities may not be available. Unfortunately, this is thecase in many market/industries. We may have information on �rms�revenues but not on
prices and quantities of each product or on outputs and inputs.
112 5. EMPIRICAL MODELS OF MARKET ENTRY
(3) Controlling for endogenous entry/exit. Sometimes in the estimation of a demandsystem or of a production function we have to deal with the endogeneity if �rms�entry and
exit decisions. For instance, in the estimation of the demand of a di¤erentiated product we
may endogenize product characteristics. This is equivalent to consider a model of market
entry.
The principle of Revealed Preference. The estimation of structural models of marketentry is based on the principle of Revealed preference. In the context of these models, this
principle simply establishes that if we observe a �rm operating in a market this is because
the value of being active is greater than the value of shuting down and pitting its assets in
alternative uses. Under this principle, �rms�choices reveal something about the underlying
latent pro�t (value).
The key endogeneity problem. Other �rms�actions may be endogenous because thedepend on the same unobserved market characteristics that a¤ect a �rm�s entry decision.
Structural Discrete Choice gamesConsider a population of individuals or agents (e.g., consumers, �rms, workers, countries,
etc) that we can classify in groups. Examples of groups are: local markets (groups of �rms);
schools (groups of students); families (groups of individuals); �rms (groups of workers);
etc. The de�nition of the group is very important in a game because strategic or social
interactions take place within groups but not between groups.
Let i be the subindex for individuals within a group and m the subindex for the group.
Therefore, a pair (i;m) identi�es an individual. Let aim be an action of individual (i;m).
This action belongs to a set A. An individual�s utility depends on his own action and on the
actions of the members in his group:
Uim = Uim(aim; a�im)
where a�im is the vector with the actions of the other members of the group. We de�ne
strategic (or social, group) interactions to the e¤ect of a�im on the utility of individual
(i;m).
We consider here non-cooperative games. Under the assumption of Nash behavior and
complete information, player i�s best response function is:
aim = Rim(a�im) � argmaxa2A
Uim(a; a�im)
These best response functions are the main structural equations in our econometric games.
Examples in IO: Models of oligopolistic competition: price or quantity competition; entryand exit; adoption of new technologies; advertising; auctions; etc.
6. IDENTIFICATION 113
Examples in Education Economics: Peer student e¤ects in school.Examples in Health Economics: Smooking and social/group e¤ects.Examples in Labor: Labor supply within the family.Examples in Consumption: Models of consumption with group externalities.
6. Identi�cation
Consider the previous general model. An individual utility may depend also on exogenous
factors such as individuals�characteristics and characteristics of the group. More precisely,
Uim = U(aim; a�im; xim; x�im; "im; "�im; zm; !m)
where x�im � fxjm : j 6= ig and "�im � f"jm : j 6= ig. The individual characteristics xim andthe group characteristics zm are observable to the econometrician, but the characteristics "imand !m are unobservable.
The best response function of an individual is a function:
aim = R(a�im; xim; x�im; "im; "�im; zm; !m)
The dependence of the function R(:) with respect to a�im represents the existence of strategic
interactions. Without that dependence the decision problem would be a single-agent decision
problem.
Example 1: Price competition in a di¤erentiated product market. The agents are�rms and the group is the market where they compete. aim is the price of �rm i in market
m. The utility of �rm i is his variable pro�t:
Uim = (aim � cim) qim
where cim is a constant marginal cost and qim is the demand. Each �rm produces a di¤er-
entiated product and the demand for product i is qim = Hmsim, where Hm is the number of
consumers in market m and sim is the market share:
sim =exp fxim �x � �a aim + zm �z + "im + !mg
1 +PNm
j=1 exp fxjm �x � �a ajm + zm �z + "jm + !mg
xim and "im are product characteristics; zm and !m are characteristics of the outside good;
and �x, �a and �z are demand parameters. The �rst order conditions for the optimal
response of �rm i, imply the best response function:
aim = cim + (�a (1� sim))�1
Example 2: Peer e¤ects in the school. TBW.
114 5. EMPIRICAL MODELS OF MARKET ENTRY
Under which conditions can we identify these strategic or social interactions? First, we
will see that independence between unobservables and exogenous variables is not su¢ cient,
i.e., ("im; "�im; !m) ? (xim; x�im; zm) is not su¢ cient to identify strategic interactions. To
illustrate this in a simple framework, suppose that the best response function are linear.
Then, we can write these best response functions as:
aim = �x xim + �z zm + �a �am + ��x �xm + "im + �"m + !m
where �am ��PNm
j=1 ajm
�=Nm, �xm �
�PNmj=1 xjm
�=Nm and �"m �
�PNmj=1 "jm
�=Nm. The
parameter �a captures the direct e¤ect of strategic interactions. We �rst obtain the reduced
form equations of the model, i.e., aim as a function only of exogenous variables. Note that
deriving the reduced form equations is equivalent to obtain the Nash equilibrium for each
possible value of the exogenous factors. Given the linearity of best response functions, the
conditions for existence and uniqueness are straightforward.
First, we average within groups the RHS and LHS and the solve �am, we obtain:
�am =
��x + ��x1� �a
��xm +
��z
1� �a
�zm +
�2�"m + !m1� �a
�Now, we plug-in this expression in the structural equations (i.e., best response functions) to
obtain the reduced form equations:
aim = �x xim +
��z
1� �a
�zm +
��a�x + ��x1� �a
��xm + "im +
�1 + �a1� �a
��"m +
�!m1� �a
�Under the assumption that ("im; "�im; !m) ? (xim; x�im; zm), it is clear that we can identifythe reduced form parameters:
�x = �x
�z =�z
1� �a��x =
�a�x + ��x1� �a
Therefore, �x is identi�ed but �z, �a and ��x are not identi�ed.
What type of additional restrictions can provide identi�cation of the struc-tural parameters? We consider here two di¤erent types of restrictions:
1. Exclussion restrictions (i.e., instrumental variables).
2. Additional outcome variables and outcome equations (i.e., control func-
tion).
We will also study how incomplete information games su¤er of similar identi�cation
problems, and incomplete information assumptions alone do not provide identi�cation of
structural parameters.
6. IDENTIFICATION 115
6.1. Exclussion restrictions. Suppose that there is a variable in the vector xim, sayvariable x1im, such that:
(a) x1im has an e¤ect on the best response of individual i, i.e.,�x1 6= 0.(b) �x1m does not have an e¤ect on the best response of individual i,
i.e.,��x1 = 0.
Under these conditions, the reduced form parameter associated with �x1m is:
��x1 =�a �x11� �a
Given that ��x1 and �x1 = �x1 are identi�ed, we can solve for �a to obtain:
�a =��x1
�x1 + ��x1
It is simple to verify that �x1 + ��x1 6= 0 if and only if �x1 6= 0. Therefore, �a is identi�ed.Given the identi�cation of �a, it is simple to verify that the rest of structural parameters are
also identi�ed.
Example: School performance and peer e¤ects. Suppose that aim is a standardizedtest score that measures the academic performance of student i in class m. The vector
ximcontains characteristics of the student such as, where he lives, family background, etc.
The vector zm contains characteristics of the school. We are particularly interested in esti-
mating �a. Consider the following assumption: parental education has an e¤ect on a kid�s
academic performance (i.e., �x1 6= 0), but the education of other kids�parents does not havea direct e¤ect of a kid�s performance (i.e., ��x1 = 0). This assumption identi�es �a.
Of course, one might think that ��x1 is not really zero, e.g., more educated parents can
be more involve in the school and can improve the quality of the school in a way that is
not captured by the variables in zm. In that case, our estimator of �a under the wrong
identi�cation assumption is a consistent estimator of:
��x1�x1 + ��x1
=
��a�x1 + ��x11� �a
�=
��x1 +
�a�x1 + ��x11� �a
�=�a + ��x1=�x11 + ��x1=�x1
If �a 2 (0; 1) and ��x1=�x1 > 0, then this estimator over-estimates the true value of �a, i.e.,we interpret as positive peer e¤ects an e¤ect that comes really from other parents education.
This bias may have some important policy implications. Consider a policy that improves
some variable zm exogenously in an amount � (leaving parental education constant). The
e¤ect of this policy is:�z�
1� �aIf we over-estimate �a, we overestimate the e¤ect of this policy.
116 5. EMPIRICAL MODELS OF MARKET ENTRY
Example: Price competition with pure vertical product di¤erentiation. In thisexample the previous type of exclussion restrictions are not always plausible. According
to these restrictions, a �rm�s best reponse price depends on the characteristics of his own
product but not on the characteristics of other products in the same market/industry. Only
under very speci�c conditions on competition, this assumption is consistent with the own
structure of the game. This is because, in general the demand and therefore the pro�t of
�rm i depends on the characteristics of all the products in the market.
However, in some models with vertical product di¤erentiation, we can have that type
of exclussion restrictions. Consider a model of vertical product di¤erentiation where the
demand (and pro�ts) of �rm i depends on the prices and qualities of �rms i � 1 and i + 1,but not (directly) on the prices of the rest of the �rms in the market. Then, we can use
the characteristics of �rms i� 2 and i+ 2 to instrument the prices of �rms i� 1 and i+ 1,respectively, in the best response function of �rm i.
6.2. Additional outcome variables. In the example of price competition in a di¤er-entiated product market, the previous type of exclussion restrictions are not always plausible.
Do we have other conditions to identify structural parameters? Note that, so far, the only
endogenous variables in the model are players�actions, i.e., in this case �rms�prices. Sup-
pose that we also observe some ourcome variables, e.g., �rms�quantities. In this case, the
structure of the model can provide conditions that identify structural parameters under the
assumption that ("im; "�im; !m) ? (xim; x�im; zm).For instance, in our example of the logit model of product di¤erentiation we have that:
ln (sim)� ln (s0m) = xim �x � �a aim + zm �z + "im + !m
where sim = qim=Mm and s0m = 1 �PNm
j=1 sjm. Note that, for this transformation of the
demand equation that exploits our knowledge of �rms�quantities, we have exclussion re-
strictions. In this equation, the charactetistics of other �rms�products do not appear in the
right hand side (i.e., do not a¤ect the average utility of buying product i). Therefore, we
can use the characteristics of other products fxjm : j 6= ig as instruments for the price aim.
6.3. Games of incomplete information. Remember our characterization of the game.We have agents with utilities:
Uim = U(aim; a�im; xim; x�im; "im; "�im; zm; !m)
This is a game of complete information. Suppose that we consider a game of incomplete
information with the following assumptions:
(1) The only unobservables for the econometrician are the private information shcoks
f"im}
6. IDENTIFICATION 117
(2) "�im does not enter in the utility of player i.
(3) Private information is independently distributed over players.
Then,
Uim = U(aim; a�im; xim; x�im; zm; "im)
A player maximizes expected utility conditional on the common knowledge variables, xmand zm, and on his own private information.
For the linear case, we have that the best response function of player i is:
aim = �x xim + �z zm + �a E (�amjzm; xm) + ��x �xm + "imSolving for E (�amjzm; xm) we get:
E (�amjzm; xm) =��x + ��x1� �a
��xm +
��z
1� �a
�zm
And solving this expression in the structural equation:
aim = �x xim +
��z
1� �a
�zm +
��a�x + ��x1� �a
��xm + "im
It is clear that we have the same identi�cation problem as in the game of complete informa-
tion.
Under the assumption that ("im; "�im; !m) ? (xim; x�im; zm), it is clear that we can
identify the reduced form parameters:
�x = �x
�z =�z
1� �a��x =
�a�x + ��x1� �a
aim = �x xim + �z zm + �a �am + ��x �xm + "im + �"m + !m
According to these restrictions, a �rm�s best reponse price depends on the characteris-
tics of his own product but not on the characteristics of other products in the same mar-
ket/industry. Only under certain conditions on competition this assumption holds.
118 5. EMPIRICAL MODELS OF MARKET ENTRY
7. Static models of market entry
Though most issues and techniques that we will consider here can be applied to any
empirical discrete game, we will concentrate on models of market entry and exit. These
are the empirical applications of discrete games that have received more attention. These
empirical studies, analyze the determinants of market structure. The number of �rms in
a market, the degree of concentration, or the amount of turnover, are related to consumer
welfare and �rms�price-cost margins. Therefore, it is important to understand how di¤erent
exogenous factors contribute to the con�guration of a particular market structure. Di¤erent
factors which are potentially important are: (a) sunk entry costs; (b) economies of scale and
market size; (c) barriers to entry; (d) heterogeneity in �rms�production costs; (e) product
di¤erentiation; (f) �rst-mover advantages; (g) predatory conduct of incumbent �rms; (e)
collusive behavior of incumbent �rms; etc.
In the presentation of these models I have followed a classi�cation based on three criteria.
These three criteria play an important role in the speci�cation and estimation of the model.
The �rst criterion is whether the entry game is complete or incomplete information, i.e.,
whether �rms have some private information about some component of their pro�ts. The
second criterion is based on a particular aspect of the data which is related to competition:
global potential entrants vs. local potential entrants. And the third criterion is whether
�rms are heterogeneous in some exogenous characteristics such as costs, productivity or
some product attributes.
7.1. Data. Most empirical studies have considered cross-sectional data of independent("isolated") markets. Therefore, these studies try to explain the determinants of cross-
sectional di¤erences in market structure. It is important to distinguish three types of data
sets. The speci�cation and the identi�cation of the model is di¤erent for each of these three
types of data.
(1) Only global potential entrants. The same N �rms are the potential entrants in
every market. We know the identity of these "global" potential entrants. Therefore, we
observe the decision of each of these �rms in every independent market. We observe market
characteristics, and sometimes �rm characteristics which may vary or not across markets.
The data set is:
fsm; xim; aim : m = 1; 2; :::;M ; i = 1; 2; :::; Ng (7.1)
where m is the market index; i is the �rm index; sm is a vector of characteristics of market
m such as market size, average consumer income, or other demographic variables; xim is a
vector of characteristics of �rm i; and aim is the indicator of the event "�rm i is active in
market m".
7. STATIC MODELS OF MARKET ENTRY 119
Let use xm to represent the vector of (exogenous or predetermined) market and �rms�
characteristics: zm � fsm; zim : i = 1; 2; :::; Ng. And let am befaim : i = 1; 2; :::; Ng. Withthis type of data we can nonparametrically identify Pr(amjxm) and therefore we will be ableto identify models with very general forms of �rm heterogeneity.
Example 1: Berry (Econometrica, 1992) considers entry in airline markets. A marketis a city pair (e.g., Boston-Chicago). The set of markets consists of all the pairs of US cities
with airports. Every airline company operating in US is a potential entrant in each of these
markets. aim is the indicator of the event "airline i operates in city pair m".
Example 2: Toivanen andWaterson (2000) consider entry in local markets by fast-foodrestaurants in UK. Potential entrants are Burger King, McDonalds, KFC, Wendys, etc.
(2) Only local potential entrants. We do not know the identity of the potential entrants.In fact, most potential entrants may be local, i.e., they consider entry in only one local
market. For this type of data we only observe market characteristics and the number of
active �rms in the market. The data set is:
fsm; nm : m = 1; 2; :::;Mg (7.2)
where nm is the number of �rms operating in market m.
With this type of data we can nonparametrically identify Pr(nmjsm). We can identify�rm heterogeneity only to a limited extent and based on functional form identi�cation.
Notice also that we do not know the number of potential entrants N , and this may vary over
markets.
Example 1: Bresnahan and Reiss (REStud, 1990). Car dealers in small towns.
Example 2: Bresnahan and Reiss (JPE, 1991). Restaurants, dentists and other retailerand professional services in small towns.
Example 3: Seim (2003). Video rental stores.
(3) Both global and local potential entrants. This case combines and encompassesthe previous two cases. There are NG �rms which are potential entrants in all the markets,
and we now the identity of these �rms. But there are also other potential entrants which
are just local. We observe:
fsm; nm; zim; aim : m = 1; 2; :::;M ; i = 1; 2; :::; NGg (7.3)
With this data we can nonparametrically identify Pr(nm; amjxm). We can allow for hetero-geneity between global players in a very general way. Heterogeneity between local players
should be much more restrictive.
120 5. EMPIRICAL MODELS OF MARKET ENTRY
7.2. Entry games with complete information.7.2.1. Model without �rm heterogeneity (Bresnahan and Reiss, JPE 1991).
7.2.2. Model and econometric issues. Consider a market withN potential entrants. These
�rms play a two-periods sequential game. In the �rst period they decide whether to enter
or not in the market. In period two, those �rms who have entered in the market compete in
quantities or in prices. We solve this game backwards and obtain a subgame perfect Nash
equilibrium. Players have complete information.
If a �rm does not enter in the market, its pro�ts are zero. If a �rm enters in the market,
its pro�ts are determined in the second stage of the game. Let �i(:) be the pro�t of �rm i
if it operates in the market
Example 1: Homogeneous product, Cournot competition, and homogeneous marginal costs.Suppose that the inverse market demand is linear, p = A � BQ, where p is the marketprice, Q is the aggregate quantity, and A and B are parameters. Firm i0s cost function is:
Ci(qi) = cqi + Fi, where c is the marginal production cost, and Fi is a �xed cost. Suppose
that there are n �rms operating in the market. You can show that a �rm�s equilibrium pro�t
under Cournot competition is:
�i(n) =1
b
�A� cn+ 1
�2� Fi (7.4)
Notice that, in this model, if potential entrants have homogeneous marginal costs, pro�ts
depend of the number of competitors but not on the identity of the competitors. This is not
the case is �rms have heterogeneous marginal costs. Suppose that Ci(qi) = ciqi + Fi, then
the equilibrium pro�t is:
�i(n) =1
b
A� ci +
PNj=1 aj (cj � ci)n+ 1
!2� Fi (7.5)
where, for j = 1; 2; :::; N , the variable aj is the indicator of �rm i0s entry decision. It is clear
that pro�ts depend on the identity of the opponents, i.e., the more e¢ cient the opponents
the lower the pro�t.
In general, the pro�t of �rm i if it operates in market m depends on market and �rm
characteristics a¤ecting demand and costs, and on the entry decisions of the other potential
entrants: �im = �i (xm; a�im), where a�im � fajm : j 6= ig. A Nash equilibrium in the �rst
period is a N-tuple a�m = (a�1m; a
�2m; :::; a
�Nm) such that for any player i:
a�im = I��i�xm; a
��im�� 0
(7.6)
Consider the following speci�cation of the pro�t function:
�im = xim �i � a�im �i + "im (7.7)
7. STATIC MODELS OF MARKET ENTRY 121
where �i is a K � 1 vector of parameters; �i = f�ij : j 6= ig is a (N � 1) � 1 vector ofparameters, and �ij is the reduction in the pro�t of �rm i associated with entry of �rm j;
and "im is mean zero random variable that is observable to the players but unobservable to
the econometrician. The parameters �i are particularly important because they capture the
strategic interactions.
ASSUMPTION: De�ne "m � ("1m; "2m; :::; "Nm) and xm � (x1m; x2m; :::; xNm). We assume
that "m is independent of xm, and it is iid N(0;). We normalize the variances of the "0s to
one, but the covariances are free parameters.
An equilibrium in this model is a system of N simultaneous equations where the endoge-
nous variables are dummy or binary variables:
aim = I f xim �i � a�im �i + "im � 0 g (7.8)
Suppose that we have a random sample of M markets where we observe fxim; aimg. Wewant to use this sample to estimate the vector of parameters � = f�i; �i : i = 1; 2; :::; Ng.There are two main econometric problems in the estimation of this model: (1) endogeneity
or simultaneity; and (2) multiple equilibria.
(1) Endogeneity (simultaneity). A �rm�s entry decision depends on all the common
knowledge variables, including other �rms� "0s. Therefore, cov (a�im ; "im) 6= 0. A ML
estimator of the probit model (7.8) that ignores this simultaneity will provide inconsistent
estimates. To illustrate this issue in a simple framework, consider two players, and linear
best response functions:a1m = x1m�1 � �1 a2m + "1ma2m = x2m�2 � �2 a1m + "2m
(7.9)
The sample fa1m; a2mg contains equilibrium points for di¤erent realizations of ("1; "2). If wecould �xed "1 and vary "2, then we could obtain sample points along �rm 1 best response
function and we could identify �1 and �1 with an OLS regression of a1m on a2m. If we could
�xed "2 and vary "1, then we could identify �2 and �2 with an OLS regression of a2m on a1m.
But the sample points are obtained with variations in both "1 and "2.
The reduced form equation for a2m is: a2m = (1��1�2)�1 (x2m�2 � �2x1m�1 + "2m � �2"1m).Taking into account this reduced form equation, we can show that, if we ignore the endo-
geneity problem and estimate by OLS the �rst equation, the asymptotic bias of the OLS
estimator of �1 is:
p lim Bias(�̂1) = �cov(a2; "1)
var(a2)=(1� �1�2) (�2 � corr("1; "2))
var("2 � �2"1)(7.10)
If �1 and �2 lie in the unit interval (0; 1), and corr("1; "2) = 0, then �̂1 is upward biased. In
general, if there are negative strategic interactions (i.e., �0s > 0) and players�unobservables
122 5. EMPIRICAL MODELS OF MARKET ENTRY
are uncorrelated, then ignoring the simultaneity of players�decisions will induce an upwardbias in our estimates of strategic interaction parameters �.
However, it is reasonable to expect a positive correlation between the unobservables "1and "2. For instance, there may market-speci�c unobservables (some markets are more
pro�table than others for every �rm):
"im = �m + "�im (7.11)
where f"�img are purely idiosyncratic. With this error structure, corr("1; "2) = �2�=(�2� + �2�),where �2� is the variance of the common market shock and �
2� is the variance of the idiosyn-
cratic shock. Therefore, if the market variability is large enough such that corr("1; "2) > �2,
the OLS estimator of �̂1 will be downward biased, i.e., we underestimate the magnitudeof strategic interactions.
The method of instrumental variables is the most common approach to deal with en-
dogeneity in linear models. However, the model that we want to estimate is not the linear
model (7.9), but the nonlinear model (7.8). For some nonlinear with endogenous variables
it is possible to construct moment conditions and use these conditions to estimate the pa-
rameters by GMM. That is not possible for this discrete choice model. A third approach to
deal with endogeneity is to derive the probability distribution of the endogenous variables
conditional on the observable exogenous variables (i.e., the reduced form of the model) and
use these probabilities to estimate the model by maximum likelihood. In our case, we would
estimate � by maximizing the log-likelihood function:
l(�) =PM
m=1 ln Pr(a1m; a2m; :::; aNmjx1m; x2m; :::; xNm; �) (7.12)
This is the approach that we are going to use here. However, we will have to deal with the
problem of multiple equilibria.
(2) Multiple equilibria. The linear model in equation (7.9) has a unique equilibrium. Infact, the reduced form equations represent the equilibrium values of a1m and a2m for di¤erent
values of the exogenous variables. However, when best response functions are nonlinear in
other players�actions, the model can have multiple equilibria. That is the case in our binary
choice game.
Consider the model with two players and assume that �1 � 0 and �2 � 0.a1m = I f x1m �1 � �1 a2m + "1m � 0 ga2m = I f x2m �2 � �2 a1m + "2m � 0 g
If we try to obtain the reduced form of this model, we have that:
(a) (a1m; a2m) = (0; 0) i¤ fx1m �1 + "1m < 0g and fx2m �2 + "2m < 0g.(b) (a1m; a2m) = (1; 1) i¤fx1m �1 + "1m � �1g and fx2m �2 + "2m � �2g.
7. STATIC MODELS OF MARKET ENTRY 123
(c) If fx1m �1 + "1m < �1g and fx2m �2 + "2m � 0g.then (a1m; a2m) =(0; 1).
(d) If fx1m �1 + "1m � 0g and fx2m �2 + "2m < �2g.then (a1m; a2m) =(1; 0).
Notice that when:
f0 � x1m �1 + "1m < �1g and f0 � x2m �2 + "2m < �2g
both (a1m; a2m) = (0; 1) and (a1m; a2m) = (1; 0) are possible outcomes. For this range of
values of the exogenous variables, the reduced form (i.e., the equilibrium) is not uniquely de-
termined. Therefore, we can not uniquely determine the probability Pr(a1m; a2mjx1m; x2m; �)that we need to estimate the model by ML. For models with more than two players, the
range of values of the exogenous variables for which there are multiple equilibria becomes
wider.
We describe below di¤erent solutions that have been proposed to deal with this problem.
But before we discuss these solutions, it is important to make a clari�cation: equilibrium
uniqueness is neither a necessary nor a su¢ cient condition for the identi�cation of a model.
Or in other words, the problem of multiple equilibria is not a problem of identi�cation.
(3) Multiple equilibria and identi�cation: Let P0(Data) 2 � be the true distributionof the data in the population under study, where � is the space of probability distributions.
Let P (Dataj�) be the distribution of the data predicted by the model when the vector ofparameters is � 2 �. The model is identi�ed if there is a unique �0 2 � such that P0(Data)= P (Dataj�0). Or, in other words, we have an identi�cation problem if there are multiple
values of � for which P0(Data) = P (Dataj�). We have a model with multiple equilibria iffor some values of � 2 � the model generates of predicts more than one probability for thedata, i.e., PA(Dataj�), PB(Dataj�), PC(Dataj�), ...It should be clear that multiple equilibria and non-identi�cation are very di¤erent prob-
lems. De�ne the mapping F from the space of parameters � to the space of probability
distributions � such that F(�) contains all the probability distributions predicted by themodel when the vector of parameters is �. Multiple equilibria means that F(�) is not a func-tion but a correspondence. And no identi�cation means that the inverse mapping F�1(.),
evaluated at the population distribution P0(Data), is a correspondence. Of course, F can bea correspondence and F�1 a function or viceversa. For a formal discussion of this problem
in a general framework see Jovanovic (Econometrica, 1989).
124 5. EMPIRICAL MODELS OF MARKET ENTRY
Example: y 2 f0; 1g. The distribution in the population is P0 � Pr(y = 1). The model isPr(y = 1) = f(�), where f(�) is implicitly de�ned as a �xed point of the mapping:
(f) =exp f� + 12 fg
1 + exp f� + 12 fg (7.13)
It is simple to verify that for some values of � there is more than one value f that solves
this mapping, i.e., the model has multiple equilibria. For instance, for � = �4 we have thatfA = 0:023, fB = 0:235, and , fC = 0:999 are equilibria. However, for any population
probability P0 2 (0; 1), the model is identi�ed. The inverse of evaluated at P0 implies
that:
� = ln
�P0
1� P0
�� 12 P0 (7.14)
There is a unique value of � that can P0 as a �xed point of .
The econometric approaches to deal with multiple equilibria in discrete games can be
classi�ed in two groups. The �rst group includes those methods that impose some addi-
tional structure in the model to guarantee equilibrium uniqueness or at least uniqueness of
some observable outcome (e.g., number of entrants). In this group we have Bresnahan and
Reiss (JE, 1990, and REStud 1991) and Berry (Econometrica, 1992). The second group
of studies do not impose additional structure and use a two-stage pseudo maximum likeli-
hood procedure to estimate structural parameters. We have in this group Tamer (REStud,
2003) for discrete games of complete information and Aguirregabiria (Econ Letters, 2004)
for games of incomplete information.
7.2.3. Recursive strategic interactions. Consider the simultaneous equations dummy vari-
able model:
a1m = I f x1m �1 � �12 a2m � :::� �1N aNm + "1m � 0 ga2m = I f x2m �2 � �21 a1m � :::� �2N aNm + "2m � 0 g::: ::: :::aNm = I f xNm �N � �N1 a1m � :::� �N;N�1 aN�1m + "Nm � 0 g
(7.15)
Heckman (Econometrica, 1978) showed that, if the distribution of "m = ("1m; "2m; :::; "Nm)
is continuous with support RN , a necessary and su¢ cient condition for this model tohave a unique reduced form (i.e., equilibrium uniqueness) for any value of the exogenous
variables, is that the model is recursive. That is:
�1i = 0 for any i�2i = 0 for any i 6= 1�3i = 0 for any i 6= 1; 2::: :::
(7.16)
Though this approach can be useful in some applications, it imposes very strong restric-
tions in the case of games of market entry. The strategic interactions parameters � are our
7. STATIC MODELS OF MARKET ENTRY 125
main parameters of interest here, and we to learn from the data about them. Furthermore,
we do not need a unique reduced form (unique equilibrium) to identify these parameters.
7.2.4. Bresnahan and Reiss (JE, 1990). In the model with two potential entrants, N =
2, it is clear that we have well-de�ned probabilities for the distribution of the number of
entrants:
Pr (nm = 2jxm; �) = Pr (1; 1jxm; �)Pr (nm = 0jxm; �) = Pr (0; 0jxm; �)Pr (nm = 1jxm; �) = 1� Pr (1; 1jxm; �)� Pr (0; 0jxm; �)
(7.17)
Therefore, we can de�ne a log likelihood function for the probability distribution of the
number of �rms and estimate � by maximizing this likelihood. That is, we concentrate only
in the predictions of the model for the number of active �rms, not for the identity of the
�rms.
Comment 1: If we have the second type of data where we do not know the identity of the
players (i.e., only "local players") this approach does not impose any important restriction.
With this type of data we can only identify Pr(nmjxm).
Comment 2: However, if we know the identity of some �rms (i.e., global players) and we
want to allow for between-players heterogeneity in some parameters, this approach is not
very useful. In particular, we cannot identify heterogeneous parameters using only data on
the distribution of the number of �rms.
7.2.5. Berry (1992). Berry extends in several directions the approach in Bresnahan and
Reiss. First, he considers a model with an arbitrary number of potential entrants N and
provides conditions under which the equilibrium number of entrants is uniquely determined.
Second, he shows that if we incorporate in this model an assumption on the order of entry,
we can allow for between-�rm heterogeneity and we can estimate the model by ML. And
third, he proposes an algorithm to calculate the probabilities of each possible outcome and
to estimate the model by simulated ML.
Suppose that the pro�t function has the following form:
�im = �im � �i h�1 +
Pj 6=i ajm
�(7.18)
where �im is an index that represents the pro�tability of �rm i in market m; �i are positive
parameters; and h(:) is a continuous and monotonically increasing function. Notice that a
�rm�s pro�t depends on other �rms�decisions only through the number of potential entrants:
i.e., the identity of the competitors is not relevant for a �rm�s pro�t, only the number of
competitors is important. This speci�cation rules out heterogeneity in marginal production
cost, and other forms of �rm heterogeneity.
126 5. EMPIRICAL MODELS OF MARKET ENTRY
A Nash equilibrium in this model is a vector fa�im : i = 1; 2; :::; Ng such that:
fa�im = 1g , �im � � h (n�m) � 0and
fa�im = 0g , �im � � h (1 + n�m) < 0(7.19)
where n�m =PN
i=1 aim. It is possible to prove (e.g., by contradiction) that given a vector
of pro�tability indexes f�img there is a unique value n�m for the number of active �rms inequilibrium. There may be multiple equilibria in terms of the identity of the entrants, but
the number of entrants is unique.
Suppose that �im = xm � + "im, and that our data set only includes information on the
number of �rms and the exogenous market characteristics: fnm; xmg. The model implies awell-de�ne probability Pr(nmjxm; �), with � = (�; �). However, to calculate the distributionof the equilibrium number of �rms we should be careful in de�ning the region of integration
associated with each value of nm. In particular, we should avoid integrating more than once
some areas in the space of the unobservable ". Here we describe the method proposed by
Berry to calculate Pr(nmjxm; �).First, it is convinient to de�ne the following sets. For a 2 f0; 1gN and n 2 f0; 1; :::; Ng,
de�ne the region:
R(ajn; xm; �) =�" 2 RN : ai = I (xm� � � h (1 + n) + "im � 0)
(7.20)
This is set of "0s in RN that generates the vector a as a best response to the hypothesis thatthere are n other �rms operating in the market. Notice that R(ajn; xm; �) is a "rectangle" inthe Ndimensional Euclidean space and that for any a0 6= a, R(a0jn; xm; �) and R(ajn; xm; �)are disjoint sets. Associated with these regions, de�ne the probabilities:
H(ajn; xm; �) =ZR(ajn;xm;�)
p(") d" (7.21)
These are well-de�ned probabilities, there is not double integration, and it is simple to verify
that: Xa2f0;1gN
H(ajn; xm; �) = 1
Now, the issue is how to use these "best response probabilities" to obtain the equilibrium
probabilities Pr(n�m = njxm; �). It is simple to show that:
Pr(n�m � njxm; �) =X
a2f0;1gNI�PN
i=1 ai � n�H(ajn; xm; �)
The computation of these probabilities involves multiple integration. We can use an unbiased
simulator of the H(:)probabilities and therefore of Pr(n�m � njxm; �). In fact, given thatR(ajn; xm; �) is a rectangle, it is possible to use the GHK importance sampling simulator.
7. STATIC MODELS OF MARKET ENTRY 127
7.2.6. Tamer (2003). Tamer (REStud, 2003) shows that we do not need equilibrium
uniqueness, or uniqueness of the equilibrium number of �rms, to identify the parameters of
the model. He also proposes a pseudo maximum likelihood method to estimate the model
with two potential entrants.
Consider the two-player game:
a1m = I f x1m �1 � �1 a2m + "1m � 0 ga2m = I f x2m �2 � �2 a1m + "2m � 0 g
(7.22)
Assumption 1: fa1m; a2m; x1m; x2m : m = 1; 2; :::;Mg is a random sample from the populationdistribution P0(a1; a2jx1; x2).
Assumption 2: P0(a1; a2jx1; x2) > 0 for any (a1; a2) 2 f(0; 0) ; (0; 1) ; (1; 0) ; (1; 1)g and any(x1; x2) 2 X.
Assumption 3: "m = ("1m; "2m) is iid N�0;
�1 �� 1
��.
(1) Identi�cation. Theorem 1 in Tamer (2003). Suppose that: (1) Assumptions 1 to 3
hold; (2) �1 � 0 and �2 � 0; (3) (exclusion restriction) there is a variable in x1k such thatx1k 2 x1 and x1k =2 x2 (or, equivalently, x2k 2 x2 and x2k =2 x1) and x1k has positive densityeverywhere over the real line; and (4) var(x1) and var(x2) have full column rank. Under
conditions (1) to (4), the vector of parameters � = (�1; �2; �1; �2; �) is identi�ed. Proof in
Tamer (2003, page 154).
(2) Pseudo maximum likelihood estimation. Suppose that we had a complete modelthat provides unique probabilities for all the outcomes of the game. The log-likelihood
function of that model would be:
l(�) =PM
m=1 a1ma2m lnP (1; 1jxm; �) + (1� a1m)(1� a2m) lnP (0; 0jxm; �)+ (1� a1m)a2m lnP (0; 1jxm; �) + a1m(1� a2m) lnP (1; 0jxm; �)
(7.23)
Our model is incomplete. It provides unique probabilities for the outcomes (0; 0) and (1; 1)
but not for the outcomes (0; 1) and (1; 0). However, we can identify nonparametrically
from the data all the population probabilities: P0(1; 1jxm), P0(0; 0jxm), P0(1; 0jxm) andP0(0; 1jxm). Given these probabilities, we can de�ne the following pseudo likelihood function:
QM(�; P0) =PM
m=1 a1ma2m lnP (1; 1jxm; �) + (1� a1m)(1� a2m) lnP (0; 0jxm; �)+ (1� a1m)a2m lnP0(0; 1jxm)+ +a1m(1� a2m) ln (1� P (1; 1jxm; �)� P (0; 0jxm; �)� P0(0; 1jxm))
(7.24)
This is a well-de�ned pseudo likelihood function, and the true �0 uniquely maximizesQ1(�; P0).
The pseudo ML estimator of �0 is such that:
�̂ = argmax�2�
QM(�; P̂0) (7.25)
128 5. EMPIRICAL MODELS OF MARKET ENTRY
where P̂0 is a nonparametric estimator of P0. This estimator is root-M consistent and
asymptotically normal.
Tamer shows that it is possible to obtain a more e¢ cient PML estimator if we take into
account that the model provides a lower and an upper bound for the probability of (0; 1).
PL(x; �) � P (0; 1jx; �) � PU(x; �) (7.26)
where:PL(x; �) = Pr ("1 < �x1�1 + �1 ; "2 � �x2�2 + �2)
+ Pr ("1 < �x1�1 ; � x2�2 � "2 � �x2�2 + �2)
PU(x; �) = Pr ("1 < �x1�1 + �1 ; "2 � �x2�2)(7.27)
We can construct the pseudo likelihood function:
Q�M(�; P0) =PM
m=1 a1ma2m lnP (1; 1jxm; �) + (1� a1m)(1� a2m) lnP (0; 0jxm; �)+ (1� a1m)a2m lnP �(0; 1jxm;P0; �)+ +a1m(1� a2m) ln (1� P (1; 1jxm; �)� P (0; 0jxm; �)� P �(0; 1jxm;P0; �))
(7.28)
where:
P �(0; 1jxm;P0; �) =
8<: PL(xm; �) if P0(0; 1jxm) � PL(xm; �)P0(0; 1jxm) if PL(xm; �) � P0(0; 1jxm) � PU(xm; �)PU(xm; �) if P0(0; 1jxm) � PU(xm; �)
(7.29)
Again, this is a well-de�ned pseudo likelihood function, and the true �0 uniquely maximizes
Q�1(�; P0). The pseudo ML estimator of �0 maximizes Q�M(�; P̂0), where P̂0 is a nonpara-
metric estimator of P0. This estimator is root-M consistent and asymptotically normal and
is asymptotically more e¢ cient than the estimator that maximizes QM(�; P̂0).
7.3. Entry games with incomplete information.7.3.1. Model and basic assumptions. Consider a market with N potential entrants. If
�rm i does not operate in market m (aim = 0), its pro�t is zero. If the �rm is active in the
market (aim = 1), the pro�t is:
�im = �i(xm; a�im)� "im (7.30)
For instance,
�im = xim �i � "im �P
j 6=i �ij ajm (7.31)
where �i and �i are parameters. These parameters and the vector sm = (s1m; s2m; :::; sNm)
contain the variables which are common knowledge for all players. Now "im is private in-
formation of �rm i. For the moment, we assume that private information variables are
independent of sm, independently distributed over �rms with distribution functions Gi("im).
The distribution function Gi is strictly increasing in R. The information of player i is
(sm; "im).
7. STATIC MODELS OF MARKET ENTRY 129
A player�s strategy depends on the variables in his information set. Let � � f�i(sm; "im) :i = 1; 2; :::; Ng be a set of strategy functions, one for each player, such that �i : S � R !f0; 1g. The actual payo¤/pro�t �im is unknown to player i because the private informationof the other players is unknown to player i. Players maximize expected pro�ts:
�i(sm; "im; ��i) = sim �i � "im �P
j 6=i �ij
�ZI f�j(sm; "jm) = 1g dGj("jm)
�(7.32)
or:�i(sm; "im; ��i) = sim �i � "im �
Pj 6=i �ij P
�j (sm)
= sim �i � "im � P��i(sm)0�i(7.33)
where P�j (sm) �RI f�j(sm; "jm) = 1g dGj("jm) is player j�s probability of entry if he be-
haves according to his strategy in �.
Suppose that players other than i play their respective strategies in �. What is player
i�s best response? Let bi(sm; "im; ��i) be player i�s best response function. This function is:
bi(sm; "im; ��i) = If �i(sm; "im; ��i) � 0 g
= I�"im � sim �i � P��i(sm)0�i
(7.34)
Associated with the best response function bi (in the space of strategies), we can de�ne a
best response probability function in the space of probabilities as:
i(sm; P��i) =
ZI f bi(sm; "im; ��i) = 1 g dGi("im)
=
ZI�"im � sim �i � P��i(sm)0�i
dGi("im)
Gi�sim �i � P��i(sm)0�i
�(7.35)
A Bayesian Nash equilibrium (BNE) in this model is a set of strategy functions �� such
that, for any player i and any value of (sm; "im), we have that:
��i (sm; "im) = bi(sm; "im; ���i) (7.36)
Associated with the set of strategies �� we can de�ne a set of choice probability functions
P � = fP �i (sm) : i = 1; 2; :::; Ng such that P �i (sm) �RI f��i (sm; "im) = 1g dGi("im). Note
that these equilibrium choice probabilities are such that, for any player i and any value of
sm:P �i (sm) = i(sm; P
��i)
= Gi�sim �i � P ��i(sm)0�i
� (7.37)
Therefore, we can de�ne a BNE in terms of strategy functions �� or in terms of choice prob-
abilties P �. There is a one-to-one relationship between �� and P �. Given ��, it is clear that
there is only one set of choice probabilities P � de�nes as P �i (sm) �RI f��i (sm; "im) = 1g dGi("im).
130 5. EMPIRICAL MODELS OF MARKET ENTRY
And given P �, there is only one set of strategies �� that is a BNE and it is consistent with
P �. These strategy functions are:
��i (sm; "im) = I�"im � sim �i � P ��i(sm)0�i
(7.38)
Suppose that the distribution of "im is known up to some scale parameter �i. For instance,
suppose that "im � iid N(0; 1). Then, we have that equilibrium choice probabilities in
market m solve the �xed point mapping in probability space:
P �i (sm) = �
�sim
�i�i� P��i(sm)0
�i�i
�For notational simplicity we will use �i and �i to represent
�i�iand
�i�i, respectively.
We use � to represent the vector of structural parameters f�i; �i : i = 1; 2; :::; Ng. Toemphasize that equilibrium probabilities deped on � we use P (sm; �) = fPi(sm; �) : i =1; 2; :::; Ng to represent a vector of equilibrium probabilities associated with the exogenous
conditions (sm; �). In general, there are values of (sm; �) for which the model has multiple
equilibria. This is very common in models where players are heterogeneous, but we can �nd
also multiple symmetric equilibria in models with homogeneous players, specially if there is
strategic complementarity (i.e., �i < 0) as in coordination games.
7.3.2. Data and identi�cation. Suppose that we observe this game played at M inde-
pendent markets. We observe players�actions and a subset of the common knowledge state
variables, xim � sim. That is,
Data = fxim; aim : m = 1; 2; :::;M ; i = 1; 2; :::; Ng (7.39)
The researcher does not observe private information variables. It is important to distinguish
two cases:
Case I: No common knowledge unobservables, i.e., xim = sim.
Case II: Common knowledge unobservables, i.e., sim = (xim; !im), where
!im is unobservable.
Case I: No common knowledge unobservables. (A) Data with global players. Supposethat we have a random sample of markets and we observe:
fxim; aim : m = 1; 2; :::;M ; i = 1; 2; :::; Ng (7.40)
Let P 0 = fP 0i (x) : i = 1; 2; ::; N ;x 2 Xg be players�entry probabilities in the the populationunder study. The population is an equilibrium of the model. That is, there is a �0 such that,
for any i and any x 2 X:P 0i (x) = �
�xi �
0i � P 0�i(x)0�0i
�(7.41)
7. STATIC MODELS OF MARKET ENTRY 131
From our sample, we can nonparametrically identify the population P 0, i.e., P 0i (x) =
E(aimjxm = x). Given P 0 and the equilibrium conditions in (7.41), can we uniquely identify�0? Notice that we can write these equations as:
��1�P 0i (xm)
�= xim �
0i � P 0�i(xm)0�0i = Zim �0i
De�ne Yim � ��1 (P 0i (xm)); Zim � (xim; P 0�i(xm)); and �0i � (�0i ; �0i ). Then,
Yim = Zim �0i
And we can also write this system as:
E(Z 0imYim) = E(Z0imZim) �
0i
It is clear �0i is uniquely identi�ed if E(Z0imZim) is a nonsigular matrix. Note that if xim
contains variables that variate both over markes and over players then we have exclussion
restrictions that imply that E(Z 0imZim) is a nonsigular matrix.
(B) Data with only local players. Suppose that we have a random sample of markets
and we observe:
fxm; nm : m = 1; 2; :::;Mg (7.42)
Let P 0 = fP 0(x) : x 2 Xg be the entry probabilities in the the population under study.The population is an equilibrium of the model, and therefore there is a �0 such that for any
x 2 X:P 0(x) = �
�x � � � H(P 0[x])
�(7.43)
From our sample, we can nonparametrically identify the population P 0. To see this, notice
that: (1) we can identify the distribution for the number of �rms: Pr(nm = njxm = x);
(2) the model implies that conditional on xm = x the number of �rms follows a Binomial
distribution with arguments N and P 0(x), therefore
Pr(nm = njxm = x) =�n
N
�P 0(x)n
�1� P 0(x)
�N�n;
and (3) given the previous expression, we can obtain the P 0(x) associated with Pr(nm =
njxm = x). Given P 0 and the equilibrium condition P 0(x) = � (x � � � H(P 0[x])), can weuniquely identify �0? Notice that we can write these equations as:
Ym = xm �0 � �0 H(P 0[xm]) = Zm �0
where Ym � ��1 (P 0(xm)); �0 � (�0; �0); and Zm � (xm; H(P 0[xm])). And we can also writethis system as:
E(Z 0mYm) = E(Z0mZm) �
0
It is clear �0 is uniquely identi�ed if E(Z 0mZm) is a nonsingular matrix.
132 5. EMPIRICAL MODELS OF MARKET ENTRY
Case II: Common knowledge unobservables. Intuition: conditional on xm, players actions
are still corelated across markets. This is evidence that ....
In applications where we do not observe the identity of the potential entrants,we consider a model without �rm heterogeneity:
�im = xm � � � h�1 +
Pj 6=i ajm
�+ "im (7.44)
A symmetric Bayesian Nash equilibrium in this model is a probability of entryP �(xm; �) that solves the �xed point problem:
P �(xm; �) = � (xm � � � H(P [xt; �])) (7.45)
whereH(P ) is the expected value of h�1 +
Pj 6=i aj
�conditional on the information
of �rm i, and under the condition that the other �rms behave according to theirentry probabilities in P . That is,
H(P ) =X
a�i
�Yj 6=iPajj [1� Pj]1�aj
�h�1 +
Xj 6=iaj
�(7.46)
andP
a�irepresents the sum over all the possible actions of �rms other than i.
7.3.3. Pseudo ML estimation. The problem is to estimate the vector of structural pa-
rameters �0 given a random sample fxim; aimg. Equilibrium probabilities are not uniquely
determined for some values of the primitives. However, for any vector of probabilities P , the
best response probability functions ��xim �i �
Pj 6=i �ij Pj(xm)
�are always well-de�ned.
We de�ne a pseudo likelihood function based on best responses to the population probabili-
ties.QM(�; P
0) =PM
m=1
PNi=1 aim ln�
�xim �i �
Pj 6=i �ij P
0j (xm)
�+ (1� aim) ln�
��xim �i +
Pj 6=i �ij P
0j (xm)
� (7.47)
It is possible to show that �0 uniquely maximizes Q1(�; P 0). The PML estimator of �0 max-
imizes QM(�; P̂ 0), where P̂ 0 is a consistent nonparametric estimator of P 0. This estimator
is consistent and asymptotically normal. Iterating in this procedure can provide e¢ ciency
gains both in �nite samples and asymptotically (Aguirregabiria, Economics Letters, 2004).
8. Entry and Spatial Competition
Based on Seim (RAND, 2006)
In today�s lecture we will examine empirical models of entry and spatial location, with
particular attention to the model in Seim (RAND, 2006) and some extensions of that model.
Seim�s paper is an important contribution in the literature of market entry games not only
because it deals with an important empirical question but also because it was one of the
�rst studies with methodological contributions such as relaxing the assumption of isolated
8. ENTRY AND SPATIAL COMPETITION 133
markets, endogenizing product characteristics (location), introducing incomplete informa-
tion in �rms� pro�ts, and dealing with endogeneity problems due to unobserved market
characteristics.
8.1. Model. How does market power and pro�ts of a retail �rm depends on the locationof its store(s) relative to the location of competitors? How important is spatial di¤erentiation,
or more generally, horizontal product di¤erentiation to explain market power? This is an
important question in IO.
Seim studies this question by looking at the location decisions of video rental stores. She
considers a model of market entry that: (1) endogenizes �rms�spatial location decisions; (2)
relaxes the assumption of isolated markets; (3) introduces �rms�private information; and
(4) takes into account endogeneity problems due to unobserved market characteristics.
From a geographical point of view, a market in this model is a compact set in the two-
dimension Euclidean space. There are L locations in the market where �rms can operate
stores. These locations are exogenously given and they could be chosen as the set grid points
where the grid can be as �ne as we want. We index locations by ` that belongs to the set
f1; 2; :::; Lg.There are N potential entrants in the market. Each �rm takes two decisions: (1) whether
to be active or not in the market; and (2) if it decides to be active, it chooses the location
of its store. Note that Seim does not model multi-store �rms. Aguirregabiria and Vicentini
(2007) present an extension of Seim�s model with multi-store �rms, endogenous consumer
behavior, and dynamics.
Let ai represent the decisions of �rm i, such that ai 2 f0; 1; :::; Lg and ai = 0 represents"no entry", and ai = ` > 0 represents entry in location `.
The pro�t of not being active in the market is normalized to zero. Let �i` be the pro�t
of �rm i if it has a store in location `. These pro�ts depend on the store location decisions
of the other �rms. In particular, �i` declines with the number of other stores "close to"
location `.
Of course, the speci�c meaning of being close to location ` is key for the implications
of this model. This should depend on how consumers perceive as close substitutes stores in
di¤erent locations. In principle, if we have data on quantities and prices for the di¤erent
stores active in this city, we could estimate a demand system that would provide a measures
of consumers� transportation costs and of the degree of substitution in demand between
stores at di¤erent locations. That is what Jackie Wang did in his job market paper for the
banking industry (Wang, 2010). However, for this industry we do not have information on
prices and quantities at the store level, and even if we had, stores location decisions may
134 5. EMPIRICAL MODELS OF MARKET ENTRY
contain useful (and even better) information to identify the degree of competition between
stores at di¤erent locations.
Seim�s speci�cation of the pro�t function is "semi-structural" in the sense that it does
not model explicitly consumer behavior,but it is consistent with the idea that consumers
face transportation costs and therefore spatial di¤erentiation (ceteris paribus) can increase
pro�ts.
For every location ` in the city, Seim de�nes B rings around the location. A �rst ring
of radius d1 (say half a mile); a second ring of radius d2 > d1 (say one mile), and so on.
The pro�t of a store depends on the number of other stores located within each of the B
rings. We expect that closer stores should have stronger negative e¤ects on pro�ts. The
pro�t function of an active store at location ` is:
�i` = x` � +PB
b=1 b Nb` + �` + "i`
where �, 1, 2, ..., and B are parameters; x` is a vector of observable exogenous character-
istics that a¤ect pro�ts in location `; Nb` is the number of stores in ring b around location `
excluding ; �` represents exogenous characteristics of location ` that are unobserved to the
researcher but common and observable to �rms; and "i` is component of the pro�t of �rm i
in location ` that is private information to this �rm. For the no entry choice, �i0 = "i0.
ASSUMPTION: Let "i = f"i` : ` = 0; 1; :::; Lg be the vector with the private informationvariables of �rm i at every possible location. "i is i.i.d. over �rms and locations with a
extreme value type 1 distribution.
The information of �rm i is (x; �,"i), where x and � represent the vectors with x` and
�`, respectively, at every location in the city. Firm i does not know the "0s of other �rms.
Therefore, Nb` is unknown to a �rm. Firms only know the probability distribution of Nb`.
Therefore, �rms maximize expected pro�ts. The expected pro�t of �rm i is:
�ei` = x` � +PB
b=1 b Neb` + �` + "i`
where N eb` represents E(Nb`jx; �).
A �rm�s strategy depends on the variables in his information set. Let �i(x; �,"i) be a
strategy function for �rm i such that �i : X � R2 ! f0; 1; :::; Lg. Given expectations N eb`,
the best response strategy of player i is:
�i(x; �; "i) = arg max`2f0;1;:::;Lg
nx` � +
PBb=1 b N
eb` + �` + "i`
oOr similarly, �i(x; �; "i) = ` if and only if x` � +
PBb=1 b N
eb` + �` + "i` is greater that x`0
� +PB
b=1 b Neb`0 + �`0 + "i`0 for any other location `.
8. ENTRY AND SPATIAL COMPETITION 135
From the point of view of other �rms that do not know the private information of �rm
i but know the strategy function �i(x; �; "i), the strategy of �rm i can be described as a
probability distribution: Pi � fPi` : ` = 0; 1; :::; Lg where Pi` is the probability that �rm i
chooses location ` when following his strategy �i(x; �; "i). That is,
Pi` �Z1f�i(x; �; "i) = `g dF ("i)
where F ("i) is the CDF of "i. By construction,PL
`=0 Pi` = 1.
Given expectations N eb`, we can also represent the best response strategy of �rm i as a
choice probability. A best response probability Pi` is:
Pi` =
Z1h` = argmax
`0
nx`0 � +
PBb=1 b N
eb`0 + �`0 + "i`0
oidF ("i)
And given the extreme value assumption on "i:
Pi` =exp
nx`� +
PBb=1 b N
eb` + �`
o1 + exp
nx`0� +
PBb=1 b N
eb`0 + �`0
oIn this application, there is not information on �rm exogenous characteristics, and Seim
assumes that the equilibrium is symmetric: �i(x; �; "i) = �(x; �; "i) and Pi` = P` for every
�rm i.
The expected number of �rms in ring b around location `, N eb`, is determined by the
vector of entry probabilities P � fP`0 : `0 = 1; 2; :::; Lg. That is:
N eb` =
PL`0=1 1f`0 belongs to ring b around `g P`0 N
To emphasize this dependence we use the notation N eb`(P ).
Therefore, we can de�ne a (symmetric) equilibrium in this game as a vector of probabil-
ities P � fP` : ` = 1; 2; :::; Lg that solve the following system of equilibrium conditions: for
every ` = 1; 2; :::; L:
P` =exp
nx`� +
PBb=1 b N
eb`(P ) + �`
o1 + exp
nx`0� +
PBb=1 b N
eb`0(P ) + �`0
oBy Brower�s Theorem an equilibrium exist. The equilibrium may not be unique. Seim
shows that if the parameters are not large and they decline fast enough with b, then the
equilibrium is unique.
8.2. Econometric model and Estimation. Let � = fN; �; 1; 2; :::; Bg be the vec-tor of parameters of the model. These parameters can be estimated even if we have data only
from one city. Suppose that the data set is fx`; n` : ` = 1; 2; :::; Lg for L di¤erent locationsin a city, where L is large, and n` represents the number of stores in location `. We want to
136 5. EMPIRICAL MODELS OF MARKET ENTRY
use these data to estimate �. I describe the estimation with data from only one city. Later,
we will see that the extension to data from more than one city is trivial.
Let x be the vectorfx` : ` = 1; 2; :::; Lg. All the analysis is conditional on x, that is adescription of the "landscape" of observable socioeconomic characteristics in the city. Given
x, we can think in fn` : ` = 1; 2; :::; Lg as one realization of a spatial stochastic process.In terms of the econometric analysis, this has similarities with time series econometrics in
the sense that a time series is a single realization from a stochastic process. Despite having
just one realization of a stochastic process, we can estimate consistently the parameters of
that process as long as we make some stationarity assumptions.
8.2.1. Estimation without unobserved location heterogeneity. This is the model considered
by Seim (2006): there is city unobserved heterogeneity (her dataset includes multiple cities)
but within a city there is not unobserved location heterogeneity.
Conditional on x, spatial correlation/dependence in the unobservable variables �` can
generate dependence between the number of �rms at di¤erent locations fn`g. We start withthe simpler case where there is not the unobserved location heterogeneity: i.e., �` = 0 for
every location `.
Without unobserved location heterogeneity, and conditional on x, the variables n` are
independently distributed, and n` is a random draw from Binomial random variable with
arguments (N;P`(x; �)), where P`(x; �) are the equilibrium probabilities de�ned above where
now I explicitly include (x; �) as arguments.
n` � i:i:d: over ` Binomial(N;P`(x; �))
Therefore,
Pr (n1; n2; :::; nL j x; �) =QL`=1 Pr (n` j x; �)
=QL`=1
N !
n`(N � n`)!P`(x; �)
n`(1� P`(x; �))N�n`
The log-likelihood function is:
l(�) =LP̀=1
ln
�N !
(N � n`)!
�+ n` lnP`(x; �) + (N � n`) ln(1� P`(x; �))
And the maximum likelihood estimator, �̂, is the value of � that maximizes this likelihood.
Later, I will present and describe in detail several algorithms to obtain this MLE. The part
of this estimation that is computationally more demanding is that the probabilities are the
solution of a �xed point/equilibrium problem.
The parameters of the model, including the number of potential entrantsN , are identi�ed.
Partly, the identi�cation comes form functional form assumptions. However, there also
exclusion restrictions that can provide identi�cation even if some of these assumptions are
8. ENTRY AND SPATIAL COMPETITION 137
relaxed. In particular, for the identi�cation of � and b, the model implies that Neb` depends
on socioeconomic characteristics at locations other than ` (i.e., x`0 for `0 6= `). Therefore,
N eb` has sample variability that is independent of x` and this implies that the e¤ects of x`
and N eb` on a �rm�s pro�t can be identi�ed even if we relax the linearity assumption.
Haiqing Xu�s job market paper (2010) (titled "Parametric and Semiparametric Structural
Estimation of Hotelling-type Discrete Choice Games in A Single Market with An Increasing
Number of Players") studies the asymptotics of this type of estimator. His model is a bit
di¤erent to Seim�s model because players and locations are the same thing.
8.2.2. Estimation WITH unobserved location heterogeneity. Now, let�s consider the model
where �` 6= 0. A simple (but restrictive approach) is to assume that there is a number R
of "regions" or districts in the city, where the number of regions R is small relative to the
number of locations L, such that all the unobserved heterogeneity is between regions but
there is no unobserved heterogeneity within regions. Under this assumption, we can control
for unobserved heterogeneity by including region dummies. In fact, this case is equivalent to
the previous case without unobserved location heterogeneity with the only di¤erence is that
the vector of observables x` now includes region dummies.
A more interesting case is when the unobserved heterogeneity is at the location level.
We assume that � = f�` : ` = 1; 2; :::; Lg is independent of x and it is a random draw
from a spatial stochastic process. The simplest process is when �` is i:i:d: with a known
distribution, say N(0; �2�) where the zero mean is without loss of generality. However, we
can allow for spatial dependence in this unobservable. For instance, we may consider a
Spatial autorregressive process (SAR):
�` = ���C` + u`
where u` is i:i:d: N(0; �2u), � is a parameter, and ��C` is the mean value of � at the C locations
closest to location `, excluding location ` itself. To obtain, a random draw of the vector �
from this stochastic process it is convenient to write the process in vector form:
� = �WC � + u
where � and u are L � 1 vectors, and WC is a L � L weighting matrix such that everyrow, say row `, has values 1=C at positions that correspond to locations close to location `,
and zeroes otherwise. Then, we can write � = (I � � WC)�1u. First, we take independent
draws from N(0; �2u) to generate the vector u, and then we pre-multiple that vector by (I��WC)�1 to obtain �.
Note that now the vector of structural parameters includes the parameters in the sto-
chastic process of �, i.e., �u and �.
138 5. EMPIRICAL MODELS OF MARKET ENTRY
Now, conditional on x AND �, the variables n` are independently distributed, and n`is a random draw from Binomial random variable with arguments (N;P`(x; �; �)), where
P`(x; �; �) are the equilibrium probabilities. Importantly, for di¤erent values of � we have
di¤erent equilibrium probabilities. Then,
Pr (n1; n2; :::; nL j x; �) =RPr (n1; n2; :::; nL j x; �; �) dG(�)
=R hQL
`=1 Pr (n` j x; �; �)idG(�)
=QL`=1
N !
n`(N � n`)!R hQL`=1 P`(x; �; �)
n`(1� P`(x; �; �))N�n`idG(�)
And the log-likelihood function is:
l(�) =LP̀=1
ln
�N !
(N � n`)!
�
+ ln
�Z hQL`=1 P`(x; �; �)
n`(1� P`(x; �; �))N�n`idG(�)
�And the maximum likelihood estimator is de�ned as usual.
Identi�cation???
8.2.3. Estimation algorithms.
Nested Fixed Point (NFXP).
Nested Pseudo Likelihood (NPL).
Mathematical Programming with Equilibrium Constraints (MPEC).
8.3. Empirical Application.