Upload
lythu
View
213
Download
1
Embed Size (px)
Citation preview
6.ParameterEs,ma,on
ECE302Fall2009TR3‐4:15pmPurdueUniversity,SchoolofECE
Prof.IlyaPollak
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
Observation Y, a random variable
Given an observation Y=y, estimate x
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating
Observation Y, a random variable
Given an observation Y=y, estimate x
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating • the number of tanks in enemy’s army
Observation Y, a random variable
Given an observation Y=y, estimate x
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating • the number of tanks in enemy’s army • the bias of a coin
Observation Y, a random variable
Given an observation Y=y, estimate x
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating • the number of tanks in enemy’s army • the bias of a coin
Observation Y, a random variable
E.g., • people’s responses to poll quesions • serial numbers of captured enemy tanks • results of several flips of the coin
Given an observation Y=y, estimate x
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating • the number of tanks in enemy’s army • the bias of a coin
Observation Y, a random variable
E.g., • people’s responses to poll quesions • serial numbers of captured enemy tanks • results of several flips of the coin
Given an observation Y=y, estimate x
In order to have any hope of reasonably estimating x from observing Y, the distribution of Y must depend on x.
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating • the number of tanks in enemy’s army • the bias of a coin
Observation Y, a random variable
E.g., • people’s responses to poll quesions • serial numbers of captured enemy tanks • results of several flips of the coin
Given an observation Y=y, estimate x
In order to have any hope of reasonably estimating x from observing Y, the distribution of Y must depend on x. To emphasize this dependence, we will write: fY(y;x) or pY(y;x) Viewed as a function of x, fY(y;x) or pY(y;x) is called a likelihood function.
ParameterEs,ma,on:Framework
Measurement Inference
Unobserved parameter x, a nonrandom but unknown constant
E.g., • the President’s approval rating • the number of tanks in enemy’s army • the bias of a coin
Observation Y, a random variable
E.g., • people’s responses to poll quesions • serial numbers of captured enemy tanks • results of several flips of the coin
Given an observation Y=y, estimate x
In order to have any hope of reasonably estimating x from observing Y, the distribution of Y must depend on x. To emphasize this dependence, we will write: fY(y;x) or pY(y;x) Viewed as a function of x, fY(y;x) or pY(y;x) is called a likelihood function.
E.g., • Find the value x* of x, which maximizes the probability of the observation Y=y, i.e., the value x* that maximizes the likelihood function fY(y;x) or pY(y;x). This is called the maximum likelihood (ML) estimate.
Parameteres,ma,onfrommul,pleobserva,ons
• Typically,wewilles,mateaparameterxfromseveralobserva,onsY=(Y1,…,Yn).
Parameteres,ma,onfrommul,pleobserva,ons
• Typically,wewilles,mateaparameterxfromseveralobserva,onsY=(Y1,…,Yn).– ORen,itisunrealis,ctoproduceausefules,matebasedonasingleobserva,on.
– E.g.,youcannotreallyes,mateP(heads)foracoinbasedonasinglecointoss.
Parameteres,ma,onfrommul,pleobserva,ons
• Typically,wewilles,mateaparameterxfromseveralobserva,onsY=(Y1,…,Yn).– ORen,itisunrealis,ctoproduceausefules,matebasedonasingleobserva,on.
– E.g.,youcannotreallyes,mateP(heads)foracoinbasedonasinglecointoss.
Given observations Y = Y1,…,Yn( ), an estimator of x is a random variable
of the form X̂ = g(Y) for some function g.
Parameteres,ma,onfrommul,pleobserva,ons
• Typically,wewilles,mateaparameterxfromseveralobserva,onsY=(Y1,…,Yn).– ORen,itisunrealis,ctoproduceausefules,matebasedonasingleobserva,on.
– E.g.,youcannotreallyes,mateP(heads)foracoinbasedonasinglecointoss.
Given observations Y = Y1,…,Yn( ), an estimator of x is a random variable
of the form X̂ = g(Y) for some function g.
Note: the distribution of Y depends on x.Hence, the distribution of X̂ also depends on x.
Parameteres,ma,onfrommul,pleobserva,ons
• Typically,wewilles,mateaparameterxfromseveralobserva,onsY=(Y1,…,Yn).– ORen,itisunrealis,ctoproduceausefules,matebasedonasingleobserva,on.
– E.g.,youcannotreallyes,mateP(heads)foracoinbasedonasinglecointoss.
Given observations Y = Y1,…,Yn( ), an estimator of x is a random variable
of the form X̂ = g(Y) for some function g.
Note: the distribution of Y depends on x.Hence, the distribution of X̂ also depends on x.Sometimes, we will denote the estimator by X̂n , to emphasizethe role of the number of observations.
Parameteres,ma,onfrommul,pleobserva,ons:es,ma,onerrorandbias
Suppose we have an estimator X̂n of x based on the observations of Y = Y1,…,Yn( ).The mean Ex X̂n⎡⎣ ⎤⎦ and variance varx X̂n( ) both depend on x.
Parameteres,ma,onfrommul,pleobserva,ons:es,ma,onerrorandbias
Suppose we have an estimator X̂n of x based on the observations of Y = Y1,…,Yn( ).The mean Ex X̂n⎡⎣ ⎤⎦ and variance varx X̂n( ) both depend on x.
The estimation error, denoted Xn , is defined by Xn = X̂n − x
Parameteres,ma,onfrommul,pleobserva,ons:es,ma,onerrorandbias
Suppose we have an estimator X̂n of x based on the observations of Y = Y1,…,Yn( ).The mean Ex X̂n⎡⎣ ⎤⎦ and variance varx X̂n( ) both depend on x.
The estimation error, denoted Xn , is defined by Xn = X̂n − x
The bias of the estimator, denoted bx X̂n( ), is the expected value
of the estimation error: bx X̂n( ) = ExXn⎡⎣ ⎤⎦ = Ex X̂n⎡⎣ ⎤⎦ − x
Parameteres,ma,onfrommul,pleobserva,ons:es,ma,onerrorandbias
Suppose we have an estimator X̂n of x based on the observations of Y = Y1,…,Yn( ).The mean Ex X̂n⎡⎣ ⎤⎦ and variance varx X̂n( ) both depend on x.
The estimation error, denoted Xn , is defined by Xn = X̂n − x
The bias of the estimator, denoted bx X̂n( ), is the expected value
of the estimation error: bx X̂n( ) = ExXn⎡⎣ ⎤⎦ = Ex X̂n⎡⎣ ⎤⎦ − x
X̂n is unbiased if Ex X̂n⎡⎣ ⎤⎦ = x, for every possible value of x.
Parameteres,ma,onfrommul,pleobserva,ons:es,ma,onerrorandbias
Suppose we have an estimator X̂n of x based on the observations of Y = Y1,…,Yn( ).The mean Ex X̂n⎡⎣ ⎤⎦ and variance varx X̂n( ) both depend on x.
The estimation error, denoted Xn , is defined by Xn = X̂n − x
The bias of the estimator, denoted bx X̂n( ), is the expected value
of the estimation error: bx X̂n( ) = ExXn⎡⎣ ⎤⎦ = Ex X̂n⎡⎣ ⎤⎦ − x
X̂n is unbiased if Ex X̂n⎡⎣ ⎤⎦ = x, for every possible value of x.
X̂n is asymptotically unbiased if limn→∞
Ex X̂n⎡⎣ ⎤⎦ = x, for every possible value of x.
Consistentes,mators
X̂n is consistent if the sequence X̂n converges in probability to x,for every possible x
Consistentes,mators
X̂n is consistent if the sequence X̂n converges in probability to x,for every possible x :
limn→∞P X̂n − x ≥ ε( ) = 0,
for every ε > 0 and every possible x.
Meansquaredes,ma,onerror
ExXn2⎡⎣ ⎤⎦ = Ex
Xn⎡⎣ ⎤⎦( )2 + varx Xn( )
Meansquaredes,ma,onerror
ExXn
2⎡⎣ ⎤⎦ = ExXn⎡⎣ ⎤⎦( )2
+ varx Xn( ) = b
x
2 X̂n( ) + varx X̂n − x( )
Meansquaredes,ma,onerror
ExXn
2⎡⎣ ⎤⎦ = ExXn⎡⎣ ⎤⎦( )2
+ varx Xn( ) = b
x
2 X̂n( ) + varx X̂n − x( ) = b
x
2 X̂n( ) + varx X̂n( )
Maximumlikelihoodes,ma,on• Unknownparameterx
• Observa,onsY=(Y1,…,Yn)whosejointdistribu,ondependsonx:– jointPMFpY(y;x)ifYisdiscrete;– jointPDFfY(y;x)ifYiscon,nuous.
Maximumlikelihoodes,ma,on• Unknownparameterx
• Observa,onsY=(Y1,…,Yn)whosejointdistribu,ondependsonx:– jointPMFpY(y;x)ifYisdiscrete;– jointPDFfY(y;x)ifYiscon,nuous.
• Whenviewedasfunc,onsofx,pY(y;x)andfY(y;x)arecalledlikelihoodfunc1ons.
Maximumlikelihoodes,ma,on• Unknownparameterx
• Observa,onsY=(Y1,…,Yn)whosejointdistribu,ondependsonx:– jointPMFpY(y;x)ifYisdiscrete;– jointPDFfY(y;x)ifYiscon,nuous.
• Whenviewedasfunc,onsofx,pY(y;x)andfY(y;x)arecalledlikelihoodfunc1ons.
• Themaximumlikelihoodes,mateofxbasedonanobserva,onyofY:
x̂n =argmax
xpY (y; x) if Y is discrete
argmaxx
fY (y; x) if Y is continuous
⎧⎨⎪
⎩⎪
Ex.9.2:Es,ma,ngthemeanofaBernoullir.v.
• BiasedcoinwithP(heads)=x.• Y1,…,Yn=nindependenttossesofthecoin(Yi=1forheads,Yi=0fortails).
• FindtheMLes,matorforx.
Es,ma,ngthemeanofaBernoullir.v.• BiasedcoinwithP(heads)=x.• Y1,…,Yn=nindependenttossesofthecoin(Yi=1forheads,Yi=0fortails).
• FindtheMLes,matorforx.
• Letk=#headsinntosses.
Es,ma,ngthemeanofaBernoullir.v.• BiasedcoinwithP(heads)=x.• Y1,…,Yn=nindependenttossesofthecoin(Yi=1forheads,Yi=0fortails).
• FindtheMLes,matorforx.
• Letk=#headsinntosses.Then
pY (y; x) = pYi (yi; x)i=1
n
∏
Es,ma,ngthemeanofaBernoullir.v.• BiasedcoinwithP(heads)=x.• Y1,…,Yn=nindependenttossesofthecoin(Yi=1forheads,Yi=0fortails).
• FindtheMLes,matorforx.
• Letk=#headsinntosses.Then
pY (y; x) = pYi (yi; x)i=1
n
∏ = xi:yi =1∏ (1− x)
i:yi =0∏
Es,ma,ngthemeanofaBernoullir.v.• BiasedcoinwithP(heads)=x.• Y1,…,Yn=nindependenttossesofthecoin(Yi=1forheads,Yi=0fortails).
• FindtheMLes,matorforx.
• Letk=#headsinntosses.Then
pY (y; x) = pYi (yi; x)i=1
n
∏ = xi:yi =1∏ (1− x)
i:yi =0∏ = xk (1− x)n− k
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
It’seasiertomaximizethelog‐likelihood:
ln pY (y; x) = k ln x + (n − k)ln(1− x)
(Takinglogsisokunlessx=0orx=1.)
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
It’seasiertomaximizethelog‐likelihood:
ln pY (y; x) = k ln x + (n − k)ln(1− x)
(Takinglogsisokunlessx=0orx=1.)Tofindthemax,differen,ate:
∂∂xln pY (y; x) =
kx−n − k1− x
=k − kx − nx + kx
x(1− x)=k − nxx(1− x)
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
It’seasiertomaximizethelog‐likelihood:
ln pY (y; x) = k ln x + (n − k)ln(1− x)
(Takinglogsisokunlessx=0orx=1.)Tofindthemax,differen,ate:
∂∂xln pY (y; x) =
kx−n − k1− x
=k − kx − nx + kx
x(1− x)=k − nxx(1− x)
x̂n =kn
, x̂n ≠ 0, x̂n ≠ 1
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
It’seasiertomaximizethelog‐likelihood:
ln pY (y; x) = k ln x + (n − k)ln(1− x)
(Takinglogsisokunlessx=0orx=1.)Tofindthemax,differen,ate:
∂∂xln pY (y; x) =
kx−n − k1− x
=k − kx − nx + kx
x(1− x)=k − nxx(1− x)
x̂n =kn
, x̂n ≠ 0, x̂n ≠ 1
I.e., this derivation will not work for k = 0 and k = n.
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
It’seasiertomaximizethelog‐likelihood:
ln pY (y; x) = k ln x + (n − k)ln(1− x)
(Takinglogsisokunlessx=0orx=1.)Tofindthemax,differen,ate:
∂∂xln pY (y; x) =
kx−n − k1− x
=k − kx − nx + kx
x(1− x)=k − nxx(1− x)
x̂n =kn
, x̂n ≠ 0, x̂n ≠ 1
I.e., this derivation will not work for k = 0 and k = n.
Checkthatthisisamaximum:∂2
∂x2ln pY (y; x) = −
kx2
−n − k(1− x)2
< 0
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n which is maximized by x̂n = 0 =kn
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n which is maximized by x̂n = 0 =kn
If k = n then pY (y; x) = xn
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n which is maximized by x̂n = 0 =kn
If k = n then pY (y; x) = xn which is maximized by x̂n = 1 = kn
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n which is maximized by x̂n = 0 =kn
If k = n then pY (y; x) = xn which is maximized by x̂n = 1 = kn
Hence, the formula x̂n =kn
actually works for any k.
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n which is maximized by x̂n = 0 =kn
If k = n then pY (y; x) = xn which is maximized by x̂n = 1 = kn
Hence, the formula x̂n =kn
actually works for any k.
ML estimate: x̂n =kn=y1 +…+ yn
n
Es,ma,ngthemeanofaBernoullir.v.pY (y; x) = x
k (1− x)n− k
x̂n =kn
, unless k = 0 or k = n.
If k = 0 then pY (y; x) = (1− x)n which is maximized by x̂n = 0 =kn
If k = n then pY (y; x) = xn which is maximized by x̂n = 1 = kn
Hence, the formula x̂n =kn
actually works for any k.
ML estimate: x̂n =kn=y1 +…+ yn
n
ML estimator: X̂n =Y1 +…+Yn
n
Es,ma,ngthemeanofaBernoullir.v.
ML estimator: X̂n =Y1 +…+Yn
n
Es,ma,ngthemeanofaBernoullir.v.
ML estimator: X̂n =Y1 +…+Yn
n
E X̂n⎡⎣ ⎤⎦ =E Y1[ ] +…+ E Yn[ ]
n= x, hence, this estimator is unbiased.
Es,ma,ngthemeanofaBernoullir.v.
ML estimator: X̂n =Y1 +…+Yn
n
E X̂n⎡⎣ ⎤⎦ =E Y1[ ] +…+ E Yn[ ]
n= x, hence, this estimator is unbiased.
By WLLN, X̂n → x in probability. Hence, this estimator is consistent.
Germantankproblem• TotalnGermantanks,numbered1,2,…,n.• ThenumbernisunknowntotheAllies.
Germantankproblem• TotalnGermantanks,numbered1,2,…,n.• ThenumbernisunknowntotheAllies.
• Tankswithserialnumbersy1,…,ykhavebeencapturedorotherwiseobservedbytheAllies.
Germantankproblem• TotalnGermantanks,numbered1,2,…,n.• ThenumbernisunknowntotheAllies.
• Tankswithserialnumbersy1,…,ykhavebeencapturedorotherwiseobservedbytheAllies.
• Objec,ve:basedonobservingy1,…,yk,es,maten.
Germantankproblem• TotalnGermantanks,numbered1,2,…,n.• ThenumbernisunknowntotheAllies.
• Tankswithserialnumbersy1,…,ykhavebeencapturedorotherwiseobservedbytheAllies.
• Objec,ve:basedonobservingy1,…,yk,es,maten.
• Model:y1,…,ykareobserva,onsofY1,…,Ykwhicharearandomcombina,onofknumbersfrom{1,2,…,n}.
Germantankproblem• TotalnGermantanks,numbered1,2,…,n.• ThenumbernisunknowntotheAllies.
• Tankswithserialnumbersy1,…,ykhavebeencapturedorotherwiseobservedbytheAllies.
• Objec,ve:basedonobservingy1,…,yk,es,maten.
• Model:y1,…,ykareobserva,onsofY1,…,Ykwhicharearandomcombina,onofknumbersfrom{1,2,…,n}.
• Approach1:MLes,ma,on.
Germantanks:MLes,ma,on
There are nk
⎛⎝⎜
⎞⎠⎟
sets of k distinct numbers which are subsets of {1,2,…,n}.
Germantanks:MLes,ma,on
There are nk
⎛⎝⎜
⎞⎠⎟
sets of k distinct numbers which are subsets of {1,2,…,n}.
Assuming each is equally likely, pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
Germantanks:MLes,ma,on
There are nk
⎛⎝⎜
⎞⎠⎟
sets of k distinct numbers which are subsets of {1,2,…,n}.
Assuming each is equally likely, pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
Thetotalnumberoftanksmustbegreaterthanorequaltothelargestobservedserialnumber,max(y1,…,yk).Let’scallitmk.
Germantanks:MLes,ma,on
There are nk
⎛⎝⎜
⎞⎠⎟
sets of k distinct numbers which are subsets of {1,2,…,n}.
Assuming each is equally likely, pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
Thetotalnumberoftanksmustbegreaterthanorequaltothelargestobservedserialnumber,max(y1,…,yk).Let’scallitmk.
As a function of n, pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if n ≥ mk
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
Germantanks:MLes,ma,on
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if n ≥ mk = max(y1,…yk )
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
Germantanks:MLes,ma,on
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if n ≥ mk = max(y1,…yk )
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
n
pY (y;n)
mk
Germantanks:MLes,ma,on
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if n ≥ mk = max(y1,…yk )
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
n
pY (y;n)
mk
Hence, the ML estimate is n̂k = mk = max(y1,…, yk )
IstheMLes,matorofnunbiased?
IstheMLes,matorofnunbiased?ML estimator is Mk = max(Y1,…,Yk )
IstheMLes,matorofnunbiased?ML estimator is Mk = max(Y1,…,Yk )
Tocomputethebias,wewill:• ComputetheCDFofMk
IstheMLes,matorofnunbiased?ML estimator is Mk = max(Y1,…,Yk )
Tocomputethebias,wewill:• ComputetheCDFofMk• ComputethePMFofMk
IstheMLes,matorofnunbiased?ML estimator is Mk = max(Y1,…,Yk )
Tocomputethebias,wewill:• ComputetheCDFofMk• ComputethePMFofMk• Computethebias,E[Mk]−n
CDFoftheMLes,matorofnML estimator is Mk = max(Y1,…,Yk )
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
CDFoftheMLes,matorofnML estimator is Mk = max(Y1,…,Yk )
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
FMk(r) = P Mk ≤ r( ) = P Y1 ≤ r,…,Yk ≤ r( )
CDFoftheMLes,matorofnML estimator is Mk = max(Y1,…,Yk )
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
FMk(r) = P Mk ≤ r( ) = P Y1 ≤ r,…,Yk ≤ r( )
=1nk
⎛⎝⎜
⎞⎠⎟
⋅ # subsets of {1,2,...,r} of size k[ ]
CDFoftheMLes,matorofnML estimator is Mk = max(Y1,…,Yk )
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
FMk(r) = P Mk ≤ r( ) = P Y1 ≤ r,…,Yk ≤ r( )
=1nk
⎛⎝⎜
⎞⎠⎟
⋅ # subsets of {1,2,...,r} of size k[ ] =0, if r ≤ k −1 ⎧
⎨⎪
⎩⎪
CDFoftheMLes,matorofnML estimator is Mk = max(Y1,…,Yk )
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
FMk(r) = P Mk ≤ r( ) = P Y1 ≤ r,…,Yk ≤ r( )
=1nk
⎛⎝⎜
⎞⎠⎟
⋅ # subsets of {1,2,...,r} of size k[ ] =
0, if r ≤ k −1
rk
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
CDFoftheMLes,matorofnML estimator is Mk = max(Y1,…,Yk )
pY (y;n) =
1nk
⎛⎝⎜
⎞⎠⎟
, if {y1,…, yk} ⊂ {1,2,…,n}
0, otherwise
⎧
⎨
⎪⎪
⎩
⎪⎪
FMk(r) = P Mk ≤ r( ) = P Y1 ≤ r,…,Yk ≤ r( )
=1nk
⎛⎝⎜
⎞⎠⎟
⋅ # subsets of {1,2,...,r} of size k[ ] =
0, if r ≤ k −1
rk
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
1, if r ≥ n
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
PMFoftheMLes,matorofn
FMk(r) = P Mk ≤ r( ) =
0, if r ≤ k −1
rk
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
1, if r ≥ n
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
fMk(r) = P Mk = r( ) = P Mk ≤ r( ) − P Mk ≤ r −1( ) = FMk
(r) − FMk(r −1)
PMFoftheMLes,matorofn
FMk(r) = P Mk ≤ r( ) =
0, if r ≤ k −1
rk
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
1, if r ≥ n
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
fMk(r) = P Mk = r( ) = P Mk ≤ r( ) − P Mk ≤ r −1( ) = FMk
(r) − FMk(r −1)
=0, if r ≤ k −1 and r ≥ n +1?, if r = k,k +1,…,n
⎧⎨⎪
⎩⎪
PMFoftheMLes,matorofn
FMk(r) = P Mk ≤ r( ) =
0, if r ≤ k −1
rk
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
1, if r ≥ n
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
fMk(r) = P Mk = r( ) = P Mk ≤ r( ) − P Mk ≤ r −1( ) = FMk
(r) − FMk(r −1)
=
0, if r ≤ k −1 and r ≥ n +1
rk
⎛⎝⎜
⎞⎠⎟− r −1
k⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
PMFoftheMLes,matorofn
FMk(r) = P Mk ≤ r( ) =
0, if r ≤ k −1
rk
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
1, if r ≥ n
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
fMk(r) = P Mk = r( ) = P Mk ≤ r( ) − P Mk ≤ r −1( ) = FMk
(r) − FMk(r −1)
=
0, if r ≤ k −1 and r ≥ n +1
rk
⎛⎝⎜
⎞⎠⎟− r −1
k⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
=
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
PMFoftheMLes,matorofn
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
Ausefulcombinatorialiden,ty
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
1 = fMk(r)
r= k
n
∑
because fMk is a PMF
Ausefulcombinatorialiden,ty
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
1 = fMk(r)
r= k
n
∑ =
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑ =1nk
⎛⎝⎜
⎞⎠⎟
r −1k −1
⎛⎝⎜
⎞⎠⎟r= k
n
∑
because fMk is a PMF
Ausefulcombinatorialiden,ty
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
1 = fMk(r)
r= k
n
∑ =
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑ =1nk
⎛⎝⎜
⎞⎠⎟
r −1k −1
⎛⎝⎜
⎞⎠⎟r= k
n
∑
Therefore, r −1k −1
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = nk
⎛⎝⎜
⎞⎠⎟
because fMk is a PMF
Anotherusefulcombinatorialiden,ty
Let r ' = r + d +1 and k ' = k + d +1. Then
r + dk + d
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = r '−1k '−1
⎛⎝⎜
⎞⎠⎟r '= k '
n+d+1
∑
Anotherusefulcombinatorialiden,ty
r −1k −1
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = nk
⎛⎝⎜
⎞⎠⎟
Let r ' = r + d +1 and k ' = k + d +1. Then
r + dk + d
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = r '−1k '−1
⎛⎝⎜
⎞⎠⎟r '= k '
n+d+1
∑ = n + d +1k '
⎛⎝⎜
⎞⎠⎟
Anotherusefulcombinatorialiden,ty
r −1k −1
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = nk
⎛⎝⎜
⎞⎠⎟
Let r ' = r + d +1 and k ' = k + d +1. Then
r + dk + d
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = r '−1k '−1
⎛⎝⎜
⎞⎠⎟r '= k '
n+d+1
∑ = n + d +1k '
⎛⎝⎜
⎞⎠⎟= n + d +1
k + d +1⎛⎝⎜
⎞⎠⎟
Anotherusefulcombinatorialiden,ty
r −1k −1
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = nk
⎛⎝⎜
⎞⎠⎟
Let r ' = r + d +1 and k ' = k + d +1. Then
r + dk + d
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = r '−1k '−1
⎛⎝⎜
⎞⎠⎟r '= k '
n+d+1
∑ = n + d +1k '
⎛⎝⎜
⎞⎠⎟= n + d +1
k + d +1⎛⎝⎜
⎞⎠⎟
r + dk + d
⎛⎝⎜
⎞⎠⎟r= k
n
∑ = n + d +1k + d +1
⎛⎝⎜
⎞⎠⎟
MeanoftheMLes,matorofn
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
E Mk[ ] = r
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑
MeanoftheMLes,matorofn
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
E Mk[ ] = r
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑ =
r!(k −1)!(r − k)!
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑
MeanoftheMLes,matorofn
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
E Mk[ ] = r
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑ =
r!(k −1)!(r − k)!
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑
=
k ⋅ r!k!(r − k)!
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑
MeanoftheMLes,matorofn
fMk(r) =
0, if r ≤ k −1 and r ≥ n +1
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
, if r = k,k +1,…,n
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
E Mk[ ] = r
r −1k −1
⎛⎝⎜
⎞⎠⎟
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑ =
r!(k −1)!(r − k)!
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑
=
k ⋅ r!k!(r − k)!
nk
⎛⎝⎜
⎞⎠⎟
r= k
n
∑ =knk
⎛⎝⎜
⎞⎠⎟
rk
⎛⎝⎜
⎞⎠⎟r= k
n
∑
Simplifyingalijlefurther…
E Mk[ ] = knk
⎛⎝⎜
⎞⎠⎟
n +1k +1
⎛⎝⎜
⎞⎠⎟=k ⋅ (n +1)!(k +1)!(n − k)!
n!k!(n − k)!
=k(n +1)!k!(k +1)!n!
=k(n +1)k +1
ThebiasoftheMLes,matorofnb(Mk ) = E Mk[ ]− n = k(n +1)
k +1− n =
k − nk +1
ThebiasoftheMLes,matorofnb(Mk ) = E Mk[ ]− n = k(n +1)
k +1− n =
k − nk +1
Therefore,theMLes,matorisbiased!
ThebiasoftheMLes,matorofnb(Mk ) = E Mk[ ]− n = k(n +1)
k +1− n =
k − nk +1
Therefore,theMLes,matorisbiased!What’stheintui,onhere?
ThebiasoftheMLes,matorofnb(Mk ) = E Mk[ ]− n = k(n +1)
k +1− n =
k − nk +1
Therefore,theMLes,matorisbiased!What’stheintui,onhere?UnlessweobserveALLserialnumbers,weareneverguaranteedtoseethelargestone.Infact,typically,wewillNOTseethelargestone,especiallyifthenumberofobserva,onsisalotsmallerthanthetotalnumberoftanks.
ThebiasoftheMLes,matorofnb(Mk ) = E Mk[ ]− n = k(n +1)
k +1− n =
k − nk +1
Therefore,theMLes,matorisbiased!What’stheintui,onhere?UnlessweobserveALLserialnumbers,weareneverguaranteedtoseethelargestone.Infact,typically,wewillNOTseethelargestone,especiallyifthenumberofobserva,onsisalotsmallerthanthetotalnumberoftanks.So,ifwees,matethenumberoftanksasthelargestserialnumberobserved,wewillsystema,callyunderes,mate.
Anunbiasedes,matorofn
TocorrectforthebiasoftheMLes,mator,let’suseadifferentes,mator:
N̂k =k +1k
Mk −1
E Mk[ ] = k(n +1)k +1
b(Mk ) =k − nk +1
Anunbiasedes,matorofn
TocorrectforthebiasoftheMLes,mator,let’suseadifferentes,mator:
N̂k =k +1k
Mk −1
E Mk[ ] = k(n +1)k +1
b(Mk ) =k − nk +1
E N̂k⎡⎣ ⎤⎦ =k +1k
E Mk[ ]−1 = k +1k
⋅k(n +1)k +1
−1 = n
Anunbiasedes,matorofn
TocorrectforthebiasoftheMLes,mator,let’suseadifferentes,mator:
N̂k =k +1k
Mk −1
E Mk[ ] = k(n +1)k +1
b(Mk ) =k − nk +1
E N̂k⎡⎣ ⎤⎦ =k +1k
E Mk[ ]−1 = k +1k
⋅k(n +1)k +1
−1 = n
The estimator N̂k is therefore unbiased.
References• R.BugglesandH.Brodie.AnempiricalapproachtoeconomicintelligenceinWorldWarII.JournaloftheAmericanSta1s1calAssocia1on,42(237):72—91,March,1947.
• Howasta,s,calformulawonthewar,TheGuardian,July20,2006,www.guardian.co.uk/world/2006/jul/20/secondworldwar.tvandradio
• Sametechniquehasrecentlybeenappliedtoes,mateiPhoneproduc,onnumbers,see– WhyiPhonesarejustlikeGermantanks,TheGuardian,Oct.2008,
www.guardian.co.uk/technology/blog/2008/oct/08/iphone.apple