Upload
tranhanh
View
213
Download
0
Embed Size (px)
Citation preview
Lectures on Probability andStatistical Models
Phil Pollett
Professor of Mathematics
The University of Queensland
c© These materials can be used for any educationalpurpose provided they are are not altered
Probability & Statistical Models c© Philip K. Pollett
13 Markov chains
Imprecise (intuitive) definition . A Markov process is arandom process that “forgets its past”, in the followingsense:
Pr(Future = y|Present = x and Past = z)
= Pr(Future = y|Present = x).
Thus, given the past and the present “state” of the process,only the present state is of use in predicting the future.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Equivalently,
Pr(Future = y and Past = z|Present = x)
= Pr(Future = y|Present = x)×Pr(Past = z|Present = x),
so that, given the present state of the process, its past andits future are independent . If the set of states S is discrete,then the process is called a Markov chain.
Remark . At first sight this definition might appear to coveronly trivial examples, but note that the current state couldbe complicated and could include a record of the recentpast.
Probability & Statistical Models c© Philip K. Pollett
Andrei Andreyevich Markov
(Born: 14/06/1856, Ryazan, Russia; Died: 20/07/1922, St Petersburg, Russia)
Markov is famous for his pioneering work on Markov chains, whichlaunched the theory of stochastic processes. His early work was innumber theory, analysis, continued fractions, limits of integrals,approximation theory and convergence of series.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Example . There are two rooms, labelled A and B. There isa spider, initially in Room A, hunting a fly that is initially inRoom B. They move from room to room independently:every minute each changes rooms (with probability p for thespider and q for the fly) or stays put, with thecomplementary probabilities. Once in the same room, thespider eats the fly and the hunt ceases.
The hunt can be represented as a Markov chain with threestates: (0) the spider and the fly are in the same room (thehunt has ended), (1) the spider is in Room A and the fly isin Room B, and, (2) the spider is in Room B and the fly is inRoom A.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Eventually we will be able to answer questions like “What isthe probability that the hunt lasts more than two minutes?”
Let Xn be the state of the process at time n (that is, after nminutes). Then, Xn ∈ S = {0, 1, 2}. The set S is called thestate space. The initial state is X0 = 1. State 0 is called anabsorbing state, because the process remains there once itis reached.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Definition . A sequence {Xn, n = 0, 1, . . . } of randomvariables is called a discrete-time stochastic process; Xn
usually represents the state of the process at time n. If{Xn} takes values in a discrete state space S, then it iscalled a Markov chain if
Pr(Xm+1 = j|Xm = i, Xm−1 = im−1, . . . , X0 = i0)
= Pr(Xm+1 = j|Xm = i). (1)
for all time points m and all states i0, . . . , im−1, i, j ∈ S. If theright-hand side of (1) is the same for all m, then the Markovchain is said to be time homogeneous.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
We will consider only time-homogeneous chains, and weshall write
p(n)ij = Pr(Xm+n = j|Xm = i)
= Pr(Xn = j|X0 = i)
for the n-step transition probabilities and
pij := p(1)ij = Pr(Xm+1 = j|Xm = i)
= Pr(X1 = j|X0 = i)
for the 1-step transition probabilities (or simply transitionprobabilities).
Probability & Statistical Models c© Philip K. Pollett
Markov chains
By the law of total probability, we have that∑
j∈S
p(n)ij =
∑
j∈S
Pr(Xn = j|X0 = i) = 1,
and in particular that∑
j∈S pij = 1.
The matrix P (n) = (p(n)ij , i, j ∈ S) is called the n-step
transition matrix and P = (pij , i, j ∈ S) is called the 1-steptransition matrix (or simply transition matrix).
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Remarks . (1) Matrices like this (with non-negative entriesand all row sums equal to 1) are called stochastic matrices.Writing 1 = (1, 1, . . . )T (where T denotes transpose), we seethat P1 = 1. Hence, P (and indeed any stochastic matrix)has an eigenvector 1 corresponding to an eigenvalue λ = 1.
(2) We may usefully set P (0) = I, where, as usual, Idenotes the identity matrix:
p(0)ij = δij :=
{
1 if i = j,
0 if i 6= j.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Example . Returning to the hunt, the three states were: (0)the spider and the fly are in the same room, (1) the spider isin Room A and the fly is in Room B, and, (2) the spider is inRoom B and the fly is in Room A. Since the spider changesrooms with probability p and the fly changes rooms withprobability q,
P =
1 0 0
r (1 − p)(1 − q) pq
r pq (1 − p)(1 − q)
,
where r = p(1 − q) + q(1 − p) = p + q − 2pq
= 1 − [(1 − p)(1 − q) + pq].
Probability & Statistical Models c© Philip K. Pollett
Markov chains
For example, if p = 1/4 and q = 1/2, then
P =
1 0 0
1/2 3/8 1/8
1/2 1/8 3/8
.
What is the chance that the hunt is over by n minutes?
Can we calculate the chance of being in each of the variousstates after n minutes?
Probability & Statistical Models c© Philip K. Pollett
Markov chains
By the law of total probability, we have
p(n+m)ij = Pr(Xn+m = j|X0 = i)
=∑
k∈S
Pr(Xn+m = j|Xn = k,X0 = i)
× Pr(Xn = k|X0 = i).
But,
Pr(Xn+m = j|Xn = k,X0 = i)
= Pr(Xn+m = j|Xn = k) (Markov property)
= Pr(Xm = j|X0 = k) (time homogeneous)
Probability & Statistical Models c© Philip K. Pollett
Markov chains
and so, for all m,n ≥ 1,
p(n+m)ij =
∑
k∈S
p(n)ik p
(m)kj , i, j ∈ S,
or, equivalently, in terms of transition matrices,P (n+m) = P (n)P (m). Thus, in particular, we haveP (n) = P (n−1)P (remembering that P := P (1)). Therefore,
P (n) = Pn, n ≥ 1.
Note that since P (0) = I = P 0, this expression is valid for alln ≥ 0.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Example . Returning to the hunt, if the spider and the flychange rooms with probability p = 1/4 and q = 1/2,respectively, then
P =
1 0 0
1/2 3/8 1/8
1/2 1/8 3/8
.
A simple calculation gives
P 2 =
1 0 0
3/4 5/32 3/32
3/4 3/32 5/32
,
Probability & Statistical Models c© Philip K. Pollett
Markov chains
P 3 =
1 0 0
7/8 9/128 7/128
7/8 7/128 9/128
,
et cetera, and, to four decimal places,
P 15 =
1 0 0
1.0000 0.0000 0.0000
1.0000 0.0000 0.0000
.
Recall that X0 = 1, so p(n)10 is the probability that the hunts
ends by n minutes. What, then, is the probability that thehunt lasts more than two minutes? Answer: 1 − 3/4 = 1/4.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Arbitrary initial conditions . What if we are unsure aboutwhere the process starts?
Let π(n)j = Pr(Xn = j) and define a row vector
π(n) = (π
(n)j , j ∈ S),
being the distribution of the chain at time n.
Suppose that we know the initial distribution π(0), that is,
the distribution of X0 (in the previous example we hadπ
(0) = (0 1 0)).
Probability & Statistical Models c© Philip K. Pollett
Markov chains
By the law of total probability, we have
π(n)j = Pr(Xn = j) =
∑
i∈S
Pr(Xn = j|X0 = i) Pr(X0 = i)
=∑
i∈S
π(0)i p
(n)ij ,
and so π(n) = π
(0)Pn, n ≥ 0.
Definition . If π(n) = π is the same for all n, then π is called
a stationary distribution. If limn→∞ π(n) exists and equals π,
then π is called a limiting distribution.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Example . Returning to the hunt with p = 1/4 and q = 1/2,suppose that, at the beginning of the hunt, each creature isequally likely to be in either room, so thatπ
(0) = (1/2 1/4 1/4).
Then,
π(n) = π
(0)Pn
= (1/2 1/4 1/4)
1 0 0
1/2 3/8 1/8
1/2 1/8 3/8
n
.
Probability & Statistical Models c© Philip K. Pollett
Markov chainsFor example,
π(3) = (1/2 1/4 1/4)
1 0 0
1/2 3/8 1/8
1/2 1/8 3/8
3
= (1/2 1/4 1/4)
1 0 0
7/8 9/128 7/128
7/8 7/128 9/128
= (15/16 1/32 1/32).
So, if, initially, each creature is equally likely to be in eitherroom, then the probability that the hunt ends within 3minutes is 15/16.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
The two state chain . Let S = {0, 1} and let
P =
(
1 − p p
q 1 − q
)
,
where p, q ∈ (0, 1). It can be shown that
P =1
p + q
(
1 p
1 −q
)(
1 0
0 r
)(
q p
1 −1
)
,
where r = 1 − p − q. This is of the form P = V DV −1. Checkit! (The procedure is called diagonalization.)
Probability & Statistical Models c© Philip K. Pollett
Markov chains
This is good news because
P 2 = (V DV −1)(V DV −1) = V D(V −1V )DV −1
= V (DID)V −1 = V D2V −1.
Similarly, Pn = V DnV −1 for all n ≥ 1. Hence,
P (n) =1
p + q
(
1 p
1 −q
)(
1 0
0 rn
)(
q p
1 −1
)
=1
p + q
(
q + prn p − prn
q − qrn p + qrn
)
.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Thus we have an explicit expression for the n-step transitionprobabilities.
Remark . The above procedure generalizes to any Markovchain with a finite state space.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
If the initial distribution is π(0) = (a b), then, since
π(n) = π
(0)Pn,
Pr(Xn = 0) =q + (ap − bq)rn
p + q,
Pr(Xn = 1) =p − (ap − bq)rn
p + q.
(You should check this for n = 0 and n = 1.) Notice thatwhen ap = bq, we have
Pr(Xn = 0) = 1 − Pr(Xn = 1) = q/(p + q) ,
for all n ≥ 0, so that π = (q/(p + q) p/(p + q)) is a stationarydistribution.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Notice also that |r| < 1, since p, q ∈ (0, 1). Therefore, π isalso a limiting distribution because
limn→∞
Pr(Xn = 0) = q/(p + q) ,
limn→∞
Pr(Xn = 1) = p/(p + q) .
Remark . If, for a general Markov chain, a limitingdistribution π exists, then it is a stationary distribution, thatis, πP = π (π is a left eigenvector corresponding to theeigenvalue 1).
For details (and the converse), you will need a moreadvanced course on Stochastic Processes.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Example . Max (a dog) is subjected to a series of trials, ineach of which he is given a choice of going to a dish to hisleft, containing tasty food, or a dish to his right, containingfood with an unpleasant taste.
Suppose that if, on any given occasion, Max goes to theleft, then he will return there on the next occasion withprobability 0.99, while if he goes to the right, he will do soon the next occasion with probability 0.1 (Max is smart, buthe is not infallible).
Probability & Statistical Models c© Philip K. Pollett
Poppy and Max
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Let Xn be 0 or 1 according as Max chooses the dish to theleft or the dish to the right on trial n. Then, {Xn} is atwo-state Markov chain with p = 0.01 and q = 0.9 and hencer = 0.09. Therefore, if the first dish is chosen at random (attime n = 1), then Max chooses the tasty food on the n-thtrial with probability
90
91−
89
182(0.09)n−1,
the long-term probability being 90/91.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Birth-death chains . Their state space S is either theintegers, the non-negative integers, or {0, 1, . . . , N}, and,jumps of size greater than 1 are not permitted; theirtransition probabilities are therefore of the form pi,i+1 = ai,pi,i−1 = bi and pii = 1 − ai − bi, with pij = 0 otherwise.
The birth probabilities (ai) and the death probabilities (bi)are strictly positive and satisfy ai + bi ≤ 1, except perhaps atthe boundaries of S, where they could be 0. If ai = a andbi = b, the chain is called a random walk .
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Gambler’s ruin . A gambler successively wagers a singleunit in an even-money game. Xn is his capital after n betsand S = {0, 1, . . . , N}. If his capital reaches N he stops andleaves happy, while state 0 corresponds to “bust”. Hereai = bi = 1/2, except at the boundaries (0 and 1 areabsorbing states). It is easy to show that the player goesbust with probability 1 − i/N if his initial capital is i.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
The Ehrenfest diffusion model . N particles are allowed topass through a small aperture between two chambers Aand B. We assume that at each time epoch n, a singleparticle, chosen uniformly and at random from the N ,passes through the aperture.
Let Xn be the number in chamber A at time n. Then,S = {0, 1, . . . , N} and, for i ∈ S, ai = 1 − i/N and bi = i/N . Inthis model, 0 and N are reflecting barriers. It is easy toshow that the stationary distribution is binomial B(N, 1/2).
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Population models . Here Xn is the size of the populationtime n (for example, at the end of the n-th breeding cycle, orat the time of the n-th census). S = {0, 1, . . . }, orS = {0, 1, . . . , N} when there is an upper limit N on thepopulation size (frequently interpretted as the carryingcapacity). Usually 0 is an absorbing state, corresponding topopulation extinction, and N is reflecting.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Example . Take S = {0, 1, . . . } with a0 = 0 and, for i ≥ 1,ai = a > 0 and bi = b > 0, where a + b = 1. It can be shownthat extinction occurs with probability 1 when a ≤ b, andwith probability (b/a)i when a > b, where i is the initialpopulation size. This is a good simple model for apopulation of cells: a = λ/(λ + µ) and b = µ/(λ + µ), where µand λ are, respectively, the death and the cell division rates.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
The logistic model . This has S = {0, . . . , N}, with 0absorbing and N reflecting, and, for i = 1, . . . , N − 1,
ai =λ(1 − i/N)
µ + λ(1 − i/N), bi =
µ
µ + λ(1 − i/N).
Here λ and µ are birth and death rates. Notice that the birthand the death probabilities depend on i only through i/N , aquantity which is proportional to the population density :i/N = (i/Area)/(N/Area). Models with this property arecalled density dependent .
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Telecommunications . (1) A communications link in atelephone network has N circuits. One circuit is held byeach call for its duration. Calls arrive at rate λ > 0 and arecompleted at rate µ > 0. Let Xn be the number of calls inprogress at the n-th time epoch (when an arrival or adeparture occurs). Then, S = {0, . . . , N}, with 0 and N bothreflecting barriers, and, for i = 1, . . . , N − 1,
ai =λ
λ + iµ, bi =
iµ
λ + iµ.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
(2) At a node in a packet-switching network, data packetsare stored in a buffer of size N . They arrive at rate λ > 0and are transmitted one at a time (in the order in which theyarrive) at rate µ > 0. Let Xn be the number of packets yet tobe transmitted just after the n-th time epoch (an arrival or adeparture). Then, S = {0, . . . , N}, with 0 and N bothreflecting barriers, and, for i = 1, . . . , N − 1,
ai =λ
λ + µ, bi =
µ
λ + µ.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Genetic models . The simplest of these is the Wright-Fishermodel . There are N individuals, each of two genetic types,A-type and a-type. Mutation (if any) occurs at birth. Weassume that A-types are selectively superior in that therelative survival rate of A-type over a-type individuals insuccessive generations is γ > 1. Let Xn be the number ofA-type individuals, so that N − Xn is the number of a-type.
Probability & Statistical Models c© Philip K. Pollett
Markov chains
Wright and Fisher postulated that the composition of thenext generation is determined by N Bernoulli trials, wherethe probability pi of producing an A-type offspring is givenby
pi =γ[i(1 − α) + (N − i)β]
γ[i(1 − α) + (N − i)β] + [iα + (N − i)(1 − β)],
where α and β are the respective mutation probabilities. Wehave S = {0, . . . , N} and
pij =
(
N
j
)
pji (1 − pi)
N−j , i, j ∈ S.
Probability & Statistical Models c© Philip K. Pollett