Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
System IdentificationLecture 10: Prediction error methods and pseudo-linear regressions
Roy Smith
2018-11-20 10.1
Prediction
Gpzq`
Hpzqvpkq
ypkq upkq
epkq
Typical assumptions
Gpzq and Hpzq are stable,
Hpzq is stably invertible (no zeros outside the unit disk)
epkq has known statistics: known pdf or known moments.
One-step ahead prediction
Given ZK “ tup0q, yp0q, . . . , upK ´ 1q, ypK ´ 1qu,what is the best estimate of ypKq?
2018-11-20 10.2
Prediction
Hpzq epkqvpkq
Noise model invertibility
Given, vpkq, k “ 0, . . . ,K ´ 1, can we determine epkq, k “ 0, . . . ,K ´ 1?
Inverse filter: Hinvpzq : epkq “8ÿ
i“0
hinvpiqvpk ´ iq
We also want the inverse filter to be causal and stable:
hinvpkq “ 0, k ă 0, and8ÿ
k“0
|hinvpkq| ă 8.
If Hpzq has no zeros for |z| ě 1, then,
Hinvpzq “ 1
Hpzq .
2018-11-20 10.3
Prediction
Hpzq epkqvpkq
One step ahead prediction
Given measurements of vpkq, k “ 0, . . . ,K ´ 1, can we predict vpKq?
Assume that we know Hpzq, how much can we say about vpKq?
Assume also that Hpzq is monic (hp0q “ 1).
vpkq “8ÿ
i“0
hpiqepk ´ iq
“ epkq `8ÿ
i“1
hpiqepk ´ iqloooooooomoooooooon“ mpk ´ 1q“observed”
2018-11-20 10.4
Prediction
Hpzq epkqvpkq
One-step ahead prediction
The prediction of vpkq, based on measurements up to time k ´ 1 is,
vpk|k ´ 1q.We will argue that a good choice in this case is,
vpk|k ´ 1q “ mpk ´ 1q “8ÿ
i“1
hpiqepk ´ iq.
The error in our prediction is epkq — which we clearly can’t reduce.
2018-11-20 10.5
One-step prediction statistics
General case
Say epkq is identically distributed with pdf: fepxq,
Probtx ď epkq ď x` δxu “ż x`δx
x
fepxqdx « fepxqδx.
A posteriori distribution
What are the statistics of vpkq given vk´1´8 “ tvp´8q, . . . , vpk ´ 1qu?
Probtx ď vpkq ď x` δx|vk´1´8 u “
“ Probtx ď epkq `mpk ´ 1q ď x` δxu“ Probtx´mpk ´ 1q ď epkq ď x´mpk ´ 1q ` δxu
« fepx ´mpk ´ 1qqδx.
2018-11-20 10.6
One-step ahead prediction statistics
Maximum of the conditional (a posteriori) distribution
Select the prediction estimate as the peak value of the conditional distribution:
vpk|k ´ 1q “ argmaxx
fepx´mpk ´ 1qq“ mpk ´ 1q for the Gaussian case.
This is the most probable value of vpk | k ´ 1q.Mean of the conditional distribution
Select the prediction estimate as the mean value of the conditional distribution:
vpk|k ´ 1q “ Etvpkq|vk´1´8 u “ Etepkq `mpk ´ 1qu
“ mpk ´ 1q ` Etepkqu “ mpk ´ 1q.This is the expected value of vpk|k ´ 1q.
2018-11-20 10.7
One-step ahead prediction
Calculation
vpk|k ´ 1q “ mpk ´ 1q “8ÿ
i“1
hpiqepk ´ iq
“ pHpzq ´ 1q epkq (assuming Hpzq is monic)
“ Hpzq ´ 1
Hpzq vpkq“ p1´Hinvpzqq vpkq
“ ´8ÿ
i“1
hinvpiqvpk ´ iq
Note that vpk|k ´ 1q depends only on values up to time k ´ 1.
The best we can do is:
vpk|k ´ 1q “ ´kÿ
i“1
hinvpiqvpk ´ iq « ´8ÿ
i“1
hinvpiqvpk ´ iq.
2018-11-20 10.8
Example
Moving average model
vpkq “ epkq ` cepk ´ 1q, ùñ Hpzq “ 1` cz´1.
For Hpzq to be stably invertible we require |c| ă 1.
Hinvpzq “ 1
1` cz´1“
8ÿ
i“0
p´cqiz´i.
One-step ahead predictor
vpk|k ´ 1q “ p1´Hinvpzqqvpkq “ ´8ÿ
i“1
p´cqivpk ´ iq
« ´kÿ
i“1
p´cqivpk ´ iq
“ cvpk ´ 1q ´ c2vpk ´ 2q ` c3vpk ´ 3q ` ¨ ¨ ¨ ´ p´cqkvp0q.
2018-11-20 10.9
Example
Moving average model
vpkq “ epkq ` cepk ´ 1q, ùñ Hpzq “ 1` cz´1.
Recursive formulation
Note that,
Hpzqvpk|k ´ 1q “ pHpzq ´ 1qvpkqSo,
vpk|k ´ 1q ` cvpk ´ 1|k ´ 2q “ cvpk ´ 1q
vpk|k ´ 1q “ c pvpk ´ 1q ´ vpk ´ 1|k ´ 2qqlooooooooooooooooomooooooooooooooooonεpk ´ 1q (prediction error at k ´ 1)
“ cεpk ´ 1q
2018-11-20 10.10
Another example
Autoregressive noise model
Our noise model is:
vpkq “8ÿ
i“0
aiepk ´ iq |a| ă 1 for stability.
So, Hpzq “8ÿ
i“0
aiz´i “ 1
1´ az´1,
and Hinvpzq “ 1´ az´1 (a moving average process)
Our one-step ahead predictor is,
vpk|k ´ 1q “ p1´Hinvpzqqvpkq “ avpk ´ 1q.
2018-11-20 10.11
Output prediction
ypkq “ Gpzqupkq ` vpkq
Gpzq`
Hpzqvpkq
ypkq upkq
epkq
One-step ahead prediction
Maximise the expected value of the conditional distribution,
ypk|k ´ 1q “ Etypkq|ZKu “ Gpzqupkq ` vpk|k ´ 1q“ Gpzqupkq ` p1´Hinvpzqqvpkq“ HinvpzqGpzqupkq ` p1´Hinvpzqqypkq
2018-11-20 10.12
Output prediction
ypkq “ Gpzqupkq ` vpkq
Gpzq`
Hpzqvpkq
ypkq upkq
epkq
Prediction error
ypkq ´ ypk|k ´ 1q “ ´HinvpzqGpzqupkq `Hinvpzqypkq“ Hinvpzqpypkq ´Gpzqupkqq “ Hinvpzqvpkq“ epkq
The innovation is the part of the output prediction that cannot be estimatedfrom past measurements.
2018-11-20 10.13
Prediction error based identification
The one-step ahead predictor is parametrised by θ,
ypk|θ, ZKq “ Hinvpθ, zqGpθ, zqupkq ` p1´Hinvpθ, zqqypkqDefine a parametrised prediction error,
εpk, θq “ ypkq ´ ypk, θq,which we can optionally filter,
εF pk, θq “ F pzqεpk, θq (weighted error).
Define a cost function,
Jpθ, ZKq “ 1
K
K´1ÿ
k“0
lpεF pk, θqq typically lpεF pk, θqq “ }εF pk, θq}2.
θ “ argminθ
Jpθ, ZKq.
2018-11-20 10.14
Prediction error methods: ARX models
Bpθ, zqApθ, zq`
1
Apθ, zq
vpkqypkq upkq
epkqGpθ, zq “ Bpθ, zq
Apθ, zq ,
Hpθ, zq “ 1
Apθ, zq ,
ypk|θq “ Hinvpθ, zqGpθ, zqupkq ` p1´Hinvpθ, zqqypkq“ Bpzqupkq ` p1´Apzqqypkq“ θTφpkq “ φT pkqθ.
So,
Y ´ Φθ “ ε ÐÝ vector of prediction errors
Least squares regression approach minimises the prediction errors.
2018-11-20 10.15
Model structures
Gpθ, zq`
Hpθ, zq
vpkqypkq upkq
epkq
ARX model structure (equation error)
Bpθ, zqApθ, zq`
1
Apθ, zq
vpkqypkq upkq
epkqGpθ, zq “ Bpθ, zq
Apθ, zq ,
Hpθ, zq “ 1
Apθ, zq ,
2018-11-20 10.16
ARMAX model structure
Bpθ, zqApθ, zq`
Cpθ, zqApθ, zq
vpkqypkq upkq
epkqGpθ, zq “ Bpθ, zq
Apθ, zq ,
Hpθ, zq “ Cpθ, zqApθ, zq ,
with Apzq, Cpzq monic.
Prediction error structure
ypk|θq “ BpzqCpzqupkq `
ˆ1´ Apzq
Cpzq˙ypkq
Cpzqypk|θq “ Bpzqupkq ` pCpzq ´Apzqq ypkq
ypk|θq “ Bpzqupkq ` p1´Apzqqypkq ` pCpzq ´ 1q pypkq ´ ypk|θqqloooooooomoooooooonεpkq
2018-11-20 10.17
Pseudolinear regression
One-step ahead ARMAX predictor
ypk|θq “ Bpzqupkq ` p1´Apzqqypkq ` pCpzq ´ 1qεpkq“ “
b1 . . . a1 . . . c1 . . .‰
“upk ´ 1q . . . ´ypk ´ 1q . . . εpk ´ 1q . . .
‰T
“ ϕT pθ, kqθ.This is not linear in θ.
Optimisation-based algorithm
minimiseθ,ε
}ε}2 (or more generally, lpεq )
subject to Y “ ΦpεqT θ ` ε (nonlinear equality constraint)
2018-11-20 10.18
ARMAX example
ARMAX model structure
Bpθ, zqApθ, zq`
Cpθ, zqApθ, zq
vpkqypkq upkq
epkqApzq “ 1` a1z´1 ` a2z´2
Bpzq “ b1z´1 ` b2z´2
Cpzq “ 1` c1z´1 ` c2z´2
θ “ “b1 b2 a1 a2 c1 c2
‰T.
Experiments
§ The plant is “at rest”.
§ Data length K “ 31.
§ PRBS input signal, upkq.
2018-11-20 10.19
ARMAX example
Typical experimental data
Index: k0 5 10 15 20 25 30
-1
-0.5
0
0.5
1u(k)
Index: k0 5 10 15 20 25 30
-20
-10
0
10
20
y(k)
Index: k0 5 10 15 20 25 30
-0.5
0
0.5
v(k)
2018-11-20 10.20
Constrained minimisation code
% Create data part of regressor. Assume plant at rest
PhiTyu(1,:) = [0, 0, 0, 0];
PhiTyu(2,:) = [u(1),0, -y(1), 0];
for i = 3:K,
PhiTyu(i,:) = [u(i-1), u(i-2), -y(i-1), -y(i-2)];
end
[x,fval] = fmincon(@(x)ARMAXobjective(x),x0,...
[],[],[],[],[],[],@(x)ARMAXconstraint(x,y,PhiTyu));
function [f] = ARMAXobjective(x) % x = [theta; e]
f = sqrt(x(7:end)’*x(7:end));
function [c,ceq] = ARMAXconstraint(x,y,PhiTyu)
e = x(7:end);
PhiTe = zeros(K,2);
PhiTe(2,1) = e(1);
for j = 3:K,
PhiTe(j,:) = [e(j-1), e(j-2)];
end
ceq = y - [PhiTyu, PhiTe] * theta - e; c = [];
2018-11-20 10.21
ARMAX example
Transfer function averages: 128 experiments (data length, K “ 31)
Frequency10
-210
-110
0
Mag.
10-1
100
101
102
103
Mean estimate comparison, K = 31
GzHzGoptHopt
10-2
10-1
100
-200
-150
-100
-50
0
50
2018-11-20 10.22
ARMAX example
Coefficient statistics for 128 experiments
coefficient
b_1 b_2 a_1 a_2 c_1 c_2-15
-10
-5
0
5
10
15
Expt datalength = 31 (green circles are the true values)
2018-11-20 10.23
ARMAX example
Coefficient error for averages: 2, 4, 8, . . . , 128 experiments
Total data length (#avgs x expt length)10
110
210
310
4
the
ta e
rro
r
10-4
10-3
10-2
10-1
100
101
102
103
b1
b2
a1
a2
c1
c2
2018-11-20 10.24
ARMAX example
Transfer function estimates: 128 experiments (data length, K “ 31)
Frequency: rad/sample10
-210
-110
0
Magnitude
10-1
100
101
102
|G(jω )| estimates (128) and mean estimate for K = 31
2018-11-20 10.25
ARMAX example
Transfer function estimates: 128 experiments (data length, K “ 31)
Frequency: rad/sample10
-210
-110
0
Magnitude
10-1
100
101
102
103
104
|H(jω )| estimates (128) and mean estimate for K = 31
2018-11-20 10.26
ARMAX example
Prediction errors and actual innovations (data length, K “ 31)
Total data length (#avgs x expt length)10
110
210
310
410
-2
10-1
Actual innovations vs. optimum prediction error
mean ||e||/sqrt(N)mean ||eps||/sqrt(N)
2018-11-20 10.27
ARMAX example
Longer experiments: K “ 127
Index: k0 20 40 60 80 100 120
-1
-0.5
0
0.5
1u(k)
Index: k0 20 40 60 80 100 120
-20
-10
0
10
20
y(k)
Index: k0 20 40 60 80 100 120
-0.5
0
0.5
v(k)
2018-11-20 10.28
ARMAX example
Coefficient statistics comparison: K “ 31 and K “ 127
b1K “ 31
b1K “ 127
a1
K “ 31a1
K “ 127c1
K “ 31c1
K “ 127
´15
´10
´5
0
5
10
15
Coefficients
2018-11-20 10.29
ARMAX example
Coefficient error comparison: K “ 31 and K “ 127
Total data length (#avgs x expt length)10
110
210
310
4
the
ta e
rro
r
10-4
10-3
10-2
10-1
100
101
102
103
b1
b2
a1
a2
c1
c2
2018-11-20 10.30
ARMAX example
Prediction error comparison: K “ 31 and K “ 127
Total data length (#avgs x expt length)10
110
210
310
410
-2
10-1
Actual innovations vs. optimum prediction error
mean ||e||/sqrt(N)mean ||eps||/sqrt(N)
2018-11-20 10.31
ARMAX example
Transfer function estimates: 32 experiments (data length, K “ 127)
Frequency: rad/sample10
-210
-110
0
Magnitude
10-1
100
101
102
|G(jω )| estimates (32) and mean estimate for K = 127
2018-11-20 10.32
ARMAX example
Transfer function estimates: 32 experiments (data length, K “ 127)
Frequency: rad/sample10
-210
-110
0
Magnitude
10-1
100
101
102
103
104
|H(jω )| estimates (32) and mean estimate for K = 127
2018-11-20 10.33
ARARMAX model structure
Bpθ, zq`1
Apθ, zq
Cpθ, zqDpθ, zq
vpkqypkq upkq
epkq
Gpθ, zq “ Bpθ, zqApθ, zq , Hpθ, zq “ Cpθ, zq
Apθ, zqDpθ, zq ,
with Apzq, Cpzq and Dpzq monic.
2018-11-20 10.34
Output error model structure
Bpθ, zqF pθ, zq` upkqypkq
epkqGpθ, zq “ Bpθ, zq
F pθ, zq ,
Hpθ, zq “ 1.
with F pzq monic.
Pseudolinear predictor framework
ypk|θq “ Bpθ, zqF pθ, zqupkq “ φpk, θqT θ.
where
φpk, θqT “r upk ´ 1q . . . upk ´mq ´ypk ´ 1, θq . . . ´ypk ´ nf , θqloooooooooooooooooooooooomoooooooooooooooooooooooons.
pseudolinear terms
2018-11-20 10.35
Box-Jenkins model structure
Bpθ, zqF pθ, zq`
Cpθ, zqDpθ, zq
vpkqypkq upkq
epkq
Gpθ, zq “ Bpθ, zqF pθ, zq , Hpθ, zq “ Cpθ, zq
Dpθ, zq ,
Predictor
ypk|θq “ DpzqCpzq
BpzqF pzqupkq `
ˆ1´ Dpzq
Cpzq˙ypkq
2018-11-20 10.36
General model structure
Bpθ, zqF pθ, zq`1
Apθ, zq
Cpθ, zqDpθ, zq
vpkqypkq upkq
epkq
Gpθ, zq “ Bpθ, zqApθ, zqF pθ, zq , Hpθ, zq “ Cpθ, zq
Apθ, zqDpθ, zq
Predictor
ypk|θq “ DpzqCpzq
BpzqF pzqupkq `
ˆ1´ DpzqApzq
Cpzq˙ypkq
A pseudo-linear regression can be derived.
2018-11-20 10.37
Known noise model (with ARMAX dynamics)
Assume that the noise is known,
vpkq “ LpzqepkqSo
Apzqypkq “ Bpzqupkq ` Lpzqepkq.Filter signals via L´1pzq,
yLpkq “ L´1pzqypkquLpkq “ L´1pzqupkq
Giving,
ApzqyLpkq “ BpzquLpkq ` epkq,for which LS methods give consistent estimates.
2018-11-20 10.38
High-order model fitting
Assume that the noise is autoregressive (ARARX structure),
Apzqypkq “ Bpzqupkq ` 1
Dpzqepkq epkq „ N p0, λq.
Fit a high order model (order of Dpzq is nd):
ApzqDpzqypkq “ BpzqDpzqupkq ` epkq.Least squares estimate with orders n` nd and m` nd. This gives a consistentestimate of,
BpzqDpzqApzqDpzq “
BpzqApzq .
This amounts to making the noise model sufficiently rich to capture additionalautoregressive features in the noise.
In practice the cancellation will not be exact: Apzq and Bpzq will be high order.
2018-11-20 10.39
Bibliography
PredictionLennart Ljung, System Identification;Theory for the User, 2nd Ed., Prentice-Hall,1999, [section 3.2].
Model parametrisations
Lennart Ljung, System Identification;Theory for the User, 2nd Ed., Prentice-Hall,1999, [sections 1.3 and 4.2].
Linear and pseudolinear regression
Lennart Ljung, System Identification;Theory for the User, 2nd Ed., Prentice-Hall,1999, [sections 10.1 and 10.2].
2018-11-20 10.40