Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
40
CHAPTER 3
LINEAR PREDICTION TECHNIQUES FOR VBR VIDEO
TRAFFIC
3.1 INTRODUCTION
Optimal allocation of network resources for streaming of
multimedia content is a major challenge facing the research community today.
Videos encoded using popular compression standards such as MPEG-4 and
H.264 generate variable bit rate (VBR) traffic. Allocating fixed bandwidth
for such traffic would either degrade the quality of service or lead to under-
utilization of the network resources. Instead, the network bandwidth can be
allocated dynamically based on the nature of data currently being transmitted.
Traffic Prediction is the process of predicting future network traffic
based on the characteristics of the past traffic. Optimal dynamic allocation of
network resources is possible if future traffic content can be predicted in
advance. Since predictions have to be done in real-time, the technique used
must be simple, fast and accurate.
Linear Prediction techniques are simple and efficient, and thus are
suitable for real-time predictions. These techniques are based on estimating
the future traffic patterns as a linear weighted sum of past traffic patterns.
Initially, a suitable mathematical model is identified for the observed past
traffic, and the model parameters are dynamically changed, (based on certain
41
criteria like, prediction error, scene-changes etc.,) so as to adapt to the
changes in the future traffic behavior. Predictions are generally done at frame
level or GOP level. Many of the researchers in the past have focused on
predictions over I-frame traffic, and occasionally P or B.
Most of the existing techniques are linear regression based (AR,
DAR, ARMA, ARIMA, etc.,) and a few are based on simple adaptive filters
like LMS (Lease Mean Square), NLMS (Normalized Least Mean Square),
RLS (Recursive Least Square) etc., Estimation of model parameters is
generally done using either of, Yule-Walker, Levinson-Durbin, Least Squares,
Maximum Entropy, Maximum Likelihood (Porat 1994).
The accuracy of the linear predictor depends on the number of past
observations used in prediction, which is called the prediction order. Higher
order predictors might give better prediction results, but at the expense of
increased computational complexity. Moreover, MPEG encoded traces
exhibit both SRD (Short Range Dependent) and LRD (Long Range
Dependent) behaviors. Thus, for real-time predictions, lower order predictors
are preferred over higher order predictors, as these are simple and also show
better performance if appropriately used. All of the regression based models
can be classified under SRD models.
Predicting accurately traffic with unknown behavior is an acid test
for predictors. MPEG encoded VBR Video traces exhibit high level of
burstiness, which is an indicative of high and sudden traffic fluctuations.
Higher level of burstiness is generally attributed to changes in content and
activity of the video. Smoothing traffic prior to the prediction can produce
better prediction performances. Traffic smoothing can be achieved by
aggregation, differencing, or removing sudden peaks etc.
42
Almost all of the predictors perform well when the traffic variations
are less significant. But if the variations are high with no trend, an uncertainty
prevails on the forecast, and prediction performances decrease. ARIMA based
models are understood to perform better in such situations. Performance
evaluation of the prediction models is done by comparing the characteristics
of the predicted trace with that of the empirical trace, using either of QQ
plots, RPE, ACF comparison, Histogram comparison, SNR-1
, RMSE, Leaky
bucket simulations, NMSE, etc.,
A brief discussion on some of the popular existing techniques is
given in Section 2.1.3.
3.2 PROPOSED WORK
3.2.1 Moving Average Based Predictors for VBR Video Traffic
The aim is to achieve high accuracy in online VBR traffic
prediction at reduced complexity. Here some new prediction techniques are
proposed, which are equivalent in complexity to a second order linear
predictor, but perform much better. This work focuses on real-time prediction
of video traffic encoded in MPEG-4 standard.
An MPEG-4 encoded video is made up of units called groups of
pictures (GOPs), which consist of I, P and B frames. The GOP-level traffic
can be considered as an aggregation of the frame-level traffic. The focus here
is at single-step-ahead GOP size prediction, where the size of the next GOP in
the video traffic sequence is predicted using the previously encountered GOP
sizes.
New prediction techniques for MPEG-4 encoded variable bit rate
video traffic are proposed based on the concept of moving average and the
43
discussion is extended further using the gradient-descent approach. The
resultant predictors are both simple and accurate, and suitable for real-time
prediction. Here, NLMS technique is used as a base-line predictor for
comparative study and a performance improvement of about 11% is achieved.
3.2.1.1 Simple linear prediction
As stated in Section 2.1.1, given two consecutive GOP sizes Gi-2
and Gi-1, the value of Gi is predicted as follows:
Linear Predictor: 12 −− += iii GGG βα , (3.1)
where the coefficients α and β depend on the characteristics of the traffic and
are determined through the method of least squares.
The least squares method determines the values of α and β, for
which the sum of squares of prediction errors is minimum. This method
requires the entire video trace to be known in advance, and hence is not
suitable for real-time prediction. On the other hand, the LMS filter (Yoo
2002) is a real-time approach to prediction, where the coefficients are
adaptively set after each prediction step based on the prediction error in that
step. Another issue with the above given linear predictor is that it assumes the
size of every GOP to be linearly dependent on the sizes of the previous two
GOP’s and ignores its dependence on all other GOP sizes. In other words, the
predictor considers only the local traffic characteristics and ignores the
average traffic characteristics.
The proposed new traffic predictors exploit the average
characteristics of the video traffic. The method is applied to online traffic
prediction and is found to outperform the conventional linear predictors.
44
3.2.1.2 Moving average predictors
Based on the average measures as discussed in Section 2.1.3.3, two
new predictors for GOP size prediction are proposed here. The size of the
next GOP is predicted as a linear combination of the current GOP size and
one of the moving average measures. Given a GOP size Gk, the value of Gk+1
is predicted as follows:
SMA Predictor: Gk+1 = α × Gk + β × SMAk (3.2)
EMA Predictor: Gk+1 = α × Gk + β × EMAk (3.3)
In the next section, it is shown that through reasonable
approximations, these predictors can be simplified further.
3.2.1.3 Shot change detection
A shot is a group of consecutive GOP’s having similar sizes. The
accuracy of GOP prediction can be improved if we have a separate predictor
for each shot. In the case of second-order linear predictor, the values of α and
β can be varied for each shot (Kwong and Johnston 1992). At first a simple
algorithm for shot boundary detection is given in Figure 3.1.
A shot boundary is declared when the standard deviation (SD) of
GOP sizes (since the last shot boundary) exceeds a given threshold T.
The smaller the value of T, greater is the number of shots and so higher
is the prediction accuracy. For experiments, the value of T was set as
20000 bytes.
3.2.1.4 α-Predictors
The shot detection algorithm was applied to a MPEG-4 video trace
45
Algorithm: Shot_Boundary
prev = 0
for k = 1 to no_of_gops
∑ +=−=
k
previiG
prevkG
1
1
( )∑ +=−
−=
k
previi GG
prevkSD
1
21
if SD > T
Declare ‘k’ as a shot boundary
prev = k-1
end if
end for
Figure 3.1 Algorithm for Shot Boundary Detection
Figure 3.2 Plot of alpha (α) vs. beta (β) for Tokyo Olympics (MQ)
of Tokyo Olympics encoded in medium quality. The parameter T in the shot
detection algorithm was set to 20000 bytes. For each shot detected, the values
of the linear predictor coefficients α and β were determined using the method
of least squares. It was found that there was a high negative correlation
46
between α and β with the coefficient of determination equal to 0.940994. The
plot of α vs β is shown in Figure 3.2, which neatly fits into a straight line. The
corresponding equation is 1.008284 α + 1.020820 β = 1 and can be
approximated to α + β = 1. Clearly, it is reasonable to approximate β to 1 – α.
Based on this observation; the proposed moving average predictors are
modified to arrive at predictors which are simpler in nature. These simplified
predictors from now will be referred to as α-SMA and α-EMA. Given a GOP
size Gk, the value of Gk+1 is predicted as follows:
α-SMA Predictor: Gk+1 = α × Gk + (1 – α) × SMAk (3.4)
α-EMA Predictor: Gk+1 = α × Gk + (1 – α) × EMAk (3.5)
In addition, a modification to the linear predictor denoted as
(α-Linear predictor) is also introduced,
α-Linear Predictor: Gk+1 = α × Gk + (1 – α) × Gk-1 (3.6)
and its performance is analyzed.
3.2.1.5 Real-Time prediction
Let us analyze the performance of the proposed predictors on two
prediction tasks, namely, offline traffic prediction, where the entire video
trace is known in advance and online traffic prediction, where the predictor
coefficients need to be learnt in real-time. For online traffic prediction,
approach similar to that of the Normalized Least Mean Square (NLMS) filter
(Yoo 2002) is used.
An online version for the, α -Moving Average predictors based on
the NLMS predictor is presented first. The gradient-descent approach is used
47
for optimization. Initially, the coefficient α is set to 0 and after each prediction
step, it is adjusted so as to minimize the mean square error.
Given GOP size Gk and moving average MAk, the prediction error
in predicting Gk+1 is given by
errk = Gk+1 – (αk × Gk + (1 – αk) × MAk) (3.7)
where αk is the value of the predictor coefficient at instant k.
The mean square error (MSE) is given by
ξk = E{errk2} (3.8)
where E{.} denotes expected value. The partial derivative of the MSE
function with respect to coefficient αk is
∇ ξk = ∇ E{errk2} = -2E{errk(Gk - MAk)} (3.9)
The coefficient is then updated by taking steps proportional to the
negative of the gradient as follows:
αk+1 = αk - ∇2
µξk = αk + µE{errk(Gk - MAk)} (3.10)
where 0 < µ < 2 is called the step-size and determines the convergence of the
algorithm.
By approximating the expectation term, we get
αk+1 = αk + µerrk(Gk - MAk). (3.11)
Generally, a normalized version of this update equation is used as given
below:
48
( )
221
kk
kkkkk
MAG
MAGerr
+
−××+=+
µαα (3.12)
In a similar manner, an online version of the α-linear predictor will
require the following update equation:
( )
21
2
11
−
−+
+
−××+=
kk
kkkkk
GG
GGerrµαα (3.13)
In the case of the EMA and SMA predictors with two coefficients,
the online versions will contain two update equations as given below:
1 2 2
k k
k k
k k
err G
G MA
µα α+
× ×= +
+ (3.14)
1 2 2
k k
k k
k k
err MA
G MA
µα α+
× ×= +
+ (3.15)
3.2.1.6 Experiments and results
The proposed predictors were tested on MPEG-4 video traces from
the video trace library and their performances were compared with that of the
linear predictor in (Lanfranchi and Bing 2008) and the NLMS predictor.
Three videos were considered for the experiment, namely Star Wars IV, NBC
News and Tokyo Olympics. In all these videos, the frame rate was 30 fps and
each GOP was encoded in G16B1 pattern. Table 3.1 contains the detailed
video characteristics for the video traces encoded in low (QP = 28), medium
(QP = 12) and high (QP = 2) quality, where QP is the quantization parameter
used in encoding.
49
Table 3.1 Video Characteristics Table Type Styles
Quality
Quantization
Parameter
(QP)
Frame
Compression
Ratio
Mean
Frame
Size
S.D. of
Frame
Sizes
Mean
Frame
Bit Rate
Peak
Frame
Bit Rate
Frame
Peak /
Mean
Ratio
Star Wars IV (CIF 352x288: 53953 frames)
High 2 39.47 3852.25 32319.06 924540 8665680 9.37
Medium 12 240.74 631.65 6847.82 151595 2478240 16.35
Low 28 329.94 460.89 3955.55 110613 1495920 13.52
NBC News (CIF 352x288: 49523 frames)
High 2 10.99 13839.02 56641.99 3321367 14670480 4.42
Medium 12 93.94 1618.79 14378.43 388510 4074000 10.49
Low 28 168.09 904.66 7110.84 217118 2040960 9.40
Tokyo Olympics (CIF 352x288: 133127 frames)
High 2 19.60 7757.12 56863.20 1861705 18666000 10.03
Medium 12 124.52 1221.19 10418.80 293084 4393200 14.99
Low 28 193.22 787 4966.64 188879 1755120 9.29
It can be seen that the peak-to-mean ratio of the frame sizes, which
is the representative of the burstiness of a video, is maximum for the traces
with medium quality encoding. To analyze the performance of the proposed
predictors, Relative Percentage Error (RPE) given in (Lanfranchi and Bing
2008) is used:
1
1
Li i
Li i
RPEG
ε=
=
∑
∑= (3.16)
where L is the total number of GOP’s predicted and εi is the error
corresponding to the ith
predicted GOP given by Gi.
Offline traffic prediction
The performance of the SMA and α-SMA predictors is evaluated in
offline video traffic prediction, where the least squares method was used for
determining the coefficients. From the plot in Figure 3.3, which corresponds
50
to the Tokyo Olympics (MQ) trace, it is seen that the SMA predictor
outperforms the α -SMA predictor for all values of N. Also, at N=4, the RPE
is minimum for both predictors and lesser than the RPE for the linear and α -
linear predictor.
Similarly, in Figure 3.4, the EMA predictor outperforms the α -
EMA predictor and for values of 0.25 < δ < 0.68, both these predictors
outperform the linear and α -linear predictors. Also, note that when δ tends to
1, the performance of the EMA predictor is same as that of the linear
predictor and the α -EMA predictor is same as that of the α -linear predictor.
The error statistics for the various offline predictors are tabulated in Tables
3.2 and 3.3. For each video trace, the statistics reported corresponds to the
parameter values that gave minimum RPE. For the EMA and α -EMA
predictors, the value of δ that gave the least RPE is denoted by δ min. The
average values of δ min for these predictors are 0.414 and 0.373 respectively.
The low values of δ min indicate that the past traffic has a large influence on
future predictions. For the SMA and α -SMA predictors, the value of N that
gave minimum RPE is denoted as Nmin. The value of Nmin varies from 1 to 9
for different video traces.
Figure 3.3 Plot of N vs. RPE forTokyo Olympics (MQ)
51
Figure 3.4 Plot of δ vs. RPE for Tokyo Olympics (MQ)
Table 3.2 Error Statistics for SMA, EMA and Linear Predictors (offline
prediction case)
SMA Predictor EMA Predictor Linear Predictor
Quality Nmin
Mean
|є| S.D. |є| RPE δmin
Mean
|є| S.D. |є| RPE
Mean
|є| S.D. |є| RPE
Star Wars IV
LQ 5 104.62 125.48 22.689 0.383 104.17 126.20 22.591 105.14 129.64 22.802
MQ 3 153.31 203.39 24.257 0.493 152.73 203.41 24.165 153.56 207.36 24.297
HQ 1 749.92 1012.22 19.453 0.611 749.60 1005.56 19.445 749.92 1012.22 19.453
NBC News
LQ 3 205.72 245.81 22.736 0.54 204.303 247.19 22.578 206.34 254.63 22.804
MQ 3 369.12 454.40 22.794 0.408 367.44 451.82 22.69 371.59 470.37 22.947
HQ 5 1693.33 2041.72 12.233 0.316 1690.1 2037.86 12.209 1706.62 2101.41 12.329
Tokyo Olympics
LQ 4 109.49 130.86 13.91 0.421 109.36 130.84 13.892 109.82 132.73 13.951
MQ 4 180.19 240.03 14.752 0.368 180.08 239.61 14.743 180.89 241.46 14.809
HQ 9 885.37 1317.34 11.41 0.183 884.68 1317.53 11.401 891.16 1317.92 11.485
The small values of Nmin indicate that the simple moving average
based predictors perform well only when the average is computed for a few
previous frames in the local neighborhood.
Online traffic prediction
The online versions of the predictors were tested on different video
traces. For the EMA and α -EMA predictors, the value of δ was taken as 0.15.
52
For the SMA predictor, N was set as 20, while for the α -SMA predictor, the
value of N was fixed at 4. As expected, it is seen from Tables 3.2, 3.3 and 3.4
that the online versions do not perform as well as their offline counterparts.
Table 3.3 Error statistics for α -SMA, α -EMA and α -Linear Predictors
(offline prediction case)
α-SMA Predictor α-EMA Predictor α-Linear Predictor
Quality Nmin
Mean
|є| S.D. |є| RPE δmin
Mean
|є| S.D. |є| RPE
Mean
|є| S.D. |є| RPE
Star Wars IV
LQ 5 105.96 125.19 22.979 0.439 105.54 126.61 22.888 106.02 130.87 22.992
MQ 1 155.09 209.80 24.539 0.617 154.85 205.89 24.5 155.09 209.80 24.539
HQ 1 754.20 1022.59 19.564 0.001 753.93 988.34 19.557 754.20 1022.59 19.564
NBC News
LQ 3 207.71 246.06 22.955 0.444 206.34 244.42 22.827 208.12 256.76 23
MQ 3 374.02 453.98 23.097 0.429 372.21 451.79 22.985 374.55 474.80 23.129
HQ 5 1701.43 2040.30 12.291 0.273 1697.17 2028.62 12.26 1712.97 2107.78 12.375
Tokyo Olympics
LQ 4 109.85 131.12 13.976 0.476 109.75 131.44 13.942 110.01 133.46 13.976
MQ 4 180.80 240.88 14.802 0.455 180.66 241.11 14.79 181.23 243.06 14.837
HQ 7 887.55 1323.42 11.438 0.222 891.16 1317.92 11.427 892.51 1324.71 11.502
The NLMS predictor with prediction order 2 was used as the base-
line predictor for performance analysis. The value of step-size � that gave
minimum prediction error for the base-line predictor was noted and used as
the step-size for the proposed online predictors. It is clearly seen that the
proposed predictors perform better than NLMS predictor (even with higher
prediction order). Unlike offline prediction, the α -predictors perform the best
among all the online predictors with an average performance improvement of
10-11% over the base-line predictor.
Surprisingly, the α-SMA predictor gives the best overall
performance in online video traffic prediction. An important advantage of this
predictor is that it has the least complexity among the online predictors
53
considered and requires only a single variable to be changed after each
prediction step.
Clearly, the predictor is both simple and accurate and so it is well
suited for real-time VBR traffic prediction.
Table 3.4 Comparison of Performance of Online Predictors
Relative Percentage Error
Higher-order
Predictors
Base-line
Predictor Proposed Predictors
% Improvement
Over Base-line
Predictor
NLMS NLMS
Quality
P=8 P=6 P=4 P=2
α-
Linear SMA
α-
SMA EMA
α-
EMA
α-
Linear
α-
SMA
α-
EMA
Star Wars IV
LQ 26.218 25.818 25.580 25.171 22.957 24.287 22.893 24.126 22.886 8.79 9.05 9.08
MQ 28.664 28.278 27.863 27.253 24.482 26.296 24.440 26.076 24.381 10.17 10.32 10.54
HQ 23.916 23.509 23.208 23.797 19.564 22.883 19.654 22.406 19.596 17.79 17.41 17.65
NBC News
LQ 25.809 25.723 25.560 25.742 22.891 24.438 22.720 24.108 22.838 11.08 11.74 11.28
MQ 26.244 26.047 25.822 25.955 23.048 24.683 22.863 24.319 22.900 11.20 11.91 11.77
HQ 15.017 14.538 14.310 13.912 12.362 13.469 12.253 13.231 12.213 11.14 11.92 12.21
Tokyo Olympics
LQ 15.691 15.518 15.358 15.080 13.981 14.693 13.954 14.672 13.963 7.29 7.47 7.41
MQ 17.021 16.898 16.780 16.381 14.853 15.909 14.826 15.886 14.889 9.33 9.49 9.11
HQ 13.506 13.346 13.155 12.692 11.482 12.274 11.436 12.318 11.446 9.53 9.90 9.82
Average 10.70 11.02 10.99
3.2.2 VSSNLMS Augmented ARIMA Based Prediction for VBR
Video Traffic
An ARIMA based model augmented by VSSNLMS for real-time
prediction of VBR video traffic is introduced here. The synergy of the two
can successfully address the challenges in traffic prediction such as accuracy
in prediction, resource management and utilization. ARIMA application on a
VBR video trace results in a component wise representation of the trace
which is then used for prediction. The step-size-adjusted ALP applied
afterwards, ensures consistency in error fluctuation and better accuracy in turn.
54
Performance evaluation of the proposed method is carried out using
RMSE. The average prediction accuracy is improved by 26% and the
average error variance is reduced by 26%. The performance of the proposed
method is thoroughly investigated by applying it on video traces of different
qualities and characteristics.
3.2.2.1 ARIMA process for GOP (16, 1)
ARIMA (Autoregressive Integrated Moving Average) as
introduced in Section 2.1.3.5 is a statistical methodology in time series
analysis which is chiefly used in the forecast or prediction of future terms
based on the characteristics of the past terms. It is the combination of three
components namely the Autoregression (AR), Integration (I) and Moving
Average (MA). The model is generally referred to as an ARIMA (p,d,q)
model, where p, d, and q are integers greater than or equal to zero and refer to
the order of the autoregressive, integrated and moving average parts of the
model, respectively. When one of the terms is zero, it is usual to drop AR, I or
MA. For instance, an I(1) model is ARIMA(0,1,0) and MA(1) model is
ARIMA(0,0,1). The proposed work makes good use of ARIMA for modeling
the frame size process.
The entire work can be divided into two segments. The input to the
first segment is the trace consisting of a frame size sequence. Here, an
ARIMA based mechanism is used to predict the future frame size based on
the past and current frame sizes. It is well established fact that ARIMA based
predictors perform the best for traffics with no seasonality or trend present in
it (Kang et al 2010). So first of all the input trace is preprocessed for the
removal of the seasonality components and trend, if any, for making it fit for
ARIMA. Then the trace is decomposed into several component processes and
represented as linear combination of its own past values along with the past
values of a newly generated ARIMA process obtained from the original trace
55
as done in (Kang et al 2010). The prediction is done over the ARIMA model
to yield the predicted values of the future frame sizes. A comparison between
predicted and actual values is done to evaluate the performance of the model.
First of all the input trace is prepared. For this, the seasonality
components must be removed. Another added advantage of this process is that
the input can be decomposed and expressed in terms of additive components
so that a separate model can be used for each subsequence.
Let Xt be the input frame size sequence with a regular fixed GOP
pattern for a VBR compressed video, denoted by GOP(s,S) where s and S
being the difference between successive P to (I or P) and consecutive I to I
frame sizes respectively. The sample process Xt is decomposed as:
t
S
t
s
t XX ε++=tX (3.17)
where s
tX and S
tX denote the seasonal components that respectively appear in
every sth
and Sth
sample, and tε is the error term. Then the differencing
operation is performed multiple times for each lag. The difference orders D
and d are set prior to performing the differencing operation. As it is indicated
earlier that most of the traces used for experimentation in this work are
encoded in G16B1 pattern. Assuming a GOP pattern GOP (16, 1), the
differenced process Yt can be formulated as:
tt XY Dd BB )1()1( 161 −−= (3.18)
where B is the backward operator, which is widely used in statistics to make
the time series expression more compact and is given by:
ktt
k XXB −= (3.19)
56
Thus (1 – B ) Xt denotes the differenced time series, Xt - Xt-1.
Let ARIMA model under consideration be (1,1,1)1 × (1,1,1)16 , making d = D
= 1, the Equation (3.18) can be rewritten as:
1 16
t t
16 17 1
t
Y (1 )(1 )X
(1 )X
B B
B B B
= − −
= − + − (3.20)
Expanding using the above convention
11716 −−− −+−= tttt XXXXtY (3.21)
In deriving the state space model for ARIMA, we need to represent
Xt as a linear combination of its past values. From Equation (3.21), it is
verifiable that
tttt YXXX +−+= −−− 17161tX (3.22)
The differenced process Yt is a multiplicative ARMA process
represented as ARMA (1,1)1 × (1,1)16. This can be represented as
tBBBB εθφ )1)(1()1)(1( 161161 Θ++=Φ−− (3.23)
where φ , Φ, θ and Θ are the coefficients of moving average and
autoregressive expression.
To make the above equation more manageable a new AR process
Zt is introduced as:
tBB εφ =Φ−−= tt ZZ )1)(1( 161 (3.24)
Now, rewrite Yt from Zt as
57
17161 −−− Θ+Θ++= tttt ZZZZ θθtY (3.25)
Equation (3.22) can be rewritten by substituting for Yt using Equation (3.25)
as
ttt ZXX += (3.26)
where 17161 −−− −+= tttt XXXX and 17161 −−− Θ+Θ++= ttttt ZZZZZ θθ (3.27)
Thus, the terms Xt-1, Xt-16, Xt-17, Zt-1, Zt-16, Zt-17, are enough to
express Xt. However as these terms are certain lags apart, all the terms Xt-1
through Xt-17 and Zt through Zt-17 are mandatory for properly representing Xt.
The Equation (3.26) above is the governing equation of the ARIMA and
hence the entire prediction process. It is used to generate the predicted frame
size sequence. The error is measured by taking the difference between the
actual value and the predicted value in each step.
3.2.2.2 Proposed methodology
The second segment of the work chiefly comprises of the
application of VSSNLMS over the result obtained in the first segment. Two
major modifications to the traditional NLMS have been proposed here.
Initially, a threshold is chosen for the prediction error based on the
characteristics of the input trace. This can ensure that the error values
obtained would not vary beyond the threshold in either direction, giving out a
lesser variance in error. This flexibility imparts a great deal of freedom
in network management while allocating resources, as the fluctuation of error
is curtailed by the threshold value. This has been shown schematically in
Figure 3.5.
58
Next the policy of using variable step-size in NLMS has been
adopted. Adaptive Linear Prediction is a rudimentary and renowned technique
in the fields of signal processing and forecasting.
Figure 3.5 Schematic Diagram of the Proposed Work
A plethora of experiments had happened where the researchers
center on ALP and pave the way for novel prediction methodologies that vary
in effectiveness, easiness, accuracy and so on. It is a linear prediction
mechanism where the predicted value can be expressed as a linear
combination of a certain number of past values of the same process.
3.2.2.3 LMS algorithm
This algorithm belongs to the class of adaptive linear predictors,
where the predictor coefficients (also called as weights) are updated at every
prediction step based on the prediction error.
The fundamental equation that governs the pth
order linear predictor
is expressed as:
( )∑=
−+ ==N
l
t
T
nlttNt XwXlwX0
ˆ
(3.28)
Step-Size-Adjusted ALP
Original trace
ARIMA trace
Threshold
Equation to compute ‘next’ µ
Algorithm to compute ‘next’ µ
Predicted process
59
where ( ) ( )( )T
ttt pwww 1,,.........0 −= denotes the prediction filter coefficient
vector which minimizes the mean square error. Parameter p indicates the
number of past values used for prediction.
Let NtNtt XXe −− −= ˆ be the prediction error at the tth
instant. The
new weights for the prediction at time instant ‘t+1’ is given by
W(t+1) = W(t) + � e(t)X(t) (3.29)
where � is the step-size and is usually a fixed value. Also ( ]2,0∈µ (Haykin
1991). It is fixed throughout the prediction process. But, it is difficult to
choose an optimal step-size. A higher step-size contributes to faster
convergence and poorer performance (i.e. high prediction error). Similarly, a
smaller step-size leads to slower convergence and provides better
performance.
It is proved that LMS will converge in the mean if � satisfies the
condition, 0<µ
1<
max
2
µ where �max is the eigenvalue of the autocorrelation
coefficient function.
The optimal weights for the LMS predictor can be obtained by
solving Weiner-Hopf equations, which requires the knowledge of
autocorrelation function of the whole trace (Haykin 1991). But for online
prediction, the whole trace is not known in advance. So, the weights obtained
give a theoretical minimum on the mean square error.
3.2.2.4 NLMS algorithm
Unlike LMS predictors, the NLMS predictors have a different
weights updating equation as given below. Starting with an initial estimate of
60
filter coefficient w, and for each new data point, the ALP method updates wt
using the recursive equation by
Wt+1 = Wt +2t
tt
X
e Xµ (3.30)
NLMS has its advantage over LMS in terms of its sensitivity to
step-size �, i.e. NLMS is relatively less sensitive to changes in �. Moreover,
NLMS will converge in the mean when 0<� <2 (Haykin 1991). A larger �
results in larger prediction error and similarly a smaller � results in smaller
prediction error. On the other hand, the convergence rate is high for larger
values of � and is low for smaller values of �.
During the process of updating the weight vector of generic NLMS,
the step-size parameter � would be generally fixed. This will be referred to as
FSSNLMS. But it has been verified in (Zhao et al 2002) that the usage of a
variable step-size (VSS) for the prediction of each value could result in better
prediction accuracy. Presented here is a variable step-size prediction
technique along with the incorporation of a threshold value to reduce the error
variance.
The NLMS mechanism is altered to include a VSS value at every
prediction step. To derive the value of the step-size to be used for predicting
the next frame size, a modified version of the variable step-size prediction
equation in (Zhao et al 2002) is used. The equation is as follows:
( )2
12
2
11 −+ ++= kkkk eqeqγαµµ (3.31)
where 21 ,, qandqγα are constants having values determined in (Zhao et al
2002) as 0.98, 0.015, 0.7 and 0.3 respectively.
61
The above equation does not work well for traces with large frame
sizes. Thus the above equation is modified as below.
( )
21
2
21
2
121
−
−
++
++=
kk
kk
kkee
eqeqγαµµ (3.32)
where, µ i and ei represent step-size and prediction error respectively for the ith
value in the trace. This change of equation can be explained by the
observation that the original equation is largely unfit for huge frame sizes,
generally present in high quality video frame size sequences. The step-size for
the (k+1)th
prediction instance (µ k+1) is computed using Equation (3.32), if
the prediction error in the kth
prediction instance is less than a predefined
threshold T. Otherwise, the value of µk+1 is determined by the algorithm given
in Figure 3.6.
Figure 3.6 Algorithm for Computing �k+1
3.2.2.5 Experimental results
A variety of analyses have been performed here for video traffic
with varying qualities for 10 minute traces taken from the movie Star Wars-
IV and NBC News. The two traces analyzed here are taken from videos of
different ilk; the movie Star Wars IV has rapid scene-changes and
Algorithm: Find_ µk+1
for (i = k; i > 0 ;i--)
if (ei < T)
{ ik µµ =+1
break;}
end for
62
insignificant correlations. On contrary, the NBC news possesses relatively
high degree of correlation. This establishes that the procedure explained here
is capable of processing and yielding good results regardless of the broad type
of video being used as input.
The general attributes of the input video traces are as follows. All
the traces being experimented here are encoded using H.264 or MPEG4. Each
of them is of duration 10 minutes and contains a total of 18000 frames. There
will be 30 frames every second. The GOP pattern is G16B1 and CIF
resolution is 352 ×288.
The frame size statistics of the individual input video traces used
for analysis is summarized in Table 3.5.
Table 3.5 Frame Size Statistics of Input Video Traces
Name Quality Max
Frame size
Min
Frame Size
Avg Frame
Size Burstiness
Low 13344 168 699.58 19.06
Medium 82152 168 6332.6 12.97 Star
Wars IV High 476912 19600 111352 4.28
Low 19024 168 1553.7 12.25
Medium 140040 168 19227 7.28 NBC-
News High 706032 61056 228174 11.56
A comparison of performance for the different prediction schemes
of traditional FSSNLMS, ARIMA and ARIMA augmented by VSSNLMS, is
carried out using RMSE as the parameter. The results obtained are
summarized in Table 3.6.
It can be observed from the table that, combining ARIMA with
variable step-size NLMS is performing the best for all of the input samples.
63
Also, ARIMA stands superior to the traditional FSSNLMS. So, generically
the method presented here, i.e., ARIMA incorporated with VSSNLMS, can be
learned to perform well for input traces of substantial differences in quality.
As evident from Table 3.6, the average performance improvement of the
proposed method over FSSNLMS is about 26% and over standard ARIMA it
Table 3.6 Performance Analysis by RMSE
RMSE % Percentage of Performance
Improved
Inp
ut
tra
ce
Qu
ali
ty
FSSNLMS ARIMA
ARIMA
augmented
with
VSSNLMS
ARIMA
over
FSSNLMS
ARIMA
augmented
with
VSSNLMS
over
FSSNLMS
ARIMA
augmented
with
VSSNLMS
over
ARIMA
Low 4.64 3.87 3.13 16.59 32.54 23.64
Medium 9.21 8.29 6.47 9.98 18.89 10.97
Star
Wars
IV High 14.29 11.76 10.40 17.70 27.22 13.07
Low 4.83 3.75 3.24 22.36 32.91 15.74
Medium 7.87 6.72 5.12 14.61 29.86 21.73 NBC -
News High 13.98 12.96 10.12 7.29 20.45 16.54
Average performance improvement 14.75 26.97 16.94
is about 16%. When the traces with extremely opposing characteristics (Star
Wars-IV and NBC News) were tested with the method, the results were found
to be quite satisfactory, based on which it can be justified that the proposed
method performs relatively better and can handle traffic of any genre
efficiently.
A detailed analysis is performed to verify how striking the feature
of having a provision to specify the threshold to make the prediction error
consistent. Performance analysis is done using the error variance as a
measure. Right through the analysis, the error threshold is kept as 10 percent
of the average frame size. It has been observed that the method described here
gives the minimum variance in prediction error, and outperforms ARIMA and
64
traditional FSSNLMS methods. This has been quantitatively portrayed in
Table 3.7. The best case average performance improvement is about 26%.
Table 3.7 Performance Analysis by Error Variance
Variance Percentage of Performance
Improved
Inp
ut
trace
Qu
ali
ty
FSSNLMS ARIMA
ARIMA
with
VSSNLMS
ARIMA
over
FSSNLMS
ARIMA
with
VSSNLMS
over
FSSNLMS
ARIMA
with
VSSNLMS
over
ARIMA
Low 14517.414 12026.137 10023.76 17.16 30.95 16.65
Medium 56156.529 50700.641 45510.812 9.72 18.96 10.24
Star
Wars
IV High 181345.601 162231.823 137691.16 10.54 24.07 15.13
Low 11610.331 9467.55 7935.801 18.46 31.65 16.18
Medium 13007.78 12752.034 10401.548 1.97 20.04 18.43 NBC-
News High 186534.925 156091.6 120225.71 16.32 35.55 22.98
Average Performance improvement 12.36 26.87 16.60
Figure 3.7 below is a depiction of the prediction error of a high
quality video trace of the movie Star Wars-IV, obtained when experimented
with the proposed method. The threshold value fixed was 20000 and it can be
easily observed that the fluctuation in error is limited by the threshold.
Figure 3.7 Fluctuation of Error (High Quality – Star Wars-IV)
65
3.3 SUMMARY
Some simple and accurate predictors for real-time MPEG-4 GOP
traffic prediction have been implemented and their performances have been
thoroughly analyzed. Performance analysis of the proposed methods is carried
out using RPE. Through extensive simulation results, it is shown that, that the
proposed predictors outperform the standard NLMS predictor in online
prediction. To Justify and validate the statements made earlier, a wide variety
of experiments were conducted on traces of different qualities and types. The
proposed techniques are both simple and accurate, and suitable for the real-
time prediction. The average performance improvement achieved over the
baseline predictor is upto 11%.
An ARIMA based mechanism augmented by VSSNLMS for the
prediction of VBR video traffic was devised and implemented. Variable Step
Size for NLMS has been implemented using a modified equation. Upon
evaluating the performance using RMSE, an improvement of 16% over
standard ARIMA was observed. Also, the error variance was reduced by 26%.
The methodology was justified through a detailed analysis of different
qualities of VBR video traces taken from two different video files. The
provision of specifying a threshold value to limit the error fluctuation
effectively controls the variation in prediction errors. Here, the threshold
value was fixed empirically based on the characteristics of the input trace.