10
Nonstationarities in teletraffic data which may spoil your statistical tests Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO)

Nonstationarities in teletraffic data which may spoil your statistical tests

Embed Size (px)

DESCRIPTION

Nonstationarities in teletraffic data which may spoil your statistical tests. Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO). Stationarity. Many models assume stationarity: statistical properties do not change over time - PowerPoint PPT Presentation

Citation preview

Page 1: Nonstationarities in teletraffic data which may spoil your statistical tests

Nonstationarities in teletraffic data which may spoil your

statistical tests

Piotr Żuraniewski (UvA/TNO/AGH)

Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO)

Page 2: Nonstationarities in teletraffic data which may spoil your statistical tests

Stationarity

• Many models assume stationarity: statistical properties do not change over time– strong stationarity: all statistical properties

remain the same over time– weak stationarity: statistical properties up to

second order (mean, variance, covariance) remain unchanged

Page 3: Nonstationarities in teletraffic data which may spoil your statistical tests

Nonstationarity – problems

• Real life: things are changing…• Bad news: sample stationarity can not be

positively verified• Best answer we can get: ‘we found no

evidence of given type of nonstationarity’• Some examples:

– mean shift– polynomial deterministic trend– variance change

Page 4: Nonstationarities in teletraffic data which may spoil your statistical tests

Example

• Change in the number of users in VoIP system

• Model: load change in M/G/inf queue

• Sample ACF suggests very high correlation– slow decay?– long range

dependency?

0 50 100 150 200250

300

350

400

450

time

no.

of u

sers

0 5 10 15 20-0.2

0

0.2

0.4

0.6

0.8

lag

sam

ple

AC

F

Page 5: Nonstationarities in teletraffic data which may spoil your statistical tests

Example

0 50 100 150 200250

300

350

400

450

time

no.

of u

sers

0 5 10 15 20-0.2

0

0.2

0.4

0.6

0.8

lag

sam

ple

AC

F

• Changepoint detection procedure we developed allows to separate parts with different load

• There is no significant correlation in either of this parts

• Sample ACF does not estimate ACF in case of nonstationarity

0 5 10 15 20-0.2

0

0.2

0.4

0.6

0.8

lag

sam

ple

AC

F

Page 6: Nonstationarities in teletraffic data which may spoil your statistical tests

Changepoint detection

• Window of 50 samples presented to detection procedure

• Add newest observation, drop oldest and repeat detection procedure

• In this example: true change in window number 51

• Changepoint detection works well – see output of 500 experiments

0 50 100 1500

0.2

0.4

0.6

0.8

1

window no.

dete

ctio

n ra

tio

0 50 100 150 200250

300

350

400

450

time

no.

of u

sers

Page 7: Nonstationarities in teletraffic data which may spoil your statistical tests

Changepoint detection

• However, if we add deterministic trend, things go wrong

• Observe high false alarm ratio after polluting data with trend

0 50 100 150 200250

300

350

400

450

500

time0 50 100 150 200

025

0 50 100 1500

0.2

0.4

0.6

0.8

1

window no.

dete

ctio

n ra

tio

Page 8: Nonstationarities in teletraffic data which may spoil your statistical tests

Work in progress

• Real VoIP data from Italian service provider and aggregated IP data from Spanish university backbone network

• Current research: estimate and remove trend from traffic

• Only than apply changepoint detection procedure(s)

1.2912 1.2914 1.2916 1.2918 1.292 1.2922 1.2924 1.2926 1.2928 1.293

x 109

0

100

200

300

400

500

600

700

800

900

Page 9: Nonstationarities in teletraffic data which may spoil your statistical tests

Work in progress

• Trend estimation methods:– moving average?– kernel/wavelets smoothing?– parametric methods?– time series regression?

• How to judge if estimated trend is really significant?

• Models different than M/G/inf?

Page 10: Nonstationarities in teletraffic data which may spoil your statistical tests

Conclusions

• Different types of nonstationarities may severely influence statistical tests or values of estimators

• Even if we try to detect one type of nonstationarity, the other type may ruin our original test

• We always have to pay attention to the assumptions of the theorems used

• Share your experience!