[IEEE 2013 International Carnahan Conference on Security Technology (ICCST) - Medellin, Colombia...

Influence of HEVC Compression on Event Detection in Security Video Sequences

Stanislav Vitek, Lukas Krasula, Milos Klima, Vojtech Hvezda

Czech Technical University in Prague

Marcelo Herrera Martinez

Universidad de san Buenaventura

Carrera 8H, Bogota, Colombia

mherrera@usbbog.edu.co

Technicka 2, 166 27 Prague 6, Czech Republic

viteks,krasuluk,klima@jel.cvut.cz

Abstract-In this paper, the impact of the degree of compression by High Efficiency Video Coding (REVC) on observers ability of detecting certain events in videos obtained by outdoor CCTV cameras is studied. This was done by extensive subjective testing. The testing was performed also for H.264IMPEG-4 Part 10 AVC compression on similar bitrates and the capabilities of detection were compared and thus the superiority of REVC was proven. A lso the threshold for encoder setting, where the increase of bit rate or decrease of quantization parameter (QP) do not improve observers ability to detect the events, was proposed.

Keywords-Closed Circuit Television (CCTV), Event Detection, High Efficiency Video Coding (REVC), Security, Video Compression, H.264IMPEG-4 Part 10 AVC

I. INTRODUCTION

In recent years the importance of Closed Circuit Televi

sion (CCTV) has significantly grown and it is being used

for various purposes. Its first generation was analogue and

recorded on the video cassettes. That had big demands on

storing space and the tapes also had to be manually changed

which was vulnerable to mistakes (e.g. overwriting). The

true potential of CCTV could not be utilized until the era of

digitalization which brought brand new possibilities. Record

ings could be stored on hard drives, the resolution was no

longer strictly limited, searching through the data became

much faster and easier, the system could be connected to

the internet etc.

This new advantages led to development of so called

smart CCTV [1]. These systems are able to analyze the

video and decide if the event is unusual, unexpected or

possible dangerous, and notify the operator. Demands on the

concentration are thus significantly lowered. Smart CCTV

systems can focus on various events. Nam and Tewfik in

their paper [2] proposed a system for detection of tampering,

methods for detecting human activity from the video were

proposed by Lv et al. [3] or Ribeiro and Santos-Victor [5],

and Kettnaker and Zabih developed an algorithm for count

ing people from records obtained by multiple cameras [5].

Some systems are also able to detect more of these events

at the same time (e.g. system proposed by Nam et al. [6]).

The biggest problem of CCTV videos is their quality and

the way how to evaluate it. The traditional ways to evaluate

the quality of video are not sufficient because in security

applications it is dependent on the expected outcome. It

is not necessary for the video to be of good subjectively

perceived quality if the quality is sufficient enough for the

specific task (e.g. face identification, car plate recognition,

event detection etc.).

The main aspects influencing the quality are the lighting

of the scene, the camera, video compression, quality of

transmission channel, and the display. All of them were

investigated in detail by Smith et al. [7]. In the scope of

this paper, only the compression is taken into account.

The choice of the compression is very important because

security cameras record mostly for very long time period

which means the huge amount of data to be stored. For

that reason CCTV recordings are often heavily compressed

and that brings massive quality degradations. Some of the

compression techniques, used in the field of security, were

evaluated by Klima et al. [8]. The impact of compression

of CCTV videos on the ability to identify the person's face

was investigated by Kovesi [9] or Keval and Sasse [10],

who proposes the use of DCT-based compression over

wavelet-based for these purposes. Kovesi also points out

that the color information is distorted due to quantization

and therefore the importance of pigmentation is lost.

So far the most advanced compression technique used

was H.264/MPEG-4 Part lO [11]. In this paper authors

investigate the influence of High Efficiency Video Coding

(HEVC) compression standard, developed by Joint Collab

orative Team on Video Coding (JCT-VC), on the event

detection in CCTV videos by extensive subjective tests. The

superiority of HEVC over H.264 was verified by Hanhart et

al. [12].

The paper is organized as follows: Section II describes the

subjective tests procedure, Section III is about used dataset,

test environment and equipment is discussed in Section IV,

Section V shows the results and Section VI concludes the

paper.

II. TESTING PROCEDURE

A. Subjects

32 observers participated in subjective test. All the partic

ipants were students from Faculty of Electrical Engineering,

Czech Technical University in Prague. Most of the students

were male. Every subject had normal or corrected to normal

vision. None of them had previous experience with security

video surveillance but each had experience with watching

videos of various degree of compression.

Untrained non-professional observers could be used be

cause CCTV surveillance systems are no longer exclusively

the matter of police and professional security companies.

The current cost of these systems makes them available

even for the private owners and the number of sold security

equipment grows every year.

Plenty of CCTV videos are also available online (e.g.

some towns and cities provide the security camera record

ings on their websites 2417). US web users are for example

encouraged to watch the Texas-Mexican borders via their

web browsers and report eventual suspicious situations I.

Similar example can be found in east London, where sub

scribers to the community safety channel are able to view

the digital CCTV video on their televisions2.

For this reasons, it is necessary to conduct subjective

tests not only with trained professionals but even with naive

observers.

B. Procedure

There is a plenty of methodologies for subjective testing

standardized. Probably the most important ones, like Double

Stimulus Impairment Scale (DSIS) or Double Stimulus

Continuous Quality Scale (DSCQS) and many others, can be

found in ITU-R Recommendation BT.500-12 [13]. However,

all of these procedures are defined for the assessment of

perceived quality in a classical way. For the purpose of this

paper a different method was needed.

Before the beginning of the test the introductory pre

sentation was given, in which the purpose of the study

was explained, parameters of the test and the procedure

were summarized and some important notifications, what is

allowed and prohibited, were stressed. Every observer was

supposed to watch 20 videos. The duration of every video

was 10 seconds. All of the subjects were given an evaluating

sheet, according to a group they were assigned to, with 20

situation descriptions and following questions.

Before watching the video, participant was supposed to

read the description of the situation. Typically, the person

requiring observers attention was described (e.g. "person in

a black coat crossing the road"). Then subject watched the

video and after that had to answer few questions (e.g. "is the

person carrying something"). In this way they were supposed

to answer all of the questions to all the videos.

Subjects were not allowed to watch videos more than once

and to stop the videos. They were also instructed not to

guess and enter the answer only when they were sure about

I BBC News, "Web users to patrol US border," http;//news.bbc.co.uk! llhi/worl d/ameri cas/ 50403 72. stm (2006).

2BBC News, "Rights group criticises Asbo TV;' http://news.bbc.co.ukll/hi/england/london/4597990.stm (2006).

it. Otherwise they were supposed to mark the situation as

undetectable.

The duration of the test was about ten minutes, there

fore no problems with dropping observers attention should

emerge.

C. Test Parameters

Authors had chosen 20 videos from security surveillance

system and 5 degrees of compression for both H.264 and

HEVC. That gives a total of 200 videos. Every participant

was however supposed to see every video only once not to

be biased. Observers were therefore divided into 10 groups.

Input videos for every group were assigned randomly and the

order of videos in the test was also randomized. Which video

was assigned to which group can be seen Table I. The bold

numbers denote the number of the video and capitals stand

for the compression. Letters A - E represent the degrees of

compression for HEVC with decreasing quality (for more

information about the compressions used, refer to Section

III). F - J means the same for H.264IMPEG-4. Coordinates

lA therefore mean 1st video compressed by HEVC with

lowest Quantization Parameter. This video was evaluated by

group number 5 in this experiment.

Table I SERIAL NUMBERS OF GROUPS To EVALUATE VIDEOS

A B C D E F G H I J

1 5 1 3 10 9 6 4 8 7 2

2 6 4 3 7 9 10 1 5 2 8

3 1 3 9 2 6 7 5 8 4 10

4 2 5 7 3 8 9 1 10 4 6

5 3 2 5 4 8 6 9 1 7 10

6 6 2 8 3 1 10 9 7 5 4

7 10 5 1 3 9 4 7 2 6 8

8 10 6 3 4 1 2 7 8 9 5

9 6 2 10 1 4 8 5 3 7 9

10 7 5 4 1 2 8 10 6 9 3

11 2 6 1 8 3 4 10 7 5 9

12 3 6 5 1 7 8 4 2 10 9

13 6 4 2 8 3 9 10 7 1 5

14 1 2 4 9 8 5 10 6 7 3

15 4 6 7 3 2 1 5 9 8 10

16 9 10 8 7 4 5 2 1 6 3

17 1 9 4 3 2 5 7 10 6 8

18 10 4 2 3 6 1 7 8 9 5

19 5 6 9 2 8 4 7 1 10 3

20 8 5 10 9 7 6 4 2 3 1

III. DATASET

As stated above, 20 videos from CCTV surveillance

system were used. The recordings were obtained within the

joint project of the TNO Physics and Electronics Laboratory

(TNO-FEL) in Hague and Czech Technical University in

Prague. In this project the effectiveness of CCTV was

studied [14].

Every video was 10 seconds long and contained certain

event subjects were supposed to detect. They were typically

asked about the number of people riding a bicycle, to decide

if the person is carrying something, if they are able to read

a number on the bus or tram, how many people are getting

out of the car and so on.

The original videos were of resolution 768x576 pixels,

25 frames per second and with YUV 4:2:0 color sampling.

They were than compressed by HEVC and H.264IMPEG-

4 compressions. HM 9.0 encoder3 was used for HEVC

compression and JM 18.4 encoder4 for H.264. The Group

of Pictures was set to 4 and Intra Period to one frame for

both encoders. The coding order was 0 1 2 3 4. For HEVC

authors decided to use the Low Delay (LD) configuration

over Random Access (RA) configuration. More information

about the encoder settings could be found in Table II.

Table II DETAILED SETTINGS FOR BOTH ENCODERS

Codec AVCIH.264 HEVC/H.265

Encoder 1M 18.4 HM 9.0

Profile Main Main

Reference Frames 4 4

RID Optimization on on

Motion Estimation EPZS EPZS

Weighted Prediction off

Search Range 64 64

Group of Pictures 4 4

Hierarchical Encoding on on

Temporal Levels 4 4

Intra Period 1 frame 1 frame

Deblocking off off

Rate Control on off

8x8 Transform on

Adaptive Loop Filter off

Coding Unit size I depth 64 I 4

Transform Unit size min I max 4 I 32

5 different degrees of HEVC compression were applied

on videos. The difference was set by different Quantization

Parameters (QP) therefore no rate control was used. The QPs

were 37, 42, 46, 49 and 5 l. The column marked as A in I

stands for videos compressed by HEVC with QP = 37, B

with QP = 42 etc.

To compare the performance of both compressions, av

erage bit rates were calculated from the outcome of the

encoder. The specific values are stated in Table III. These

values were used in the H.264 encoder to create videos

with similar bit rates as by HEVC. Here the rate control

3 https:llhevc.hhi. fraunho fer. de/s vn/s vn_HE V CSo ftware/branches/HM-9.0-devl

4http://iphome.hhi.de/suehringltml

Figure 1. Multimedia Technology Groups post-processing lab.

was employed. The videos in column F in I are therefore

compressed by H.264 with expected bit rate 46.7 kbps, G

with 25.9 etc.

Table III AVERAGE BIT RATES FOR VIDEOS COMPRESSED BY HEVC WITH

DIFFERENT QP

Quantization Parameter Average Bit Rate (kbps)

37 46.7

42 25.9

46 15.7

49 1�8

51 8.5

In some cases, the H.264 encoder was not able to set the

bit rate correctly. This was mostly the case of two lowest

expected bit rates (10.8 kbps and 8.5 kbps, respectively)

where the compression is really massive and it is almost

impossible for the H.264 encoder to achieve these bit rates.

IV. ENVIRONMENT AND EQUIPMENT

The subjective tests were conducted in Multimedia Tech

nology Group's5 post-processing lab at Faculty of Elec

trical Engineering, Czech Technical University in Prague.

The laboratory organization can be seen in Figure l. Ten

workspaces were available for evaluation. Every workspace

was equipped with color calibrated LCD display. The reso

lution of the screens was 1600x 1200 pixels.

No special software for displaying videos was used. All

the content was played using ordinary Windows Media

Player.

V. RESULTS

Considering that most of the questions in the sheet had

three parts, the maximum score for every video was 3. If

the participant answered every part of the question correctly

(that means was able to reliably detect and recognize the

5http://www.multimediatech.cz

, , Compre;s;onmethodx(-)

Figure 3. Two-sampled right-tailed t-test results.

event) his score for the question was 3. Therefore result of

every question could be 3, 2, I or O.

The processing of results was done according to ITU-R

Recommendation BT.500-12 [13]. Mean scores for every ap

plied compression with corresponding confidence intervals

at level of significance 0.05 can be found in Figure 2.

For the proper comparison of the influence of com

pressions, two-sample right-tailed t-test was employed. Its

function is to decide if the scores for different compressions

are from the same distribution (i.e. if the event detection

capability of observers is the same). The t-test results are

visualized in Figure 3. In cases when detection in videos

compressed by method on y axis was statistically signifi

cantly more successful than in videos compressed by method

on x axis, the particular square is white. Otherwise it is

black.

The average percentage of successful detection for each

compression method can be found in Table IV.

Table IV AVERAGE PERCENTAGE OF SUCCESFUL DETECTION

Compression Method6 Average Percentage (%)

A 88.5

B 88.3

C 70.1

D 5l.3

E 54.8

F 74.2

G 61.8

H 3l.9

The results show several interesting things. First important

thing is that detection in videos compressed by method

B (HEVC with QP = 42, average bit rate 25.9 kbps) is

more suitable for the detection than method F (H.264 with

6 A - E: HEVC from best quality to worse; F - J: H.264/MPEG-4 from best quality to worse

expected bit rate 46.7). That proofs that videos compressed

by HEVC are better for the detection than videos of almost

double the bit rate when compressed by H.264. This hypoth

esis was not always confirmed (C did not outperform G, D

did not outperform I and J). This is probably caused by the

extension of confidence intervals on these lower bit rates

which is the consequence of massive artifacts complicating

the reliable detection. Some events were easier to detect than

the others and this difference is much more significant when

the videos are heavily distorted.

The other and maybe even more important outcome of

the study is that there is no significant difference between

detection capabilities when using methods A and B. That

means that using HEVC with QP lower than 42 (or bit

rates higher than 25.9) does not improve the quality of event

detection and is thus redundant for the CCTV surveillance

purposes.

The quality indexes for videos no. 2 and 12 measured by

objective video quality metrics and actual bit rates of particu

lar videos are stated in V and VI, respectively. A MATLAB

based framework developed by Murthy [15] was employed

to assess the quality by four criteria - VQM [16], Averaged

PSNR, Averaged SSIM [17] and Averaged VSNR [18]. Note

that unlike the other metrics, VQM decreases with better

quality. Also the H.264 encoder in case of video 12 reached

the same bit rates for rate control set as 10.8 and 8.5.

VI. CONCLUSION

In this work authors studied the suitability of HEVC

compression for CCTV surveillance systems purposes. 20

video sequences obtained by outdoor security camera were

compressed using 5 different settings of parameters. The

same videos were also compressed by H.264IMPEG-4 Part

10 AVC with comparable bit rates. These video sequences

were than shown to 32 observers. They were asked to detect

particular events in the videos.

The results showed that, in most cases, HEVC enables

more reliable detection than H.264 even with half the bit

rate necessary.

It was also proven that bit rates higher than 26 kbps do

not improve observers' ability to detect the events and this

bit rate is therefore sufficient for security video compressed

by HEVC. This bit rate was obtained with QP set to 42.

ACKNOWLED GMENT

This work was supported by the grant No. Pi021l01l320

Research and modeling of advanced methods of image qual

ity evaluation of the Grant Agency of the Czech Republic.

REFERENCES

[1] C. Held, J. Krumm, P. Markel and R. P. Schenke, "Intelligent Video Surveillance," in Computer, Vol. 45, No. 3, pp. 83-84, March 2012.

2_5 ++ ,--r-

r--,---

D E F G Compression method (-)

c:::::J KEVC

c:::::J H.264

r-- -r-

Figure 2. The subjective tests results with confidence intervals on significance level 0.05.

Table V OBJECTIVE QUALITY METRICS RESULTS AND ACTUAL BIT RATES FOR VIDEO No. 2

Compression Method VQM Averaged PSNR Averaged SSIM Averaged VSNR Actual Bit Rate (kbps)

A 0.4720 33.3712 0.9299 25.5473 44.0960

B 0.6271 30.2172 0.8844 20.6540 25.0996

C 0.7796 27.4695 0.8221 16.6779 15.4718

D 0.9068 25.6711 0.7641 14.1261 10.8579

E 0.9678 24.4582 0.7222 12.5501 8.6315

F 0.7698 25.8306 0.8727 17.6400 51.0100

G 0.9004 25.0564 0.8272 15.6801 27.8400

H 1.0019 24.1259 0.7807 13.6564 17.1800

I 1.0154 23.6163 0.7626 12.8200 12.9000

J 1.0172 23.5519 0.7591 12.6695 12.4000

Table VI OBJECTIVE QUALITY METRICS RESULTS AND ACTUAL BIT RATES FOR VIDEO No. 12

Compression Method VQM Averaged PSNR Averaged SSIM Averaged VSNR Actual Bit Rate (kbps)

A 0.4664 33.6295

B 0.6382 30.3134

C 0.7934 27.6619

D 0.9027 25.7214

E 0.9733 24.6455

F 0.6944 29.5971

G 0.8257 27.2824

H 0.9144 25.7122

I 0.9576 24.6300

0.9576 24.6300

[2] J. Nam and A. H Tewfik, "Detection of gradual transitions in video sequences using B-spline interpolation," in IEEE Transactions on Multimedia, Vol. 7, No. 4, pp. 667-679, August 2005.

[3] F. Lv, J. Kang, R. Nevatia, I. Cohen and G. Medioni, "Au-

0.9323 25.6088 43.4193

0.8832 20.3115 23.9281

0.8217 16.4305 14.3311

0.7626 13.8593 9.91111

0.7212 12.3980 7.91037

0.8710 19.9469 48.2200

0.8270 16.4995 28.8000

0.7893

0.7641

14.3492 16.5000

13.0417 11.7600

tomatic tracking and labeling of human activities in a video sequence," in Proceedings of the 6th IEEE International Workshop on Performance Evaluation of Tracking and Surveilance, 2004.

[4] P. C. Ribeiro, J. Santos-Victor, "Human activity recognition

from video: modeling, feature selection and classification architecture," in International Workshop on Human Activity Recognition and Modeling, pp. 61-70, 2005.

[5] V. Kettnaker and R. Zabih, "Counting people from multiple cameras," in IEEE International Conference on Multimedia Computing and Systems, Vol. 2, pp. 267-271, July 1999.

[6] Y. Nam, S. Rho and J. H. Park, "Intelligent video surveillance system: 3-tier context-aware surveillance system with metadata," in Multimedia Tools and Applications, Vol. 57, No. 2, pp. 315-334, March 2012.

[7] R. A. Smith, K. MacLennan-Brown, J. F. Tighe, N. Cohen, S. Triantaphillidou and L. W. MacDonald, "Colour analysis and verification of CCTV images under different lighting conditions," in Image Quality and System Performance, Proc. SPIE, Vol. 6808, 2008.

[8] M. Klima and K. Fliegel, "Image compression techniques in the field of security technology: examples and discussion," in 38th Annual 2004 International Carnahan Conference on Security Technology, pp. 78-284, October 2004.

[9] P. Kovesi, "Video Surveillance: Legally Blind?" in Digital Image Computing: Techniques and Applications (DICTA 2009), pp. 204-211, 2009.

[10] H. U. Keval and M. A. Sasse, "Can we ID from CCTV: image quality in digital CCTV and face identification performance," in Mobile Multimedia/Image Processing, Security, and Applications, Proc. SPIE, Vol. 6982, 2008.

[11] ISO, "Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding," in Tech. Rep. ISOIIEC 14496-10:2005, ISOIIEC, 2005.

[12] P. Hanhart, M. Rerabek, F. De Simone and T. Ebrahimi, "Subjective quality evaluation of the upcoming HEVC video compression standard," in Applications of Digital Image Processing XXXV, Proc. SPIE, Vol. 8499, 2012.

[13] ITU-R Recommendation BT.500-12, "Methodology for the subjective assessment of the quality of television pictures,", September 2009.

[14] G. van Voorthuijsen, H. van Hoof, M. Klima, K. Roubik, M. Bernas, et aI. , "CCTV Effectiveness Study," in Proc. of 39 IEEE ICCST, Piscataway: IEEE, pp. 105-108, 2005.

[15] A. V. Murthy and L. J. Karam, "A MATLAB-basedframework for image and video quality quality evaluation," in Proceedings QoMEX 2010, 2010.

[16] M. Pinson, S. Wolf, "A new standardized method for objectively measuring video quality," in IEEE Transactions on Broadcasting, Vol. 50, No. 3, pp. 312-446, September 2004.

[17] Z. Wang et aI. , "Image quality assessment: From error visibility to structural similarity," in IEEE Transactions on Image Processing, Vol. 13, No. 4, pp. 600-612, April 2004.

[18] D. M. Chandler and S. S. Hemami, "VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images," in IEEE Transactions on Image Processing, Vol. 16, No. 9, pp. 2284-2298, September 2007.

Stanislav V itek graduated at the Czech Technical Uni

versity in Prague in 2002, PhD in 2008. Recently he is

an assistant professor with Dept. of Radioelectronics at the

Faculty of Electrical Enginnering at the Czech Technical

University in Prague. His main research interests are as

sistive technologies, multimedia processing and database

systems.

Lukas Krasula graduated at the Czech Technical Uni

versity in Prague in 2013. Currently he is a Ph.D student at

the Faculty of Electrical Engineering at the Czech Technical

University in Prague. His research interests are oriented to

image processing and image compression for security and

multimedia applied imaging systems.

Milos Klima graduated at the Czech Technical University

in Prague in 1974, PhD in 1978. He is a full professor from

2000. Recently he is the head of Dept. of radioelectron

ics at the Faculty of Electrical Engineering at the Czech

Technical University in Prague and the leader of Multimedia

Technology Group. His research interests are oriented to the

image sensing, image processing and image compression for

security and multimedia applied imaging systems. He has

participated at the ICCST since 1991.

Vojtech Hvezda graduated at Czech Technical University

in Prague in 2013. His research interests are oriented to

image processing and image compression for security and

multimedia applied imaging systems.

Marcelo Herrera Martinez graduated at the Czech

Technical University in Prague in 2003, PhD in 2010. His

research topics are Psychoacoustics, Noise Control, and

Digital Signal Processing (specially the fields of Perceptual

Compression and Broadcasting Technology). At the present

time he is a full-time research professor at Universidad de

San Buenaventura-Bogota.

[IEEE 2013 International Carnahan Conference on Security Technology (ICCST) - Medellin, Colombia...

Documents

Juandon ramiro power medellin

Neighborhood Robledo Diamante-Medellin

Economic Impact of Peacekeeping Michael Carnahan

Medellin river

2014 awc2 medellin

Urban Argiculture Medellin

CARNAHAN BAYOU AQUIFER SUMMARY, 2007deq.louisiana.gov/.../07CarnahanBayouAquiferSummary09.pdf · carnahan bayou aquifer summary, 2007 aquifer sampling and assessment program appendix

Medellin 2018 English March

Hostel in medellin Palm Tree Hostal Medellin

Numerical Numerical Methods Carnahan

Booklet of 3rd ICCST 2014

Medellin Petition - SCOTUSblog

Transformaci Ones Urbanas Medellin

Culture Fit Medellin

Homeland Security An Overview and Cybersecurity …ewh.ieee.org/conf/iccst/Conferences/2012_Boston/ICCST 2012 Maughan.pdfEnvironment: Greater Use of Technology, More Threats, Less

Medellin and Originalism

Anywhere access-medellin-brochure

PROGRAM - IEEE Entity Web Hostingewh.ieee.org/conf/iccst/Conferences/2012_Boston/ICCST 2012 Progra… · Sandy Burr Country Club, Wayland, Massachusetts 18:00 – 21:00 Registration

Sin título-1 › sites › default › files › ... · camarade comercio de medellin para antioquia . camarade comercio de medellin para antioquia . camarade comercio de medellin

A. S. J. Carnahan Papers (C2539)