3
Security Evolution of the Webkit Browser Engine Ren´ ata Hodov´ an Department of Software Engineering University of Szeged Dugonics t´ er 13., 6720 Szeged, Hungary [email protected] ´ Akos Kiss * Department of Software Engineering University of Szeged Dugonics t´ er 13., 6720 Szeged, Hungary [email protected] Abstract—By now, the Web has matured into a full-scale application platform and its popularity is constantly rising. However, popularity comes at a cost: the Web is becoming more and more of a target for attackers. In this paper, we argue that the security analysis of web browsers deserves attention and the data on their security evolution is of high value. We examine the security evolution of the widely used WebKit browser engine, investigate its historical security data, and point at the alarming trends. Keywords-Security; web browser; evolution I. I NTRODUCTION By now, the Web has matured from a distributed document storage system, what it was in the beginning, into a full- scale application platform. Its popularity as a platform is constantly rising [1] and is comparable to the desktop domain. This is all due to scripting, i.e., the extension of both the server-side and the client-side with code execution capa- bility. Scripting allowed the generation of dynamic content and smooth user interaction. Today, the “web programming languages” like JavaScript and PHP are among the most popular and widely used languages [2]. However, such popularity comes at a cost: the Web is becoming more and more of a target for malicious attackers. Thus, its security is of major concern. In fact, since the mid-2005, more security vulnerabilities have been reported for the Web than for the desktop domain [3]. When the Web and its security is discussed, focus mostly falls on the security of web applications [4], or the applica- tions and the browsers are handled as one joint category [3] (unlike in the desktop domain, where the applications and the operating system are discussed separately). However, similarly to the web applications that are the analogous counterparts of user-space desktop applications, browsers and browser engines correspond to operating systems and the kernel [5]. Thus, we argue that the security of web browsers deserves to be analysed separately as well. In this paper, we take a look at the security evolution of the widely used web browser engine, WebKit [6]. We investigate historical security data and analyse the trends. The rest of the paper is organised as follows: In Sec- tion II, we introduce the WebKit project and the trends of its development. In Section III, we analyse the security 2007M12 2008M01 2008M02 2008M03 2008M04 2008M05 2008M06 2008M07 2008M08 2008M09 2008M10 2008M11 2008M12 2009M01 2009M02 2009M03 2009M04 2009M05 2009M06 2009M07 2009M08 2009M09 2009M10 2009M11 2009M12 2010M01 2010M02 2010M03 2010M04 2010M05 2010M06 2010M07 2010M08 2010M09 2010M10 2010M11 2010M12 2011M01 2011M02 2011M03 2011M04 2011M05 2011M06 2011M07 2011M08 2011M09 2011M10 2011M11 2011M12 2012M01 2012M02 2012M03 2012M04 0 500000 1000000 1500000 2000000 2500000 Lines of code Figure 1. The change of the code size of WebKit over time. evolution of the browser engine by mining several databases. In Section IV, we give a brief overview of the related publications. Finally, in Section V, we summarize the paper and outline our plans for future work. II. DEVELOPMENT TRENDS We have chosen WebKit [6] as the target of our analysis since it is the most widespread browser engine today. It powers several desktop browsers (Apple Safari and Google Chrome among others) and mobiles as well (like those shipped with iPhone, Android, and MeeGo phones). Accord- ing to the April 2012 statistics of StatCounter [7], WebKit holds more than 38% of the browser market share. Although not all of the above mentioned browsers are developed in the public, their core, the WebKit browser engine is an open source project. Thus, we can access its code and bug repositories, and analyse the rate of its development. Figures 1 and 2 show ever-increasing trends. At the end of 2007, the size of code in WebKit was cca. 680 KLOC (thousand lines of code), while by April 2012, it became almost 2.3 MLOC, which is a more than 3-fold increase. Not only is the code constantly growing but it is changing faster and faster. In March 2012, the revision number of the code repository was incremented by more than 3000, which means that the code was modified in every 15 minutes on average. III. SECURITY EVOLUTION One might naively assume that the quality (and thus, the security) of software which has been developed for a long time – and the development of WebKit started more than 17 978-1-4673-3055-8/12/$31.00 © 2012 IEEE

[IEEE 2012 IEEE 14th International Symposium on Web Systems Evolution (WSE) - Trento, Italy (2012.09.28-2012.09.28)] 2012 14th IEEE International Symposium on Web Systems Evolution

  • Upload
    akos

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2012 IEEE 14th International Symposium on Web Systems Evolution (WSE) - Trento, Italy (2012.09.28-2012.09.28)] 2012 14th IEEE International Symposium on Web Systems Evolution

Security Evolution of the Webkit Browser Engine

Renata Hodovan

Department of Software Engineering

University of Szeged

Dugonics ter 13., 6720 Szeged, Hungary

[email protected]

Akos Kiss∗

Department of Software Engineering

University of Szeged

Dugonics ter 13., 6720 Szeged, Hungary

[email protected]

Abstract—By now, the Web has matured into a full-scaleapplication platform and its popularity is constantly rising.However, popularity comes at a cost: the Web is becoming moreand more of a target for attackers. In this paper, we argue thatthe security analysis of web browsers deserves attention and thedata on their security evolution is of high value. We examine thesecurity evolution of the widely used WebKit browser engine,investigate its historical security data, and point at the alarmingtrends.

Keywords-Security; web browser; evolution

I. INTRODUCTION

By now, the Web has matured from a distributed document

storage system, what it was in the beginning, into a full-

scale application platform. Its popularity as a platform is

constantly rising [1] and is comparable to the desktop

domain. This is all due to scripting, i.e., the extension of both

the server-side and the client-side with code execution capa-

bility. Scripting allowed the generation of dynamic content

and smooth user interaction. Today, the “web programming

languages” like JavaScript and PHP are among the most

popular and widely used languages [2]. However, such

popularity comes at a cost: the Web is becoming more and

more of a target for malicious attackers. Thus, its security is

of major concern. In fact, since the mid-2005, more security

vulnerabilities have been reported for the Web than for the

desktop domain [3].

When the Web and its security is discussed, focus mostly

falls on the security of web applications [4], or the applica-

tions and the browsers are handled as one joint category [3]

(unlike in the desktop domain, where the applications and

the operating system are discussed separately). However,

similarly to the web applications that are the analogous

counterparts of user-space desktop applications, browsers

and browser engines correspond to operating systems and the

kernel [5]. Thus, we argue that the security of web browsers

deserves to be analysed separately as well.

In this paper, we take a look at the security evolution

of the widely used web browser engine, WebKit [6]. We

investigate historical security data and analyse the trends.

The rest of the paper is organised as follows: In Sec-

tion II, we introduce the WebKit project and the trends

of its development. In Section III, we analyse the security

20

07

M1

22

00

8M

01

20

08

M0

22

00

8M

03

20

08

M0

42

00

8M

05

20

08

M0

62

00

8M

07

20

08

M0

82

00

8M

09

20

08

M1

02

00

8M

11

20

08

M1

22

00

9M

01

20

09

M0

22

00

9M

03

20

09

M0

42

00

9M

05

20

09

M0

62

00

9M

07

20

09

M0

82

00

9M

09

20

09

M1

02

00

9M

11

20

09

M1

22

01

0M

01

20

10

M0

22

01

0M

03

20

10

M0

42

01

0M

05

20

10

M0

62

01

0M

07

20

10

M0

82

01

0M

09

20

10

M1

02

01

0M

11

20

10

M1

22

01

1M

01

20

11

M0

22

01

1M

03

20

11

M0

42

01

1M

05

20

11

M0

62

01

1M

07

20

11

M0

82

01

1M

09

20

11

M1

02

01

1M

11

20

11

M1

22

01

2M

01

20

12

M0

22

01

2M

03

20

12

M0

4

0

500000

1000000

1500000

2000000

2500000

Lin

es o

f co

de

Figure 1. The change of the code size of WebKit over time.

evolution of the browser engine by mining several databases.

In Section IV, we give a brief overview of the related

publications. Finally, in Section V, we summarize the paper

and outline our plans for future work.

II. DEVELOPMENT TRENDS

We have chosen WebKit [6] as the target of our analysis

since it is the most widespread browser engine today. It

powers several desktop browsers (Apple Safari and Google

Chrome among others) and mobiles as well (like those

shipped with iPhone, Android, and MeeGo phones). Accord-

ing to the April 2012 statistics of StatCounter [7], WebKit

holds more than 38% of the browser market share.

Although not all of the above mentioned browsers are

developed in the public, their core, the WebKit browser

engine is an open source project. Thus, we can access

its code and bug repositories, and analyse the rate of its

development. Figures 1 and 2 show ever-increasing trends.

At the end of 2007, the size of code in WebKit was cca.

680 KLOC (thousand lines of code), while by April 2012,

it became almost 2.3 MLOC, which is a more than 3-fold

increase. Not only is the code constantly growing but it

is changing faster and faster. In March 2012, the revision

number of the code repository was incremented by more

than 3000, which means that the code was modified in every

15 minutes on average.

III. SECURITY EVOLUTION

One might naively assume that the quality (and thus, the

security) of software which has been developed for a long

time – and the development of WebKit started more than

17978-1-4673-3056-5/12/$31.00 © 2012 IEEE978-1-4673-3055-8/12/$31.00 © 2012 IEEE

Page 2: [IEEE 2012 IEEE 14th International Symposium on Web Systems Evolution (WSE) - Trento, Italy (2012.09.28-2012.09.28)] 2012 14th IEEE International Symposium on Web Systems Evolution

2007M

12

2008M

01

2008M

02

2008M

03

2008M

04

2008M

05

2008M

06

2008M

07

2008M

08

2008M

09

2008M

10

2008M

11

2008M

12

2009M

01

2009M

02

2009M

03

2009M

04

2009M

05

2009M

06

2009M

07

2009M

08

2009M

09

2009M

10

2009M

11

2009M

12

2010M

01

2010M

02

2010M

03

2010M

04

2010M

05

2010M

06

2010M

07

2010M

08

2010M

09

2010M

10

2010M

11

2010M

12

2011M

01

2011M

02

2011M

03

2011M

04

2011M

05

2011M

06

2011M

07

2011M

08

2011M

09

2011M

10

2011M

11

2011M

12

2012M

01

2012M

02

2012M

03

2012M

04

0

500

1000

1500

2000

2500

3000

3500

Nu

mb

er

of

revis

ion

s

Figure 2. The number of changes committed monthly in the repositoryof WebKit.

20

07

M1

22

00

8M

01

20

08

M0

32

00

8M

04

20

08

M0

52

00

8M

06

20

08

M0

72

00

8M

08

20

08

M0

92

00

8M

10

20

08

M1

12

00

8M

12

20

09

M0

12

00

9M

02

20

09

M0

32

00

9M

04

20

09

M0

52

00

9M

06

20

09

M0

72

00

9M

08

20

09

M0

92

00

9M

10

20

09

M1

12

00

9M

12

20

10

M0

12

01

0M

02

20

10

M0

32

01

0M

04

20

10

M0

52

01

0M

06

20

10

M0

72

01

0M

08

20

10

M0

92

01

0M

10

20

10

M1

12

01

0M

12

20

11

M0

12

01

1M

02

20

11

M0

32

01

1M

04

20

11

M0

52

01

1M

06

20

11

M0

72

01

1M

08

20

11

M0

92

01

1M

10

20

11

M1

12

01

1M

12

20

12

M0

12

01

2M

02

20

12

M0

32

01

2M

04

0

10

20

30

40

50

60

70

Nu

mb

er

of

se

cu

rity

pa

tch

es

Figure 3. The number of security-related changes committed monthly inthe repository of WebKit.

10 years ago – is improving over time. However, Pressman

forecasts that continuous changes and extensions are likely

to introduce undesired behaviour, i.e., defects to the code [8].

Thus, our hypothesis is that such an increase in size and such

a pace of development as observable in WebKit might have

an adverse effect on the security of the project in the long

run.

To verify our hypothesis, we have started to analyse the

publicly available sources of information on the security

problems of WebKit. First, we focused on the repositories of

the project itself. The bug database of WebKit has a special

section for issues that are labelled as security-related by the

experts of the developer community. By relating the code

repository and the bug database, we have been able to collect

the changesets that belong to security bugs. When building

our security database, we also recorded the date of the

modifications and the size of the changesets. Altogether, we

have found 877 security bug entries and 998 modifications

(with some bug fixes spanning over multiple changesets).

Our findings, which are depicted on Figures 3 and 4, un-

derpin our hypothesis. Both the number of the modifications

fixing security problems and the size of the code affected

by these modifications are increasing. In 2007, only 0.38%

of the changes were security-related, but in the first four

months of 2012, this number crept up to 2.03%, which, as

an absolute value, is not unacceptably high but shows a more

than 5-fold increase in the efforts spent on fixing security

problems.

Additionally to the project repositories, we have mined the

Common Vulnerabilities and Exposures (CVE) List [9] for

2007M

12

2008M

01

2008M

03

2008M

04

2008M

05

2008M

06

2008M

07

2008M

08

2008M

09

2008M

10

2008M

11

2008M

12

2009M

01

2009M

02

2009M

03

2009M

04

2009M

05

2009M

06

2009M

07

2009M

08

2009M

09

2009M

10

2009M

11

2009M

12

2010M

01

2010M

02

2010M

03

2010M

04

2010M

05

2010M

06

2010M

07

2010M

08

2010M

09

2010M

10

2010M

11

2010M

12

2011M

01

2011M

02

2011M

03

2011M

04

2011M

05

2011M

06

2011M

07

2011M

08

2011M

09

2011M

10

2011M

11

2011M

12

2012M

01

2012M

02

2012M

03

2012M

04

0

500

1000

1500

2000

Lin

es a

ffe

cte

d b

y s

ecu

rity

pa

tch

es

Figure 4. The number of lines changed monthly as a result of security-related changes in WebKit.

20

07

20

08

20

09

20

10

20

11

0

20

40

60

80

100

120

140

160

CV

Es

low medium high critical

Figure 5. The number of WebKit-related CVEs.

WebKit-related entries. We have found 356 reported vulnera-

bilities between the years 2007 and 2011. The description of

the CVEs allowed us to categorize them manually according

to their severity. For rating the vulnerabilities, we have used

the guidelines of the Chromium project [10]. The trends are

shown in Figure 5. The gross numbers are in line with the

trends observed in the code repository and are shown in

Figure 3. The number of CVEs are rising almost every year

(with 2011 being a bit more “fortunate”). Most importantly

(or distressingly), the ratio of critical vulnerabilities is rising

as well: in 2007, only 19% of the vulnerabilities were rated

critical, but by showing growing trends, this ratio rose to

84% in 2011.

IV. RELATED WORK

From the dawn of the internet era, when it was only

used by the military and by scientists, to our days when

we are managing even the paying of bills via ebank, the

Internet has been working with sensitive data. This calls

for the need to determine the rules and the restrictions to

reduce abuses. The first articles in the web security topic

were written at the end of the ‘80s. These first papers

focused primarily on the security of data transmission [11].

Recently however, as the Web is becoming dynamic and

the number of web applications is continuously increasing,

interest has turned to the security issues of web applications.

Experts have been trying to draw up guides on writing secure

18

Page 3: [IEEE 2012 IEEE 14th International Symposium on Web Systems Evolution (WSE) - Trento, Italy (2012.09.28-2012.09.28)] 2012 14th IEEE International Symposium on Web Systems Evolution

code [12], defining metrics for measuring the security level

of applications [13], and developing tools for automatic

security assessment [14]. Additionally, the most frequent

attack forms against web applications are regularly compiled

and reported [4].

V. SUMMARY AND FUTURE WORK

In this paper, we have argued that the security analysis of

web browsers deserves attention and the historical data on

their security evolution is of high value. We have analysed

the widely-used WebKit browser engine from the perspective

of security and found that the trends are alarming. Based on

the data extracted from two data sources, we have found

that more and more security problems are arising and their

fixing is requiring increasing efforts. Additionally, the ratio

of the critical bugs is also rising.

Our ultimate goal is not only to observe the trends of

browser security but also to help developers avoid the

introduction of security holes or to help them recognise

the problems at an early stage. We plan to thoroughly

analyse the patch database we have extracted from the code

repository of WebKit and apply code analysis techniques

(e.g., computation of metrics [15], [16]) on them or on the

affected parts of the code to find characteristic common

attributes. We hope that such characteristics can help us point

out bad smells, suspicious parts of the code.

Additionally, we are also interested in defining an attack

surface metric [17] for web browsers (which is ultimately

also a good indicator of the security of a product). The

information gathered in this work will help to validate the

defined metric.

REFERENCES

[1] B. A. Huberman and L. A. Adamic, “Growth dynamics ofthe world-wide web,” Nature, vol. 401, p. 131, Sep. 1999.

[2] TIOBE Software, “TIOBE programming community index,”http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html.

[3] M. Andrews, “The state of web security,” IEEE Security &Privacy, vol. 4, no. 4, pp. 14–15, Jul./Aug. 2006.

[4] OWASP Top 10 – 2010 – The Ten Most Critical Web Appli-cation Security Risks, OWASP – The Open Web ApplicationSecurity Project, Apr. 2010,https://www.owasp.org/.

[5] S. Shah, “Teflon: Anti-stick for the browser’s attack surface,”Oct. 22–24, 2008, Hack.LU 2008.

[6] “The WebKit open source project,”http://webkit.org/.

[7] StatCounter, “StatCounter Global Stats,”http://gs.statcounter.com/.

[8] R. S. Pressman, Software Engineering – A Practicioner’sApproach, 5th ed. McGraw-Hill, 2001.

[9] The MITRE Corporation, “Common vulnerabilities and ex-posures list,”http://cve.mitre.org/.

[10] The Chromium Projects, “Severity guidelines for securityissues,”http://www.chromium.org/developers/severity-guidelines.

[11] W.-P. Lu and M. Sundareshan, “Secure communication ininternet environments: a hierarchical key management schemefor end-to-end encryption,” IEEE Transactions on Communi-cations, vol. 37, no. 10, pp. 1014–1023, Oct. 1989.

[12] J. Meier, “Web application security engineering,” IEEE Secu-rity & Privacy, vol. 4, no. 4, pp. 16–24, Jul.–Aug. 2006.

[13] T. Heumann, S. Turpe, and J. Keller, “Quantifying the attacksurface of a web application,” in Sicherheit 2010: Sicherheit,Schutz und Zuverlassigkeit, Beitrage der 5. Jahrestagung desFachbereichs Sicherheit der Gesellschaft fur Informatik e.V.(GI), Berlin, Germany, Oct. 5–7, 2010, pp. 305–316.

[14] M. Curphey and R. Arawo, “Web application security as-sessment tools,” IEEE Security & Privacy, vol. 4, no. 4, pp.32–41, Jul.–Aug. 2006.

[15] T. Gyimothy, “To use or not to use? The metrics to measuresoftware quality (developers’ view),” in Proceedings of the

13th European Conference on Software Maintenance andReengineering (CSMR09). IEEE Computer Society, 2009,pp. 3–4.

[16] P. Hegedus, T. Bakota, L. Illes, G. Ladanyi, R. Ferenc, andT. Gyimothy, “Source code metrics and maintainability: acase study,” in Proceedings of the 2011 International Confer-ence on Advanced Software Engineering and Its Applications(ASEA 2011). Springer Verlag, Dec. 2011, pp. 272–284.

[17] P. K. Manadhata and J. M. Wing, “An attack surface metric,”IEEE Transactions on Software Engineering, vol. 37, no. 3,pp. 371–386, May–Jun. 2011.

19