25
An Exploratory Study of the Evolution of Communicated Information about the Execution of Large Software Systems Weiyi Shang Zhen Ming Jiang Bram Adams Ahmed E. Hassan Michael W. Godfrey University of Waterloo Queen’s University Mohamed Nasser Parminder Flora Research In Motion (RIM

Ian wcre2011

  • Upload
    sailqu

  • View
    75

  • Download
    0

Embed Size (px)

Citation preview

An Exploratory Study of the Evolution of

Communicated Information about the Execution of Large Software Systems

Weiyi Shang

Zhen Ming Jiang

Bram Adams

Ahmed E. Hassan

Michael W. Godfrey

University of WaterlooQueen’s University

Mohamed NasserParminder Flora

Research In Motion (RIM)

2

What run-time actions cause the failure?

3

Automated profiling & instrumentation

Detail No domain knowledgeLarge scale

4

Communicated information (CI)

Execution Logs

System Alerts

Code Comments

/*…*/

StaticDynamic

Field experienceDeveloper experience

5

CI forms basis of Ecosystem of Log Processing Apps

Workload recoveryAnomaly detection

Capacity planning System

monitoring

Performance analysis

Failure diagnosis

6

How to keep Log Processing Apps in sync with CI?

Release 1 Release 2 Release 3

7

Our Study Dimensions

What types of modifications happen to CI?

What information is conveyed by the short-lived CI?

Quantity Type Content

How does CI evolve over

time?

8

Case Study Setup

Data Collection

Log Abstraction

System Deployment

time=1, Trying to launch, TaskID=01A

time=$t, Trying to launch, TaskID=$id

Enterprise Application (EA)

LogEvents

9

Our Study Dimensions

What types of modifications happen to CI?

What information is conveyed by the short-lived CI?

Quantity Type Content

How does CI evolve over

time?

10

CI keeps on growing over time0.

14.0

0.15

.0

0.16

.0

0.17

.0

0.18

.0

0.19

.0

0.20

.0

0.20

.1

0.20

.2

0.21

.0

0

20

40

60

80

100

120

140

160

180

releases

# execution events

11

…even when system size decreases

# K SLOC # Execution log events 0.19.0 293 1130.20.0 250 121

12

CI is impacted by re-engineering

0.15.0 0.16.0 0.17.0 0.18.0 0.19.0 0.20.0 0.20.1 0.20.2 0.21.00.00%

10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00% Unchanged CI

Large amounts of implementation changes

How does CI evolve over

time?

13

Growing & changing

Document & track

What types of modifications happen to CI?

What information is conveyed by the short-lived CI?

Quantity Type Content

14

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

15

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

Hadoop mapred Reduce task fetch n bytes

Hadoop MapReduce task Reduce fetch n bytes

16

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

ShuffleRamManager memory limit n MaxSingleShuffleLimit m

ShuffleRamManager memory limit n MaxSingleShuffleLimit m mergeThreshold Q

17

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

Adding task to tasktracker

Adding Map Task to tasktracker

Adding Reduce Task to tasktracker

18

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

Avoidable

19

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

Recoverable

20

Six types of modification exist

Rephrasing Redundant information

Adding information

Deleting information

Diverging Merging

Unavoidable

21

Most modifications can be avoided

redundant info

rephrasing adding info deleting info

diverging merging0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

9.86%

61.97%

14.08%7.04% 7.04% 2.82%

avoidable recoverable unavoidable

22

How does CI evolve over

time?

Growing &changing

Document & track

What types of modifications happen to CI?

What information is conveyed by the short-lived CI?

Quantity Type Content

6 types

Are mostly avoidable

23

Short-lived CI contains implementation details

Hadoop saves output to a machine.Hadoop assigns a reduce task to a machine.Map task updates its progress.Hadoop reads from a local file.Hadoop Attempt saves its output and reports to the task tracker.

Node name

Local pathUsing ipc

Output file name

24

How does CI evolve over

time?

Growing &changing

Document & track

What types of modifications happen to CI?

What information is conveyed by the short-lived CI?

Quantity Type Content

6 types

Are mostly avoidable

Implementation-level details

FragileMaintenance effort

25