35
1 Cost Effective Dynamic Program Slicing Xiangyu Zhang Rajiv Gupta The University of Arizona

Cost Effective Dynamic Program Slicing

  • Upload
    cedric

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Cost Effective Dynamic Program Slicing. Xiangyu Zhang Rajiv Gupta The University of Arizona. Program Slicing. Definition Slice( v @ S ) Slice of v at S is the set of statements involved in computing v ’s value at S . [Mark Weiser, 1982] - PowerPoint PPT Presentation

Citation preview

Page 1: Cost Effective  Dynamic  Program  Slicing

1

Cost Effective Dynamic Program Slicing

Xiangyu ZhangRajiv Gupta

The University of Arizona

Page 2: Cost Effective  Dynamic  Program  Slicing

2

Program Slicing

Definition

Slice(v@S)

• Slice of v at S is the set of statements involved in computing v ’s value at S. [Mark Weiser, 1982]

Static slice is the set of statements that COULD influence the value of a variable for ANY input.

• Construct static dependence graph Control dependences Data dependences

• Traverse dependence graph to compute slice Transitive closure over control and data dependences

Page 3: Cost Effective  Dynamic  Program  Slicing

3

Dynamic Slicing

Dynamic slice is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988]

• Execution trace control flow trace -- dynamic control dependences memory reference trace -- dynamic data dependences

• Construct a dynamic dependence graph• Traverse dynamic dependence graph to compute slices• Smaller, more precise, slices are more helpful

Page 4: Cost Effective  Dynamic  Program  Slicing

4

Slice Sizes: Static vs. Dynamic

Program Statements

Avg. of 25 slices Static /

Dynamic Static Dynamic

126.gcc

099.go

134.perl

130.li

008.espresso

585,491

95,459

116,182

31,829

74,039

51,098

16,941

5,242

2,450

2,353

6,614

5,382

765

206

350

7.72

3.14

6.85

11.89

6.72

Static slice can be much larger than the dynamic slice

Page 5: Cost Effective  Dynamic  Program  Slicing

5

Applications of Dynamic Slicing

Debugging [Korel & Laski - 1988]

Detecting Spyware [Jha - 2003]

• Installed without users’ knowledge

Software Testing [Duesterwald, Gupta, & Soffa - 1992]

• Dependence based structural testing - output slices.

Module Cohesion [N.Gupta & Rao - 2001]

• Guide program structuring

Performance Enhancing Transformations• Instruction criticality [Ziles & Sohi - 2000]• Instruction isomorphism [Sazeides - 2003]

Others…

Page 6: Cost Effective  Dynamic  Program  Slicing

6

The Graph Size Problem

ProgramStatements Executed (Millions)

Dynamic Dependence

Graph Size(MB)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

140

67

108

123

118

71

220

124

131

138

1,568

1,296

1,442

1,816

1,535

835

1,954

1,745

1,534

1,707

Graphs of

realistic

program

runs do not

fit in

memory.

Page 7: Cost Effective  Dynamic  Program  Slicing

7

Space and Time Cost of LP [ICSE 2003]

ProgramSlicing Time

Average (Minutes)

Max. Dynamic Dependence

Graph Size(MB)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

13.9

9.2

10.2

9.9

12.3

4.69

25.2

11.3

12.1

10.7

296

81

34

40

114

35

54

105

58

162

Still not

fast

enough.

Need to

keep graph

in memory.

Page 8: Cost Effective  Dynamic  Program  Slicing

8

Input: N=2

Dependence Graph Representation

51: for I=1 to N do61: if (i%2==0) then

71: p=&a

81: a=a+191: z=2*(*p)

101: print(z)

11: z=021: a=031: b=2

41: p=&b

52: for I=1 to N do

62: if (i%2==0) then

82: a=a+192: z=2*(*p)

1: z=02: a=03: b=24: p=&b5: for i = 1 to N do6: if ( i %2 == 0) then7: p=&a endif8: a=a+19: z=2*(*p) endfor10: print(z)

Page 9: Cost Effective  Dynamic  Program  Slicing

9

5:for i=1 to N

6:if (i%2==0) then

7: p=&a

8: a=a+1

9: z=2*(*p)

10: print(z)

T F

1: z=0

2: a=0

3: b=2

4: p=&b

T

Input: N=2

11: z=0

21: a=0

31: b=2

41: p=&b

51: for i = 1 to N do

61: if ( i %2 == 0) then

81: a=a+1

91: z=2*(*p)

52: for i = 1 to N do

62: if ( i %2 == 0) then

71: p=&a

82: a=a+1

92: z=2*(*p)

101: print(z)

T1

2

3

4

5

6

7

8

9

10

11

12

13

14

Dependence Graph Representation

<3,8><2,7>

<7,12>

<11,13>

<13,14>

<4,8>

<12,13>

<5,6><9,10>

<10,11>

<5,7><9,12>

<5,8><9,13>

F

Page 10: Cost Effective  Dynamic  Program  Slicing

10

OPT: Compacted Graph Algorithm

Compaction• Elimination of timestamp labels.

Remove labels that can be inferred Transform dependence graph to enable elimination Remove labels that are redundant

Fast Traversal• Long search for relevant dependence is often replaced

by quick computation of dependence Consequence of compaction

Page 11: Cost Effective  Dynamic  Program  Slicing

11

OPT-1a. Infer Local Def-Use Labels: Full Elimination

X =

= X

X =

= X

0

X =

= X

(10,10)

(20,20)

(30,30)

Assign timestamps on node level

Page 12: Cost Effective  Dynamic  Program  Slicing

12

OPT-1b. Infer Local Def-Use Labels: Partial Elimination In Presence of Aliasing

X =

*P =

= X

X =

*P =

= X

(10,10)

(20,20) X =

*P =

= X

(10,10)0

*P is a may alias of X

Page 13: Cost Effective  Dynamic  Program  Slicing

13

OPT-2a. Transform Local Def-Use Labels: Full Elimination In Presence of Aliasing

Z =

Y =

(10,11)

(20,21)

(10,11)

(20,21)

X = f(Y)

= X

*P = g(Z)(11,11)

(21,21)

Z =

Y =(10,11)

(20,21)(10,11)

(20,21)

X = f(Y)

= X

*P = g(Z)

X = f(Y)

= X

*P = g(Z)0

0X = f(Y)

= X

*P = g(Z)

Z =

Y =

Page 14: Cost Effective  Dynamic  Program  Slicing

14

OPT-2b. Transform Non-local Def-Use to Local Use-Use Edges

= X

= X

X =

(10,11)

(20,21)

(10,11)

(20,21)

= X

= X

X =

(10,11)

(20,21) = X

= X

X =

0

use-use

Page 15: Cost Effective  Dynamic  Program  Slicing

15

OPT-2c. Transform Non-Local Def-Use to Local Def-Use Edges

X =

= Y

= X

Y =1

Y =2

X =

Y =1

Y =2

= Y

= X

(1,3)

(2,3)

(10,12)

(11,12)

X =

Y =1

Y =2

= Y

= X

(1,3)

(2,3)

= Y

= X

Y =2

X = 0

0

Node for path

Page 16: Cost Effective  Dynamic  Program  Slicing

16

OPT-3. Redundant Labels Across Non-Local Def-Use Edges

X =

Y =

= Y

= X

X =

Y =

X =

Y =

= Y

= X

X =

Y =

(1,2)

(1,2)

(10,11)

(10,11)

X =

Y =

= Y

= X

X =

Y =

(10,11)

(1,2)

Page 17: Cost Effective  Dynamic  Program  Slicing

17

OPT-4.(Control Dep.) Infer Fixed Distance Unique Control Ancestor

1

2

3

4

5

1.2.3.51.2.4.51.2.3.4.5

10.11.12.1320.21.22.2330.31.32.33.34

Path Timestamps

(32,33)

1

2

3

4

5

(10,13)

(20,23)

(30,34)

(21,22)

(11,12)

(31,32)

(10,11)

(20,21)

(30,31)

1

1

Page 18: Cost Effective  Dynamic  Program  Slicing

18

OPT-5a. Transform Multiple Control Ancestors

1

2

3

4

5

(32,33)

(10,13)

(20,23)

(30,34)

(21,22)

1

1

1

2

3

4

5

1

1

1

(10,13)

(30,34)

1

2

3

4

5

0

1

2

4

5

0

0

Page 19: Cost Effective  Dynamic  Program  Slicing

19

OPT-5b. Transform Varying Distance to Unique Control Ancestors

1

2

3

4

5

1

1

1

3

1

2

3

4

5

0

0 0

1

2

5

3

4 0

Page 20: Cost Effective  Dynamic  Program  Slicing

20

OPT-6. Redundant Across Non-Local Def- Use and Control Dependence Edges

X =

If P

= X

X =

If P

= X

(1,2)(1,2)

X =

If P

= X

(1,2)

Page 21: Cost Effective  Dynamic  Program  Slicing

21

Completeness of Label Elimination Optimizations

Data Dependence Labels• Local to a basic block

Infer (OPT-1a, OPT-1b) Transform (OPT-2a)

• Non-Local across basic blocks Transform (OPT-2b, OPT-2c) Redundant (OPT-3)

Control Dependence Labels Infer (OPT-4) Transform (OPT-5a, OPT-5b) Redundant (OPT-6)

Page 22: Cost Effective  Dynamic  Program  Slicing

22

Slicing algorithm (1)

{s2} U Slice(x,s2) @ t

0

Slice(v,s1) @ t =

s2: x=

s1:v=f(x,…)

0

Page 23: Cost Effective  Dynamic  Program  Slicing

23

Slicing algorithm (2)

Slice(x,s2) @ t

0

Slice(v,s1) @ t =

s2: …=x

s1:v=f(x,…)

0Use-use edge

Page 24: Cost Effective  Dynamic  Program  Slicing

24

Slicing algorithm (3)

{s3} U Slice(x,s3) @ t’Slice(v,s1) @ t =

s1:v=f(x,…)

s3: x=…

s4: x=…

…<t’,t>… …

Page 25: Cost Effective  Dynamic  Program  Slicing

25

Shortcuts to Speed Up Traversal

0: X =

1: Y = f(X)

2: Z = g(Y)

3: … = Z

(10,11)

(20,21)

0

0

0: X =

1: Y = f(X)

2: Z = g(Y)

3: … = Z

(10,11)

(20,21)

0

{2}

Page 26: Cost Effective  Dynamic  Program  Slicing

26

Experimental Setup

Implementation• Trimaran: C programs, IR (intermediate representation) • An instrumented interpreter executes IR, collects compact

control flow trace and memory trace.• CFG and PDG are constructed on IR level so that the

slicing is also on IR level.

Experiment• In order to get fair comparisons among algorithms,

we shared as much code as possible in different implementations.

• 2.2 GHz Pentium, 2 G RAM, 1 G swap space.• For each benchmark, we collected 3 different traces,

for each trace, we randomly computed 25 slices.

Page 27: Cost Effective  Dynamic  Program  Slicing

27

OPT: Compacted Graph Sizes

Program

Graph Size (MB) Before /

After

Explicit Dependences (%)Before After

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

1,568

1,296

1,442

1,816

1,535

835

1,954

1,745

1,534

1,707

210

51

65

70

170

52

21

97

75

131

7.72

25.68

22.26

26.03

9.02

16.19

93.40

18.09

20.54

13.01

13.40

3.89

4.49

3.84

11.09

6.18

1.07

5.53

4.87

7.69

Page 28: Cost Effective  Dynamic  Program  Slicing

28

OPT: Effects

Page 29: Cost Effective  Dynamic  Program  Slicing

29

OPT: Slicing Times at Different Execution Points

Page 30: Cost Effective  Dynamic  Program  Slicing

30

OPT: Benefit of Shortcuts

ProgramOPT Slicing Times (Avg. of 25 slices)

W/O Shortcuts (Seconds)

With Shortcuts (Seconds)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

68.0

6.1

5.6

4.9

22.0

4.5

12.6

15.7

9.8

26.9

36.3

2.1

1.9

2.2

17.1

1.7

4.1

6.1

3.8

11.4

Page 31: Cost Effective  Dynamic  Program  Slicing

31

OPT vs. LP: Graph Sizes

ProgramGraph Size (MB)

OPT LP (Max. of 25)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

210

51

65

70

170

52

21

97

75

131

296

81

35

40

113

35

54

105

57

162

Page 32: Cost Effective  Dynamic  Program  Slicing

32

OPT vs. LP: Slicing Times

ProgramSlicing Times (Avg. of 25 slices)

OPT (Seconds)

LP (Minutes)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

36.3

2.1

1.9

2.2

17.1

1.7

4.1

6.1

3.8

11.4

13.9

9.2

10.2

9.9

12.3

4.7

25.2

11.3

12.1

10.7

Page 33: Cost Effective  Dynamic  Program  Slicing

33

Traditional vs. OPT: Short Program Runs

ProgramSlicing Times (Avg. of 25 slices)

OPT (Seconds)

Traditional (Seconds)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

36.3 : 68.0

2.1 : 6.1

1.9 : 5.6

2.2 : 4.86

17.1 : 22.0

4.5 : 1.7

4.1 : 12.6

6.1 : 15.7

3.8 : 9.8

11.4 : 26.9

66.0

5.9

6.2

5.3

21.7

4.8

-

17.9

11.0

29.8

Page 34: Cost Effective  Dynamic  Program  Slicing

34

Graph Construction Cost

• Trace Generation - Instrumented program takes twice as long to run as the uninstrumented program.

• Trace Preprocessing for Graph Construction Time(LP) < Time(OPT) < Time(Traditional)

Program LP (min) OPT (min) Trad. (min)

300.twolf

256.bzip2

255.vortex

197.parser

181.mcf

164.gzip

134.perl

130.li

126.gcc

099.go

14.54

9.38

16.35

16.23

16.64

14.56

17.18

19.23

26.65

17.06

65.29

38.36

44.46

44.06

53.64

23.52

51.12

49.88

48.83

35.24

99.62

80.78

55.47

67.57

71.17

31.66

-

74.86

52.70

42.17

Page 35: Cost Effective  Dynamic  Program  Slicing

35

Conclusion

A straightforward implementation of precise algorithm is not practical.

Carefully designed precise dynamic slicing algorithms provide precise dynamic slices at reasonable space and time costs.

Our work is one step toward making dynamic slicing practical.

• On going work: Efficient online compression another 5-10 times reduction; 15MB for 150Mills(over 100 times reduction in total); 4-10 times slowdown.