27
1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

Embed Size (px)

Citation preview

Page 1: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

1

Parametric Heap Usage Analysis for Functional Programs

Leena Unnikrishnan

Scott D. Stoller

Page 2: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

2

Motivation

• Goal: upper bound on live heap usage.

• Understanding memory usage of programs.

• Guiding automatic methods, e.g., help program transformations select space-efficient programs.

• Required for time analysis: running times depend on memory allocation, garbage collection, etc.

• Determining/verifying space usage of embedded applications.

Page 3: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

3

Challenges

• Determine live heap usage over entire program execution, and over multiple execution paths.

• Unlike running times, live heap space increases and decreases during execution.

• Static determination of liveness of objects.

Page 4: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

4

Overview

• Input: Program P in a functional language with lists.

• Output: worst-case live heap usage of P expressed in terms of sizes of P’s inputs.

• Example Input: program reverse(ls) to reverse a list

Output: Max live heap usage = 2|ls| - 1, if |ls| > 0

• Method– Transform functions in P into bound functions that

describe their heap space usage and related metrics.– Simplify and rewrite bound functions into recurrences.– Solve recurrences into closed forms.

Page 5: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

5

Outline

• Bound functions

• Recurrences containing the max operator

• Soundness checks for R functions

• Examples

• Related work

• Conclusion

Page 6: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

6

Bound Functions

• Construct three bound functions for each function f in program P.

• Arguments of each bound function for f(v1,…,vn) are the sizes v1s

,…,vns, of v1,…,vn.

• Size of a boolean or number: its value

• Size of a flat list: its length

Page 7: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

7

Bound FunctionsFor f(v1,…,vn) in the input program, construct

• Live heap space bound function, Sf(v1s,…,vns

): upper bound on live heap use of f, for all possible arguments v1,…,vn, of sizes v1s

,…,vns.

• New result-space bound function, Nf(v1s,…,vns

,v1w,…,vnw

): upper bound on amount of newly allocated space in the result of f, for all possible arguments v1,…,vn, of sizes v1s

,…,vns. viw

is the amount of new space in vi.

– “New” is w.r.t. start of evaluation of call to f.

• Size bound function, Rf(v1s,…,vns

): upper bound on size of result of f, for all possible arguments v1,…,vn, of sizes v1s

,…,vns.

Page 8: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

8

Live Heap Space Bound Functions

• For f(v1,…,vn) = e, Sf(v1s,…,vns

) = S[e].

• Example clauses in the definition of S[.].

S[v] = 0, S[cons(e1, e2)] = 1 + max(S[e1], S[e2])

S[f(e1, e2)] = max(S[e1],

N[e1] + S[e2], N[e1] + N[e2] + Sf(R[e1], R[e2]))

reverse(ls) = if null?(ls) then nil

else append(reverse(cdr(ls)),

cons(car(ls),

nil))

Srev(lss) = if lss=0 then 0 else max(Srev(lss-1),

Nrev(lss-1)+1,

Nrev(lss-1)+1+Sapp(Rrev(lss-1)))

Page 9: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

9

Size Bound Functions

• For f(v1,…,vn) = e, Rf(v1s,…,vns

) = R[e].

• Example clauses in the definition of R[.].R[v] = vs , R[cons(e1, e2)] = 1 + R[e2]

R[if p then e1 else e2] max(R[e1], R[e2]), if p contains “car” if R[p] then R[e1] else R[e2]

R[f(e1,…,en)] = Rf(R[e1],…, R[en])

= {

reverse(ls) = if null?(ls) then nil

else append(reverse(cdr(ls)),

cons(car(ls),

nil))

Rrev(lss) = if lss=0 then 0 else Rapp(Rrev(lss-1), 1)

Page 10: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

10

New Result-Space Bound Functions

• For f(v) in the input, the N function is Nf(vs, vw).

• f(e’) in input program yields Nf(R[e’], N[e’]) in bound functions.

– “New” is w.r.t. the start of evaluation of f(e’).

– vw is the part of v created in e’.

– Nf(…) is the part of the result created in e’ or the body of f.

• For f(v) = e, Nf(vs, vw) = N[e].

• Example clauses in the definition of N[.].

N[v] = vw , N[cons(e1, e2)] = 1 + N[e2]

N[f(e1,…,en)] = Nf(R[e1],…,R[en], N[e1],…,N[en])

Page 11: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

11

New Result-Space Bound Functions

reverse(ls) = if null?(ls) then nil

else append(reverse(cdr(ls)),

cons(car(ls),

nil))

Nrev(lss , lsw) = if lss=0 then 0 else Napp(Rrev(lss-1), 1, Nrev(lss-1, 0), 1)

Page 12: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

12

Outline

• Bound functions

• Recurrences containing the max operator

• Soundness checks for R functions

• Examples

• Related work

• Conclusion

Page 13: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

13

Composite Recurrences

• Bound functions are simplified into recurrences that may contain max. We call these composite recurrences.

• Example: For insert in insertion sort, Rins(n) = max(1+n, 1+Rins(n-1)), if n>0

• Solve composite recurrences using a library of solution templates.

Page 14: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

14

Solving Composite Recurrences: Example

For a recurrence of the form

where, the arguments of max are equal for

(a) “base cases”, i.e., e1(n) = e2(n) + a·c(n-b-i+1), for n in [i+j+1, i+j+b]

(b) all other values of n, i.e. e1(n) = e2(n) + a·e1(n-b), for n>(i+j+b)

the solution is T(n) = e1(n), for n>(i+j).

T(n) = {c1,…,cj+1, if n = i,…,(i+j), respectivelymax(e1(n), e2(n) + aT(n-b)), otherwise

Page 15: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

15

Outline

• Bound functions

• Recurrences containing the max operator

• Soundness checks for R functions

• Examples

• Related work

• Conclusion

Page 16: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

16

Soundness Checks on Uses of R Functions

• Intuition: It is not necessarily the object with the largest size that leads to worst-case live heap usage.

• R function results are arguments to S and N functions.

• Consider the R function, Rf(n), of greaterlist in quicksort.

Rf(n) is the maximum of a set of values. Rf is a complex R function.

Rf(n) = {0, if n=0max(1+Rf(n-1), Rf(n-1)), o.w.}= max(n, …, 0)

Page 17: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

17

Soundness Checks on Uses of R functions Using the result of Rf(n) as an argument to a bound function, say Sg, yields a true upper bound down the line, only if the following hold:

– Case 1: If Sg(Rf(n)) is a recursive call in the body of Sg, then using the result of Rf(n) in the definition of Sg, yields the maximum result. E.g., this holds for recurrences of the form

where• base values of T are monotonically increasing• largest base value < value at smallest non-base input• e(n) is monotonically increasing w.r.t. n

– Case 2: check if the solution of Sg is monotonically increasing w.r.t. its single argument.

T(n) = {c1,…,cj+1, if n = i,…,(i+j), respectivelye(n) + aT(R(n)), otherwise

Page 18: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

18

Improving Solvability of Recurrences• Eliminate unused arguments from bound functions.

– Simplifies recurrences. May eliminate dependence on other recurrences.

• Use invariants to simplify max expressions.– May simplify composite recurrences or make them regular.– Example: Sf(e1,…,en) Nf(e1,…,en,0,…,0).

So, max(Sf(e1,…,en), Nf(e1,…,en,0,…,0)) = Sf(e1,…,en).

• Use simpler, approximate definitions of N functions. precise: Nf(v1s

,…,vns, v1w

,…,vnw) = N[e]

approx: Nf(v1s,…,vns

) = Rf(v1s,…,vns

)

In our examples, even with this simplification, we get tight bounds on worst-case space usage.

Page 19: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

19

ExamplesProgram S Function (vs > c) Tight?

reverse(ls) Srev(lss) = 2lss – 1 Yes

insertion-sort(ls) Sis(lss) = 2lss – 1 Yes

selection-sort(ls) Sss(lss) = lss(lss+1)/2 Yes

merge-sort(ls) Sms(lss) = 2lss – 1 Yes

quicksort(ls) Sqs(lss) = 3·2lss-1 – 1 No

longest-common-subseq(ls1, ls2) Slcs(ls1s, ls2s

) = 2ls2s + 2 Yes

string-edit-distance(ls1, ls2) Sse(ls1s, ls2s

) = 2ls1 + 2ls2 + 1 Yes

binomial-coefficient(n, m) Sbc(ns,ms) = 2ms+2, if ms<ns Yes

• Derives linear, quadratic, and exponential bounds.• Bounds are tight (exact) for all examples except quicksort.

Page 20: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

20

Related Work• Timing analysis using recurrence relations [Wegbreit

‘75, LeMetayer ‘88].

• Static prediction of heap space [Hofmann & Jost ‘03]– Applies to a linearly-typed functional language.– Derives linear heap bounds.

• Parametric prediction of heap memory requirements [Braberman et al ’08]

– Java-like language.– Region-based memory management models heap recovery.– Derives polynomial bounds.

Page 21: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

21

Related Work• Memory resource bounds for low-level programs

[Chin et al ‘08]: – Assembly-like language.– Explicit deallocation to model heap recovery. – Derives linear bounds.

• Cost analysis [Albert et al ‘07]: – Java bytecode.– Escape analysis to model heap recovery.

– Analysis for live heap at ISMM ‘09!

Page 22: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

22

Conclusion

• Automatic, accurate analysis of live heap usage of functional programs.

– Framework of bound functions that describe live heap usage of input programs.

– Methods to simplify and solve composite recurrences.

– Soundness checks involving monotonicity and monotonicity–like properties of recurrences.

• Results are at source-code level and are easy to understand.

• Results are not restricted to any complexity class.

Page 23: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

23

Thank You!

Page 24: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

24

Extensions to Handle Other Data Types

• Specialize R functions and size arguments to each data type.

• Binary Trees– Possible size measures: height, total.

– R functions: Rheight, Rtotal

• Extend S, N, R transformations to new data type.Rheight[tree(e, el, er)] = 1 + max(Rheight[el], Rheight[er])

Rtotal [tree(e, el, er)] = 1 + Rtotal[el] + Rtotal[er ]

• Live heap usage of a given program may be better represented in terms of one specific size measure, than others.

Page 25: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

25

Example

insertion-sort(ls) = if null?(ls) then nil

else insert(car(ls),

insertion-sort(cdr(ls)))

insert(x, ls) = if null?(ls) then cons(x, nil) else if lesseroreq?(x, car(ls)) then cons(x, ls) else cons(car(ls), insert(x,cdr(ls)))

Input Program

Sis(lss) = if lss=0 then 0 else max(Sis(lss-1),

Nis(lss-1) + Sins(Ris(lss-1)))

Nis(lss) = if lss=0 then 0 else Nins(Ris(lss-1))

Ris(lss) = if lss=0 then 0 else Rins(Ris(lss-1))

Sins(lss) = if lss=0 then 1 else max(1, 1+Sins(lss-1))

Nins(lss) = Rins(lss)

Rins(lss) = if lss=0 then 1+0 else max(1+lss, 1+Rins(lss-1))

Bound Functions

Page 26: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

26

Example

insertion-sort(ls) = if null?(ls) then nil

else insert(car(ls),

insertion-sort(cdr(ls)))

insert(x, ls) = if null?(ls) then cons(x, nil) else if lesseroreq?(x, car(ls)) then cons(x, ls) else cons(car(ls), insert(x,cdr(ls)))

Input Program

Nins(lss) = 1+lss

Sis(lss) = {0, if lss=0max(Sis(lss-1), 2lss-1), otherwise

Sins(lss) ={1, if lss=0max(1, 1+Sins(lss-1)), otherwise

Rins(lss) = {1, if lss=0max(1+lss, 1+Rins(lss-1)), otherwise

Ris(lss) =0, if lss=01+ Ris(lss-1), otherwise{

Nis(lss) = lss

Recurrences/Equations

Page 27: 1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller

27

Example

insertion-sort(ls) = if null?(ls) then nil

else insert(car(ls),

insertion-sort(cdr(ls)))

insert(x, ls) = if null?(ls) then cons(x, nil) else if lesseroreq?(x, car(ls)) then cons(x, ls) else cons(car(ls), insert(x,cdr(ls)))

Input Program

Sis(lss) ={0, if lss=02lss-1, otherwise

Nis(lss) = lss

Nins(lss) = 1+lss

Ris(lss) = lss

Sins(lss) = 1+lss

Rins(lss) = 1+lss

Closed Forms