50
Shape Analysis for Low-level Code Hongseok Yang (Seoul National University) (Joint work with Cristiano Calcagno, Dino Distefano and Peter O’Hearn)

Shape Analysis for Low-level Code

  • Upload
    donoma

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Shape Analysis for Low-level Code. Hongseok Yang (Seoul National University) (Joint work with Cristiano Calcagno, Dino Distefano and Peter O’Hearn). Dream. Automatically verify the memory safety of systems code, such as device derivers and memory managers. Challenges: Pointer arithmetic. - PowerPoint PPT Presentation

Citation preview

Page 1: Shape Analysis for Low-level Code

Shape Analysis for Low-level Code

Hongseok Yang(Seoul National University)

(Joint work with Cristiano Calcagno, Dino Distefano and Peter O’Hearn)

Page 2: Shape Analysis for Low-level Code

Dream

Automatically verify the memory safety of systems code, such as device derivers and memory managers.

Challenges: 1. Pointer arithmetic.2. Scalability.3. Concurrency.

Page 3: Shape Analysis for Low-level Code

Our Analyzer Handles programs for dynamic memory

management. Experimental results (Pentium

3.2GHz,4GB)Found a hidden assumption of the K&R memory manager. These are “fixed” versions.

Proved memory safety and even partial correctness.

Page 4: Shape Analysis for Low-level Code

Sample Analysis Result

Program: ans = malloc_bestfit_acyclic(n);Precondition: n¸2 Æ mls(freep,0)

Postcondition: (ans=0 Æ n¸2 Æ mls(freep,0)) Ç(n¸2 Æ nd(ans,q’,n) * mls(freep,0)) Ç(n¸2 Æ nd(ans,q’,n) * mls(freep,q’) * mls(q’,0))

Page 5: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 6: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 7: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 8: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global Vars Stack Heap

Page 9: Shape Analysis for Low-level Code

Hidden Assumption in K&R Malloc/Free

0 220

Global VarsStack Heap

Page 10: Shape Analysis for Low-level Code

Multiword Lists

24

515 3 18 3 nil 2

lp 15 18

24

Link Field Size Field

Page 11: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 25 15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

p

Page 12: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p

Page 13: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 14: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 3 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 15: Shape Analysis for Low-level Code

Coalescing

24 515 3 18 8 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 16: Shape Analysis for Low-level Code

Coalescing

24 515 3 24 8 nil 2

15 18 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p q

Page 17: Shape Analysis for Low-level Code

Coalescing

15 3 24 8 nil 2

15 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p

Page 18: Shape Analysis for Low-level Code

Coalescing

15 3 24 8 nil 2

15 24

p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }

5p=0

Nodeful High-level View

Nodeful High-level View

Nodeless Low-level View

Complex numerical relationships are used only for reconstructing a high-

level view.

Page 19: Shape Analysis for Low-level Code

Separation Logic blk(p+2,p+5)

nd(p,q,5) =def (pq) * (p+15) * blk(p+2,p+5)

mls(p,q)

p+2 p+5

p+5

5q

p

3 4 2

qp

Page 20: Shape Analysis for Low-level Code

Symbolic Heaps

9x’,y’. (P1 Æ P2 Æ … Æ Pn) Æ (H1 * H2 * … * Hm)

whereP ::= E=F | E·F | E!=F | …H ::= EF | blk(E,F) | mls(E,F) | nd(E,F,G) |…

Page 21: Shape Analysis for Low-level Code

Abstract Domain

P(CanSymH)>,µ

Pfin(SymH)>,µ

P(Emb) P(Abs)

y=x+z Æ x y*x+1 z*blk(x+2,0)*mls(y,0)

nd(x,y,z) * mls(y,0)

{Q1, Q2, … ,Qn}

{T1,T2,…,Tn}

Page 22: Shape Analysis for Low-level Code

Our Analysis

while(B) { C;

}

{T1,T2,…,Tn}

{ T’1,T’2,…,T’m}

{Q1, Q2, … ,Qn}

Nodeful View:

P(CanSymH)>

Nodeless View:

Pfin(SymH)>

{Q’1, Q’2, … ,Q’m}

Emb; Rearrangement

Abstraction

Sym. Execution

Page 23: Shape Analysis for Low-level Code

Our Analysis

while(B) { C;

}

{T1,T2,…,Tn}

{ T’1,T’2,…,T’m}

{Q1, Q2, … ,Qn}

Nodeful View:

P(CanSymH)>

Nodeless View:

Pfin(SymH)>

{Q’1, Q’2, … ,Q’m}

Page 24: Shape Analysis for Low-level Code

Analysis

«C¬ : Pfin(SymH)> ! Pfin(SymH)>

«A¬d = P(SymExec(A) o Rearrange(A))d «while b C¬d = FixComp(P(Abs) o F)

where F : P(CanSymHeaps) ! P(CanSymHeaps) F(d’) = P(Abs)(d [ «C¬d’)

Page 25: Shape Analysis for Low-level Code

Analysis

«C¬ : Pfin(SymH)> ! Pfin(SymH)>

«A¬d = (P(SymExec(A)) o lift(Rearrange(A)))d «while b C¬d = FixComp(P(Abs) o F)

where F : P(CanSymHeaps) ! P(CanSymHeaps) F(d’) = P(Abs)(d [ «C¬d’)

SymExec(A) :

Proof Rules in Sep. Log.

Rearrange(A) :

Unrolling of mls and nd

Page 26: Shape Analysis for Low-level Code

Analysis

«C¬ : Pfin(SymH)> ! Pfin(SymH)>

«A¬d = (P(SymExec(A)) o lift(Rearrange(A)))d «while b C¬d = FixComp(F)

where F : P(CanSymH)> ! P(CanSymH)>

F(d’) = P(Abs)(d [ («C¬o P(Emb))d’)

Emb: CanSymH !SymH Abs : SymH ! CanSymH

Information Loss

Widened Differential Fixpoint Algorithm

Page 27: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(p q’ * p+1 3 * blk(p+2,z’) * mls(q’,0))

Page 28: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

Page 29: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0) * r 4)

Page 30: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0) * true)

Page 31: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))

Page 32: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. (nd(p,q’,3) * mls(q’,0))

Page 33: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. (nd(p,q’,3) * mls(q’,0))

Page 34: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. mls(p,0)

Page 35: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.Precondition: true

… (xx’,s) * blk(x+2,x+s) Ã … nd(x,x’,s)

x’ s

x x+2 x+s

x’ s

x x+s

Page 36: Shape Analysis for Low-level Code

Abstraction Function Abs

Abs : SymH ! CanSymH

1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.Precondition: s = s’+i

… (xx’,s) * blk(x+2,x+i) * nd(x+i,y’,s’) Ã … nd(x,x’,s)

y’ s’x’ s

x x+2 x+i x+i+s’

x’ s

x x+s

Page 37: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 38: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=q’Æmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*q’r’,t’*blk(q’+2,q’+t’)*mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 39: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=q’Æmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*nd(q’,r’,t’) *mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 40: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=q’Æmls(lp,p)*nd(p,r’,s’+t’)* *mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 41: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

mls(lp,p)*nd(p,r’,s’+t’)* *mls(r’,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 42: Shape Analysis for Low-level Code

Coalescing while (p!=0){local q=p*;

if (p + *(p+1) == q) {

*(p+1) = *(p+1) + *(q+1);

*p = *q; } else {

p = *p;

} }

mls(lp,p) * mls(p,0)…

p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)

p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)

p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)

mls(lp,p)*mls(p,0)

p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)

Page 43: Shape Analysis for Low-level Code

Theorem Prover for “Q1 ` Q2”

without prover with prover

malloc_K&R about 20 hours 502.23 secs

free_K&R 23.844 secs 9.69 secs

Page 44: Shape Analysis for Low-level Code

Put Prover inside Hoare Powerdomain?

Q1 ` Q2, Q3 ` Q4

{Q1, Q2, Q3, Q4}

x0 = {}

x1 = F(x0) = {Q1, Q2, Q4}

x2 = F(x1) = {Q1, Q2, Q3, Q4}

P(CanSymH), µ vs. PH(CanSymH), v

{Q2, Q3} v

But, works only when ` is transitive.

Page 45: Shape Analysis for Low-level Code

Put Prover inside Hoare Powerdomain?

Q1 ` Q2, Q2 ` Q3, Q3 ` Q1

x0 = {}

x1 = F(x0) = {Q1, Q2}

x2 = F(x1) = {Q2, Q3}

x3 = F(x2) = {Q3, Q1}

x4 = F(x3) = {Q1, Q2}

P(CanSymH), µ vs. PH(CanSymH), v

But, works only when ` is transitive.

Page 46: Shape Analysis for Low-level Code

Put Prover inside Widening!

r : P(CanSymH) £ P(CanSymH) ! P(CanSymH)

x0r x1 =def x0 [ { Q 2 x1 | 8Q’ 2 x0. Q ` Q’ }

x0 = {}

x1 = x0 r F(x0)

x2 = x1 r F(x1)

xn+1 = xn r F(xn)

…x0 µ x1 µ x2 µ x3 …

Page 47: Shape Analysis for Low-level Code

Add Differencing

F : P(CanSymH) ! P(CanSymH)

x0 = {}

x1 = x0rF({}) = {Q1}

x2 = x1rF({Q1}) = {Q1,Q2}

x3 = x2rF({Q1,Q2}) = {Q1,Q2,Q3}

x4 = x3rF({Q1,Q2,Q3}) = {Q1,Q2,Q3}xn+1 = xnrF(yn), yn+1 = xn+1-xn

Nonstandard Fixpoint Algorithm:

• NOT y µ (x r y).

• NOT F(wdfix F) µ wdfix F.

NOT (F(wdfix F)) µ (wdfix F)

Page 48: Shape Analysis for Low-level Code

Soundness

Analysis results can be compiled into separation-logic proofs.

Page 49: Shape Analysis for Low-level Code

Widened Differential Fixpoint Algo.

«while (*) C¬d0 = ??

x0 = d0

x1 = x0r F(x0) y1 = x1 – x0

x2 = x1r F(y1) y2 = x2 – x1

x3 = x2r F(y2) = x2(x3) µ (d0) [ (y1)

[ (y2)x3 = d0r F(d0) r F(y1) r F(y2)(x3) (d0) [ (F(d0)) [ (F(y1))

[ (F(y2))

Page 50: Shape Analysis for Low-level Code

Widened Differential Fixpoint Algo.

{d0} C {F(d0)} {y1} C {F(y1)} {y2} C {F(y2)}

{d0} C {x3} {y1} C {x3} {y2} C {x3}

{d0 Ç y1 Ç y2} C {x3}

{x3} C {x3}

{x3} while (*) C {x3}

{d0} while (*) C {x3}

Disjunction Rule

Consequence:

(x3) (d0) [ (F(d0)) [ (F(y1)) [ (F(y2))

Consequence:

(x3) µ (d0) [ (y1)

[ (y2)