92
Context-Sensitivity Analysis Literature Review by José Nelson Amaral ([email protected]) University of Alberta

Context-Sensitivity Analysis Literature Review by José Nelson Amaral ([email protected]) University of Alberta

Embed Size (px)

Citation preview

Context-Sensitivity AnalysisLiterature Review

by José Nelson Amaral([email protected])

University of Alberta

Dimensions of Pointer Analysis

• Unification-based × Insertion-based• Flow-sensitive × flow-insensitive• Field-sensitive × field-insensitive × field-based• Context-sensitive × context-insensitive

CMPUT 680 - Compiler Design and Optimization 3

Andersen’s X Steensgaard’s (Example) Insertion X Unification

a = &b;

Program: Steensgaard:

Andersen:

S = {(a,b)}

S = {(a,b)}

a

b

ba

After (Shapiro/Horwitz, PPL97)

CMPUT 680 - Compiler Design and Optimization 4

Andersen’s X Steensgaard’s (Example)

a = &b;b = &c;

Program: Steensgaard:

Andersen:

S = {(a,b); (b,c)}

S = {(a,b); (b,c)}c

a

b

cba

After (Shapiro/Horwitz, PPL97)

CMPUT 680 - Compiler Design and Optimization 5

Andersen’s X Steensgaard’s (Example)

Program: Steensgaard:

Andersen:

S = {(a,b); (b,c)}

S = {(a,b); (b,c)}c

a

b

cba

What should happenin each analysis?

a = &b;b = &c;if(cond) a = &d;

After (Shapiro/Horwitz, PPL97)

CMPUT 680 - Compiler Design and Optimization 6

Andersen’s X Steensgaard’s (Example)

Program: Steensgaard:

Andersen:

S = {(a,b); (b,c); (a,d); (d,c)}

S = {(a,b); (b,c); (a,d)}c

a

b

d

c(b,d)a

a = &b;b = &c;if(cond) a = &d;

After (Shapiro/Horwitz, PPL97)

CMPUT 680 - Compiler Design and Optimization 7

Andersen’s X Steensgaard’s (Example)

Program: Steensgaard:

Andersen:

S = {(a,b); (b,c); (a,d); (d,c)}

S = {(a,b); (b,c); (a,d)}c

a

b

d

c(b,d)a

And now?

a = &b;b = &c;if(cond) a = &d;d = &e;

After (Shapiro/Horwitz, PPL97)

CMPUT 680 - Compiler Design and Optimization 8

Andersen’s X Steensgaard’s (Example)

a = &b;b = &c;if(cond) a = &d;d = &e;

Program: Steensgaard:

Andersen:

S = {(a,b); (b,c); (a,d); (d,c); (d,e); (b,e)}

S = {(a,b); (b,c); (a,d); (d,e)}c

a

b

d e

(c,e)(b,d)a

After (Shapiro/Horwitz, PPL97)

CMPUT 680 - Compiler Design and Optimization

Flow-sensitive X Flow-insensitive (Example)

a = &b; b = &c; if(cond) a = &d; d = &e;

Program:

ba

cba

d

cba

Strong update:Not only a now points to d,but also a no longer points to b

a cb

d ec,eb,da

a cb

dcb,da

Insertion based Unification based

CMPUT 680 - Compiler Design and Optimization

Flow-sensitivity in SSA(incomplete slide)

Program:

a0 cb

d e

pb = &b;pc = &c;pd = &d;pe = &e;

a0 = pb;*pb = pc;if(cond) a1 = pd;a2 = phi(a0, a1, FALSE, TRUE);*pd = pe;

All variables that had theiraddress taken must have an“access path” which is their address.

They can only be referenced throughtheir access paths.

a1

pb pc

pd pe

a2

In SSA flow-sensitive information can beobtained from the single graph above.

Field-insensitive × Field-based × Field-sensitive analysis

• Field insensitive: Each aggregate object modeled by a single abstract variable.

• Field-based: An abstract variable models all instances of a field of an aggregate type.

• Field-sensitive: Unique abstract variable models each field of each aggregate object.

(PearceKellyHankinTOPLAS07)

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

Program:typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;

da df1 daf1

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.Program:

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

Program:typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;a.f2 = &f;

da df1 daf1

f

f f

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.Program:

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

Program:typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;a.f2 = &f;

da df1 daf1

f

ff2 faf2

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.Program:

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;a.f2 = &f;b.f1 = &e;

da df1 daf1

f

ff2 faf2e

e

e

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.Program:

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;a.f2 = &f;b.f1 = &e;

da df1 daf1

f

ff2 faf2eb

e

ebf1

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.Program:

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

Program:

typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;a.f2 = &f;b.f1 = &e;c = a.f1;

da df1 daf1

f

ff2 faf2eb

e

ebf1

c c c

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.

Field Sensitivity (Example)

(PearceKellyHankinTOPLAS07)

Program:

typedef struct{int *f1; int *f2;} aggr;aggr a,b;

int *c, d, e, f;

a.f1 = &d;a.f2 = &f;b.f1 = &e;c = a.f1;

da df1 daf1

f

ff2 faf2eb

e

ebf1

c c c

Field Insensitive Field Based Field Sensitive

Assume a flow insensitive, insertion-based analysis.

Field Sensitivity in C

• A field-sensitive analysis for C is fundamentally harder than a field-sensitive analysis for Java:– C allows the address of a field to be taken

• Existing field-sensitive analysis for C:– YongHorwitzRepsPLDI99;– ChandraRepsPASTE99;– JohnsonWagnerUSENIX04;– PearceKellyHankinTOPLAS07;

What context-sensitivity means?

• Context-sensitive analysis: “the effects of a procedure call are estimated within a specific calling context”

• Context-insensitive analysis: “the effects of a procedure call summarizes the information for all calling contexts.”

(EmamiGhyaHendrenPLDI94)

Another definition

• “A context-insensitive (CI) algorithm does not distinguish the different calling contexts of a procedure, whereas a context-sensitive (CS) does.” (ZhuCalmanPLDI04)

• “CS treats multiple calls to a single procedure independently.” (RufPLDI95)

• “CI constructs a single approximation to a procedure’s effect on all of its callers.” (RufPLDI95)

Alternative definition:The calling context problem

• The calling context problem is “the problem of correctly accounting for the calling context of a called procedure.”

HorowitzRepsBlinkeyTOPLAS90

A more strict definition

• “A precise CS analysis yields results as precise as if they were computed on a modified program with all method calls inlined.”– Requires a context-sensitive heap abstraction:

• a separate abstraction is needed for each copy of an allocation statement

– Virtual call targets must be computed context-sensitively

• separately for each calling context;• using precise points-to information;

SridharanBodikPLDI06

Context-Sensitive Example

• Two calls to a function foo produce different return values because of the points-to set at the point immediately before each call to foo. – In other words, the return value of foo changes

depending on the context within which foo is invoked.

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

Context-sensitive example

Is there an algorithm that “gets” this example?

• Emami, Ghiya, and Hendren (PLDI94) should get it.

• We need to study the points-to sets that the algorithm computes at points P1, P2, and P3.

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foo

P1

P2

P3

Context-sensitive example

• In the following animation:

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

P1

P2

P3

x y

x definitely points to y (variable x containsthe address of variable y)

x probably points to y

(arrows are colored red only for conveniencein the animation, they represent newpoints-to relations that were not inthe previous slide)

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

P1?

a1

x1

y1

a2

x2

y2

a3

x3

y3

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

P1

a1

x1

y1

a2

x2

y2

a3

x3

y3

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

PA?

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

PA

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

PA’?

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

PA’

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3tPA”?

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo PA”

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo PA”

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3tPA’’’?

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofooPA’’’?

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3tPB?

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofooPB

a1

x1

y1

a2

x2

y2

a3

x3

y3

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

P2?

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

P2

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

PA?

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

PA

p2 p3t

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

p2 p3t

PB?

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

p2 p3t

PB

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

lqP3?

Context-sensitive example

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

#include <stdlib.h>typedef int arr[10000];arr a1, a2, a3;int cond1, cond2;int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2;}int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq);}

foofoo

a1

x1

y1

a2

x2

y2

a3

x3

y3

lp

lqP3

Solutions to the context-sensitive problem

• Create a context for each acyclic path from the root of the call graph to the current invocation (EmamiGhyaHendrenPLDI94).

• Create a context for each set of “relevant” alias set on entry of procedure --- also known as partial transfer functions (PTF) (WilsonLamPLDI95)– “to answer simple queries (PTF) requires all the

results to be computed.” (WhaleyLamPLDI04)

(Descriptions taken from RufPLDI95)

Solutions to the context-sensitive problem (cont.)

• Tag each alias to allow a procedure to propagate only appropriate aliases to its callers:– uses aliases on entry to the enclosing procedure

(LandiRyderPLDI93)– Augment summary with abstraction of call stack

(Cooper89MScThesis, ChoiBurkeCarinePoPL93)• A fully context-sensitive analysis is exponential

on the size of the input program --- unless the number of contexts considered is limited somehow.

Solutions to the context-sensitive problem (cont.)

• Create a clone of the method for each context (WhaleyLamPLDI04)– Reports up to 5 × 1023 clones (for a Java source

code analyzer called pmd).– No discussion as how results of the analysis could

be used in a real compiler.

Ruf’s Evaluation of Context Sensitivity

• Compares flow-sensitive CS and CI analyses– Benchmarks:

• Largest benchmark has 6771 lines of code and 5435 pointer or function outputs in the analysis.

• Sparse call graphs (4.2 callers/procedure on average, 54% of procedures have a single caller)

• Shallow nesting of pointer datatypes --- most pointers reference scalar datatypes.

– CI finds that on average each memory operation references very few locations.

– CS analysis generates 2% fewer points-to pair– CS does not affect the indirect memory references at

all.

RufPLDI95

Definition of a context

• “A context is a static abstraction of a method invocation”– A context-sensitive analysis “distinguishes

invocations if their context is different”

LhotakHendrenCC06

Invocation (Context) Abstractions

• call sites: the context of an invocation is the program statement from which the method was invoked.– Derived from call-string abstraction: a different

approximation is computed for each distinct path in the call graph (defined by SharirPnueli1981).

• receiver object: the context is the static abstraction of the object in which the method was invoked.– (defined by MilanovaRoutevRyderISSTA02)

LhotakHendrenCC06

1-level call string sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

LiangPenningsHarroldPASTE05

Points-to Graph Using 1-level Call String Sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

A:this10 a1

o10

LiangPenningsHarroldPASTE05

A node is a variable or instance.

An edge is variable reference or an instance field reference.

1-level call string sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o7#4

get:ret7

get:this7

A:this10 a1

o10

f

LiangPenningsHarroldPASTE05

Special local variableto represent return valueof method get()

The call string is limited to size one.This node represents the objectallocated at line 4 because of a call to get from line 7.

1-level call string sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o7#4

get:ret7

get:this7

A:this10 a1 a2 A:this11

o10 o11

f

LiangPenningsHarroldPASTE05

1-level call string sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o7#4

get:ret7

get:this7

A:this10 a1 a2 A:this11

get:this12

o10 o11

f f

LiangPenningsHarroldPASTE05

Cannot distinguish between the object allocation initiated by lines10 and 11 because in both casesthe new object is created at line 4through a call from line 7.

1-level call string sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o7#4

get:ret7

get:this7

A:this10 a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

f f

LiangPenningsHarroldPASTE05

1-level context-bound receiver-object sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

a1

o10

A:this10

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

LiangPenningsHarroldPASTE05

get:this7

1-level context-bound receiver-object sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1

o10

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

f

LiangPenningsHarroldPASTE05

get:this7

This node represented the objected created at line 4to represent the object ofline 10. The context is givenby the object and is independentof the call chain to the objectcreation.

1-level context-bound receiver-object sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1

o10

a2 A:this11

o11

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

f

LiangPenningsHarroldPASTE05

get:this7

1-level context-bound receiver-object sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

f f

LiangPenningsHarroldPASTE05

get:this7

Now objects fromlines 10 and 11 have distinctabstract representations.

1-level context-bound receiver-object sensitivity

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

p

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

f f

LiangPenningsHarroldPASTE05

get:this7

context insensitive1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

p

1-level receiver object

A:this a1

o10

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

ff

LiangPenningsHarroldPASTE05

get:this7

context insensitive1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

p

1-level receiver object

o4

get:ret

A:this a1

get:this

o10

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

ff

LiangPenningsHarroldPASTE05

get:this7

Without context, thereis a single abstractionto represent all objectsallocated at line 4.

context insensitive1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

p

1-level receiver object

o4

get:ret

A:this a1 a2

get:this

o10 o11

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

ff

LiangPenningsHarroldPASTE05

get:this7

context insensitive1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

p

1-level receiver object

o4

get:ret

A:this a1 a2

get:this

o10 o11

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

ff

LiangPenningsHarroldPASTE05

get:this7

context insensitive1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

o10#4

get:ret10

get:this10

A:this10 a1 a2 A:this11

get:this11

o10 o11

o11#4

get:ret11

p

1-level receiver object

o4

get:ret

A:this a1 a2

get:this

o10 o11p

o7#4

get:ret7

A:this10a1 a2 A:this11

get:this12

o10 o11

o12#4

p

get:ret12

1-level call string

ff

ff

LiangPenningsHarroldPASTE05

get:this7

Strings of Contexts

• A context of a method invocation i can be defined by a context string that represents the top invocations in the stack when i is invoked.

• Managing unbounded growth in the number of contexts:– k-limiting: Limit the contexts considered to k– cycle collapsing: Collapse all cycles in the context-

insensitive call graph into a single context. • Used by ZhuCalmanPLDI04 and WhaleyLamPLDI04

LhotakHendrenCC06

Equivalent Contexts

• Two contexts are equivalent if their points-to relations are the same.– The number of distinct method-context pairs

indicates how worthwhile context sensitivity may be in improving precision of points-to sets.

LhotakHendrenCC06

Call Site × Receiver Object Context Sensitivity

• Call-site Sensitivity: The context of an invocation is the program statement from which the method is invoked.

• Receiver-Object Sensitivity: The context of an invocation is the abstraction of the object on which the method is invoked..

o12#4

p

get:ret12

o11#4

get:ret11

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

Call Site × Receiver Object Context Sensitivity

• Hibrid Sensitivity: The context of an invocation is the abstraction of both the call site and the object on which the method is invoked..

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

1 class A {2Object f;3Object get() {4 return new Object();5}6A() {7 this.f = this.get();8}9static main() {10 A a1 = new A();11 A a2 = new A();12 Object p = a2.get();13 p.toString();14}15 }

(c7,o11)#4

p

get:ret12

Using a limit of 1 for string and object.

Is CS (in Java) Worth It?(number of contexts)

LhotakHendrenCC06

Total # of contexts is the product of the number in the column by the number of methods.Insensitive: 1 context per method1, 2, 3: Pointers are context-sensitive but pointer targets are not.1H: both pointers and pointer targets modeled with context strings of maximum length 1.

Is CS (in Java) Worth It?(number of contexts)

LhotakHendrenCC06

• Large number of contexts, but fewer that are distinct.• Collapsing cycles models large parts of the call graph context-insensitively.

Is CS (in Java) Worth It?(Virtual Call Resolution)

Bench CHA Insens

Object-sensitivity Call-site string

1 2 3 1H 1 2 1H

javac 908 737 720 720 720 720 720 720 720

soot-c 1748 983 913 913 913 913 938 913 938

polyglot 1332 744 592 592 592 585 592 592 592

bloat 2503 1079 962 962 - 961 962 962 962

pmd 2868 1224 1193 1193 1193 1163 1205 1205 1205

LhotakHendrenCC06

• Number of potentially polymorphic call sites (non library code).

• A casting can potentially fail if the analysis cannot prove statically that the new type is a supertype of the original type.

• Cast Safety Analysis determines which casts cannot fail.

• A 1H analysis reduced the number of potentially failing castings from 3539 to 1017 in the polyglot benchmark.

Application of CS in Java(Cast Safety)

LhotakHendrenCC06

• CS slightly improves call graph precision.• CS yields a more significant improvement in

virtual call resolution.• A 1-object-sensitive or a 1H-object-sensitive

analysis seems to be the best tradeoff.• Extending the length of context strings in an

object-sensitive analysis has little benefits.• Collapsing cycles in the call graph is not a good

idea for Java.

Is CS (in Java) Worth It?(Lhotak-Hendren Conclusions)

LhotakHendrenCC06

Liang/Harrold Evaluate CS on Andersen’s Analysis for Java

• CS results in more precise reference information in some benchmarks.– Both call-string contexts and receiver contexts are

useful (in different benchmarks).– In some benchmarks CS makes no difference

• They use precise models to simulate collection and map classes.

LiangPenningsHarrolsPASTE05

k-limiting object names

• In the code on the left the number of object names (shadows) is unbounded.

• Landi and Ryder limit the number of shadows to k.

• All object names with more then k dereferences are represented by the same name (shadow).

LandiRyderPLDI92

2 typedef struct CELL{3 int number;4 struct CELL *next;5 } cell;6 cell *head;7 8 int FindMax(cell *cursor)9 {10 int local_max;11 if(cursor == NULL)12 return 0;13 local_max = cursor->number;14 for( ; cursor->next != NULL ; cursor = cursor->next)15 {16 if(cursor->number > local_max)17 local_max = cursor->number;18 }19 return local_max;20 }

2 typedef struct CELL{3 int number;4 struct CELL *next;5 } cell;6 cell *head;7 8 int FindMax(cell *cursor)9 {10 int local_max;11 if(cursor == NULL)12 return 0;13 local_max = cursor->number;14 for( ; cursor->next != NULL ; cursor = cursor->next)15 {16 if(cursor->number > local_max)17 local_max = cursor->number;18 }19 return local_max;20 }

Heap Cloning

• For each procedure, create a graphical representation of the heap objects that are manipulated by the procedure (allocated, assigned to, referenced, etc)

• Traverse the call graph cloning the graph of the callee into each call site.

LattnerLnhartAdvePLDI07

Dealing with Cloning Complexity

• Use unification-based analysis so that many clones are merged together;

• Do not clone unreacheable objects from a callee into a caller;– For example, objects whose scope is entirely

within the callee are not cloned;

• Merge (instead of cloning) global variables;

LattnerPhD05

Recursive functions

• Abandon context-sensitivity in strongly connected components of the call graph.– Merge the graphs for all functions in the SCC

LattnerPhD05

Heap Specialization

• Heap specialization: clone heap objects along call chains (paths in the call graph).

• Nystrom et al. propose that only heap objects that escape the callee need to be cloned.

• They observe, empirically, that if the only exposure of an escaped object is through a global variable, there is no benefit for cloning.

• Their analysis is flow-insensitive, Anderson style.

NystromKimHwuPASTE04

Demand-driven Pointer Analysis

• Aimed to JITs. Only analyze portions of the program relevant to queries.

• 90% precision of field-sensitive Andersen’s analysis within 2ms per query (OOPSLA05).

SridharanGopanShanBodikOOPSLA05

Incremental/Compositional/Partial Pointer/Escape Analysis for Java

• Generate parameterized analysis results for each method.– Recursive methods use a fix-point iterative algorithm.– Analyze each method independent of its caller.– Trade precision X time: can analyze a method without

analyzing all the methods that it invokes.– Function summaries are flow insensitive.– Based on “points-to escape graphs”:

• (inside nodes/edges, outside nodes/edges, return value)• Slow. Complexity of O(N10) where N is the number of

instructions in the scope of the analysis:– compress is 3 times slower to compile with the analysis.

VivienRinardPLDI01, WhaleyRinardOOPSLA99, SalcianuPhDMIT01

On-demand and Incremental Region-based Shape Analysis for C

• Main idea: break down the abstraction into smaller components and analyze each component separately.– Use a “cheap” flow-insensitive and context-sensitive pointer

analysis to partition the memory into disjoint regions. Each node in the points to graph represents a “memory region”.

• Regions must be disjoint • Interprocedural propagation: uses a pair of input/output

transfer functions for each function.• On-demand: Can limit inter-procedural propagations to a

set of regions.• Incremental: Can reuse results from previously analyzed

regions.• Analyze OpenSSH (18.6 Kloc) in 45 seconds.

HacketRuginaPOPL05

Refinement-Based On-Demand CS points-to analysis for Java

• Based on Context-Free-Language (CFL)-reachability.– the CF language L represents paths in the program that

might cause a variable to point to an abstract location.– Balanced parenthesis property filters out unrealizable

paths:• call/return pairs must match• In Java store/loads to fields must also match (the same is not true

for C).– Significant increase of precision in relation to context-

insensitive analysis.– 13 minutes to analyze polyglot

SridharanBodikPOPL05

References• M. Sharir and A. Pnuelli, “Two approaches to interprocedural data flow analysis,” in

Program Flow Analysis: Theory and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1981, pp. 189-234.

• S. Horwitz, T. Reps, D. Binkley, “Interprocedural slicing using dependence graphs,” TOPLAS 1990 12(1):26-60.

• W. Landi, B. G. Ryder, “A safe approximate algorithm for interprocedural aliasing,” PLDI 1992, pp. 235-248.

• M. Emami, R. Ghiya, L. J. Hendren, “Context-sensitive interprocedural points-to analysis in the presence of function pointers.” PLDI 1994, pp. 242-246.

• J.-D. Choi, R. Cytron, J. Ferrante, “On the Efficient Engineering of Ambitious Program Analysis,” IEEE Trans. on Soft. Enginer, Vol. 20, No. 2, Feb, 1994, pp, 105-114

– Describes Factored SSA (FSSA).

• E. Ruf , “Context-insensitive alias analysis reconsidered,” PLDI 95, pp 13-22.• R. P. Wilson, M. S. Lam, “Efficient context-sensitive pointer analysis for C programs ,”

PLDI 1995, pp. 1-12. – Partial Transfer Functions are not practical.

References• J. Whaley, M. Rinard, “Compositional Pointer and Escape Analysis for Java

Programs,” OOPSLA 99, pp. 187-206.• M. Fähndrich, J. Rehof, M. Das, “Scalable context-sensitive flow analysis using

instantiation constraints,” PLDI 2000, 253-263.– Based exclusively on types. Unification-based in the intra-procedural level.

• F. Vivien, M. Rinard, “Incrementalized Pointer and Escape Analysis,” PLDI 2001, pp. 35-46,

• M. Berndl, O. Lhoták, F. Qian, L. Hendren, N. Umanee, “Points-to analysis using BDDs,” PLDI 2003, pp. 103-114.

• J. Zhu, S. Calman, “Symbolic pointer analysis revisited,“ PLDI 2004, pp. 145-157. – Treats call-graph cycles context-insensitively --- loses precision in Java

• J. Whaley, M. S. Lam, “Cloning-based context-sensitive pointer alias analysis using binary decision diagrams,” PLDI 2004, pp. 131-144.

– Treats call-graph cycles context-insensitively --- loses precision in Java (lots of contexts --- no practical way to use them).

References• E. M. Nystrom, H.-S. Kim, W. W. Hwu, “Importance of Heap Specialization in

Pointer Analysis,” PASTE 2004, pp. 43-48.• D. Liang, M. Pennings, M. J. Harrold, “Evaluating the Impact of Context-Sensitivity

on Andersen’s Algorithm for Java Programs,” PASTE05, pp. 6-12.• B. Hackett, R. Rugina, “Region-Based Shape Analysis with Tracked Locations,”

POPL05, pp. 310-323.• M. Sridharan, D. Gopan, L. Shan, R. Bodik, “Demand-Driven Points-to Analysis for

Java,” OOPSLA05, 59-76• O. Lhoták, L. Hendren, “Context-Sensitive Points-to Analysis: Is It Worth It?,”

Compiler Construction 2006, pp. 47-64. • C. Lattner, A. Lenharth, V. Adve, “Making context-sensitive points-to analysis with

heap cloning practical for the real world,” PLDI 2007, pp. 278 – 289.