104
ๅ— ไบฌ ๅคง ๅญฆ ๆŽ ๆจพ ่ฐญ ๆทป ่ฎก ็ฎ— ๆœบ ็ง‘ ๅญฆ ไธŽ ๆŠ€ ๆœฏ ็ณป ็จ‹ ๅบ ่ฎพ ่ฎก ่ฏญ ่จ€ ้™ ๆ€ ๅˆ† ๆž ็ ” ็ฉถ ็ป„ ไธŽ ่ฝฏ ไปถ ๅˆ† ๆž

Nanjing University

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Nanjing University

ๅ—ไบฌๅคงๅญฆ

ๆŽๆจพ

่ฐญๆทป

่ฎก็ฎ—ๆœบ็ง‘ๅญฆไธŽๆŠ€ๆœฏ็ณป

็จ‹ๅบ่ฎพ่ฎก่ฏญ่จ€

้™ๆ€ๅˆ†ๆž็ ”็ฉถ็ป„

ไธŽ

่ฝฏไปถๅˆ†ๆž

Page 2: Nanjing University

Nanjing University

Tian Tan

2021

Pointer Analysis

Static Program Analysis

Page 3: Nanjing University

Contents

1. Motivation2. Introduction to Pointer Analysis3. Key Factors of Pointer Analysis4. Concerned Statements

3Tian Tan @ Nanjing University

Page 4: Nanjing University

Contents

1. Motivation2. Introduction to Pointer Analysis3. Key Factors of Pointer Analysis4. Concerned Statements

4Tian Tan @ Nanjing University

Page 5: Nanjing University

Problem of CHA

5

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

Tian Tan @ Nanjing University

Page 6: Nanjing University

Problem of CHA

6

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA:

โ€ข call targets

Tian Tan @ Nanjing University

Page 7: Nanjing University

Problem of CHA

7

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based onclass hierarchyโ€ข 3 call targets

Tian Tan @ Nanjing University

Page 8: Nanjing University

Problem of CHA

8

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based onclass hierarchyโ€ข 3 call targets

Constant propagationโ€ข x =

Tian Tan @ Nanjing University

Page 9: Nanjing University

Problem of CHA

9

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based onclass hierarchyโ€ข 3 call targets

Constant propagationโ€ข x = NAC

Tian Tan @ Nanjing University

Page 10: Nanjing University

Problem of CHA

10

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based on only considersclass hierarchyโ€ข 3 call targetsโ€ข 2 false positives

Constant propagationโ€ข x = NACX X

Tian Tan @ Nanjing University

imprecise

Page 11: Nanjing University

Via Pointer Analysis

11

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based on only considersclass hierarchyโ€ข 3 call targetsโ€ข 2 false positives

Constant propagationโ€ข x = NAC

Pointer analysis: based on points-to relationโ€ข 1 call target

Tian Tan @ Nanjing University

imprecise

n points to new One

Page 12: Nanjing University

Via Pointer Analysis

12

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based on only considersclass hierarchyโ€ข 3 call targetsโ€ข 2 false positives

Constant propagationโ€ข x = NAC

Pointer analysis: based on points-to relationโ€ข 1 call target

Constant propagationโ€ข x = 1

Tian Tan @ Nanjing University

imprecise

n points to new One

Page 13: Nanjing University

Via Pointer Analysis

13

void foo() {Number n = new One();int x = n.get();

}

interface Number {int get();

}class Zero implements Number {

public int get() { return 0; }}class One implements Number {

public int get() { return 1; }}class Two implements Number {

public int get() { return 2; }}

CHA: based on only considersclass hierarchyโ€ข 3 call targetsโ€ข 2 false positives

Constant propagationโ€ข x = NAC

Pointer analysis: based on points-to relationโ€ข 1 call targetโ€ข 0 false positive

Constant propagationโ€ข x = 1

Tian Tan @ Nanjing University

imprecise

precise

n points to new One

Page 14: Nanjing University

Contents

14Tian Tan @ Nanjing University

1. Motivation2. Introduction to Pointer Analysis3. Key Factors of Pointer Analysis4. Concerned Statements

Page 15: Nanjing University

Pointer Analysisโ€ข A fundamental static analysis

โ€ข Computes which memory locations a pointer can point to

15Tian Tan @ Nanjing University

Page 16: Nanjing University

Pointer Analysisโ€ข A fundamental static analysis

โ€ข Computes which memory locations a pointer can point to

โ€ข For object-oriented programs (focus on Java)โ€ข Computes which objects a pointer (variable or field) can point to

16Tian Tan @ Nanjing University

Page 17: Nanjing University

Pointer Analysisโ€ข A fundamental static analysis

โ€ข Computes which memory locations a pointer can point to

โ€ข For object-oriented programs (focus on Java)โ€ข Computes which objects a pointer (variable or field) can point to

โ€ข Regarded as a may-analysisโ€ข Computes an over-approximation of the set of objects that a pointer

can point to, i.e., we ask โ€œa pointer may point to which objects?โ€

17Tian Tan @ Nanjing University

Page 18: Nanjing University

Pointer Analysisโ€ข A fundamental static analysis

โ€ข Computes which memory locations a pointer can point to

โ€ข For object-oriented programs (focus on Java)โ€ข Computes which objects a pointer (variable or field) can point to

โ€ข Regarded as a may-analysisโ€ข Computes an over-approximation of the set of objects that a pointer

can point to, i.e., we ask โ€œa pointer may point to which objects?โ€

18

A research area with 40+ years of history William E. Weihl, โ€œInterprocedural Data Flow Analysis in the Presence

of Pointers, Procedure Variables, and Label Variablesโ€. POPL 1980.Still an active area today

OOPSLAโ€™18, FSEโ€™18, TOPLASโ€™19, OOPSLAโ€™19, TOPLASโ€™20, OOPSLAโ€™21 โ€ฆTian Tan @ Nanjing University

Page 19: Nanjing University

Example

19

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

โ€œWhich objects a pointer can point to?โ€

Program Points-to relations

Tian Tan @ Nanjing University

Page 20: Nanjing University

Example

20

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Variable Object

a new A

x new B

Program Points-to relations

โ€œWhich objects a pointer can point to?โ€

Tian Tan @ Nanjing University

Page 21: Nanjing University

Example

Variable Object

a new A

x new B

this

b

21

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Program Points-to relations

โ€œWhich objects a pointer can point to?โ€

Tian Tan @ Nanjing University

Page 22: Nanjing University

Example

Variable Object

a new A

x new B

this new A

b new B

22

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Program Points-to relations

โ€œWhich objects a pointer can point to?โ€

Tian Tan @ Nanjing University

Page 23: Nanjing University

Example

Variable Object

a new A

x new B

this new A

b new B

23

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Field Object

new A.b new B

Program

โ€œWhich objects a pointer can point to?โ€

Points-to relations

Tian Tan @ Nanjing University

Page 24: Nanjing University

Example

24

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Field Object

new A.b new B

Variable Object

a new A

x new B

this new A

b new B

y

Program Points-to relations

โ€œWhich objects a pointer can point to?โ€

Tian Tan @ Nanjing University

Page 25: Nanjing University

Example

25

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Field Object

new A.b new B

Variable Object

a new A

x new B

this new A

b new B

y new B

Program Points-to relations

โ€œWhich objects a pointer can point to?โ€

Tian Tan @ Nanjing University

Page 26: Nanjing University

Example

Program Points-to relations

26

void foo() {A a = new A();B x = new B();a.setB(x);B y = a.getB();

}

class A {B b;void setB(B b) { this.b = b; }B getB() { return this.b; }

}

Field Object

new A.b new B

Variable Object

a new A

x new B

this new A

b new B

y new B

Pointer Analysis

input output

โ€œWhich objects a pointer can point to?โ€

Tian Tan @ Nanjing University

Page 27: Nanjing University

Pointer Analysis and Alias Analysis

Two closely related but different conceptsโ€ข Pointer analysis: which objects a pointer can point to?โ€ข Alias analysis: can two pointers point to the same object?

Tian Tan @ Nanjing University 27

Page 28: Nanjing University

Pointer Analysis and Alias Analysis

Two closely related but different conceptsโ€ข Pointer analysis: which objects a pointer can point to?โ€ข Alias analysis: can two pointers point to the same object?

If two pointers, say p and q, refer to the same object, then pand q are aliases

Tian Tan @ Nanjing University 28

p = new C();q = p;x = new X();y = new Y();

p and q are aliasesx and y are not aliases

Page 29: Nanjing University

Pointer Analysis and Alias Analysis

Two closely related but different conceptsโ€ข Pointer analysis: which objects a pointer can point to?โ€ข Alias analysis: can two pointers point to the same object?

If two pointers, say p and q, refer to the same object, then pand q are aliases

Tian Tan @ Nanjing University 29

Alias information can be derived from points-to relations

p = new C();q = p;x = new X();y = new Y();

p and q are aliasesx and y are not aliases

Page 30: Nanjing University

Applications of Pointer Analysis

โ€ข Fundamental informationoCall graph, aliases, โ€ฆ

โ€ข Compiler optimizationoVirtual call inlining, โ€ฆ

โ€ข Bug detectionoNull pointer detection, โ€ฆ

โ€ข Security analysiso Information flow analysis, โ€ฆ

โ€ข And many more โ€ฆ

30Tian Tan @ Nanjing University

โ€œPointer analysis is one of the most fundamental static program analyses,

on which virtually all others are built.โ€*

Page 31: Nanjing University

Applications of Pointer Analysis

โ€ข Fundamental informationoCall graph, aliases, โ€ฆ

โ€ข Compiler optimizationoVirtual call inlining, โ€ฆ

โ€ข Bug detectionoNull pointer detection, โ€ฆ

โ€ข Security analysiso Information flow analysis, โ€ฆ

โ€ข And many more โ€ฆ

31

โ€œPointer analysis is one of the most fundamental static program analyses,

on which virtually all others are built.โ€*

*Pointer Analysis - Report from Dagstuhl Seminar 13162. 2013.Tian Tan @ Nanjing University

Page 32: Nanjing University

Contents

32Tian Tan @ Nanjing University

1. Motivation2. Introduction to Pointer Analysis3. Key Factors of Pointer Analysis4. Concerned Statements

Page 33: Nanjing University

Key Factors in Pointer Analysis

33

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 34: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

34

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 35: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

35

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 36: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

36

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 37: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

37

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 38: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

38

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 39: Nanjing University

Heap Abstraction

How to model heap memory?โ€ข In dynamic execution, the number of heap objects can be unbounded

due to loops and recursion

39

for (โ€ฆ) {A a = new A();

}

Tian Tan @ Nanjing University

Page 40: Nanjing University

Heap Abstraction

How to model heap memory?โ€ข In dynamic execution, the number of heap objects can be unbounded

due to loops and recursion

โ€ข To ensure termination, heap abstraction models dynamically allocated, unbounded concrete objects as finite abstract objects for static analysis

40

for (โ€ฆ) {A a = new A();

}

Tian Tan @ Nanjing University

Page 41: Nanjing University

โ€ฆ

Heap Abstraction

How to model heap memory?โ€ข In dynamic execution, the number of heap objects can be unbounded

due to loops and recursion

โ€ข To ensure termination, heap abstraction models dynamically allocated, unbounded concrete objects as finite abstract objects for static analysis

41

Dynamic execution Static analysis

abstracted

Bounded abstract objectsUnbounded concrete objects

for (โ€ฆ) {A a = new A();

}

Tian Tan @ Nanjing University

Page 42: Nanjing University

Heap Abstraction

42Tian Tan @ Nanjing UniversityVini Kanvar, Uday P. Khedker, โ€œHeap Abstractions for Static Analysisโ€. ACM CSUR 2016

Page 43: Nanjing University

Heap Abstraction

43Tian Tan @ Nanjing UniversityVini Kanvar, Uday P. Khedker, โ€œHeap Abstractions for Static Analysisโ€. ACM CSUR 2016

Page 44: Nanjing University

Allocation-Site Abstraction

โ€ข Model concrete objects by their allocation sitesโ€ข One abstract object per allocation site to represent

all its allocated concrete objects

44

The most commonly-used heap abstraction

Tian Tan @ Nanjing University

Page 45: Nanjing University

Allocation-Site Abstraction

โ€ข Model concrete objects by their allocation sitesโ€ข One abstract object per allocation site to represent

all its allocated concrete objects

45

1 for (i = 0; i < 3; ++i) {2 a = new A();3 โ€ฆ4 }

Dynamic execution

๐‘œ๐‘œ2, iteration i = 0๐‘œ๐‘œ2, iteration i = 1๐‘œ๐‘œ2, iteration i = 2

Tian Tan @ Nanjing University

The most commonly-used heap abstraction

Page 46: Nanjing University

Allocation-Site Abstraction

โ€ข Model concrete objects by their allocation sitesโ€ข One abstract object per allocation site to represent

all its allocated concrete objects

46

1 for (i = 0; i < 3; ++i) {2 a = new A();3 โ€ฆ4 }

๐‘œ๐‘œ2

Dynamic executionAllocation-site

abstraction

๐‘œ๐‘œ2, iteration i = 0๐‘œ๐‘œ2, iteration i = 1๐‘œ๐‘œ2, iteration i = 2

abstracted

Tian Tan @ Nanjing University

The most commonly-used heap abstraction

Page 47: Nanjing University

Allocation-Site Abstraction

โ€ข Model concrete objects by their allocation sitesโ€ข One abstract object per allocation site to represent

all its allocated concrete objects

47

1 for (i = 0; i < 3; ++i) {2 a = new A();3 โ€ฆ4 }

๐‘œ๐‘œ2

Dynamic execution

๐‘œ๐‘œ2, iteration i = 0๐‘œ๐‘œ2, iteration i = 1๐‘œ๐‘œ2, iteration i = 2

abstracted

Tian Tan @ Nanjing University

The number of allocation sites in a program is bounded,

thus the abstract objects must be finite.

The most commonly-used heap abstraction

Allocation-site abstraction

Page 48: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

48

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 49: Nanjing University

Context SensitivityHow to model calling contexts?

49

Context-sensitive Context-insensitive

Distinguish different calling contexts of a method

Merge all calling contexts of a method

Analyze each method multiple times, once for each context

Analyze each method once

Tian Tan @ Nanjing University

Page 50: Nanjing University

Context SensitivityHow to model calling contexts?

50

a.foo(x); b.foo(y);

Context 1:void foo(T p) {

โ€ฆ}

Context-sensitive Context-insensitive

Distinguish different calling contexts of a method

Merge all calling contexts of a method

Analyze each method multiple times, once for each context

Analyze each method once

Context 2:void foo(T p) {

โ€ฆ}

Tian Tan @ Nanjing University

Page 51: Nanjing University

Context SensitivityHow to model calling contexts?

51

a.foo(x); b.foo(y);

Context 1:void foo(T p) {

โ€ฆ}

Context-sensitive Context-insensitive

Distinguish different calling contexts of a method

Merge all calling contexts of a method

Analyze each method multiple times, once for each context

Analyze each method once

Context 2:void foo(T p) {

โ€ฆ}

Tian Tan @ Nanjing University

a.foo(x); b.foo(y);

void foo(T p) {โ€ฆ

}

Page 52: Nanjing University

Context SensitivityHow to model calling contexts?

52

a.foo(x); b.foo(y);

Context 1:void foo(T p) {

โ€ฆ}

Context-sensitive Context-insensitive

Distinguish different calling contexts of a method

Merge all calling contexts of a method

Analyze each method multiple times, once for each context

Analyze each method once

Context 2:void foo(T p) {

โ€ฆ}

Tian Tan @ Nanjing University

a.foo(x); b.foo(y);

void foo(T p) {โ€ฆ

}

Page 53: Nanjing University

Context SensitivityHow to model calling contexts?

53

a.foo(x); b.foo(y);

Context 1:void foo(T p) {

โ€ฆ}

Context-sensitive Context-insensitive

Distinguish different calling contexts of a method

Merge all calling contexts of a method

Analyze each method multiple times, once for each context

Analyze each method once

Context 2:void foo(T p) {

โ€ฆ}

Tian Tan @ Nanjing University

a.foo(x); b.foo(y);

void foo(T p) {โ€ฆ

}

Very useful technique Significantly improve precision More details in later lectures

We start with this

Page 54: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

54

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 55: Nanjing University

Flow SensitivityHow to model control flow?

55

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

Tian Tan @ Nanjing University

Page 56: Nanjing University

Flow SensitivityHow to model control flow?

56

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

Tian Tan @ Nanjing University

So far, all data-flow analyseswe have learnt are flow-sensitive

Page 57: Nanjing University

Flow SensitivityHow to model control flow?

57

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

Tian Tan @ Nanjing University

Page 58: Nanjing University

Flow SensitivityHow to model control flow?

58

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}

Tian Tan @ Nanjing University

Page 59: Nanjing University

Flow SensitivityHow to model control flow?

59

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}

c โž {o1}o1.f โž {"x"}

Tian Tan @ Nanjing University

Page 60: Nanjing University

Flow SensitivityHow to model control flow?

60

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž

c โž {o1}

c โž {o1}o1.f โž {"x"}

Tian Tan @ Nanjing University

Page 61: Nanjing University

Flow SensitivityHow to model control flow?

61

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

Tian Tan @ Nanjing University

Page 62: Nanjing University

Flow SensitivityHow to model control flow?

62

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

Page 63: Nanjing University

Flow SensitivityHow to model control flow?

63

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Page 64: Nanjing University

Flow SensitivityHow to model control flow?

64

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Page 65: Nanjing University

Flow SensitivityHow to model control flow?

65

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}o1.f โž

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Page 66: Nanjing University

Flow SensitivityHow to model control flow?

66

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}o1.f โž {"x", "y"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Page 67: Nanjing University

Flow SensitivityHow to model control flow?

67

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}o1.f โž {"x", "y"}s โž

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Page 68: Nanjing University

Flow SensitivityHow to model control flow?

68

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}o1.f โž {"x", "y"}s โž {"x", "y"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Page 69: Nanjing University

Flow SensitivityHow to model control flow?

69

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}o1.f โž {"x", "y"}s โž {"x", "y"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

false positive

Page 70: Nanjing University

Flow SensitivityHow to model control flow?

70

Flow-sensitive Flow-insensitive

Respect the execution order of the statements

Ignore the control-flow order, treat the program as a set of unordered statements

Maintain a map of points-to relations at each program location

Maintain one map of points-to relations for the whole program

1 c = new C();2 c.f = "x";3 s = c.f;4 c.f = "y";

c โž {o1}o1.f โž {"x"}s โž {"x"}

c โž {o1}o1.f โž {"x", "y"}s โž {"x", "y"}

c โž {o1}

c โž {o1}o1.f โž {"x"}

c โž {o1}o1.f โž {"y"}s โž {"x"}

Chosen in this course

Page 71: Nanjing University

Key Factors in Pointer Analysis

Factor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

71

โ€ข Pointer analysis is a complex systemโ€ข Multiple factors affect the precision and efficiency of the system

Tian Tan @ Nanjing University

Page 72: Nanjing University

Analysis ScopeWhich parts of program should be analyzed?

72

Whole-program Demand-driven

Compute points-to information for all pointers in the program

Only compute points-to information for the pointers that may affect specific sites of interest (on demand)

Provide information for all possible clients Provide information for specific clients

Tian Tan @ Nanjing University

Page 73: Nanjing University

Analysis ScopeWhich parts of program should be analyzed?

73

Whole-program Demand-driven

Compute points-to information for all pointers in the program

Only compute points-to information for the pointers that may affect specific sites of interest (on demand)

Provide information for all possible clients Provide information for specific clients

1 x = new A();2 y = x;3 โ€ฆ4 z = new T();5 z.bar();

Tian Tan @ Nanjing University

Page 74: Nanjing University

Analysis ScopeWhich parts of program should be analyzed?

74

Whole-program Demand-driven

Compute points-to information for all pointers in the program

Only compute points-to information for the pointers that may affect specific sites of interest (on demand)

Provide information for all possible clients Provide information for specific clients

1 x = new A();2 y = x;3 โ€ฆ 4 z = new T();5 z.bar();

x โž {o1}y โž {o1}z โž {o4}

Tian Tan @ Nanjing University

Page 75: Nanjing University

Analysis ScopeWhich parts of program should be analyzed?

75

Whole-program Demand-driven

Compute points-to information for all pointers in the program

Only compute points-to information for the pointers that may affect specific sites of interest (on demand)

Provide information for all possible clients Provide information for specific clients

1 x = new A();2 y = x;3 โ€ฆ4 z = new T();5 z.bar();

x โž {o1}y โž {o1}z โž {o4}

Client: call graph constructionSite of interest: line 5

What points-to information do we need

Tian Tan @ Nanjing University

Page 76: Nanjing University

Analysis ScopeWhich parts of program should be analyzed?

76

Whole-program Demand-driven

Compute points-to information for all pointers in the program

Only compute points-to information for the pointers that may affect specific sites of interest (on demand)

Provide information for all possible clients Provide information for specific clients

1 x = new A();2 y = x;3 โ€ฆ4 z = new T();5 z.bar();

x โž {o1}y โž {o1}z โž {o4}

Client: call graph constructionSite of interest: line 5

z โž {o4}

Tian Tan @ Nanjing University

Page 77: Nanjing University

Analysis ScopeWhich parts of program should be analyzed?

77

Whole-program Demand-driven

Compute points-to information for all pointers in the program

Only compute points-to information for the pointers that may affect specific sites of interest (on demand)

Provide information for all possible clients Provide information for specific clients

1 x = new A();2 y = x;3 โ€ฆ 4 z = new T();5 z.bar();

Chosen in this course

Client: call graph constructionSite of interest: line 5

z โž {o4}

x โž {o1}y โž {o1}z โž {o4}

Tian Tan @ Nanjing University

Page 78: Nanjing University

Pointer Analysis in This CourseFactor Problem ChoiceHeap abstraction

How to model heap memory?

โ€ข Allocation-siteโ€ข Storeless

Context sensitivity

How to model calling contexts?

โ€ข Context-sensitiveโ€ข Context-insensitive

Flow sensitivity

How to model control flow?

โ€ข Flow-sensitiveโ€ข Flow-insensitive

Analysis scope Which parts of program should be analyzed?

โ€ข Whole-programโ€ข Demand-driven

78Tian Tan @ Nanjing University

Page 79: Nanjing University

Contents

79Tian Tan @ Nanjing University

1. Motivation2. Introduction to Pointer Analysis3. Key Factors of Pointer Analysis4. Concerned Statements

Page 80: Nanjing University

What Do We Analyze?โ€ข Modern languages typically have many kinds of statements

โ€ข if-elseโ€ข switch-caseโ€ข for/while/do-whileโ€ข break/continueโ€ข โ€ฆ

80Tian Tan @ Nanjing University

Page 81: Nanjing University

What Do We Analyze?โ€ข Modern languages typically have many kinds of statements

โ€ข if-elseโ€ข switch-caseโ€ข for/while/do-whileโ€ข break/continueโ€ข โ€ฆ

โ€ข We only focus on pointer-affecting statements

81

Do not directly affect pointers Ignored in pointer analysis

Tian Tan @ Nanjing University

Page 82: Nanjing University

Pointers in Java

โ€ข Local variable: x

โ€ข Static field: C.f

โ€ข Instance field: x.f

โ€ข Array element: array[i]

Tian Tan @ Nanjing University 82

Page 83: Nanjing University

Pointers in Java

โ€ข Local variable: x

โ€ข Static field: C.f

โ€ข Instance field: x.f

โ€ข Array element: array[i]

Tian Tan @ Nanjing University 83

Page 84: Nanjing University

Pointers in Java

โ€ข Local variable: x

โ€ข Static field: C.f

โ€ข Instance field: x.f

โ€ข Array element: array[i]

Tian Tan @ Nanjing University 84

Sometimes referred as global variable

Page 85: Nanjing University

Pointers in Java

โ€ข Local variable: x

โ€ข Static field: C.f

โ€ข Instance field: x.f

โ€ข Array element: array[i]

Tian Tan @ Nanjing University 85

Modeled as an object (pointed by x) with a field f

Page 86: Nanjing University

Pointers in Java

โ€ข Local variable: x

โ€ข Static field: C.f

โ€ข Instance field: x.f

โ€ข Array element: array[i]

Tian Tan @ Nanjing University 86

Ignore indexes. Modeled as an object (pointed by array)

with a single field, say arr, which may point to any value

stored in array

array = new String[10];array[0] = "x";array[1] = "y";s = array[0];

array = new String[];array.arr = "x";array.arr = "y";s = array.arr;

Real code Perspective of pointer analysis

Page 87: Nanjing University

Pointers in Java

โ€ข Local variable: x

โ€ข Static field: C.f

โ€ข Instance field: x.f

โ€ข Array element: array[i]

Tian Tan @ Nanjing University 87

Page 88: Nanjing University

Pointer-Affecting Statements

Tian Tan @ Nanjing University 88

New x = new T()

Assign x = y

Store x.f = y

Load y = x.f

Call r = x.k(a, โ€ฆ)

Page 89: Nanjing University

Pointer-Affecting Statements

Tian Tan @ Nanjing University 89

New x = new T()

Assign x = y

Store x.f = y

Load y = x.f

Call r = x.k(a, โ€ฆ)

x.f.g.h = y;

t1 = x.ft2 = t1.gt2.h = y;

Complex memory-accesses will be converted to three-address code by

introducing temporary variables

Page 90: Nanjing University

Pointer-Affecting Statements

Tian Tan @ Nanjing University 90

New x = new T()

Assign x = y

Store x.f = y

Load y = x.f

Call r = x.k(a, โ€ฆ)

โ€ข Static call C.foo()

โ€ข Special call super.foo()/x.<init>()/this.privateFoo()

โ€ข Virtual call x.foo()

Page 91: Nanjing University

โ€ข Static call C.foo()

โ€ข Special call super.foo()/x.<init>()/this.privateFoo()

โ€ข Virtual call x.foo()

Pointer-Affecting Statements

Tian Tan @ Nanjing University 91

New x = new T()

Assign x = y

Store x.f = y

Load y = x.f

Call r = x.k(a, โ€ฆ)

focus

Page 92: Nanjing University

The X You Need To Understand in This Lecture

โ€ข What is pointer analysis?

โ€ข Understand the key factors of pointer analysis

โ€ข Understand what we analyze in pointer analysis

Tian Tan @ Nanjing University

Page 93: Nanjing University

Nanjing University

Tian Tan

2020

Pointer Analysis

Static Program Analysis

Foundations (I)

Page 94: Nanjing University

Contents

1. Pointer Analysis: Rules2. How to Implement Pointer Analysis3. Pointer Analysis: Algorithms4. Pointer Analysis with Method Calls

94Tian Tan @ Nanjing University

Page 95: Nanjing University

Contents

1. Pointer Analysis: Rules2. How to Implement Pointer Analysis3. Pointer Analysis: Algorithms4. Pointer Analysis with Method Calls

95Tian Tan @ Nanjing University

Page 96: Nanjing University

Pointer-Affecting Statements

Tian Tan @ Nanjing University 96

New x = new T()

Assign x = y

Store x.f = y

Load y = x.f

Call r = x.k(a, โ€ฆ) Will come back to this inpointer analysis with method calls

First focus on these statements(suppose the program has just one method)

Page 97: Nanjing University

Domain and Notations

97

Variables: x, y โˆˆ VFields: f, g โˆˆ FObjects: oi, oj โˆˆ OInstance fields: oi.f, oj.g โˆˆ O ร— FPointers: Pointer = V โ‹ƒ (O ร— F)Points-to relations: pt : Pointer โ†’ ๐’ซ๐’ซ(O)

โ€ข ๐’ซ๐’ซ(O) denotes the powerset of Oโ€ข pt(p) denotes the points-to set of p

Tian Tan @ Nanjing University

Page 98: Nanjing University

Rules

Kind Statement Rule

New i: x = new T() ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Assign x = y ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Store x.f = y ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฆ๐‘ฆ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘œ๐‘œ๐‘–๐‘– . ๐‘“๐‘“)

Load y = x.f ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘œ๐‘œ๐‘–๐‘– . ๐‘“๐‘“๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)

98Tian Tan @ Nanjing University

Page 99: Nanjing University

Rules

Kind Statement Rule

New i: x = new T() ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Assign x = y ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Store x.f = y ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฆ๐‘ฆ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘œ๐‘œ๐‘–๐‘– . ๐‘“๐‘“)

Load y = x.f ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘œ๐‘œ๐‘–๐‘– . ๐‘“๐‘“๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)

99Tian Tan @ Nanjing University

โ† premises

โ† conclusion

โ† unconditional

Page 100: Nanjing University

Rule: New

100

i: x = new T()

๐‘œ๐‘œ๐‘–๐‘–Conclusion

๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Tian Tan @ Nanjing University

Page 101: Nanjing University

Rule: Assign

101

๐‘œ๐‘œ๐‘–๐‘–Conclusion

x = y

Premises

๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Tian Tan @ Nanjing University

Page 102: Nanjing University

Rule: Store

102

๐‘œ๐‘œ๐‘–๐‘–

x.f = y

๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฆ๐‘ฆ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘œ๐‘œ๐‘–๐‘– .๐‘“๐‘“)

๐‘œ๐‘œ๐‘—๐‘—๐‘“๐‘“

Tian Tan @ Nanjing University

ConclusionPremises

Page 103: Nanjing University

Rule: Load

103

๐‘œ๐‘œ๐‘—๐‘—

y = x.f

๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘œ๐‘œ๐‘–๐‘– .๐‘“๐‘“๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)

๐‘œ๐‘œ๐‘–๐‘–๐‘“๐‘“

Tian Tan @ Nanjing University

ConclusionPremises

Page 104: Nanjing University

RulesKind Rule Illustration

New ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ

Assign ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฅ๐‘ฅ)

Store ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฆ๐‘ฆ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘œ๐‘œ๐‘–๐‘– . ๐‘“๐‘“)

Load ๐‘œ๐‘œ๐‘–๐‘– โˆˆ ๐‘๐‘๐‘๐‘ ๐‘ฅ๐‘ฅ , ๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘ ๐‘œ๐‘œ๐‘–๐‘– . ๐‘“๐‘“๐‘œ๐‘œ๐‘—๐‘— โˆˆ ๐‘๐‘๐‘๐‘(๐‘ฆ๐‘ฆ)

104

ConclusionPremises

๐‘œ๐‘œ๐‘—๐‘—

y = x.f

๐‘œ๐‘œ๐‘–๐‘–๐‘“๐‘“

๐‘œ๐‘œ๐‘–๐‘–

x.f = y

๐‘œ๐‘œ๐‘—๐‘—๐‘“๐‘“

x = y

๐‘œ๐‘œ๐‘–๐‘–

i: x = new T()

๐‘œ๐‘œ๐‘–๐‘–

Tian Tan @ Nanjing University