EXPLORER : Query- and Demand-Driven Exploration of ... · PDF fileEXPLORER : Query- and...

Preview:

Citation preview

EXPLORER : Query- and Demand-Driven Exploration

of Interprocedural Control Flow Properties

Yu Feng, Xinyu Wang, Isil Dillig and Calvin Lin

UT Austin

1

Motivation

2

Many problems require answers to queries about control flow properties

Motivation

2

Many problems require answers to queries about control flow properties

Can foo() transitively invoke bar()?

Motivation

2

Many problems require answers to queries about control flow properties

launch()

sendSMSexec

loadLibrary...

Security

Can foo() transitively invoke bar()?

Motivation

2

Many problems require answers to queries about control flow properties

Motivation

2

Many problems require answers to queries about control flow properties

Can foo() transitively invoke bar() without calling goo() in the middle?

Motivation

2

Many problems require answers to queries about control flow properties

onclick() !asynctask

IOAccess DB

Socket...Performance

bugs

Can foo() transitively invoke bar() without calling goo() in the middle?

Steps to answer control flow queries

3

Steps to answer control flow queries

3

Step 1: Build a calligraph

Steps to answer control flow queries

3

Step 1: Build a calligraph

Steps to answer control flow queries

3

Step 1: Build a calligraph

Step 2: Customize

analysis

Steps to answer control flow queries

3

Step 1: Build a calligraph

Step 2: Customize

analysis

Why is this suboptimal?

4

Why is this suboptimal?

User must figure out what callgraph to use

4

Why is this suboptimal?

User must figure out what callgraph to use

CHA and RTA are fast but imprecise

4

Why is this suboptimal?

User must figure out what callgraph to use

CHA and RTA are fast but imprecise

Using pointer analysis (kcfa, kobj) are precise but not scalable

4

Why is this suboptimal?

User must figure out what callgraph to use

CHA and RTA are fast but imprecise

Using pointer analysis (kcfa, kobj) are precise but not scalable

User must write different analyses for answering each kind of queries

4

Our insights

5

Our insights

General framework to answer many different kinds of queries

5

Our insights

General framework to answer many different kinds of queries

Users do not need to write custom analyses

5

Our insights

General framework to answer many different kinds of queries

Users do not need to write custom analyses

To give a precise answer for a query, don’t need callgraph that is precise everywhere

5

Our insights

General framework to answer many different kinds of queries

Users do not need to write custom analyses

To give a precise answer for a query, don’t need callgraph that is precise everywhere

Only need to focus on parts of callgraph that are relevant to the query

5

Contributions

6

Contributions

General query language for describing queries on control flow properties

6

Contributions

General query language for describing queries on control flow properties

Refinement-based algorithm for answering control flow queries

6

Contributions

General query language for describing queries on control flow properties

Refinement-based algorithm for answering control flow queries

Only refine parts of callgraph relevant to the query

6

Overview

7

Key idea 1

8

Key idea 1

8

Our query language: regular expression

Key idea 1

8

Our query language: regular expression

Examples written in our query language

Key idea 1

8

main foo

Our query language: regular expression

Examples written in our query language

Key idea 1

8

main foo

main .* foo

Our query language: regular expression

Examples written in our query language

Key idea 1

8

main foo

main .* foo

(.*(!foo)* bar)

Our query language: regular expression

Examples written in our query language

Key idea 2

Focus on graph that is relevant to the query

Key idea 2

foo .* bar

Focus on graph that is relevant to the query

Key idea 2

foo

bar

hoogoo

callgraph

foo .* bar

Focus on graph that is relevant to the query

Key idea 2

foo

bar

hoogoo

callgraph

foo .* bar

Focus on graph that is relevant to the query

Key idea 2

foo

bar

hoogoo

callgraph

foo

bar

goo

Relevant part!

foo .* bar

Focus on graph that is relevant to the query

Solution

10

Solution

10

Callgraph Automaton

Solution

10

Query Automaton

Callgraph Automaton

Solution

10

Query Automaton

Callgraph Automaton

ProductAutomaton

Solution

10

NOEmpty

Query Automaton

Callgraph Automaton

ProductAutomaton

Solution

10

NOEmpty

Query Automaton

Callgraph Automaton

ProductAutomaton

Refine callgraph by issuing pts-to query

Refinement

Point-toQuery

YES

Solution

10

NOEmpty

Query Automaton

Callgraph Automaton

ProductAutomaton

Solution

10

NOEmpty

Query Automaton

Callgraph Automaton

ProductAutomaton

EdgesRefinement

Point-toQuery

Min-CutNot

Empty

Minimize the number of queries

Solution

10

NOEmpty

Query Automaton

Callgraph Automaton

ProductAutomaton

EdgesRefinement

Point-toQuery

Min-CutNot

Empty

Minimize the number of queries

Solution

10

NOEmpty

Query Automaton

Callgraph Automaton

ProductAutomaton

EdgesRefinement

Point-toQuery

Min-CutNot

Empty

YES

Minimize the number of queries

Example

11

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Example - Step 1

Query: Can A:foo transitively call C:bar?

12

Convert Query to Query Automaton

.* (A:foo) .*(C:bar)

Example - Step 1

Query: Can A:foo transitively call C:bar?

12

Convert Query to Query Automaton

1 2 3A:foo C:bar

. . ..* (A:foo) .*(C:bar)

Example - Step 2

main

A:foo

A:bar B:bar

B:foo

C:bar

A:woo

13

Convert callgraph to callgraph automaton

Example - Step 2

main

A:foo

A:bar B:bar

B:foo

C:bar

A:woo

13

Convert callgraph to callgraph automaton

q_m

q_af

q_bf

q_bb

q_ab

q_aw

q_cb

Entry main

B:foo

A:woo

B:bar

C:bar

C:bar

A:woo

A:bar

A:bar

B:bar

A:foo

Example - Step 3

14

Compute product automaton by intersecting query and callgraph automata

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3

main A:foo A:woo C:bar

C:bar

q_bf,1 q_aw,1

B:foo

A:woo

q_ab,2 q_bb,2

A:barB:bar

Example - Step 3

14

Compute product automaton by intersecting query and callgraph automata

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3

main A:foo A:woo C:bar

C:bar

q_bf,1 q_aw,1

B:foo

A:woo

q_ab,2 q_bb,2

A:barB:bar

Won’t scale if we construct it naively

Example - Step 3

14

Compute product automaton by intersecting query and callgraph automata

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3

main A:foo A:woo C:bar

C:bar

q_bf,1 q_aw,1

B:foo

A:woo

q_ab,2 q_bb,2

A:barB:bar

Won’t scale if we construct it naively

Example - Step 3

14

Entry,1

q_m,1 q_af,2 q_aw,2 q_cb,3

main

A:foo A:woo C:bar

C:bar

Compute product automaton by intersecting query and callgraph automata

How to refine effectively

15

How to refine effectively

If intersection is non-empty: callgraph may have spurious edges and we need to refine

15

How to refine effectively

If intersection is non-empty: callgraph may have spurious edges and we need to refine

Performing min-cut on product automaton to minimize # points-to queries

15

How to refine effectively

If intersection is non-empty: callgraph may have spurious edges and we need to refine

Performing min-cut on product automaton to minimize # points-to queries

15

Weight on the edges are either 1 or infinite.

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woo C:bar

C:bar

main

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woo C:bar

C:bar

main

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woo C:bar

C:bar

main

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woo C:bar

C:bar

main

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woo C:barmain

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woomain

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woomain

Refine Product automaton via min-cut and demand-driven pointer analysis

Example - Step 4

Answer: NO!

16

void main(...) { A x; A y; if (...) x = new A(); y = new B(); else x = new B(); y = new C(); x.foo();}

Entry,1 q_m,1 q_af,2 q_aw,2 q_cb,3A:foo A:woomain

Refine Product automaton via min-cut and demand-driven pointer analysis

Evaluation

17

Evaluation

Analysis of the observer design pattern in Java programs

17

Evaluation

Analysis of the observer design pattern in Java programs

Identification of performance bugs caused by GUI lagging in Android applications

17

Evaluation

Analysis of the observer design pattern in Java programs

Identification of performance bugs caused by GUI lagging in Android applications

Analysis of inter-component communication in Android

17

Evaluation

Analysis of the observer design pattern in Java programs

Identification of performance bugs caused by GUI lagging in Android applications

Analysis of inter-component communication in Android

17

Evaluation

Analysis of the observer design pattern in Java programs

Identification of performance bugs caused by GUI lagging in Android applications

Analysis of inter-component communication in Android

17

Evaluation

Analysis of the observer design pattern in Java programs

Identification of performance bugs caused by GUI lagging in Android applications

Analysis of inter-component communication in Android

17

Evaluation

classMainActivity{onClick(){Cursorc=sqLiteDatabase.query(...);}}

classMainActivity{onClick(){newLongOperation().execute("");

}}

classLongOperationextendsAsyncTask{doInBackground(...){

Cursorc=sqLiteDatabase.query(...);

}

18

Evaluation

classMainActivity{onClick(){Cursorc=sqLiteDatabase.query(...);}}

classMainActivity{onClick(){newLongOperation().execute("");

}}

classLongOperationextendsAsyncTask{doInBackground(...){

Cursorc=sqLiteDatabase.query(...);

}

18

Evaluation

classMainActivity{onClick(){Cursorc=sqLiteDatabase.query(...);}}

classMainActivity{onClick(){newLongOperation().execute("");

}}

classLongOperationextendsAsyncTask{doInBackground(...){

Cursorc=sqLiteDatabase.query(...);

}

18

Evaluation

classMainActivity{onClick(){Cursorc=sqLiteDatabase.query(...);}}

classMainActivity{onClick(){newLongOperation().execute("");

}}

classLongOperationextendsAsyncTask{doInBackground(...){

Cursorc=sqLiteDatabase.query(...);

}

18

.* onClick (!doInBackground)* query

Evaluation

19

Running time for detecting performance bugs(Lower is better)[Liu et al. ICSE’ 2014]

Evaluation

19

Valu

e Ti

tle

0

1000

2000

3000

4000

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Running time for detecting performance bugs(Lower is better)[Liu et al. ICSE’ 2014]

Evaluation

19

Valu

e Ti

tle

0

1000

2000

3000

4000

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Running time for detecting performance bugs(Lower is better)[Liu et al. ICSE’ 2014]

Evaluation

19

Valu

e Ti

tle

0

1000

2000

3000

4000

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Running time for detecting performance bugs(Lower is better)

T/O

[Liu et al. ICSE’ 2014]

Evaluation

19

Valu

e Ti

tle

0

1000

2000

3000

4000

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Running time for detecting performance bugs(Lower is better)

T/O

[Liu et al. ICSE’ 2014]

Evaluation

20

Number of warnings (Lower is better)

Evaluation

20

0

100

200

300

400

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Number of warnings (Lower is better)

Evaluation

20

0

100

200

300

400

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Number of warnings (Lower is better)

Evaluation

20

0

100

200

300

400

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Number of warnings (Lower is better)

N/A

Evaluation

20

0

100

200

300

400

Ushahidi c:geo Geohash FireFox APG BitcoinWallet MyTrack

CHA KOBJ EXPLORER

Number of warnings (Lower is better)

N/A

Conclusion

EXPLORER is …

21

http://fredfeng.github.io/explorer/

Conclusion

EXPLORER is …

21

Query Driven

http://fredfeng.github.io/explorer/

Conclusion

EXPLORER is …

21

Query Driven Demand Driven

http://fredfeng.github.io/explorer/

Conclusion

EXPLORER is …

21

Query Driven Demand Driven

http://fredfeng.github.io/explorer/

Practical

Thank you!

Sridharan et al. "Refinement-based context-sensitive points-to analysis for Java.” PLDI’ 06.

Liu et al. "Characterizing and detecting performance bugs for smartphone applications." ICSE’ 2014.

Christensen et al. Precise analysis of string expressions. SAS’ 2003.

22

Recommended