19

Click here to load reader

Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Embed Size (px)

DESCRIPTION

Sept 12ICSM'043 Motivation  Program comprehension  Reverse engineering of UML class diagrams: isQuery attrbutes  Code transformations  Specifications (e.g. JML)  Assertions  Simplification for other analyses

Citation preview

Page 1: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 1

Precise Identification of Side-Effect-Free Methods in Java

Atanas (Nasko) RountevOhio State University

Page 2: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 2

Side-Effect-Free Methods Side effects of a method: changes to

the values of memory locations In C++: global variables, local variables on

the stack, static fields, heap objects In Java: static fields, heap objects

Side-effect-free method: does not make changes that are observable by its callers

This work: static analysis of Java code for identifying side-effect-free methods

Page 3: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 3

Motivation Program comprehension Reverse engineering of UML class

diagrams: isQuery attrbutes Code transformations Specifications (e.g. JML) Assertions Simplification for other analyses

Page 4: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 4

Talk Outline Static analyses for finding methods

that are free of side effects Works on a partial program Parameterized by class analysis Tracks transitive modifications of static

fields and observable heap objects Experiments

How many methods are reported by the analysis as being free of side effects?

How many are really free of side effects?

Page 5: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 5

Problem Definition Analyzed component:

set of Java classes Some methods are

designated as boundary methods ( )

Could an unknown caller of a boundary method observe side effects? Static fields Fields of observable

objects

Page 6: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 6

Exampleclass WordIteratorWordIterator { …

private CharIterator text;public void setText(CharIterator c) { text = c; }public int next() { return nextPos(); } private int nextPos() { … text.nextChar(); … } }public char m() { CharIterator t = new CharIterator(“abc”);return t.nextChar(); } … }

class CharIteratorCharIterator { …private int curr;public char nextChar() { … curr++; … } … }

Page 7: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 7

Class Analysis Which objects does a reference variable

point to? Similar question for reference object fields

Set of object names: e.g., One name per class One name per “new” expression

Points-to pairs (x,o)(x,o) : variable x points to object name o (o1.f,o2)(o1.f,o2) : field f of o1 points to o2

Page 8: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 8

Class Analysis Flow sensitivity: track statement order Context sensitivity: track calling context

Set of context abstractions (x(xcc,o),o) : variable x points to object name o

when the calling context is c We consider context-sensitive, flow-

insensitive class analyses Most class analyses are in this category

Typically, whole-program analyses

Page 9: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 9

Artificial Main Goal: simulate

everything that unknown callers can do with respect to object references

Run a whole-program class analysis on main and the component

Artificial variables and statements

main

Page 10: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 10

Some Points-to Pairs

wi obj1 [WordIter]

ci obj2 [CharIter]setText.cobj1

setText.thisobj1 text

nextChar.thisobj3

obj3 [CharIter]

t = new CharItert.nextChar()m.tobj1

Page 11: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 11

Direct Side Effects Observable objects: reachable from

the artificial variables in the points-to graph In the example: obj1 and obj2

Direct side effects of a method x.f = … , where x refers to an observable

object new C, when representing an observable

object C.f = … , where f is a static field

Page 12: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 12

Example setText: this.text = c

nextChar: this.curr++

Direct(setText,obj1) = { obj1.text } Direct(nextChar,obj2) =

{ obj2.curr }

obj1setText.thisobj1

obj2nextChar.thisobj2

obj3nextChar.thisobj3

Page 13: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 13

Transitive Side Effects Transitive side effects of a method

Under context c1, calls another method with some calling context c2

For c2, the callee has side effects

private int nextPos() { … text.nextChar(); … } }

obj1

obj2

nextPos.thisobj1

textDirect(nextChar,obj2) = { obj2.curr }Trans(nextPos,obj1) = { obj2.curr }

Page 14: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 14

Boundary Methodsclass WordIteratorWordIterator { …

public void setText(CharIterator c) { … }public int next() { … } public char mm() { … } // free of side effects

class CharIteratorCharIterator { …public char nextChar() { … } … }

Mod(setText,obj1) = { obj1.text }Mod(next,obj1) = { obj2.curr }Mod(nextChar,obj2) = { obj2.curr }

Page 15: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 15

Experimental Study Small–scale study of:

two side-effect analyses based on two class analyses

feasibility of the analysis solution Class analyses

Context-sensitive, flow-insensitive analysis similar to the one from the example (CSA)

Rapid Type Analysis (RTA): low end of the theoretical precision spectrum

Page 16: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 16

Experimental Study Seven components from the standard

Java libraries Classes that implement some

functionality, plus all their transitive server classes

Boundary methods: provide access to the functionality

Implementation based on the Soot framework (McGill) and the BANE constraint solver (UC Berkeley)

Page 17: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 17

Results from CSA

0

5

10

15

20

25

30

35

40

gzip zip

chec

ked

colla

tordate

number

boundary

Num

ber o

f met

hods

Boundary methods Side-effect-free methods

Page 18: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 18

Summary of Results CSA-based analysis: a quarter of the

boundary methods are side-effect-free Perfect precision: we proved that all

other methods did have real side effects Manual examination of the code The analysis did not miss any side-effect-

free methods Surprising result: RTA-based analysis

achieves almost the same precision

Page 19: Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University

Sept 12 ICSM'04 19

Conclusions It is possible to adapt many whole-

program class analyses to find side-effect-free methods in partial programs

Preliminary experimental results Significant number of side-effect-free

methods Analyses achieve very good precision

Future work: more comprehensive study