Parameterized Object Sensitivity for Points-to Analysis for Java

Preview:

DESCRIPTION

Parameterized Object Sensitivity for Points-to Analysis for Java. Presented By: - Anand Bahety Dan Bucatanschi. Presentation Roadmap. Introduction Terms and Definitions Application of previous techniques to OOP Imprecision analyzed Object Sensitive analysis and its advantages - PowerPoint PPT Presentation

Citation preview

Parameterized Object Sensitivityfor

Points-to Analysis for Java

Presented By: -Anand Bahety

Dan Bucatanschi

Presentation Roadmap

• Introduction • Terms and Definitions• Application of previous techniques to OOP• Imprecision analyzed• Object Sensitive analysis and its advantages• Parameterized Object Sensitivity

Introduction

• Points-to Analysis: - Method in Java to determine the set of objects pointed to by a reference variable or a reference object field

• Goal• Advantages

Terms and Definitions

• Side-effect analysis• Def-use analysis• Flow sensitive & flow insensitive• Context sensitive & context insensitive• Object sensitivity

Sample points-to graph

Object Oriented Programing

• Encapsulation• Inheritance• Collection (Containers)…

Lets try to analyze these features using flow insensitive and context insensitive analysis

Semantics

• R – set of all reference variables• O – set of all objects created at object

allocation sites• F – contains all instance fields in program class• Edge (r,oi) Є R x O• (<oi,f>, oj) Є (O x F) x O• Transfer functions

Encapsulation

x1 O1 O2 x2

y1 O3

y2 O4

f

x

this

f

f

f

Inheritance

y O1 O2 z

b O3

B.xb

A.xa

this

f

f

f

O4 c

C.xc

f

Imprecision

• Encapsulation• Inheritance

– Both of these are strong concepts of OOP– But not captured properly with old techniques– Solution is Object sensitivity

Object Sensitivity

• Revised semantics– O` - set of all object names– R` - set of replicas of reference variable– Relation α(C,m)– Set of new transfer functions

Context sensitivity included

• B.thiso3,B.xbo3, A.xao3

C.thiso4,C.xco4,A.xao4

y O1 O2 z

b O3 O4 c

B.xb

A.xa

this

C.xc

f

f

f

f

Older representation

Advantages

• Models OOP features• Distinguishes between different receiver

objects • Static methods and variables can be handled

with insensitivity • Can be parameterized

Parameterized Object Sensitivity

• Two dimensions– Degree of precision in

naming scheme

o21 , o31

– Set R* of reference variables for which multiple points-to sets should be maintained

Implementation and Performance

• Techniques for implementation and optimization

• Side-effect analysis (MOD)• Def-Use analysis• Empirical Results• Conclusions• Future Work

Techniques for Implementation

• Typical implementation of flow- and context-insensitive analysis (Andersen’s technique):– Statement processing routine: processes different

kinds of program statements– Virtual dispatch routine: models the semantics of

virtual calls

Techniques for Implementation

• Implementation of parameterized object-sensitive analysis:– Implement function map(v, c)– Process each statement once for every possible

context– Augment the virtual dispatch routine to map the

return variable and the formal parameters of the invoked method to the corresponding context.

Techniques for Optimization

• The points-to set of a replica thiso = {o}.• Suppose statement s contains only

nonreplicated variables (i.e. the variables are not in the R* set), then analyze s only once for one “default” context.

• Similarly, if l ∈ R* but r ∉ R*, and l is assigned only at statements of the form:– l = r– l = r.f

Techniques for Optimization

• Suppose l ∈ R* and p ∉ R*.• Consider the assignments: l.f=p, p=l, p=l.f, and p.f=l.

• We can add a nonreplicated variable l’ and a new (context-dependent) statement l’=l.

• Then the points-to set of l’ is the union of the the points-to sets of all context copies of l.

• So the statements can be analyzed context-independently.

Side-effect Analysis (MOD)

• Goal:– For each statement s and context c of the method

enclosing s, compute set Mod(s, c) of objects that could be modified by executing s when in c.

– Also, MMod(m, c) is the set of objects that could be modified by each contextual version of a method m.

• The previous optimizations can be applied.

Side-effect Analysis (MOD)

Instance field assignments

Virtual method calls

Static method calls

Typo: should be c

Def-Use Analysis

• Goal: compute def-use associations between pairs of statements.

• A def-use association for a memory location l is a pair of statements (m, n) such that m assigns a value to l and subsequently n uses that value.

Standard Def-Use Analysis

• For procedural languages, well known methods for computing intraprocedural associations and interprocedural associations.

• We need a pointer analysis to disambiguate indirect definitions and uses.

• Reaching definitions (RD) analysis needed to determine the sets of definitions that may reach a program statement (because of use of pointers), in order to identify def-use pairs.

Object Sensitivity in Def-Use Analysis

• Points-to analysis must be used in order to determine which objects may be accessed by expressions of the form p.f.

• ∀ oi ∈Pt(p), memory location oi.f is added to the DEF or USE set for the corresponding statement.

• MDEF(m) contains definitions created in method m and in all direct and indirect callees of m.

Standard Def-Use Analysis

DEF set;Direct and indirect DEF set

Reaching Definitions set broken down by type of node (statement)

DEF-USE pairs

Implementations

• Parameterized object-sensitive points-to analysis (context depth = 1):– ObjSens1: keeps context-sensitive information for

implicit parameters this and formal parameters of instance methods and constructors.

– ObjSens2: the same as ObjSens1, but it also keeps track of return variables.

Implementations

• Context-sensitive analysis based on the call string approach to context sensitivity, for a call string k = 1 (CallSite).

• Distinguishes context per call site.• To allow for comparison, the context

replication is performed for this, formal parameters and return variables in instance methods and constructors.

Implementations

• The 3 context-sensitive analyses were built on top of an existing implementation of Andersen’s context-insensitive points-to analysis (And).

• The analyses are using the optimization techniques we discussed.

• The Soot framework was used to process Java bytecode and to build a typed intermediate representation.

Characteristics of Programs

Analysis Cost

Discussion

• Time and memory cost is comparable to Andersen’s analysis.

• Amount of work is similar: And has to consider all possible objects for a statement s. Even though context-sensitive analyses do more work to keep track of different contexts, they eventually end up doing less work per statement s.

• For the majority of programs, adding the return values to R* does not increase cost.

Discussion

• Call string context-sensitive analysis (CallSite) achieves practical cost.

• CallSite has poor running time for larger programs, probably because it is less precise than ObjSens2.

MOD Analysis Implementation

• Measurements of ObjSens2, CallSite, and And.• Percentages are with respect to the number

of statements that modify at least one object.• Each column shows the percentage of the

total number of statements that modify the respective number of objects.

• More precise analyses produce a smaller percentage number.

MOD Analysis Precision

Conclusions

• Presented a framework for parameterized object-sensitive points-to analysis, and side-effect and def-use analyses based on it.

• Object-sensitive analysis achieves significantly better precision than context-insensitive analysis, while remaining efficient and practical.

Future Work

• Investigate other instantiations of the framework: more precise naming of sub-objects of composite objects.

• Investigate applications of points-to, side-effect, and def-use analyses in the context of software productivity tools.

Recommended