Upload
lucinda-atkinson
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Strauss:A Specification
Miner
Glenn AmmonsDepartment of Computer SciencesUniversity of Wisconsin-Madison
2
How should programs behave?
• Strauss mines temporal specifications• for example, always call free after malloc
• Why?• To debug
• To verify
• To test
• To modify
• To understand
3
How should programs behave?
• Strauss mines temporal specifications• for example, always call free after malloc
• Why?• To debug
• To verify
• To test
• To modify
• To understand
• Specs constrain programs• like structured programming, types, etc.
4
Strauss output
Spec. says what programs should do:•What order of calls? •What values in calls?
X := socket()
Y := accept(sock = X)
close(sock = Y)
read(sock = Y)
write(sock = Y)
start
end
For all calls Y := accept(sock = X):
5
Strauss inputs
...socket(domain = 2, type = 1, proto = 0, return = 7))...
Traces
...socket(domain = 2, type = 1, proto = 0, return = 7))...
...7 := socket(domain = 2, type = 1, proto = 0)...
StraussStrauss
LibraryClient programs
Interface (e.g., sockets)
6
Strauss inputs
...socket(domain = 2, type = 1, proto = 0, return = 7))...
Traces
...socket(domain = 2, type = 1, proto = 0, return = 7))...
...7 := socket(domain = 2, type = 1, proto = 0)...
StraussStrauss
LibraryClient programs
Interface (e.g., sockets)
Scenario criterion
accept
State transition model (STM)
def socket()close(def use sock)def accept(use sock)read(use sock)write(use sock)
7
My contributions• A dynamic approach to mining
• analyze communication, not source code
• A tool for mining specifications• practical: scalable and mostly automatic• temporal specs may refer to multiple data
values• requires little info about code
• New method for debugging specifications• Birds of a feather flock together (concept
analysis)
• Found specifications and bugs
8
Summary of results
• Mined 17 specs. for X11• example: XInternAtom should only be
called during initialization, not in the event loop.
• Found 199 bugs• In widely distributed programs
• wish (Tk), xpdf, xpilot, ...
• Included races, leaks, logic errors, and performance bugs
• By verifying dynamically (on traces)
9
Related work: obtaining specs.
• By hand• From programmers
• Types• Specification languages and annotations
• From analysts, testers• Abstraction tools (Bandera [Dwyer et al.])
• With specification mining
10
Comparing spec. miners• Strauss
• Specs: temporal, multiple objects, arbitrary NFAs
• Mining: from traces, local, no code necessary
• Goal: specs. for interfaces
• Other work• Concurrent work
• FSMs for Java objects [Whaley, Martin, and Lam]• Simple templates [Metacompilation]• Loop invariants [ESC/Java]
• Earlier work• Arithmetic properties [Daikon]• Cliches [Programmer’s Apprentice]• Communicating processes [Verisoft]
11
Overview of Strauss
Strauss
Programs and interface
Instrumentedprograms
Training traces
STM and scenariocriterion
Unvalidatedspecification
Preprocess
Mine
Debug
Developer’sjudgement
Specification
Scenarios
Tracer
Run ontest inputs
Scenarioextractor
Learner
Cable
12
Trace + STM = Semantic trace
1 7 := socket(domain = 2, type = 1, proto = 0)2 0x100 := malloc(size = 23)3 8 := accept(sock = 7, addr = 0x40, addr_len = 0x50)4 23 := write(sock = 8, buf = 0x100, len = 23)5 0 := close(sock = 8)...
State transition model (STM)
def socket()close(def use sock)def accept(use sock)read(use sock)write(use sock)
Semantic trace
Vars: v7, v8
1 v7 := socket()2 skip3 v8 := accept(sock = v7)4 write(sock = v8)5 v8 := close(sock = v8)...
13
Trace + STM = Semantic trace
State transition model (STM)
def socket()close(def use sock)def accept(use sock)read(use sock)write(use sock)
Semantic trace
Vars: v7, v8
1 v7 := socket()2 skip3 v8 := accept(sock = v7)4 write(sock = v8)5 v8 := close(sock = v8)...
1 7 := socket(domain = 2,domain = 2, type = 1,type = 1, proto = 0proto = 0)2 0x100 := malloc(size = 23)0x100 := malloc(size = 23)3 8 := accept(sock = 7,, addr = 0x40,addr = 0x40, addr_len = 0x50addr_len = 0x50)4 23 := write(sock = 8,, buf = 0x100,buf = 0x100, len = 23len = 23)5 0 :=0 := close(sock = 8)...
14
Trace + STM = Semantic trace
State transition model (STM)
def socket()close(def use sock)def accept(use sock)read(use sock)write(use sock)
Semantic trace
Vars: v7, v8
1 v7 := socket()2 skip3 v8 := accept(sock = v7)4 write(sock = v8)5 v8 := close(sock = v8)...
1 7 := socket(domain = 2,domain = 2, type = 1,type = 1, proto = 0proto = 0)2 0x100 := malloc(size = 23)0x100 := malloc(size = 23)3 8 := accept(sock = 7,, addr = 0x40,addr = 0x40, addr_len = 0x50addr_len = 0x50)4 23 := write(sock = 8,, buf = 0x100,buf = 0x100, len = 23len = 23)5 0 :=0 := close(sock = 8)...
15
A different STMState transition model (STM)
def malloc()free(def p)read(use buf)write(use buf)
Semantic trace
Var: v0x100
1 skip2 v0x100 := malloc()3 skip4 write(buf = v0x100)5 skip...
1 7 := socket(domain = 2, type = 1, proto = 0)2 0x100 := malloc(size = 23)3 8 := accept(sock = 7, addr = 0x40, addr_len = 0x50)4 23 := write(sock = 8, buf = 0x100, len = 23)5 0 := close(sock = 8)...
16
A different STMState transition model (STM)
def malloc()free(def p)read(use buf)write(use buf)
Semantic trace
Var: v0x100
1 skip2 v0x100 := malloc()3 skip4 write(buf = v0x100)5 skip...
1 7 := socket(domain = 2,7 := socket(domain = 2, type = 1,type = 1, proto = 0)proto = 0)2 0x100 := malloc(size = 23size = 23)3 8 := accept(sock = 7,8 := accept(sock = 7, addr = 0x40,addr = 0x40, addr_len = 0x50addr_len = 0x50)4 23 :=23 := write(sock = 8,sock = 8, buf = 0x100,, len = 23len = 23)5 0 := close(sock = 8)0 := close(sock = 8)...
17
From dependences to scenarios
Pick a seedSlice forwardsSlice backwardsChop each pair
18
From dependences to scenarios
Pick a seedSlice forwardsSlice backwardsChop each pair
19
From dependences to scenarios
Pick a seedSlice forwardsSlice backwardsChop each pair
20
From dependences to scenarios
Pick a seedSlice forwardsSlice backwardsChop each pair
21
From dependences to scenarios
Pick a seedSlice forwardsSlice backwardsChop each pair
22
Extracting scenarios: example
Vars: v7, v8
1 v7 := socket
2 v8 := accept(sock = v7)
3 write(sock = v8)
4 v8 := close(sock = v8)
5 v7 := close(sock = v7)
23
Extracting scenarios: example
Pick a seedVars: v7, v8
1 v7 := socket
2 v8 := accept(sock = v7)
3 write(sock = v8)
4 v8 := close(sock = v8)
5 v7 := close(sock = v7)
24
Extracting scenarios: example
Pick a seedSlice forwards
Vars: v7, v8
1 v7 := socket
2 v8 := accept(sock = v7)
3 write(sock = v8)
4 v8 := close(sock = v8)
5 v7 := close(sock = v7)
25
Extracting scenarios: example
Pick a seedSlice forwardsSlice backwards
Vars: v7, v8
1 v7 := socket
2 v8 := accept(sock = v7)
3 write(sock = v8)
4 v8 := close(sock = v8)
5 v7 := close(sock = v7)
26
Extracting scenarios: example
Pick a seedSlice forwardsSlice backwardsChop each pair
Vars: v7, v8
1 v7 := socket
2 v8 := accept(sock = v7)
3 write(sock = v8)
4 v8 := close(sock = v8)
5 v7 := close(sock = v7)
27
Overview of Strauss
Strauss
Programs and interface
Instrumentedprograms
Training traces
STM and scenariocriterion
Unvalidatedspecification
Preprocess
Mine
Debug
Developer’sjudgement
Specification
Scenarios
Tracer
Run ontest inputs
Scenarioextractor
Learner
Cable
28
Overview of learning
standardizeScenarios(dep. graphs)
ACEGB
ACEGB
ACEGB
Strings
NFA Learn NFA
Scenario standardize
NFA
Scenario acceptor (SA)
String
Yes or no
Specification = (SA, STM, scenario criterion)
29
Standardizing scenarios
v7 := socket()
v8 := accept(sock = v7)
write(sock = v8)
read(sock = v8)
v8 := close(sock = v8)
v7 := close(sock = v7)
v2 := socket()
v3 := accept(sock = v2)
v2 := close(sock = v2)
read(sock = v3)
write(sock = v3)
v3 := close(sock = v3)
30
Standardizing scenarios
v7 := socket()
v8 := accept(sock = v7)
v7 := close(sock = v7)
read(sock = v8)
write(sock = v8)
v8 := close(sock = v8)
v2 := socket()
v3 := accept(sock = v2)
v2 := close(sock = v2)
read(sock = v3)
write(sock = v3)
v3 := close(sock = v3)
31
Standardizing scenarios
X := socket()
Y := accept(sock = X)
X := close(sock = X)
read(sock = Y)
write(sock = Y)
Y := close(sock = Y)
X := socket()
Y := accept(sock = X)
X:= close(sock = X)
read(sock = Y)
write(sock = Y)
Y := close(sock = Y)
32
Standard scenario = string
X := socket()
Y := accept(sock = X)
X := close(sock = X)
read(sock = Y)
write(sock = Y)
Y := close(sock = Y)
X := socket()
Y := accept(sock = X)
X:= close(sock = X)
read(sock = Y)
write(sock = Y)
Y := close(sock = Y)
A
B
C
D
E
F
33
Overview of learning
standardizeScenarios(dep. graphs)
ACEGB
ACEGB
ACEGB
Strings
NFA Learn NFA
Scenario standardize
NFA
Scenario acceptor (SA)
String
Yes or no
Specification = SA + STM + scenario criterion
34
Raman’s PFSA algorithm1. Build a weighted retrieval tree2. Merge similar states
35
Raman’s PFSA algorithm1. Build a weighted retrieval tree2. Merge similar states
socket
accept
read write
write
write
read
read
write
write
write
100
5
5
95
70
7025
25
25
5
end70
525
36
Raman’s PFSA algorithm1. Build a weighted retrieval tree2. Merge similar states
socket
accept
read write
write
write
read
read
write
write
write
100
5
5
95
70
7025
25
25
5
end70
525
37
Raman’s PFSA algorithm1. Build a weighted retrieval tree2. Merge similar states
socket
accept
read write
writeread
read
write
write
100
5
5
95
70
7025
25
5
end100
25
38
Raman’s PFSA algorithm1. Build a weighted retrieval tree2. Merge similar states
socket
accept
readwrite
read
readwrite
100
595
7025
25
end100
25
5
75
39
Raman’s PFSA algorithm1. Build a weighted retrieval tree2. Merge similar states
socket
accept
readwrite
readwrite
100
595
70
25
25
end100
25
5
75
40
Overview of Strauss
Strauss
Programs and interface
Instrumentedprograms
Training traces
STM and scenariocriterion
Unvalidatedspecification
Preprocess
Mine
Debug
Developer’sjudgement
Specification
Scenarios
Tracer
Run ontest inputs
Scenarioextractor
Learner
Cable
41
Overview of Debugging
Cable
Debugged specification
SpecificationACEGB
ACEGB
ACEGB
Classifiedscenarios
Learner
Scenarios
42
Classifying scenarios
b0b2
b1
G0G0
G0
G0G0
g0
g1
43
Classifying scenarios
b0b2
b1
G0G0
G0
G0G0
b0b2
b1 G0G0
G0
G0G0
g0
g1
g0
g1
44
Classifying scenarios
b0b2
b1
g0
g1
G0G0
G0
G0G0
b0b2
b1 g0
g1
G0G0
G0
G0G0
g0
g1
G0G0
G0
G0G0
45
Concept analysis: mammals
4-legged
hairy smart marine thumbed
cats yes yes
dogs yes yes
dolphins
yes yes
gibbons yes yes yes
humans yes yes
whales yes yes
46
Concept analysis: mammals
4-legged
hairy smart marine thumbed
cats yes yes
dogs yes yes
dolphins
yes yes
gibbons yes yes yes
humans yes yes
whales yes yes
47
Cable method: good pets
Good pets:
Bad pets:
48
Cable method: good pets
catsgibbonsdogsdolphinshumanswhales
{}
Good pets:
Bad pets:
49
Cable method: good pets
hairy
catsgibbonsdogs
Good pets:
Bad pets:
50
Cable method: good pets
4-leggedhairy
cats dogs
Good pets:
Bad pets:
51
Cable method: good pets
4-leggedhairy
cats dogs
Good pets:cats, dogs
Bad pets:
52
Cable method: good pets
Good pets:cats, dogs
Bad pets:
gibbons dolphinshumanswhales
smart
53
Cable method: good pets
Good pets:cats, dogs
Bad pets:gibbons dolphins humanswhales
gibbons dolphinshumanswhales
smart
54
Concept analysis: scenarios
Takestransition 0
Takestransition 1
Takes transition 2
Takes transition 3
Takes transition 4
Scen. 0 yes yes
Scen. 1 yes yes
Scen. 2 yes yes
Scen. 3 yes yes yes
Scen. 4 yes yes
Scen. 5 yes yes
55
Cable method: good scenarios
Good scenarios:
Bad scenarios:
56
Cable method: good scenarios
scen. 0scen. 1scen. 2scen. 3scen. 4scen. 5
{}
Good scenarios:
Bad scenarios:
57
Cable method: good scenarios
scen. 0scen. 1scen. 2
Takes transition 1
Good scenarios:
Bad scenarios:
58
Cable method: good scenarios
scen. 0 scen. 2
Takes transition 0Takes transition 1
Good scenarios:
Bad scenarios:
59
Cable method: good scenarios
scen. 0 scen. 2
Takes transition 0Takes transition 1
Good scenarios:0, 2
Bad scenarios:
60
Cable method: good scenarios
Takes transition 2Takes transition 3Takes transition 4
Good scenarios:0, 2
Bad scenarios:
scen. 1 scen. 3scen. 4scen. 5
61
Cable method: good scenarios
Takes transition 2Takes transition 3Takes transition 4
Good scenarios:0, 2
Bad scenarios:1, 3, 4, 5
scen. 1 scen. 3scen. 4scen. 5
62
Experimental results
• Where to find bugs?• in programs (static verification)?
• or in traces (dynamic verification)?
• Specifications and bugs
63
How Strauss verifies a spec
extract scenarios
Scenario acceptor
...socket(domain = 2, type = 1, proto = 0, return = 7))...
...socket(domain = 2, type = 1, proto = 0, return = 7))...
...socket(domain = 2, type = 1, proto = 0, return = 7))...
socket(...)
accept(...)
read(...) write(...)
close(...)
socket(...)
accept(...)
read(...) write(...)
close(...)
socket(...)
accept(...)
read(...) write(...)
close(...)
Traces Scenarios(dep. graphs)
Positive specifications:Accept trace iff every seed hasa scenario that the SA accepts.
Negative specifications:Accept trace iff every seed hasno scenario that the SA accepts.
Yes or no
64
Experimental results
•90 traces•72 programs
•7 “book” rules•10 “graph” rules
Rule STM States Edges Bugs
PrsAccelTbl 9 3 10 1PrsTransTbl 5 3 4 0Quarks 4 63 93 0RegionsAlloc 4 8 14 16RegionsBig 33 370 604 12RmvTimeOut 3 3 3 0XFreeGC 3 10 13 44XGContextFromGC 4 22 36 44XGetSelOwner 3 5 6 9XInternAtom 17 3 11 42XPutImage 9 3 7 2+2XSaveContext 8 989 1519 33XSetFont 4 20 37 44+1XSetSelOwner 16 3 8 7XtFree 4 95 171 45XtOwnSel 16 4 23 1XtRealizeProc 4 22 36 0
65
1. X11 selection protocol specs.:• Found 17 bugs total.• English spec is buggy!
2. Performance: XInternAtom should not be called in the event loop
• 42 buggy programs (out of 72); degree varied
3. XPutImage spec. (more on this later)• 2 different bugs! But, also 2 false positives.
4. Other bugs:• 138 total (resource errors)
Bugs found
66
Cable was usefulCost = concepts visited+ concepts labeled
SpecificationBase Optimal Expert
PrsAccelTbl 18 4 8PrsTransTbl 6 2 2Quarks 16 2 2RegionsAlloc 20 8 16RegionsBig 540 X 149RmvTimeOut 4 2 2XFreeGC 20 6 12XGContextFromGC 50 14 24XGetSelOwner 4 4 5XInternAtom 20 4 5XPutImage 12 6 10XSaveContext 184 X 42XSetFont 56 X 53XSetSelOwner 12 4 5XtFree 224 X 28XtOwnSel 14 4 5XtRealizeProc 76 12 13
Cost of labeling
67
XSetSelectionOwner
English: Pass XSetSelectionOwner the timestampfrom the last event.
For all calls XSetSelectionOwner(time = X)
XSetSelectionOwner(time = X)
(event.time = X) := XNextEvent()XFilterEvent(event.time = X)XtDispatchEvent(event.time = X)(event.time = X) := XCheckWindowEvent()cb_XtActionProc(event.time = X)
68
XInternAtom
English: Don’t call XInternAtom in the event loop.
For no calls XInternAtom(display = X)
cb_XtEventHandler(event.display = X)cb_XtActionProc(event.display = X)(event.display = X) := XtAppNextEvent()(event.display = X) := XNextEvent()XFilterEvent(event.display = X)XtCallActionProc(event.display = X)(event.display = X) := XWindowEventXtDispatchEvent(event.display = X)(event.display = X) := XCheckWindowEvent()
XInternAtom(display = X)
69
XPutImage
English: The image and GC passed to XPutImagemust have been created on the same display.
For all calls XPutImage(display = X, image = Y, gc = Z)
Z := XCreateGC(display = X)
Y := XCreateImage(display = X)Y := XShmCreateImage(display = X)
XPutImage(display = X, image = Y, gc = Z)
70
Conclusions• Strauss is
• local• scalable• dynamic
• Strauss finds• real specs.• real bugs
• Does Strauss• generalize well?• work for others?• do too much or too little?
71
Suggestions for future work
• Increasing automation• Exploiting patterns• Exploiting labels
• Beyond mining from traces• Static; hybrids (profile-based)
• More expressive models• Structured heaps; arithmetic properties
• Other domains• Program understanding; testing
72
My publications• On Strauss.
• Debugging temporal specifications with concept analysis. With Rastislav Bodik, James R. Larus, and David Mandelin. PLDI03.
• Mining specifications. With Rastislav Bodik and James R. Larus. POPL02.
• Path profiling• Improving data-flow analysis with path profiles.
With James R. Larus. PLDI98. Selected for 20 Years of PLDI (1979-1999).
• Exploiting hardware performance counters with flow and context sensitive profiling. With Tom Ball and James R. Larus. PLDI97.
73
End of talk
74
Classifying scenarios
b0b2
b1
g0
g1
G0G0
G0
G0G0
b0b2
b1 g0
g1
G0G0
G0
G0G0
g0
g1
G0G0
G0
G0G0
75
Overview of StraussTraces and STM
apply STM extract scenarios
socket(...)
accept(...)
read(...) write(...)
close(...)
Trace programs(with PDGs)
socket(...)
accept(...)
read(...) write(...)
close(...)
socket(...)
accept(...)
read(...) write(...)
close(...)
Seed pattern
Scenarios
Classify and learn
Specification
X := socket()
Y := accept(so = X)
close(so = Y)close(so = X)
read(so = Y)
write(so = Y)
start
end
..
76
Overview of StraussTraces and STM
apply STM extract scenarios
socket(...)
accept(...)
read(...) write(...)
close(...)
Trace programs(with PDGs)
socket(...)
accept(...)
read(...) write(...)
close(...)
socket(...)
accept(...)
read(...) write(...)
close(...)
Seed pattern
Scenarios
Classify and learn
Specification
X := socket()
Y := accept(so = X)
close(so = Y)close(so = X)
read(so = Y)
write(so = Y)
start
end
..
77
Strauss inputs
...socket(domain = 2, type = 1, proto = 0, return = 7))...
Traces
...socket(domain = 2, type = 1, proto = 0, return = 7))...
...7 := socket(domain = 2, type = 1, proto = 0)...
State transition model (STM)
def socket()close(def use sock)def accept(use sock)read(use sock)write(use sock)
StraussStrauss
LibraryClient programs
Interface (e.g., sockets)