View
1
Download
0
Category
Preview:
Citation preview
Synthesis and Inductive Learning – Part 2
Synthesis and Inductive Learning – Part 2
Sanjit A. Seshia
EECS DepartmentUC Berkeley
NSF ExCAPE Summer SchoolJune 23-25, 2015
Acknowledgments to several Ph.D. students, postdoctoral researchers, and collaborators, and to
the students of EECS 219C, Spring 2015, UC Berkeley
– 2 –
Questions of Interest for this TutorialQuestions of Interest for this Tutorial
How can inductive synthesis be used to solve other (non-synthesis) problems?
Reducing a Problem to Synthesis How does inductive synthesis compare with
machine learning? What are the common themes amongst various inductive synthesis efforts?
Oracle-Guided Inductive Synthesis (OGIS) Framework
Is there a complexity/computability theory for inductive synthesis?
Yes! A first step: Theoretical analysis of counterexample-guided inductive synthesis (CEGIS)
– 3 –
Outline for this Lecture SequenceOutline for this Lecture Sequence
Examples of Reduction to Synthesis– Specification– Verification
Differences between Inductive Synthesis and Machine Learning
Oracle-Guided Inductive Synthesis– Examples, CEGIS
Theoretical Analysis of CEGIS– Properties of Learner– Properties of Verifier
Demo: Requirement Mining for Cyber-Physical Systems
– 4 –
Comparison with Machine LearningComparison with Machine Learning
– 5 –
Formal Inductive SynthesisFormal Inductive Synthesis
Given:– Class of Artifacts C– Formal specification – Set of (labeled) examples E (or source of E)
Find, using only E, an f C that satisfies
– 6 –
Counterexample-Guided Inductive Synthesis (CEGIS)Counterexample-Guided Inductive Synthesis (CEGIS)
INITIALIZE
SYNTHESIZE VERIFY
CandidateArtifact
Counterexample
Verification SucceedsSynthesis Fails
Structure Hypothesis (“Syntax-Guidance”), Initial Examples
– 7 –
CEGIS = Learning from Examples & CounterexamplesCEGIS = Learning from Examples & Counterexamples
INITIALIZE
LEARNINGALGORITHM
VERIFICATIONORACLE
CandidateConcept
Counterexample
Learning SucceedsLearning Fails
“Concept Class”, Initial Examples
– 8 –
CEGIS is an instance of ActiveLearningCEGIS is an instance of ActiveLearning
ACTIVELEARNING
ALGORITHM
1. Search Strategy: How to search the space of candidate concepts?
2. Example Selection: Which examples to learn from?
Examples
Search Strategy
SelectionStrategy
– 9 –
Some Instances of CEGIS you’ve seen (will see soon)Some Instances of CEGIS you’ve seen (will see soon)Instance Concept
ClassLearner Verifier
SKETCH Programs in SKETCH
SAT/SMT solver
SAT/SMT solver
Enumerative SyGuS
Defined by Grammar
EnumerativeSearch
SMT solver
Stochastic SyGuS
Defined by Grammar
StochasticSearch
SMT solver
Constraint SyGuS
Defined by Grammar
SMT solver SMT solver
Req. Mining for STL
Parametric STL
Parameter Search
Simulation-based falsifier
(see [Alur et al., FMCAD’13])
– 10 –
Comparison*Comparison*
Feature Formal Inductive Synthesis
MachineLearning
Concept/Program Classes
Programmable, Complex Fixed, Simple
Learning Algorithms
General-Purpose Solvers Specialized
Learning Criteria Exact, w/ Formal Spec
Approximate, w/ Cost Function
Oracle-Guidance Common (can control Oracle)
Rare (black-boxoracles)
* Between typical inductive synthesizer and machine learning algo
– 11 –
Oracle-Guided Inductive SynthesisOracle-Guided Inductive Synthesis
– 12 –
Oracle-Guided Inductive SynthesisOracle-Guided Inductive Synthesis
Given:– Domain of Examples D– Concept Class C – Formal Specification ⊆ D– Oracle O that can answer queries of type Q
Find, by only querying O, an f C that satisfies
– 13 –
Common Oracle Query TypesCommon Oracle Query Types
LEARNER ORACLE
Positive Witnessx , if one exists, else
Negative Witnessx , if one exists, else
Membership: Is x ?Yes / No
Equivalence: Is f = ?Yes / No + x f
Subsumption/Subset: Is f ⊆ ?Yes / No + x f \
Distinguishing Input: f, X ⊆ ff’ s.t. f’ ≠f X ⊆ f’, if it exists;
o.w.
– 14 –
Examples of OGISExamples of OGIS
L* algorithm to learn DFAs:– Membership + Equivalence queries
CEGIS used in SKETCH/SyGuS solvers– (positive) Witness + Equivalence/Subsumption
queries CEGIS used in Reactive Model Predictive Control
– covered in Vasu Raman’s lecture Two different examples:
– Learning Programs from Distinguishing Inputs [Jha et al., ICSE 2010]
– Learning LTL Properties for Synthesis from Counterstrategies [Li et al., MEMOCODE 2011]
– 15 –
Reverse Engineering Malware by Program SynthesisReverse Engineering Malware by Program Synthesis
Obfuscated code:Input: y Output: modified value of y
{ a=1; b=0; z=1; c=0;while(1) { if (a == 0) {if (b == 0) { y=z+y; a =~a;b=~b; c=~c; if (~c) break; }else {z=z+y; a=~a; b=~b; c=~c;if (~c) break; } }else if (b == 0) {z=y << 2; a=~a;}else { z=y << 3; a=~a; b=~b;}
} }FROM CONFICKER WORM
What it does:
y = y * 45
We solve this using program synthesis.
Paper: S. Jha et al., “Oracle-Guided Component-Based Program Synthesis”, ICSE 2010.
– 16 –
Class of Programs: “Loop-Free”Class of Programs: “Loop-Free”
Programs implementing functions: I O
Functions could be if-then-else definitions and hence, the above represents any loop-free code.
P(I):O1 = f1 (V1)O2 = f2 (V2)…On = fn (Vn)
where
f1,f2,…,fn are functions from a given component library
– 18 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
[Goldman & Kearns, 1995]
– 19 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
– 20 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
(i1, o1)
Example Set of programs ruled out by that example
– 21 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
(i1, o1) - E1(i2, o2) - E2
………
(in, on) - En
Theorem: [Goldman & Kearns, ‘95]Smallest set of I/O examples to learn correct program
IS
Minimum size subset of {E1, E2, ……, En } that covers all the incorrect programs
– 22 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
(i1, o1) - E1(i2, o2) - E2
………
(in, on) - En
Smallest set of I/O examples to learn correct design
IS
Minimum size subset of {E1, E2, ……, En } that cover all the incorrect programs
Practical challenge: can’t enumerate all inputs and find set Ei for each
– 23 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
(i1, o1) - E1(i2, o2) - E2
………
(in, on) - En
ONLINE set-cover:
In each step, • choose some (ij,oj) pair• eliminated incorrect programs Ejdisclosed
– 24 –
Program Learning as Set CoverProgram Learning as Set Cover
Space of all possible programsEach dot represents semantically unique program
(i1, o1) - E1(i2, o2) - E2
………
(in, on) - En
ONLINE set-cover:
In each step, • choose some (ij,oj) pair• eliminated incorrect programs Ejdisclosed
Our heuristic:|Ej| 1: atleast one incorrect program identified
– 25 –
Approach: Learning based on Distinguishing InputsApproach: Learning based on Distinguishing Inputs
Space of all possible programsEach dot represents semantically unique program
[Jha et al., ICSE’10]
– 26 –
Space of all possible programs
Make (positive) Witness queryExample I/O set E := {(i1,o1)}
Learning from Distinguishing InputsLearning from Distinguishing Inputs
– 27 –
Space of all possible programs
Example I/O set E := {(i1,o1)}
P1
Learning from Distinguishing InputsLearning from Distinguishing Inputs
SMT Learner synthesizes P1 usingSMT solver
– 28 –
Space of all possible programs
Example I/O set E := {(i1,o1)}
P1
P2
Learning from Distinguishing InputsLearning from Distinguishing Inputs
SMT
SMT
– 29 –
Space of all possible programs
Example I/O set E := {(i1,o1)}
P1
P2i2
Learning from Distinguishing InputsLearning from Distinguishing Inputs
SMT
– 30 –
Space of all possible programs
Example I/O set E := {(i1,o1)}
i2
(i2, o2)
P1
P2
Learning from Distinguishing InputsLearning from Distinguishing Inputs
Implemented as a singleDistinguishing input query:• Given P1,• Finds P2 s.t. P1 != P2
and (i2, o2) is dist. input
– 31 –
Space of all possible programs
Example I/O set E := E {(i2,o2)}
Learning from Distinguishing InputsLearning from Distinguishing Inputs
– 32 –
Space of all possible programs
Example I/O set E := E {(ij,oj)}
Learning from Distinguishing InputsLearning from Distinguishing Inputs
– 33 –
Space of all possible programs
Example I/O set E := E {(ik,ok)}
Learning from Distinguishing InputsLearning from Distinguishing Inputs
– 34 –
Space of all possible programs
Example I/O set E := E {(in,on)}
Semantically Unique Program
Correct Program?
Learning from Distinguishing InputsLearning from Distinguishing Inputs
– 35 –
SoundnessSoundness
Library of components is sufficient ?
Correct design
I/O pairs show infeasibility ?
YES
Infeasibility reported
Incorrect design
YES
NO
NO
Can put this learning approach within an outer CEGIS loop
Conditional on Validity of Structure Hypothesis
– 38 –
Example 2: Assumption SynthesisExample 2: Assumption Synthesis
Reactive Synthesis from LTL
39
SystemRequirements
EnvironmentAssumptions
SynthesisTool
Unrealizable
Realizable
FSM
Often due to incomplete environment assumptions!
s
e
e s
Example
40
Inputs: request r and cancel c
Outputs: grant g
System specification s:- G (r X F g)- G(c g X g)
Environment assumption e: - True
Is it realizable?
Not realizable because the environment can force c to be high all the time
Counter-strategy and counter-trace
41
Counter-strategy is a strategy for the environment to force violation of the specification.
Counter-trace is a fixed input sequence such that the specification is violated regardless of the outputs generated by the system.
System s:- G (r X F g)- G(c g X g)
A counter-trace:r: 1 1 (1)c: 1 1 (1)
CounterStrategy-Guided Environment Assumption Synthesis [Li et al., MEMOCODE 2011]
42
FormalSpecification
Synthesis Tool
Realizable
Done
ComputeCounterstrategy
Unrealizable
Start
MineAssumptions
AddSpecificationTemplates
UserScenarios
Instance of CEGIS
Concept Class: all LTL formulas over I and O “constraining I”
Domain of Examples: All finite-state transducers with input O and output I (environments)
Formal Spec: set of env transducers for which there exists an implementation of Sys satisfying s
Oracle supports subsumption query: Does the current assumption rule out environments outside ? If
not, give one such counterexample.
MEMOCODE 201143
I
OSysEnv
Assumption Learning Algorithm
45
Generate Candidate φ
e.g. G F ?Specification
Templates
UserScenarios
Counter-strategy
Eliminate or Accept
Candidate
e.g. G F p
DoneAccept φ
Eliminate φ
(check consistency with existing assumptions)
Assumption Synthesis Algorithm
46
Generate Candidate φ
e.g. G F ?Specification
Templates
UserScenarios
Counter-strategy E
Invoke Model Checker: Does E satisfy φ ?
e.g. G F p
DoneYES: Accept φ
NO: Eliminate φ
VERSION SPACES:• Retain candidates
consistent with examples
• Traverse weakest to strongest
(check consistency with existing assumptions)
Version Space Learning
G F p1
true
false
G F p2 G F pn
G X p1
G p1
G p1 p2
G p2 G pn
G pn-1 pn
G X pn
G (p1 X p2)
G X p2
. . .
. . .
. . .
. . .
. . .
. . .
STRONGEST (most specific)
WEAKEST (most general)
Originally due to [Mitchell, 1978]
Example
48
System s:- G (r X F g)- G(c g X g)
A counter-trace:r: 1 1 (1)c: 1 1 (1)
Test assumption candidates by checking its negation:o G (F c) ------------ o G (F c) ------------
System s:- G (r X F g)- G(c g X g)
Environment e:- G (F c)
Theoretical Results
49
Theorem 1: [Completeness] If there exist environment assumptions under our structure hypothesis that make the spec realizable, then the procedure finds them (terminates successfully). “conditional completeness” guarantee
Theorem 2: [Soundness] The procedure never adds inconsistent environment assumptions.
– 50 –
Summary of Lecture 2Summary of Lecture 2
Differences between Formal Inductive Synthesis and Machine Learning
Common features across various inductive synthesis methods
Oracle-Guided Inductive Synthesis framework Two Instances
– Learning Programs from Distinguishing Inputshttp://www.eecs.berkeley.edu/~sseshia/pubs/b2hd-jha-
icse10.html– Learning Assumptions from Counterstrategies
http://www.eecs.berkeley.edu/~sseshia/pubs/b2hd-li-memocode11.html
Brief Introduction to the Theory (more tomorrow)
Recommended