Upload
alvin-walton
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
• Synthesize a program in some underlying language from user intent using some search technique.
2
Program Synthesis
• Why today?– Variety of (cheap) computational devices and platforms
• Billions of non-experts have access to these devices!– Enabling technology is now available
• Better search algorithms• Faster machines (good application for multi-cores)
• Synthesize a program in some underlying language from user intent using some search technique.
3
Program Synthesis
• Why today?– Variety of (cheap) computational devices and platforms
• Billions of non-experts have access to these devices!– Enabling technology is now available
• Better search algorithms• Faster machines (good application for multi-cores)
• Concept Language– Programs
• Straight-line programs– Automata– Queries– Sequences
• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces
• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)
4
Dimensions in Synthesis
PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.
(Application)
(Ambiguity)
(Algorithm)
5
Compilers vs. Synthesizers
Dimension
Compilers Synthesizers
Concept Language
Executable Program
Variety of concepts: Program, Automata, Query, Sequence
User Intent Structured language
Variety/mixed form of constraints: logic, examples, traces
Search Technique
Syntax-directed translation (No new algorithmic insights)
Uses some kind of search (Discovers new algorithmic insights)
Students and Teachers
End-Users
Algorithm Designers
Software Developers
Most Transformational Target
Potential Users of Synthesis Technology
6
Most Useful Target
• Vision for End-users: Enable people to have (automated) personal assistants.
• Vision for Education: Enable every student to have access to free & high-quality education.
Lecture 1: Algorithms• Synthesis of Straight-line Programs from Logic
– Bit-vector Algorithms– Geometry Constructions
Lecture 2: Applications• Intelligent Tutoring Systems
Lecture 3: Ambiguity• Synthesis from Examples & Keywords
7
Organization
Intelligent Tutoring Systems
Technical Goals:• Identify a useful task that can be formalized as
a synthesis problem.• Propose an appropriate user interaction model.• Propose an appropriate search technique.
8
Lab
Synthesizing Bitvector Algorithms
PLDI 2011: Gulwani, Jha, Tiwari, Venkatesan
• Concept Language– Programs
• Straight-line programs– Automata– Queries– Sequences
• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces
• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)
10
Dimensions in Synthesis
PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.
Straight-line programs that use – Arithmetic Operators: +,-,*,/– Logical Operators: Bitwise and/or/not, Shift left/right
11
Bitvector Algorithms
1 0 1 0 1 1 0 0
Turn-off rightmost 1-bit
12
Examples of Bitvector Algorithms
1 0 1 0 1 1 0 0
1 0 1 0 1 0 0 0
Z
Z & (Z-1)
1 0 1 0 1 0 1 1
Z
Z-1
1 0 1 0 1 0 0 0
&
Z & (Z-1)
13
Examples of Bitvector Algorithms
Turn-off rightmost contiguous sequence of 1-bits
Z
Z & (1 + (Z | (Z-1)))
1 0 1 0 1 1 0 0 1 0 1 0 0 0 0 0
Ceil of average of two integers without overflowing
(Y|Z) – ((Y©Z) >> 1)
14
Examples of Bitvector Algorithms
Higher order half of product of x and yo1 := and(x,0xFFFF);o2 := shr(x,16);o3 := and(y,0xFFFF);o4 := shr(y,16);o5 := mul(o1,o3);o6 := mul(o2,o3);o7 := mul(o1,o4);o8 := mul(o2,o4);o9 := shr(o5,16);o10 := add(o6,o9);o11 :=
and(o10,0xFFFF);o12 := shr(o10,16);o13 := add(o7,o11);o14 := shr(o13,16);o15 := add(o14,o12);res := add(o15,o8);
Round up to nexthighest power of 2o1 := sub(x,1);o2 := shr(o1,1);o3 := or(o1,o2);o4 := shr(o3,2);o5 := or(o3,o4);o6 := shr(o5,4);o7 := or(o5,o6);o8 := shr(o7,8);o9 := or(o7,o8);o10 := shr(o9,16);o11 := or(o9,o10);res := add(o10,1);
Given:• Specification of desired
functionality• Specification of library components
Synthesize a straight-line program
15
Problem Definition
where• Each variable in is either or some where
k<j• is a permutation of 1...n
that meets the desired specification.
VerificationConstraint
• Specification of desired functionality
• Specification of library components
16
Problem Definition: Turn-off rightmost 1 bit
17
Synthesis Constraint
VerificationConstraint
SynthesisConstraint
represents which component goes on which location (line #) and from which location does it gets its input arguments. We encode this by location variables L.
18
Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order
19
Example: Possible programs that use 2 components and their Representation using
Location Variables
• Consistency Constraint: Every line in the program should have at most one component.
20
Encoding Well-formedness of Programs
• Acyclicity Constraint: A variable should be initialized before being used.
The following constraint ensures that L assignments correspond to well-formed programs.
21
Encoding data-flow
The following constraint describes connections between inputs and outputs of various components.
22
Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order
Synthesis constraint is of the form: 9L 8Y F(L,Y)
Finite Synthesis Step9L F(L,y1) Æ … Æ F(L,yn)
Verification StepDoes 8Y F(S,Y) hold?Or, equivalently 9Y :F(S,Y)
Solution Y = yn+1
return S 23
Choose some values y1,..,yn for y
Solution L = S
Failure
No Solution
No Solution
Idea # 2: Using CEGIS style procedure to solve the Synthesis Constraint
Experiments: Comparison with Brute-force Search
24
Program Brahma AHAtimeNam
elines
iters time
P1 2 2 3 0.1
P2 2 3 3 0.1
P3 2 3 1 0.1
P4 2 2 3 0.1
P5 2 3 2 0.1
P6 2 2 2 0.1
P7 3 2 1 2
P8 3 2 1 1
P9 3 2 6 7
P10 3 14 76 10
P11 3 7 57 9
P12 3 9 67 10
Program Brahma AHAtime
Name lines
iters time
P13 4 4 6 X
P14 4 4 60 X
P15 4 8 119 X
P16 4 5 62 X
P17 4 6 78 109
P18 6 5 46 X
P19 6 5 35 X
P20 7 6 108 X
P21 8 5 28 X
P22 8 8 279 X
P23 10 8 1668 X
P24 12 9 224 X
P25 16 11 2779 X
Synthesizing Geometry Constructions
PLDI 2011: Gulwani, Korthikanti, Tiwari.
• Concept Language– Programs
• Straight-line programs– Automata– Queries– Sequences
• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces
• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)
26
Dimensions in Synthesis
PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.
Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.
27
Ruler/Compass based Geometry Constructions
X
Z
Y
L1 L2
N
C
• Draw a regular hexagon given a side.
• Given 3 parallel lines, draw an equilateral triangle whose vertices lie on the parallel lines.
• Given 4 points, draw a square whose sides contain those points.
28
Other Examples of Geometry Constructions
• Good platform for teaching logical reasoning.
– Visual Nature:• Makes it more accessible.• Exercises both logical/visual abilities of left/right
brain.
– Fun Aspect: • Ruler/compass restrictions make it fun, as in
sports.
• Application in dynamic geometry or animations.– “Constructive” geometry macros (unlike numerical
methods) enable fast re-computation of derived objects from free (moving) objects.
29
Significance
Types: Point, Line, Circle
Methods:• Ruler(Point, Point) -> Line • Compass(Point, Point) -> Circle• Intersect(Circle, Circle) -> Pair of Points• Intersect(Line, Circle) -> Pair of Points• Intersect(Line, Line) -> Point
Geometry Program: A straight-line composition of the above methods.
30
Programming Language for Geometry Constructions
Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.
31
Example Problem: Program
1. C1 = Compass(X,Y);2. C2 = Compass(Y,X);3. <P1,P2> =
Intersect(C1,C2);4. L1 = Ruler(P1,P2);5. D1 = Compass(Z,X);6. D2 = Compass(X,Z);7. <R1,R2> =
Intersect(D1,D2);8. L2 = Ruler(R1,R2);9. N = Intersect(L1,L2);10.C = Compass(N,X);
X
Z
Y
C1
C2P1
P2
L1
D2
D1 R1
R2
L2
N
C
Conjunction of predicates over arithmetic expressions
Predicates p := e1 = e2
| e1 e2
| e1 · e2
Arithmetic Expressions e := Distance(Point, Point) | Slope(Point, Point) | e1 § e2
| c32
Specification Language for Geometry Programs
Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.
Precondition: Slope(X,Y) Slope(X,Z) Æ Slope(X,Y) Slope(Z,X)
Postcondition: LiesOn(X,C) Æ LiesOn(Y,C) Æ LiesOn(Z,C)
Where LiesOn(X,C) ´ Distance(X,Center(C)) = Radius(C)
Example Problem: Precondition/Postcondition
33
• Let P be a geometry program that computes outputs O from inputs I.
• Verification Problem: Check the validity of the following Hoare triple.
Assume Pre(I); P
Assert Post(I,O);
• Synthesis Problem: Given Pre(I), Post(I,O), find P such that the above Hoare triple is valid.
34
Verification/Synthesis Problem for Geometry Programs
Pre(I), P, Post(I,O)
a) Symbolic decision procedures are complex.
35
Approaches to Verification Problem
• Problem: Given two polynomials P1 and P2, determine whether they are equivalent.
• The naïve deterministic algorithm of expanding polynomials to compare them term-wise is exponential.
• A simple randomized test is probabilistically sufficient:– Choose random values r for polynomial variables x– If P1(r) ≠ P2(r), then P1 is not equivalent to P2.– Otherwise P1 is equivalent to P2 with high
probability,
36
Randomized Polynomial Identity Testing
Pre(I), P, Post(I,O)
a) Symbolic decision procedures are complex.
b) New efficient approach: Random Testing!1. Choose I’ randomly from the set { I | Pre(I) }.2. Compute O’ := P(I’).3. If O’ satisfies Post(I’,O’) output “Verified”.
Correctness Proof of (b):• Objects constructed by P can be described using
polynomial ops (+,-,*), square-root & division operator.
• The randomized polynomial identity testing algorithm lifts to square-root and division operators as well !
37
Approaches to Verification Problem
Synthesis Algorithm: // First obtain a random input-output example.1. Choose I’ randomly from the set { I | Pre(I) }.2. Compute O’ s.t. Post(I’,O’) using numerical
methods.// Now obtain a construction that can generate O’ from I’ (using exhaustive search).3. S := I’;4. While (S does not contain O’)5. S := S [ { M(O1,O2) | Oi 2 S, M 2 Methods }
6. Output construction steps for O’.
38
Idea 1 (from Theory): Symbolic Reasoning -> Concrete
Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.
39
Error Probability of the algorithm is extremely low.
…L1 = Ruler(P1,P2); …L2 = Ruler(R1,R2);N = Intersect(L1,L2);C = Compass(N,X);
39
• For an equilateral 4XYZ, incenter coincides with circumcenter N.
• But what are the chances of choosing a random 4XYZ to be an equilateral one?
X
Z
Y
L1 L2
N
C
Synthesis algorithm times out because programs are large.
• Identify a library of commonly used patterns (pattern = “sequence of geometry methods”)– E.g., perpendicular/angular bisector, midpoint, tangent, etc.
S := S [ { M(O1,O2) | Oi 2 S, M 2 Methods }
S := S [ { M(O1,O2) | Oi 2 S, M 2 LibMethods }
• Two key advantages:– Search space: large depth -> small depth– Easier to explain solutions to students.
40
Idea 2 (from PL): High-level Abstractions
Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.
41
Use of high-level abstractions reduces program size
1. C1 = Compass(X,Y);2. C2 = Compass(Y,X);3. <P1,P2> =
Intersect(C1,C2);4. L1 = Ruler(P1,P2);5. D1 = Compass(Z,X);6. D2 = Compass(X,Z);7. <R1,R2> =
Intersect(D1,D2);8. L2 = Ruler(R1,R2);9. N = Intersect(L1,L2);10.C = Compass(N,X);
1. L1 = PBisector(X,Y);2. L2 = PBisector(X,Z);3. N = Intersect(L1,L2);4. C = Compass(N,X);
Synthesis algorithm is inefficient because the search space is too wide and hence still huge.
• Prune forward search by using A* style heuristics.
S := S [ { M(O1,O2) | Oi 2 S, M 2 LibMethods }
S := S [ {M(O1,O2) | Oi2S, M2LibMethods, IsGood(M(O1,O2)) }
• Example: If a method constructs a line L that passes through a desired output point, then L is “good” (i.e., worth constructing).
42
Idea 3 (from AI): Goal Directed Search
Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.
43
Effectiveness of Goal-directed search
43
• L1 and L2 are immediately constructed since they pass through output point N.
• On the other hand, other lines like angular bisectors are not eagerly constructed.
X
Z
Y
L1 L2
N
C
25 benchmark problems.
• such as: Construct a square whose extended sides pass through 4 given points.
• 18 problems less than 1 second. 4 problems between 1-3 seconds. 3 problems 13-82 seconds.
• Idea 2 (high-level abstractions) reduces programs of size 3-45 to 3-13.
• Idea 3 (goal-directedness) improves performance by factor of 10-1000 times on most problems. 44
Experimental Results
45
Search space Exploration: With/without goal-directness
• Concept Language– Programs
• Straight-line programs– Automata– Queries– Sequences
• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces
• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)
46
Dimensions in Synthesis
PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.
• Lecture 2– Section 4 in WAMBSE 2012 keynote paper
“Synthesis from Examples”, Gulwani.
• Lab– Section 4 in WAMBSE 2012 keynote paper.– NCERT Online Book Website. http://ncert.nic.in/NCERTS/textbook/textbook.htm
• Lecture 3– Sections 1-3 in WAMBSE 2012 keynote paper
47
Optional Advance Preparation
• Motivation– Online learning sites: Khan academy, Edx, Udacity,
Coursera• Increasing class sizes with even less personal attention
– New technologies: Tablets/Smartphones, NUI, Cloud• Various Aspects
– Solution Generation– Problem Generation – Automated Grading– Content Entry
• Various Domains– K-12: Mathematics, Physics, Chemistry– Undergraduate: Introductory Programming, Automata
Theory – Language Learning 48
Intelligent Tutoring Systems