Upload
alair
View
23
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Programming by Sketching. Armando Solar-Lezama, Liviu Tancau, Gilad Arnold, Rastislav Bodik , Sanjit Seshia UC Berkeley, Rodric Rabbah MIT, Kemal Ebcioglu, Vijay Saraswat, Vivek Sarkar IBM. Merge sort. int[] mergeSort (int[] input, int n) { return merge ( mergeSort (input[0::n/2]), - PowerPoint PPT Presentation
Citation preview
Programming by Sketching
Armando Solar-Lezama, Liviu Tancau, Gilad Arnold, Rastislav Bodik, Sanjit Seshia UC Berkeley, Rodric Rabbah MIT, Kemal Ebcioglu, Vijay Saraswat, Vivek Sarkar IBM
2
int[] mergeSort (int[] input, int n) {return merge( mergeSort (input[0::n/2]),
mergeSort (input[n/2+1::n]) , n);}int[] merge (int[] a, int b[], int n) {
int j=0, k=0;for (int i = 0; i < n; i++)
if ( a[j] < b[k] ) { result[i] = a[j++];
} else { result[i] = b[k++];
}}return result;
}
Merge sort
looks simple to code, but there is a bug
3
Merge sortint[] mergeSort (int[] input, int n) {
return merge( mergeSort (input[0::n/2]), mergeSort (input[n/2+1::n]) , n);
}int[] merge (int[] a, int b[], int n) {
int j, k;for (int i = 0; i < n; i++)
if ( j<n && ( !(k<n) || a[j] < b[k]) ) { result[i] = a[j++];
} else { result[i] = b[k++];
}}return result;
}
4
The sketching experience
sketch implementation (completed sketch)
spec
specification
+
5
The spec: bubble sort
int[] sort (int[] input, int n) { for (int i=0; i<n; ++i)
for (int j=i+1; j<n; ++j) if (input[j] < input[i])
swap(input, j, i);}
6
int[] mergeSort (int[] input, int n) {return merge( mergeSort (input[0::n/2]),
mergeSort (input[n/2+1::n]) , n);}int[] merge (int[] a, int b[], int n) {
int j, k;for (int i = 0; i < n; i++)
if ( expression( ||, &&, <, !, [] ) ) { result[i] = a[j++];
} else { result[i] = b[k++];
}}return result;
}
Merge sort: sketched
hole
7
Merge sort: synthesizedint[] mergeSort (int[] input, int n) {
return merge( mergeSort (input[0::n/2]), mergeSort (input[n/2::n]) );
}int[] merge (int[] a, int b[], int n) {
int j, k;for (int i = 0; i < n; i++)
if ( j<n && ( !(k<n) || a[j] < b[k]) ) { result[i] = a[j++];
} else { result[i] = b[k++];
}}return result;
}
8
Sketching: spec vs. sketch
• Specification– executable: easy to debug, serves as a
prototype– a reference implementation: simple and
sequential– written by domain experts: crypto, bio, MPEG
committee
• Sketched implementation– program with holes: filled in by synthesizer– programmer sketches strategy: machine
provides details– written by performance experts: vector
wizard; cache guru
9
How sketching fits into autotuning• Autotuning: two methods for obtaining code variants
1. optimizing compiler: transform a “spec” in various ways2. custom generator: for a specific algorithm
• We seek to simplify the second approach
• Scenario 1: library of variants stores resolved sketches– as if written by hand
• Scenario 2: library has unresolved, flexible sketches– sketch works for a variety of specifications:
e.g., a class of stencils
10
SKETCH
• A language with support for sketching-based synthesis– like C without pointers – two simple synthesis constructs
• restricted to finite programs:– input size known at compile time, terminates on all inputs
• most high-performance kernels are finite:– matrix multiply: yes – binary search tree: no
• we’re already working on relaxing the fineteness restriction– later in this talk
11
Ex1: Isolate rightmost 0-bit. 1010 0111 0000 1000bit[W] isolate0 (bit[W] x) { // W: word size
bit[W] ret = 0;for (int i = 0; i < W; i++)
if (!x[i]) { ret[i] = 1; break; } return ret;
}
bit[W] isolate0Fast (bit[W] x) implements isolate0 { return ~x & (x+1);
}
bit[W] isolate0Sketched (bit[W] x) implements isolate0 {return ~(x + ??) & (x + ??);
}
12
Programmer’s view of sketches
• the ?? operator replaced with a suitable constant
• as directed by the implements clause.
• the ?? operator introduces non-determinism
• the implements clause constrains it.
14
Beyond synthesis of literals
• Synthesizing values of ?? already very useful– parallelization machinery: bitmasks, tables
in crypto codes– array indices: A[i+??,j+??]
• We can synthesize more than constants– semi-permutations: functions that select and
shuffle bits– polynomials: over one or more variables– actually, arbitrary expressions, programs
17
Synthesizing polynomials
int spec (int x) {return 2*x*x*x*x + 3*x*x*x + 7*x*x + 10;
}
int p (int x) implements spec {return (x+1)*(x+2)*poly(3,x);
}
int poly(int n, int x) {if (n==0) return ??; else return x * poly(n-1, x) + ??;
}
18
Karatsuba’s multiplication
x = x1*b + x0 y = y1*b + y0 b=2k
x*y = b2*x1*y1 + b*(x1*y0 + x0*y1) + x0*y0
x*y = poly(??,b) * x1*y1 + + poly(??,b) * poly(1,x1,x0,y1,y0)*poly(1,x1, x0, y1, y0) + poly(??,b) * x0*y0
x*y = (b2 +b) * x1*y1
+ b * (x1 - x0)*(y1 - y0) + (b+1) * x0*y0
19
Sketch of Karatsubabit[N*2] k<int N>(bit[N] x, bit[N] y) implements mult {
if (N<=1) return x*y;
bit[N/2] x1 = x[0:N/2-1]; bit[N/2+1] x2 = x[N/2:N-1]; bit[N/2] y1 = y[0:N/2-1]; bit[N/2+1] y2 = y[N/2:N-1];
bit[2*N] t11 = x1 * y1; bit[2*N] t12 = poly(1, x1, x2, y1, y2) * poly(1, x1, x2, y1, y2);bit[2*N] t22 = x2 * y2;
return multPolySparse<2*N>(2, N/2, t11) // log b = N/2 + multPolySparse<2*N>(2, N/2, t12) + multPolySparse<2*N>(2, N/2, t22);
}bit[2*N] poly<int N>(int n, bit[N] x0, x1, x2, x3) {
if (n<=0) return ??;else return (??*x0 + ??*x1 + ??*x2 + ??*x3) * poly<N>(n-1, x0, x1, x2, x3);
}bit[2*N] multPolySparse<int N>(int n, int x, bit[N] y) {
if (n<=0) return 0;else return y << x*?? + multPolySparse<N>(n-1, x, y);
}
20
Semantic view of sketches
• a sketch represents a set of functions:– the ?? operator modeled as reading from an
oracle
int f (int y) { int f (int y, bit[][K] oracle) {x = ??; x = oracle[0];loop (x) { loop (x) {
y = y + ??; y = y + oracle[1];} }return y; return y;
} }
Synthesizer must find oracle satisfying f implements g
21
Synthesis algorithm: overview
1. translation: represent spec and sketch as circuits
2. synthesis: find suitable oracle3. code generation: specialize sketch wrt
oracle
22
Ex : Population count. 0010 0110 3int pop (bit[W] x){ int count = 0; for (int i = 0; i < W; i++)
{ if (x[i]) count++; } return count;}
x count 0 0 0 0 one 0 0 0 1
+
mux
count
+
count
mux
+
count
+
count
mux
mux
F(x) =
23
Synthesis as generalized SAT• The sketch synthesis problem is an instance of 2QBF:
o . x . P(x) = S(x,o)
• Counter-example driven solver:
I = {}x = random()do
I = I U {x}c = synthesizeForSomeInputs(I)if c = nil then exit(“buggy sketch'')x = verifyForAllInputs(c) // x: counter-example
while x != nilreturn c
S(x1, c)=P(x1) … S(xk, c)=P(xk)
I ={ x1, x2, …, xk }
S(x, c) P(x)
24
Case study
• Implemented AES– the modern block-cipher standard – 14 rounds: each has table lookup, permutation, GF-
multiply– a good implementation collapses each round into
table lookups
• Our results– we synthesized 32Kbit oracle!– synthesis time: about 1 hour– counterexample-driven synthesizer iterated 655
times– performance of synthesized code within 10% of
hand-tuned
25
Finite programs
• In theory, SKETCH is complete for all finite programs:– specification can specify any finite program– sketch can describe any implementation over given
instructions– synthesizer can resolve any sketch
• In practice, SKETCH scales for small finite programs– small finite programs: block ciphers, small kernels– large finite: big-integer multiplication, matrix
multiplication
• Solution: – synthesize for a small input size– prove (or examine) that result of synthesis works for
bigger inputs
28
Lossless abstraction
• Problem– does result of synthesis for a small matrix work
for all matrices?
• Approach– spec, sketch have unbounded-input/output– abstract them into finite functions, with the same
abstraction– synthesize– obtained oracle works for original sketch
• Stencil kernels– concrete: matrix A[N] matrix B[N]– abstract: A[e(i)], i, N B[i]
29
Example: divide and conquer parallelization• Parallel algorithm:
– Data rearrangement + parallel computation
• spec: – sequential version of the program
• sketch: – parallel computation
• automatically synthesized: – Rearranging the data (dividing the data
structure)