Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Learning Programs:
A Hierarchical Bayesian Approach
ICML - Haifa, Israel June 24, 2010
Percy Liang Michael I. Jordan Dan Klein
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
2
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
Goal: Programming by Demonstration
If the user demonstrates italicizing the first occurrence,can we generalize to the remaining?
2
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
Goal: Programming by Demonstration
If the user demonstrates italicizing the first occurrence,can we generalize to the remaining?
Solution: represent task by a program to be learned
1. Move to next occurrence of word with prefix program2. Insert <i>3. Move to end of word4. Insert </i>
2
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
Goal: Programming by Demonstration
If the user demonstrates italicizing the first occurrence,can we generalize to the remaining?
Solution: represent task by a program to be learned
1. Move to next occurrence of word with prefix program2. Insert <i>3. Move to end of word4. Insert </i>
Challenge: learn from very few examples2
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)
Training data
3
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)=⇒ Z such that (Z Xj) = Yj
Training data Consistent program
3
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)=⇒ Z such that (Z Xj) = Yj
Training data Consistent program
Challenge:
When n small, many programs consistent with training data.
I like <i>programs</i>, but I wish programswould just program themselves sinceI don’t like programming.
Move to beginning of third word, ...Move to beginning of word after like, ...Move 7 spaces to the right, ...Move to word with prefix program, ...· · ·
3
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)=⇒ Z such that (Z Xj) = Yj
Training data Consistent program
Challenge:
When n small, many programs consistent with training data.
I like <i>programs</i>, but I wish programswould just program themselves sinceI don’t like programming.
Move to beginning of third word, ...Move to beginning of word after like, ...Move 7 spaces to the right, ...Move to word with prefix program, ...· · ·
Which program to choose?3
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
4
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)?
4
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
4
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
4
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
Find programs that share common subprograms.
4
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
Find programs that share common subprograms.
• Programs do tend to share common components.
4
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
Find programs that share common subprograms.
• Programs do tend to share common components.
• Penalize joint complexity of all K programs.
4
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
5
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
5
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
5
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
6
Representation: What Language?
Goal: allow sharing of subprograms
7
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
7
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
+ higher-order combinators (new)
+ routing intuition, visual representation (new)
7
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
+ higher-order combinators (new)
+ routing intuition, visual representation (new)
Properties: no mutation, no variables ⇒ simple semantics
7
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
+ higher-order combinators (new)
+ routing intuition, visual representation (new)
Properties: no mutation, no variables ⇒ simple semantics
Result:
• Programs are trees
• Subprograms are subtrees
7
Programs with No Arguments
Example: compute min(3, 4)
8
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
8
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
if
< 34
34
8
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
if
< 34
34
General:
x y⇒ result of applying function x to argument y
8
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
if
< 34
34
General:
x y⇒ result of applying function x to argument y
Arguments are curried
8
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4) (if true 3 4)
if
< 34
34⇒
if true3
4
General:
x y⇒ result of applying function x to argument y
Arguments are curried
8
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4) (if true 3 4)
if
< 34
34⇒
if true3
4 ⇒ 3
General:
x y⇒ result of applying function x to argument y
Arguments are curried
8
Programs with One Argument
Example: x 7→ x2 + 1
9
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
Lambda calculus
9
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
9
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
Intuition:
Combinators {B,C,S, I} encode placement of arguments
9
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
Intuition:
Combinators {B,C,S, I} encode placement of arguments
Semantics:
x y
r⇔ (r x y)
r ∈ {B,C,S, I} 9
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
Intuition:
Combinators {B,C,S, I} encode placement of arguments
Semantics:
x y
r⇔ (r x y)
r ∈ {B,C,S, I}
Rules:(B f g x) = (f (g x))... 9
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 1C 5
10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 1C 5
x yC z ⇔
x zy
route left
10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 51 x y
C z ⇔x z
y
route left
10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 51 x y
C z ⇔x z
y
route left
x yB z ⇔ x
y z
route right
10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B 5 I 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B 5 I 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right
I x ⇔ x
stop10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B 55
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right
I x ⇔ x
stop10
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ 55
1
x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right
I x ⇔ x
stop10
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if <
BB I
SC I
CS
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if <
BB I
SC I
CS 34
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if <
BB I
SC I
CS 34
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if< 3
B 3C I
S 4
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if< 3
B 3C I
S 4
11
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if
< 34
34
11
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
12
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
No significant sharing of subtrees (subprograms)
12
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
No significant sharing of subtrees (subprograms)
Refactored:
if I
BBB I
CSC I
CCS <
if I
BBB I
CSC I
CCS >
12
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
No significant sharing of subtrees (subprograms)
Refactored:
if I
BBB I
CSC I
CCS <
if I
BBB I
CSC I
CCS >
Fruitful sharing of subtrees (subprograms)12
Summary
Introduced new combinatory logic basis (intuition: routing)
13
Summary
Introduced new combinatory logic basis (intuition: routing)
Purpose of these combinators:
• Represent multi-argument functions
• Allow refactoring to expose common substructures
13
Summary
Introduced new combinatory logic basis (intuition: routing)
Purpose of these combinators:
• Represent multi-argument functions
• Allow refactoring to expose common substructures
Achieved uniformity: Every subtree is a subprogram
13
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
14
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
Example:
GenIndep(int→ int) =⇒ + 1
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
Example:
GenIndep(int→ int) =⇒ ∗
− 31
15
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
Example:
GenIndep(int→ int) =⇒ ∗
− 31
Problem: No encouragement to share subprograms
15
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators]
16
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
16
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
16
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
With probability λ0:Return∗ a random primitive combinator (e.g., +, 3, I)
Else:Choose a type sx← GenCache(s→ t)y ← GenCache(s)Return∗ (x, y)
Else:
16
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
With probability λ0:Return∗ a random primitive combinator (e.g., +, 3, I)
Else:Choose a type sx← GenCache(s→ t)y ← GenCache(s)Return∗ (x, y)
Else:
Return∗ z ∈ Ct with probability Mz−d|Ct|−Ntd
16
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
With probability λ0:Return∗ a random primitive combinator (e.g., +, 3, I)
Else:Choose a type sx← GenCache(s→ t)y ← GenCache(s)Return∗ (x, y)
Else:
Return∗ z ∈ Ct with probability Mz−d|Ct|−Ntd
Interpretation of cache Ct: library of generally useful(unnamed) subroutines which are reused.
16
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
17
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
18
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
Use Metropolis-Hastings
Proposal: sample a random program transformation
18
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
Use Metropolis-Hastings
Proposal: sample a random program transformation
Program transformations maintain invariant that
program is correct (likelihood is 1)
18
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
Use Metropolis-Hastings
Proposal: sample a random program transformation
Program transformations maintain invariant that
program is correct (likelihood is 1)
Two types of transformations:
1. Switching
2. Refactoring
18
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)}
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
∗+ 2
S
[x 7→ x(x + 2)]
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
∗+ 2
S
[x 7→ x(x + 2)]
⇔ ∗ +BS 2
[x 7→ x(x + 2)]
19
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
∗+ 2
S
[x 7→ x(x + 2)]
⇔ ∗ +BS 2
[x 7→ x(x + 2)]
Purpose: expose different subprograms for sharing19
Text Editing Experiments
Setup:
Dataset of [Lau et al., 2003]K = 24 tasksEach task: train on 2–5 examples, test on u 13 examples10 random trials
20
Text Editing Experiments
Setup:
Dataset of [Lau et al., 2003]K = 24 tasksEach task: train on 2–5 examples, test on u 13 examples10 random trials
Example task:
Cardinals 5, Pirates 2.⇒
GameScore[ winner ’Cardinals’; loser ’Pirates’; scores [5, 2]].
20
Experimental Results
Uniform prior
Independent prior
Joint prior
21
Experimental Results
2 3 4 5
5
10
15
20
25
erro
r
Uniform prior
Independent prior
Joint prior
21
Experimental Results
2 3 4 5
5
10
15
20
25
erro
r
Uniform prior
Independent prior
Joint prior
Observations:
• Independent prior is even worse than uniform prior
21
Experimental Results
2 3 4 5
5
10
15
20
25
erro
r
Uniform prior
Independent prior
Joint prior
Observations:
• Independent prior is even worse than uniform prior
• Joint prior (multi-task learning) is effective
21
Summary
X ⇒ ⇒ Y
22
Summary
X ⇒ program ⇒ Y
22
Summary
X ⇒ program ⇒ Y
Key challenge: learn programs from few examples
22
Summary
X ⇒ program ⇒ Y
Key challenge: learn programs from few examples
Main idea: share subprograms across multiple tasks
22
Summary
X ⇒ program ⇒ Y
Key challenge: learn programs from few examples
Main idea: share subprograms across multiple tasks
Tools:
• Combinatory logic: expose subprograms to be shared
• Adaptor grammars: encourage sharing of subprograms
•Metropolis-Hastings: proposals are program transformations
22