Upload
kaylee-cockrill
View
222
Download
1
Embed Size (px)
Citation preview
Overview
• What is Dynamic Programming?
• A Sequence of 4 Steps
• Longest Common Subsequence
• Elements of Dynamic Programming
S-1
What is Dynamic Programming?
• Dynamic programming solves optimization problems by combining solutions to sub-problems
• “Programming” refers to a tabular method with a series of choices, not “coding”
S-2
What is Dynamic Programming? …contd
• A set of choices must be made to arrive at an optimal solution
• As choices are made, subproblems of the same form arise frequently
• The key is to store the solutions of subproblems to be reused in the future
S-3
What is Dynamic Programming? …contd
• Recall the divide-and-conquer approach– Partition the problem into independent
subproblems– Solve the subproblems recursively– Combine solutions of subproblems
• This contrasts with the dynamic programming approach
S-4
What is Dynamic Programming? …contd
• Dynamic programming is applicable when subproblems are not independent
– i.e., subproblems share subsubproblems– Solve every subsubproblem only once and
store the answer for use when it reappears
• A divide-and-conquer approach will do more work than necessary
S-5
A Sequence of 4 Steps
• A dynamic programming approach consists of a sequence of 4 steps
1. Characterize the structure of an optimal solution
2. Recursively define the value of an optimal solution
3. Compute the value of an optimal solution in a bottom-up fashion
4. Construct an optimal solution from computed information
S-6
Longest Common Subsequence (LCS)
Application: comparison of two DNA stringsEx: X= {A B C B D A B }, Y= {B D C A B A}
Longest Common Subsequence:
X = A B C B D A B
Y = B D C A B A
Brute force algorithm would compare each subsequence of X with the symbols in Y
S-7
LCS Algorithm• if |X| = m, |Y| = n, then there are 2m
subsequences of x; we must compare each with Y (n comparisons)
• So the running time of the brute-force algorithm is O(n 2m)
• Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution.
• Subproblems: “find LCS of pairs of prefixes of X and Y”
S-8
LCS Algorithm• First we’ll find the length of LCS. Later we’ll
modify the algorithm to find LCS itself.
• Define Xi, Yj to be the prefixes of X and Y of length i and j respectively
• Define c[i,j] to be the length of LCS of Xi and Yj
• Then the length of LCS of X and Y will be c[m,n]
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
S-9
LCS recursive solution
• We start with i = j = 0 (empty substrings of x
and y)
• Since X0 and Y0 are empty strings, their LCS
is always empty (i.e. c[0,0] = 0)
• LCS of empty string and any other string is
empty, so for every i and j: c[0, j] = c[i,0] = 0
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
S-10
LCS recursive solution
• When we calculate c[i,j], we consider two
cases:
• First case: x[i]=y[j]: one more symbol in
strings X and Y matches, so the length of LCS
Xi and Yj equals to the length of LCS of
smaller strings Xi-1 and Yi-1 , plus 1
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
S-11
LCS recursive solution
• Second case: x[i] != y[j]
• As symbols don’t match, our solution is not
improved, and the length of LCS(Xi , Yj) is the
same as before (i.e. maximum of LCS(Xi, Yj-1)
and LCS(Xi-1,Yj)
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
S-12
LCS Length AlgorithmLCS-Length(X, Y)1. m = length(X) 2. n = length(Y)
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi == Yj )8. c[i,j] = c[i-1,j-1] + 19. else c[i,j] = max( c[i-1,j], c[i,j-1] )10. return c S-13
LCS ExampleWe’ll see how LCS algorithm works on the
following example:
• X = ABCB
• Y = BDCAB
LCS(X, Y) = BCBX = A B C BY = B D C A B
What is the Longest Common Subsequence of X and Y?
S-14
LCS Example (0)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
X = ABCB; m = |X| = 4Y = BDCAB; n = |Y| = 5Allocate array c[4,5]
ABCBBDCAB
S-15
LCS Example (1)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0
ABCBBDCAB
S-16
LCS Example (2)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0
ABCBBDCAB
S-17
LCS Example (3)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0 0 0
ABCBBDCAB
S-18
LCS Example (4)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0 0 0 1
ABCBBDCAB
S-19
LCS Example (5)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
000 1 1
ABCBBDCAB
S-20
LCS Example (6)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
0 0 10 1
1
ABCBBDCAB
S-21
LCS Example (7)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 1 11
ABCBBDCAB
S-22
LCS Example (8)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 1 1 1 2
ABCBBDCAB
S-23
LCS Example (10)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
21 1 11
1 1
ABCBBDCAB
S-24
LCS Example (11)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 21 11
1 1 2
ABCBBDCAB
S-25
LCS Example (12)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 21 1
1 1 2
1
22
ABCBBDCAB
S-26
LCS Example (13)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 21 1
1 1 2
1
22
1
ABCBBDCAB
S-27
LCS Example (14)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj )c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 21 1
1 1 2
1
22
1 1 2 2
ABCBBDCAB
S-28
LCS Example (15)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3
ABCBBDCAB
S-29
LCS Algorithm Running Time
• LCS algorithm calculates the values of each entry of the array c[m,n]
• So what is the running time?
O(m*n)
since each c[i,j] is calculated in constant time, and there are m*n elements in the array
S-30
How to find actual LCS• So far, we have just found the length of LCS,
but not LCS itself.
• We want to modify this algorithm to make it output Longest Common Subsequence of X and Y
Each c[i,j] depends on c[i-1,j] and c[i,j-1]
or c[i-1, j-1]
For each c[i,j] we can say how it was acquired:
2
2 3
2 For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3
S-31
How to find actual LCS - continued
• Remember that
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
So we can start from c[m,n] and go backwards
Whenever c[i,j] = c[i-1, j-1]+1, remember x[i] (because x[i] is a part of LCS)
When i=0 or j=0 (i.e. we reached the beginning), output remembered letters in reverse order S-32
Finding LCSj 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
Yj BB ACD
0
0
00000
0
0
0
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3B
S-33
Finding LCS (2)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
Yj BB ACD
0
0
00000
0
0
0
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3B
B C BLCS (reversed order):
LCS (straight order): B C B
S-34
Elements of Dynamic Programming
• For dynamic programming to be applicable, an optimization problem must have:
1. Optimal substructure– An optimal solution to the problem contains within it
optimal solution to subproblems
2. Overlapping subproblems– The space of subproblems must be small; i.e., the
same subproblems are encountered over and over
S-35
Review: Dynamic programming
• DP is a method for solving certain kind of problems
• DP can be applied when the solution of a problem includes solutions to subproblems
• We need to find a recursive formula for the solution
• We can recursively solve subproblems, starting from the trivial case, and save their solutions in memory
• In the end we’ll get the solution of the whole problem
S-36
Properties of a problem that can be solved with dynamic
programming• Simple Subproblems
– We should be able to break the original problem to smaller subproblems that have the same structure
• Optimal Substructure of the problems– The solution to the problem must be a
composition of subproblem solutions
• Subproblem Overlap– Optimal subproblems to unrelated problems can
contain subproblems in common
S-37
Review: Longest Common Subsequence (LCS)
• Problem: how to find the longest pattern of characters that is common to two text strings X and Y
• Dynamic programming algorithm: solve subproblems until we get the final solution
• Subproblem: first find the LCS of prefixes of X and Y.
• this problem has optimal substructure: LCS of two prefixes is always a part of LCS of bigger strings
S-38
Review: Longest Common Subsequence (LCS)
continued• Define Xi, Yj to be prefixes of X and Y of
length i and j; m = |X|, n = |Y| • We store the length of LCS(Xi, Yj) in c[i,j]
• Trivial cases: LCS(X0 , Yj ) and LCS(Xi, Y0) is empty (so c[0,j] = c[i,0] = 0 )
• Recursive formula for c[i,j]:
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjic
jyixjicjic
c[m,n] is the final solutionS-39
Review: Longest Common Subsequence (LCS)
• After we have filled the array c[ ], we can use this data to find the characters that constitute the Longest Common Subsequence
• Algorithm runs in O(m*n), which is much better than the brute-force algorithm: O(n 2m)
S-40
Conclusion• Dynamic programming is a useful
technique of solving certain kind of problems
• When the solution can be recursively described in terms of partial solutions, we can store these partial solutions and re-use them as necessary
• Running time (Dynamic Programming algorithm vs. brute-force algorithm):– LCS: O(m*n) vs. O(n * 2m)
S-41
The End
S-42