Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming

Recursion

In this Chapter we Study:

• Recursion

• Divide and Conquer

• Dynamic Programming

Recursion involves:• Self-reference (as in a recursive definition, or in a recursive method)

• Each recursive call to a function should conduct to a problem of smaller size.

• The chain of self-reference is terminated by a base case, in which the solution to the problem is trivial.

General form of a recursion function:

if (Condition for which the problem is trivial) // fundamental case Trivial solution

else // general caseRecursive call to the function for a smaller case

)!1(

1!

nnn

if n = 0

if n > 0

Example 1:

The factorial of a non-negative integer, n, is defined as follows:

2/2/

2/2/

x

nn

nnn

xx

xxx

if n = 1

if n is odd

else

Example 2:

The power n of a number x is defined as follows:

Let us illustrate this with some examples.

x

Example 3:The GCD of two non-negative integers, neither of them 0,

is defined as follows:

m if n = 0gcd(m, n) gcd(n, m mod n) if n > 0

Example 4:The product of two positive integers m and n

is defined as follows:

m if n = 1P(m, n) m + P(m, n-1)if n > 1

Example 5: Towers of Hanoi

A classical example of a problem which is nicely solved by recursion is:

We have:n disks in order of decreasing size, which are stacked on a peg,

say A.

We want to:Move these disks to another peg, say B,

so that they are in the same decreasing order of size.

using a third peg, say C, as workspace, andunder the following rules:

• We can move only one disk at a time.

• We cannot place a larger disk on top of a smaller one in any move.

A B C

A B C

A B C

Recursive Solution

Move (n-1, A, C, B) +Move (1, A, B, C) +Move (n-1, C, B, A)

Move (n, A, B, C)

how many? from to workspace

Example 6:

Another nice example of recursion:already discussed in Stacks.

In fact there is a great similarity between stack and recursion!

Involves printing the digits of a number n, say n = 1234from right to left, say 4321

What the solution is?

We can extract the digits in reverse order in which we want to print them

Recursive Solution

Algorithm RecursiveReverseNumber

Input: A non-negative number, n

Output: The digits of n in reverse order.

procedure printDecimal(int n) Print n mod 10 if (n >= 10) printDecimal(n div 10)

What if we want to print digits in the same order?Idea: use a stack

Implementation of Recursion

Recursion is implemented by using a stack of activation records.

Each activation record stores all parametersrelated to the present call, so thatit can be completed when we return to this call.

Why a stack?Because calls are completed in the reverse order in which they are generated.

example: compute the sum of the integer from 1 to n

1 + 2 + . . . + n

That can be also write:

n + sum of the integer from 1 to (n - 1), ~ n + 1 + 2 + . . . + (n - 1) or n + Sum(n - 1)

Return 4 + Sum(3) = 4 + 6 = 10

call 1:Sum(4)

Return 3 + Sum(2) = 3 + 3 = 6

call 2:Sum(3)

Return 2 + Sum(1)

= 2 + 1 = 3 call 3: Sum(2)

n==1 Return 1

call 4: Sum(1)

4

34

234

1234

int Sum (int n) { if ( n == 1) // fondamental case

return 1 ; else // general case

return (n + Sum(n -1)) ;}

1234

Proof of correctness of recursive algorithms

The proof of the correctness of a recursive algorithm is often aproof by induction.

Formulating a proof by induction

1 State the proposition P(n) and the range of n for which you are trying to prove the proposition.

2 Verify the base case: that is, to verify that P(n) is true for the smallest value of n in the range.

3 Formulate the inductive hypothesis: that is, P(k) is true (for k < n).

4 Prove the induction step: that is, proving the induction hypothesis is true for the next value k + 1 of n.We prove the following:if P(k) is true, then P(k + 1) is also true.

5 Conclude that P(n) is true for all n in the stated range.

Example 1:

Example 2:

Correctness of Factorial algorithm

The factorial of an integer n is defined as follows: 0! = 1n! = 1 × 2 × 3 × 4 × . . . × n for all n 1

A recursive method to calculate the factorial of n:

public static int factorial(int n){ If (n==0)return 1; else return n*factorial(n-1);}

In order to prove the correctness of the factorial algorithm, we need to prove the following proposition:

n! = n × (n − 1)! is true for all n 1.

Proof by induction

1 Prove of the previous proposition.

2 Verify the base case: 1! = 1 × (1 − 1)! = 1 The above proposition is true for n = 1.

3 Induction hypothesis: k! = k × (k − 1)! true for n = k.

4 Prove induction step: prove that (k + 1)! = (k + 1) × k!.Proof:

5 Conclusion: the propositionn! = n × (n − 1)! is true for all n 1.

Considering the following problem:

Compute the value of the Fibonacci numbers : 1, 1, 2, 3, 5, 8, 13, 21, 34 …

Recursive definition:

Recursive implementationint fib(int n) { if (n==1 || n==2) return 1; else if (n>2) return fib(n-1)+ fib(n-2);}

)2()1(

1)(

nFibnFibnFib

if n > 1

if n=1 or n=2

int Iterativefib( int n){int prev1=1, prev2=1; current=1; for (int i=3; i<=n; i++){ current= prev1+prev2; prev2= prev1; prev1= current; } return current;}

The recursive solution for the Fibonacci number is simple but highly inefficient!

• The tree of function call exponentially grow.• Many call with the same parameters!• The iterative solution is much more efficient.

If n = 35 then:The iterative program performs 33 additions.The recursive program performs 18.5 million calls!!!

Complexity of the recursive algorithm

• The recursive algorithm fiboRec that calculates the nth Fibonacci number is exponential: it runs O(2n) time-complexity.

• The proof to the exponential complexity of requires proving the 2 following propositions:

• Proposition 1fiboRec(n) performs fn − 1 additions before it terminates (fn is the nth Fibonacci number).

• Proposition 2The number of additions fn − 1 performed by fiboRec(n) is greater than some exponential function of n.

Proof of proposition 1:



The complexity of the iterative algorithm for Fibonacci is O(n) !

Recursion or iteration?

• In general there exists an iterative and an recursive solution for each problem.

• A recursive approach is a good solution when the problem is intrinsically recursive.

In this case iterative solutions are much more complicate (ex. Towers of Hanoi).

• In general iterative solutions are more efficient than recursive ones.

• Redundancy of computation (ex. Fibonacci).

• Recursion imply number function calls.

• Each function call imply transfer of parameters, copy of the state of the program…

• Recursive call need a lot of memory.

Proof by induction that if n ≥ 5, n2 < 2n.

True for n = 5 : 25 < 32.

Set that k2 < 2k is true for a given k > 5.

Then 2k2 < 2 * 2k = 2k+1

When n > 5, n2 > 5n + 1 > 2n + 1

Then (k + 1)2 = k2 + 2k + 1 < k2 + k2 = 2 k2 < 2k+1

What is the big-Oh complexity of the following algorithm? What is the return value of the call func(15)?

Func(n) {If ( n ≤ 0 )

Return 10Else

Return Func(n-2) – 2

Func(15) = Func(13) – 2 = Func(11) – 2 – 2 = Func(9) – 2 – 2 - 2 = …Func(-1) – 2 – 2 – 2 – 2 – 2 – 2 – 2 – 2 = 10 – 2 – 2 – 2 – 2 – 2 – 2 – 2 – 2 = - 6

n/2 calls to subtractions then complexity of O(n)

Proof by induction that for n ≥ 0.

True for n = 0 : 0 = 0

Assume is true for a given n

1 + 4 + 9 + … + n2 + (n+1)2 =

=

=

We have also (n+2)(2(n+1) +1) = (n+2)(2n+3) = 2n2 + 7n + 6

Then

6

)12)(1(

0

2 nnn

in

6

)12)(1(

0

2 nnn

in

2)1(6

)12)(1(

n

nnn

6

))1(6)12()(1( nnnn

6

)672)(1( 2 nnn

6

)1)1(2)(2)(1(1

0

2

nnni

n

Divide-and-Conquer

An important paradigm…in algorithm design

Divide:It recursively breaks down a problem into two (or more) sub-problems of the same (or related) type, until these become simple enough to be solved directly (base case).

Conquer:The solutions to the sub-problems are then combined to

give a solution to the original problem.

Example: Linear and Binary search

Linear search

Given an unsorted array, find the index position of a value x in the array.

Algorithm:

Compare x to all values in the array until the value of x isfound or the end of the array is reached.

If x is found in the array return its index position, else return -1.

Complexity:

The algorithm is linear in the worst case.

low high key indices

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

array 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28

16 18 20 22 24 26 28

24 26 28

24 NOTE: refer to element of the array that are tested

x = BinarySearch(a, 0, 14, 25);

Binary search

Binary search

Given a sorted array, find the index position of a value x in the array.

int BinarySearch (const int a[ ] , int low , int high, int key ){ // a [ low . . high ] sorted in increasing order int mid ; if ( low > high ) // fundamental case: not found

return -1; else { mid = (low + high) / 2 ;

if ( a [ mid ] == key ) // fundamental case: found in the middlereturn mid ;

else if ( key < a [ mid ] ) // search in the lower half return BinarySearch ( a, low, mid - 1, key ); else // search in the higher half

return BinarySearch( a, mid + 1, high, key ) ; }} ;

The complexity in the worse case: O(log n)

An iterative version of the binary search

int BinarySearch (const int a[ ] , int low , int high, int key ) {

// a [ low . . high ] sorted in increasing orderint mid;while ( low <= high ) {

mid = (low + high) / 2 ;if ( a [ mid ] == key ) // found in the middle return mid ;else if ( key < a [ mid ] ) // search in lower half

high = mid - 1 ; else // search in higher half

low = mid + 1 ; } return -1 ; // key not found } ;

In fact, most of the recursive algorithm can be write in an iterative form. The complexity is the same but the program efficiency is greater.

How to chose the best algorithm for a given problem?

Look at the complexity of each part of the algorithm.

Define how many time each part of the algorithm will be apply.

Example with the linear/binary search:

linear search:

build the array in O(n)

search in the array in O(n)

binary search:

build the array in O(n)

sort the array in O(n log n)

search in the array in O(log n)

So: for a problem

with few searches -> linear search

with lot of searches -> binary search

Example: Maximum Contiguous Subsequence Problem

a1, a2, … , an/2 | an/2+1, an/2+2, …, an

• Find maximum contiguous subsequence of first half.

• Find maximum contiguous subsequence of second half.

MCS max. of these two?

first half second half

What if max. contiguous subsequence straddles both halves?

Example:-1, 2, 3, | 4, 5, -6

MCS in first half is2, 3

MCS in second half is4,5

However … the MCS of the given sequence is:2, 3, 4, 5

It straddles both halves!

Observation:For a straddling sequence,

an/2 is the last element in first half, andan/2+1 is the first element in the second half.

In the example:

Last element = 3First element = 4

Computing a max. straddling sequence:

a1, a2, a3, a4, |a5, a6, a7, a8

Find maximum of:

a4

a4 + a3

a4 + a3 + a2

a4 + a3+ a2+ a1

Let the maximum be max1

Find the maximum of:

a5

a5 + a6

a5 + a6 + a7

a5 + a6+ a7+ a8

Let the maximum be max2

Required maximum:

max1 + max2

Required subsequence:

Glue subsequence for max1and subsequence for max2

Complexity of for the straddling sequence

Theorem:For an n-element sequence, the complexity of finding a

maximum straddling sequence is O(n)

Proof:We look at each element exactly once!

Putting things together…

S = a1, a2, … , an/2, an/2+1, an/2+2, …, an

S1 = a1, a2, … , an/2

S2 = an/2+1, an/2+2, …, an

MCS ( S ) = Max ( MCS (S1), MCS (S2), Max (straddling sequence) )

MCS (S1) and MCS (S2) are found recursively.

Complexity of Recursive MCS

Recursive complexity:

How computing the global complexity?

T(n/2) = time to solve a problem of size n/2

O(n) = complexity of finding a Maximum straddling subsequence

1if,1

)1(1if,)()2(2)(

n

nnOnTnT

What is a solution to (1)?

Approach 1:

# of levels in the tree of recursive calls?1 + log2 n

O(n) work needed to go from one level of the tree of recursive call to the next higher level.

Therefore, we have a global complexity of:O (n log n)

n

n / 2 n / 2

n / 4n / 4 n / 4 n / 4

n/8n/8 n/8n/8 n/8n/8 n/8n/8

n

…

log n +1

The tree of recursive calls

22

nn

4444

nnnn

A total of… O (n log n)

Approach 2:20 T(n) 21 T(n/2) + c n,21 T(n/2) 22 T(n/4) + 2 c n/2,22 T(n/4) 23 T(n/8) + 22 c n/4,23 T(n/8) 24 T(n/16) + 23 c n/8,………

Adding up, we have:20 T(n) + 21 T(n/2) + 22 T(n/4) + 23 T(n/8) + … 21 T(n/2) + 22 T(n/4) + 23 T(n/8) + …+ c n + 2 c n/2 + 22 c n/4 + …

T(n) c n + c n + c n + … log2 n terms

T(n) c n log2 n

Recursive MCS takes…T(n) = O (n log n)

whereas non-recursive MCS takes…T(n) = O (n)

Divide-and-conquer Not always best solution !!

Another example:

Find both the maximum and minimum of a set of n elements, S

Naive solution:

Find maximum:n – 1 comparisons

Find minimum:n – 2 comparisons

Total: 2n - 3 comparisons

Can we do better??... Yes…

Algorithm Divide-and-Conquer

procedure maxMin(S)

if |S| = 2 // S = {a,b}return (max(a,b), min(a,b))

elsedivide S into two subsets, say S1 and S2, each with half of

elements.(max1, min1) maxMin(S1)(max2, min2) maxMin(S2)return(max(max1, max2), min(min1, min2))

Analysis of the divide and conquer algorithm for maxMin

Number of comparisons?

T(n) = # of comparisons on n elementsT(n) = 1 n = 2

= 2T(n/2) + 2, n > 2

Solution:T(4) = 2T(2) + 2 = 4T(8) = 2T(4) + 2 = 10T(16) = 2T(8) + 2 = 22

…T(2k) = 3 2k-1 – 2 (proof by induction)

= 3/2 2k - 2 vs. 2 2k - 3,

where n = 2k

However…T(n) = O(n) !!!

for both the naive andthe divide-and-conquer algorithm

For the curious!!

Approximately 3n/2 – 2 comparisons are both necessary and sufficient to find the maximum and minimum of a set of n elements.

Dynamic Programming

Reminder:

The complexity of the recursive Fibonacci algorithm is in O(2n) while the complexity of the iterative one is in O(n)!

Why?

` Because the recursive algorithm perform the same operations a huge number of time.

Idea: Memorizing the value in an array:

Dynamic programming algorithm for Fibonacci numbers:

For i=2 to n { t[i] = t[i-1] + t[i-2]}

4

. . .

0123n

11235

Essence of this method

• Solving problem method by combination of subproblems solutions.

• Apply to a recursive problem composed of dependent subproblems.

• In fact: a transformation of a recursive algorithm by the use of data structure to store intermediate solutions.

• Each subproblem is solved only one time and is store in an array for later use.

• Can lower the complexity from exponential to polynomial.

• Numerous application for optimization problems.

Another example of an optimization problem:

Given coins which worth c1, c2, …, cn cents

Make up change for k cents,using the minimum number of coins of the above denominations.

In order that this be always possible, we assume thatc1 = 1

Let min(k) denote the minimum number of coins needed to make k cents of change.

Then we set,min(k) = min { min(r) + min(k-r) }

for all 1 r k/2

if we know min(1), min(2), …, min(k-1)

Example:

c1 = 1 centc2 = 5 centsc3 = 10 centsc4 = 25 cents

min(1) = 1, and min(2) = min(1) + min(1) = 2We know also min(5) = min(10) = min(25) = 1

Already know the value of min(2), then

min(3) = min(1) + min(2) = 1 + 2 = 3

Again, already know values of min(2) and min(3), so:min(4) = min{min(1) + min(3), min(2) + min(2) } = min {4, 4} = 4

# of cents Min How solution is found?

1 1 min(1)

2 2 min{min(1)+min(1)}

3 3 min{min(1)+min(2)}

4 4 min{min(1)+min(3), min(2)+min(2)}

5 1 min{min(1)+min(4), min(2)+min(3)}

6 2 min{min(1)+min(5), min(2)+min(4), min(3)+min(3)}

7 3 min{min(1)+min(6), min(2)+min(5), min(3)+min(4)}

8 4 min{min(1)+min(7),min(2)+min(6), min(3)+min(5), min(4)+min(4)}

9 5 min{min(1)+min(8), min(2)+min(7), min(3)+min(6), min(4)+min(5)}

10 1 min{min(1)+min(9), min(2)+min(8), min(3)+min(7), min(4)+min(6)}

min(11) = min{ 1+10, 2+9, 3+8, 4+7, 5+6}= min{ 2, 7, 7, 7, 3}= 2

1+1=2 2+5=7 3+4=7 1+2=3

Why don’t we use recursion?Quite inefficient !!

What do we use instead?Dynamic Programming…

The algorithm uses one arrays:coinsUsed: Stores the minimum number of coins needed to

make change of k cents, k = 1,…, maxChange.

coinsUsed[0] 0; for cents 1 to maxChange do // maxChange = k

minCoins centsfor j 1 to diffCoins do // diffCoins = n

if (coins[j] > cents)continue // Cannot use coin j

if (coinsUsed[cents - coins[j]] + 1 < minCoins)minCoins coinsUsed[cents – coins[j]] + 1

coinsUsed[cents] minCoinsPrint “minimum number of coins:”, coinsUsed[maxChange]

Time Complexity:O ( n k )

where n = number of coins of different denominationsk = amount of change we want to make

Another example from bioinformatics : Sequence alignment by dynamic programming

• Important problem in bioinformatics: determining the level of similarity between two protein sequences.

• Very useful to discover the function of a new protein.

• A protein sequence is represented by a sequence of characters, each character corresponding to one amino acid.

• Align: to match the higher number of characters between two sequences in order to maximize a similarity score.

• Three possible configurations (match): Identity: A A Substitution (mismatch): A C Insertion or deletion (gap): A - or - A

Example: A T - G G - TA A C - G C T

• A score value is associated to each configuration.

• The global score is the sum of the score of all matches in the alignment.

Example:

Identity score: 4mismatch score: -1

gap score: -2

A T - G G - TA A C - G C T

Score: 4 - 1 - 2 - 2 + 4 - 2 + 4 = 5

Problem: to compute the best alignment, all the alignments have to be constructed.

• The solution is defined recursively.

• We define P = p1, p2, …, pn the first sequence and Q = q1, q2, …, qm the second sequence.

• We define F(i,j) the score of the best alignment between p1, p2, …, pi and q1, q2, …, qj.

Finding the score of i,j

i-

ij

-j

1…i1…j-1

1…i-11…j-1

1…i-11…j

+

+

+

Three ways to buildthe alignment 1…i

1…j

Cédric Notredame (21/04/23)

Finding the score of i,j

1…i-11…j-1

1…i1…j-1

1…i-11…j

In order to Compute the score of 1…i1…j

All we need are the scores of:


Formalizing the algorithm

F(i,j)= best

F(i-1,j) + Gap

F(i-1,j-1) + Mat[i,j]

F(i,j-1) + Gap X-

XX

-X

1…i1…j-1

1…i-11…j-1

1…i-11…j

+

+

+

• The direct application of the recursive formula is in O(cn).

• There is overlap between subproblems => dynamic programming.


Arranging Everything in a Table

- F A

-

F

A

S

T

T

1…I-11…J-1

1…I1…J-1

1…I-11…J

1…I 1…J

We use an bidimensional array D to store all the values of the partial alignments


Taking Care of the Limits

- F A-FAS

T

T -4Match=2MisMatch=-1Gap=-1

-3

FAT---

-1

F-

-2

FA--

-1F-

-2FA--

-3FAS---

0


Filing Up The Matrix


- F A

-

F

A

S -3

-2

-1

-1 -2

T

-3

T -4

-2+2

-2 +2

-3-2

+1 +1

-4-3

0 0

+1-2

-3 +10

+4

0 +4-1

0

+3 +30

-3

-4 0+3

0

-1 +3+2

+3

+2 +3-1

-4

-5 -1

+2-1

-2 +2+2

+5

+1 +5

0


Delivering the alignment: Trace-back

Score of 1…3 Vs 1…4

Optimal Aln Score

TT

S-

AA

FF

Dynamic programming algorithm complexity in O(n2)

- G C T C T G C G A A T A

- 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 -20 -22 -24

C -2 -1 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 -20

G -4 0 -2 -1 -3 -5 -4 -6 -8 -10 -12 -14 -16

T -6 -2 -1 0 -2 -1 -3 -5 -7 -9 -11 -10 -12

T -8 -4 -3 1 -1 0 -2 -4 -6 -8 -10 -9 -11

G -10 -6 -5 -1 0 -2 2 0 -2 -4 -6 -8 -10

A -12 -8 -7 -3 -2 -1 0 1 -1 0 -2 -4 -6

G -14 -10 -9 -5 -4 -3 1 -1 3 1 -1 -3 -5

A -16 -12 -11 -7 -6 -5 -1 0 1 5 3 1 -1

T -18 -14 -13 -9 -8 -4 -3 -2 -1 3 4 5 3

A -20 -16 -15 -11 -10 -6 -5 -4 -3 1 5 3 7

C -22 -18 -14 -13 -9 -8 -7 -3 -5 -1 3 4 5

T -24 -20 -16 -12 -11 -7 -9 -5 -4 -3 1 5 3

• Alignment between GCTCTGCGAATA and CGTTGAGATACT (match=2, mismatch = -1 and gap = -2)

G C T C T G C G A - A T A

- C G T T G A G A T A C T

Documents

Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming