32
Lecture 4 Divide and Conquer for Nearest Neighbor Problem Shang-Hua Teng

Lecture 4 Divide and Conquer for Nearest Neighbor Problem Shang-Hua Teng

Embed Size (px)

Citation preview

Lecture 4Divide and Conquer for Nearest

Neighbor ProblemShang-Hua Teng

Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array

A[p..r] using divide and conquer

• Merge-Sort(A,p,r)– if p >= r, do nothing– if p< r then

• Merge-Sort(A,p,q)

• Merge-Sort(A,q+1,r)

• Merge(A,p,q,r)

• Starting by calling Merge-Sort(A,1,n)

2/)( rpq

A = MergeArray(L,R)Assume L[1:s] and R[1:t] are two sorted arrays of elements: Merge-Array(L,R) forms a single

sorted array A[1:s+t] of all elements in L and R.

• A = MergeArray(L,R)– – – for k 1 to s + t

• do if– then

– else

1];[][ iiiLkA1];[][ jjjRkA

]1[;]1[ tRsL

][][ jRiL

1;1 ji

Complexity of MergeArray

• At each iteration, we perform 1 comparison, 1 assignment (copy one element to A) and 2 increments (to k and i or j )

• So number of operations per iteration is 4.

• Thus, Merge-Array takes at most 4(s+t) time.

• Linear in the size of the input.

Merge (A,p,q,r)Assume A[p..q] and A[q+1..r] are two sorted

Merge(A,p,q,r) forms a single sorted array A[p..r].

• Merge (A,p,q,r)– – – –

]1[;]1[ tRsL

;;1 qrtpqs

],1[];..[ rqARqpAL

),(]..[ RLMergeArrayrpA

Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array

A[p..r] using divide and conquer

• Merge-Sort(A,p,r)– if p >= r, do nothing– if p< r then

• Merge-Sort(A,p,q)

• Merge-Sort(A,q+1,r)

• Merge(A,p,q,r)

2/)( rpq

Divide and Conquer

• Divide the problem into a number of sub-problems (similar to the original problem but smaller);

• Conquer the sub-problems by solving them recursively (if a sub-problem is small enough, just solve it in a straightforward manner.

• Combine the solutions to the sub-problems into the solution for the original problem

Merge Sort

• Divide the n-element sequence to be sorted into two subsequences of n/2 element each

• Conquer: Sort the two subsequences recursively using merge sort

• Combine: merge the two sorted subsequences to produce the sorted answer

• Note: during the recursion, if the subsequence has only one element, then do nothing.

Algorithm Design Paradigm I

• Solve smaller problems, and use solutions to the smaller problems to solve larger ones– Divide and Conquer

• Correctness: mathematical induction

Running Time of Merge-Sort

• Running time as a function of the input size, that is the number of elements in the array A.

• The Divide-and-Conquer scheme yields a clean recurrences.

• Assume T(n) be the running time of merge-sort for sorting an array of n elements.

• For simplicity assume n is a power of 2, that is, there exists k such that n = 2k .

Recurrence of T(n)

• T(1) = 1

• for n > 1, we have

nnTnT 4)2/(2)(

nnTnT

4)2/(2

1)(

if n = 1

if n > 1

Solution of Recurrence of T(n)

T(n) = 4 nlog n + n = O(nlog n)

• Picture Proof by Recursion Tree

Two Dimensional Divide and Conquer

Can we extend the divide and conquer idea to 2 dimensions?

We will consider a slightly simpler problem

(handout #33, Chapter 33.4)

Closest Pair Problems

• Input: – A set of points P = {p1,…, pn} in two dimensions

• Output:– The pair of points pi, pj that minimize the

Euclidean distance between them.

Closest Pair Problem

Closest Pair Problem

Divide and Conquer

• O(n2) time algorithm is easy

• Assumptions:– No two points have the same x-coordinates– No two points have the same y-coordinates

• How do we solve this problem in 1 dimensions?– Sort the number and walk from left to right to find

minimum gap

Divide and Conquer

• Divide and conquer has a chance to do better than O(n2).

• Assume that we can find the median in O(n) time!!!

• We can first sort the point by their x-coordinates

Closest Pair Problem

Divide and Conquer for the Closest Pair Problem

Divide by x-median

Divide

Divide by x-median

L R

Conquer

Conquer: Recursively solve L and R

L R

1

2

Combination I

Takes the smaller one of 1 , 2 : = min(1 , 2 )

L R

2

Combination IIIs there a point in L and a point in R whose distance

is smaller than ?

Takes the smaller one of 1 , 2 : = min(1 , 2 )

L R

Combination II

• If the answer is “no” then we are done!!!

• If the answer is “yes” then the closest such pair forms the closest pair for the entire set

• Why????

• How do we determine this?

Combination IIIs there a point in L and a point in R whose distance

is smaller than ?

Takes the smaller one of 1 , 2 : = min(1 , 2 )

L R

Combination IIIs there a point in L and a point in R whose distance

is smaller than ?

Need only to consider the narrow bandO(n) time

L R

Combination IIIs there a point in L and a point in R whose distance

is smaller than ?

Denote this set by S, assume Sy is sorted list of S by y-coordinate.

L R

Combination II

• There exists a point in L and a point in R whose distance is less than if and only if there exist two points in S whose distance is less than .

• If S is the whole thing, did we gain any thing?• If s and t in S has the property that ||s-t|| < ,

then s and t are within 30 position of each other in the sorted list Sy.

Combination IIIs there a point in L and a point in R whose distance

is smaller than ?

L R

There are at most one point in each box

Closest-Pair• Closest-pair(P)

– Preprocessing: • Construct Px and Py as sorted-list by x- and y-coordinates

– Divide• Construct L, Lx , Ly and R, Rx , Ry

– Conquer• Let 1= Closest-Pair(L, Lx , Ly )• Let 2= Closest-Pair(R, Rx , Ry )

– Combination• Let = min(1 , 2 )• Construct S and Sy • For each point in Sy, check each of its next 30 points down the list• If the distance is less than , update the as this smaller distance

Complexity Analysis

• Preprocessing takes O(n lg n) time

• Divide takes O(n) time

• Conquer takes 2 T(n/2) time

• Combination takes O(n) time

• So totally takes O(n lg n) time