Lecture 2 - Stanford University

Lecture2Divide-and-conquer,MergeSort,andBig-Onotation

Announcements

• Homework!

• HW1willbereleasedFriday.

• ItisduethefollowingFriday.

• Seethewebsiteforguidelinesonhomework,including:

• Collaborationpolicy

• Bestpractices/styleguide

• WillbepostedbyFriday!

Lasttime

• Algorithmsareawesomeandpowerful!

• Algorithmdesigner’squestion:CanIdobetter?

• Karatsubaintegermultiplication

• Exampleof“DivideandConquer”

• Not-so-rigorousanalysis

Philosophy

Technicalcontent

Pluckythe

pedantic

penguin

Luckythe

lackadaisical

lemur

Olliethe

over-achieving

ostrich

Siggi the

studious

stork

Cast

Today

• Thingswewanttoknowaboutalgorithms:

• Doesitwork?

• Isitefficient?

• We’llstarttoseehowtoanswerthesebylookingatsomeexamplesofsortingalgorithms.

• InsertionSort

• MergeSort

SortingHatSort notdiscussed

Theplan

• PartI:SortingAlgorithms

• InsertionSort:doesitworkandisitfast?

• MergeSort:doesitworkandisitfast?

• Skills:

• Analyzingcorrectnessofiterativeandrecursivealgorithms.

• Analyzingrunningtimeofrecursivealgorithms(part1…morenexttime!)

• PartII:Howdowemeasuretheruntimeofanalgorithm?

• Worst-caseanalysis

• AsymptoticAnalysis

Sorting

• Importantprimitive

• Fortoday,we’llpretendallelementsaredistinct.

6 4 3 8 1 5 2 7

1 2 3 4 5 6 7 8

Ihopeeveryonedidthe

pre-lectureexercise!

Whatwasthemysterysortalgorithm?

1. MergeSort

2. QuickSort

3. InsertionSort

4. BogoSort

def MysteryAlgorithmTwo(A):

for i in range(1,len(A)):

current = A[i]

j = i-1

while j >= 0 and A[j] > current:

A[j+1] = A[j]

j -= 1

A[j+1] = current

def mysteryAlgorithmOne(A):

B = [None for i in range(len(A))]

for i in range(len(B)):

if B[i] == None or B[i] > x:

j = len(B)-1

while j > i:

B[j] = B[j-1]

j -= 1

B[i] = x

break

return B

Benchmark:insertionsort

• Saywewanttosort:

• Insertitemsoneatatime.

• Howwouldweactuallyimplementthis?

6 4 3 8 5

We’regoingtogo

throughthisinsome

detail– it’sgood

practice!

Inyourpre-lectureexercise…

def InsertionSort(A):


current = A[i]

j = i-1


A[j+1] = A[j]

j -= 1

A[j+1] = current

InsertionSortexample

46 3 8 5

64 3 8 5

64 3 8 5

43 6 8 5

43 6 8 5

43 6 8 5

43 6 8 5

43 5 6 8

StartbymovingA[1]toward

thebeginningofthelistuntil

youfindsomethingsmaller

(orcan’tgoanyfurther):

ThenmoveA[2]:

ThenmoveA[3]:

ThenmoveA[4]:

Thenwearedone!

46 3 8 5

InsertionSort

1. Doesitwork?

2. Isitfast?

Empiricalanswers…• Doesitwork?

• Yousawitworkedonthepre-Lectureexercise.

• Isitfast?• IPython notebooklecture2_sorting.ipynb says:

InsertionSort

1. Doesitwork?

2. Isitfast?

• The“same”algorithmcan

befasterorslower

dependingonthe

implementation…

• Weareinterestedinhow

fasttherunningtime

scaleswithn,thesizeof

theinput.

InsertionSort:runningtime

n-1iterations

oftheouter

loop

Intheworstcase,

aboutniterations

ofthisinnerloop

Runningtimescalesliken2

def InsertionSort(A):


current = A[i]

j = i-1


A[j+1] = A[j]

j -= 1

A[j+1] = current

Seems

plausible

InsertionSort

1. Doesitwork?

2. Isitfast?

• Okay,soit’sprettyobviousthatitworks.

• HOWEVER!Inthefutureitwon’tbesoobvious,solet’stakesometimenowtoseehowwewouldprovethisrigorously.

Whydoesthiswork?

• Sayyouhaveasortedlist,,and

anotherelement.

• Insertrightafterthelargestthingthat’sstill

smallerthan.(Aka,rightafter).

• Thenyougetasortedlist:

43 6 8

5

5

43 6 85

5 4

Sojustusethislogicateverystep.

Thefirstelement,[6],makesupasortedlist.

Socorrectlyinserting4intothelist[6]means

that[4,6]becomesasortedlist.

Thefirsttwoelements,[4,6],makeupa

sortedlist.

Thefirstthreeelements,[3,4,6],makeupa

sortedlist.

Socorrectlyinserting3intothelist[4,6]means

that[3,4,6]becomesasortedlist.

Socorrectlyinserting8intothelist[3,4,6]means

that[3,4,6,8]becomesasortedlist.

Thefirstfourelements,[3,4,6,8],makeupa

sortedlist.

46 3 8 54 3 8 5

64 3 8 5

64 3 8 5

4 63 8 5

43 6 8 5

43 6 85

43 6 8 5

43 6 8 5

Socorrectlyinserting5intothelist[3,4,6,8]

meansthat[3,4,5,6,8]becomesasortedlist.

YAYWEAREDONE!

Recall:proofbyinduction

• Maintainaloopinvariant.

• Proceedbyinduction.

• Fourstepsintheproofbyinduction:

• InductiveHypothesis:Theloopinvariantholdsaftertheith iteration.

• Basecase:theloopinvariantholdsbeforethe1st

iteration.

• Inductivestep:Iftheloopinvariantholdsaftertheith

iteration,thenitholdsafterthe(i+1)st iteration

• Conclusion:Iftheloopinvariantholdsafterthelastiteration,thenwewin.

Aloopinvariantissomethingthat

shouldbetrueateveryiteration.

Thisslideskippedinclass;

forreferenceonly.

Formally:induction

• Loopinvariant(i):A[:i+1] issorted.

• InductiveHypothesis:• Theloopinvariant(i) holdsattheendoftheith iteration(oftheouterloop).

• Basecase(i=0):• Beforethealgorithmstarts,A[:1] issorted.✓

• Inductivestep:

• Conclusion:• Attheendofthen-1’stiteration(aka,attheendofthealgorithm),A[:n] = A issorted.

• That’swhatwewanted!✓

Thefirsttwoelements,[4,6],makeupa

sortedlist.

Socorrectlyinserting3intothelist[4,6]means

that[3,4,6]becomesasortedlist.

64 3 8 5

4 63 8 5

A“loopinvariant”is

somethingthatwemaintain

ateveryiterationofthe

algorithm.

Thiswas

iterationi=2.

Aside:proofsbyinduction

• We’regonna see/do/skipoveralotofthem.

• I’massumingyou’recomfortablewiththemfromCS103.

• Whenyouassume…

• Ifthatwentbytoofastandwasconfusing:

• Slides[there’sahiddenonewithmoreinfo]

• Lecturenotes

• Book

• OfficeHours

Makesureyoureallyunderstandthe

argumentonthepreviousslide!

Siggi theStudiousStork

Tosummarize

InsertionSort isanalgorithmthatcorrectlysortsanarbitraryn-element

arrayintimethatscalesliken2.

Canwedobetter?

Theplan




• Skills:


• Analyzingrunningtimeofrecursivealgorithms(partA)




Canwedobetter?

• MergeSort:adivide-and-conquer approach

• Recallfromlasttime:

Bigproblem

Smaller

problem

Smaller

problem

Yetsmaller

problem

Yetsmaller

problem

Yetsmaller

problem

Yetsmaller

problem

Recurse!

Divideand

Conquer:

Recurse!

1

MergeSort

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7

3 4 6 8 1 2 5 7

2 3 4 5 6 7 8

Recursivemagic!Recursivemagic!

CodefortheMERGE stepisgiveninthe

Lecture2notebookortheLectureNotes

MERGE!

Howwould

youdothis

in-place?

Ollietheover-achievingOstrich

MergeSort Pseudocode

• n=length(A)

• if n≤ 1:

• return A

• L=MERGESORT(A[0:n/2])

• R=MERGESORT(A[n/2:n])

• returnMERGE(L,R)

MERGESORT(A):

IfAhaslength1,

Itisalreadysorted!

Sorttherighthalf

Sortthelefthalf

Mergethetwohalves

SeeLecture2IPython notebookforMergeSort PythonCode.

Whatactuallyhappens?First,recursivelybreakupthearrayallthewaydowntothebasecases

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7

6 4 3 8 1 5 2 7

Thisarrayof

length1is

sorted!

Then,mergethemallbackup!

64 3 8 1 5 2 7

1 2 5 73 4 6 8

1 2 3 4 5 6 7 8

Merge!Merge!Merge!Merge!

Merge! Merge!

Merge!

4 3 8 1 5 2 76

Abunchofsortedlistsoflength1(intheorderoftheoriginalsequence).

Sortedsequence!

Twoquestions

1. Doesthiswork?

2. Isitfast?

Empirically:1. Seemsto.

2. Maybe?

IPython notebook says…

ItworksLet’sassumen=2t

• Inductivehypothesis:

“Ineveryrecursivecall,

MERGESORTreturnsasortedarray.”

• n=length(A)

• if n≤ 1:

• return A

• L=MERGESORT(A[1:n/2])

• R=MERGESORT(A[n/2+1:n])

• returnMERGE(L,R)

• Basecase(n=1):a1-element

arrayisalwayssorted.

• Inductivestep:SupposethatL

andRaresorted.Then

MERGE(L,R)issorted.

• Conclusion:“Inthetoprecursive

call,MERGESORT returnsa

sortedarray.”

Fillintheinductivestep!(Eitherdoit

yourselforreaditinCLRS!)

Againwe’lluseinduction.

Thistimewithaninvariant

thatwillremaintrueafter

everyrecursivecall.

It’sfastLet’skeepassumingn=2t

CLAIM:

MERGESORTrequiresatmost11n(log(n)+1)operationstosortn numbers.

Howdoesthiscompareto

InsertionSort?

Scalingliken2 vsscalinglikenlog(n)?

Whatexactlyisan“operation”here?

We’releavingthatvagueonpurpose.

AlsoImadeupthenumber11.

Empirically

[SeeLecture2Notebookforcode]

This

grows

liken2

Thissupposedly

growslikenlog(n)

Theconstantdoesn’tmatter:eventually,𝑛2 > 111111 ⋅ 𝑛log(𝑛)

Quicklogrefresher

• log(n):howmanytimesdoyouneedtodividenby2inordertogetdownto1?

32

16

8

4

2

1

log(32)=5

Alllogarithmsinthis

coursearebase2

64

32

16

8

4

2

1

log(64)=6

log(128)=7

log(256)=8

log(512)=9

.

.

.

log(numberofparticlesin

theuniverse)<280

It’sfast!

CLAIM:

MERGESORTrequiresatmost11n(log(n)+1)operationstosortn numbers.

MuchfasterthanInsertionSort forlargen!(Nomatterhowthealgorithmsareimplemented).

(Andnomatterwhatthatconstant“11”is).

Let’sprovetheclaim

Sizen

n/2n/2

n/4

(Size1)

…

n/4n/4n/4

n/2tn/2tn/2tn/2tn/2tn/2t

…

• Laterwe’llseemoreprincipled

waysofanalyzingdivide-and-

conqueralgs.

• Butfortodaylet’sjustwingit.

Focusonjustoneof

thesesub-problems

Level0

Level1

Levelt

Howmuchworkinthissub-problem?

n/2t

n/2t+1 n/2t+1

TimespentMERGE-ing

thetwosubproblems

Timespentwithinthe

twosub-problems

+

Howmuchworkinthissub-problem?

k

k/2 k/2

TimespentMERGE-ing

thetwosubproblems

Timespentwithinthe

twosub-problems

+

Letk=n/2t…

1

HowlongdoesittaketoMERGE?

3 4 6 8 1 2 5 7

2 3 4 5 6 7 8

CodefortheMERGE

stepisgiveninthe

Lecture2notebook.

MERGE!

k

k/2 k/2

k/2k/2

k

HowlongdoesittaketoMERGE?

CodefortheMERGE

stepisgiveninthe

Lecture2notebook.

k

k/2 k/2

Pluckythe

PedanticPenguin

• Timetoinitializean

arrayofsizek

• Plusthetimeto

initializethreecounters

• Plusthetimeto

incrementtwoofthose

countersk/2times

each

• Plusthetimeto

comparetwovaluesat

leastktimes

• Plusthetimetocopyk

valuesfromthe

existingarraytothebig

array.

• Plus…

Luckythe

lackadaisicallemur

Let’ssaynomore

than11k operations.

There’ssome

justificationforthis

number“11”inthe

lecturenotes,butit’s

reallyprettyarbitrary.

…

Recursiontree

Sizen

n/2n/2

n/4

(Size1)

…

n/4n/4n/4

n/2tn/2tn/2tn/2tn/2tn/2t

…

Level

Amountofwork

atthislevel

0

#

problems

1

2

t

log(n)

1

2

4

2t

n

Sizeof

each

problem

n

n/2

n/4

n/2t

1

11n

11n

11n

11n

11n

…

Totalruntime…

• 11nstepsperlevel,ateverylevel

• log(n)+1levels

•11n(log(n)+1)stepstotal

Thatwastheclaim!

Afewreasonstobegrumpy

• Sorting

shouldtakezerosteps…

• What’swiththis11kbound?

• You(Mary)madethatnumber“11”up.

• Differentoperationsdon’ttakethesameamountoftime.

1 2 3 4 5 6 7 8

Howwewilldealwithgrumpiness

• Takeadeepbreath…

• Worstcaseanalysis

• Asymptoticnotation

Theplan




• Skills:






Worst-caseanalysis

• Inthisclass,wewillfocusonworst-caseanalysis

• Pros:verystrongguarantee

• Cons:verystrongguarantee

1 2 3 4 5 6 7 8

Sortingasortedlist

shouldbefast!!

Algorithm

designer

Algorithm:

Do the thing

Do the stuff

Return the answer

Here is my algorithm!Here is an

input!

Big-Onotation

• Whatdowemeanwhenwemeasureruntime?

• Weprobablycareaboutwalltime: howlongdoesittaketosolvetheproblem,insecondsorminutesorhours?

• Thisisheavilydependentontheprogramminglanguage,architecture,etc.

• Thesethingsareveryimportant,butarenotthepointofthisclass.

• Wewantawaytotalkabouttherunningtimeofanalgorithm,independentoftheseconsiderations.

Howlongdoesan

operationtake?Whyare

webeingsosloppyabout

that“11”?

Mainidea:

Focusonhowtheruntime scaleswithn(theinputsize).

AsymptoticAnalysisHowdoestherunningtimescaleasngetslarge?

• Abstractsawayfrom

hardware- andlanguage-

specificissues.

• Makesalgorithmanalysis

muchmoretractable.

• Onlymakessenseifnis

large(comparedtothe

constantfactors).

Pros: Cons:

Onealgorithmis“faster”thananotherifits

runtimescalesbetterwiththesizeoftheinput.

2100000000000000 n

is“better”thann2 ?!?!

O(…)meansanupperbound

• LetT(n),g(n)befunctionsofpositiveintegers.• ThinkofT(n)asbeingaruntime:positiveandincreasinginn.

• Wesay“T(n)isO(g(n))”ifg(n)growsatleastasfastasT(n)asngetslarge.

• Formally,

𝑇 𝑛 = 𝑂 𝑔 𝑛 ⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)

pronounced“big-ohof…”orsometimes“ohof…”

𝑇 𝑛 = 𝑂 𝑔 𝑛

⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)

Example2𝑛< + 10 = 𝑂 𝑛<

3n2

2n2+10

n2


⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)

Example2𝑛< + 10 = 𝑂 𝑛<

Formally:• Choosec=3

• Choosen0 =4

• Then:

∀𝑛 ≥ 4,

0 ≤ 2𝑛< + 10 ≤ 3 ⋅ 𝑛<

3n2

n2


⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)

sameExample2𝑛< + 10 = 𝑂 𝑛<

Formally:• Choosec=7

• Choosen0 =2

• Then:

∀𝑛 ≥ 2,

0 ≤ 2𝑛< + 10 ≤ 7 ⋅ 𝑛<

7n2

n2

Anotherexample:𝑛 = 𝑂(𝑛2)


⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)

• Choosec=1

• Choosen0 =1

• Then

∀𝑛 ≥ 1,

0 ≤ 𝑛 ≤ 𝑛<

g(n)=n2

T(n)=n

Ω(…)meansalowerbound

• Wesay“T(n)isΩ(g(n))”ifg(n)growsatmostasfastasT(n)asngetslarge.

• Formally,

𝑇 𝑛 = Ω 𝑔 𝑛 ⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑐 ⋅ 𝑔 𝑛 ≤ 𝑇 𝑛

Switchedthese!!

Example𝑛 log< 𝑛 = Ω 3𝑛

• Choosec=1/3

• Choosen0 =3

• Then

𝑇 𝑛 = Ω 𝑔 𝑛

⟺

∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,

0 ≤ 𝑐 ⋅ 𝑔 𝑛 ≤ 𝑇 𝑛

∀𝑛 ≥ 3,

0 ≤3𝑛

3≤ 𝑛 log< 𝑛

Θ(…)meansboth!

•Wesay“T(n)isΘ(g(n))”if:

T(n)=O(g(n))

-AND-

T(n)=Ω(g(n))

Somemoreexamples

• Alldegreekpolynomials* areO(nk)

• Foranyk≥ 1,nk isnot O(nk-1)

(Ontheboardifwehavetime…

ifnotseethelecturenotes!)

*Needsomecaveathere…whatisit?

Take-awayfromexamples

• ToproveT(n)=O(g(n)),youhavetocomeupwithcandn0sothatthedefinitionissatisfied.

• ToproveT(n)isNOT O(g(n)),onewayisproofbycontradiction:

• Suppose(togetacontradiction)thatsomeonegivesyouacandann0 sothatthedefinitionis satisfied.

• Showthatthissomeonemustbylyingtoyoubyderivingacontradiction.

Yetmoreexamples

• n3 +3n=O(n3 – n2)

• n3 +3n=Ω(n3 – n2)

• n3 +3n=Θ(n3 – n2)

• 3n isNOT O(2n)

• log(n)=Ω(ln(n))

• log(n)=Θ(2loglog(n) )

Siggi theStudiousStork

Workthroughanyof

thesethatwedon’t

havetimetogo

throughinclass!

rememberthatlog=log2 inthisclass.

Somebrainteasers

• Aretherefunctionsf,gsothatNEITHER f=O(g)norf=Ω(g)?

• Aretherenon-decreasing functionsf,gsothattheaboveistrue?

• Definethen’th fibonacci numberbyF(0)=1,F(1)=1,F(n)=F(n-1)+F(n-2)forn>2.

• 1,1,2,3,5,8,13,21,34,55,…

Trueorfalse:

• F(n)=O(2n)

• F(n)=Ω(2n)

OllietheOver-achievingOstrich

Whathavewelearned?AsymptoticNotation

• ThismakesbothPluckyandLuckyhappy.• PluckythePedanticPenguinishappybecause

thereisaprecisedefinition.

• LuckytheLackadaisicalLemurishappybecausewe

don’thavetopaycloseattentiontoallthosepesky

constantfactorslike“11”.

• Butweshouldalwaysbecarefulnottoabuseit.

• Inthecourse,(almost)everyalgorithmwesee

willbeactuallypractical,withoutneedingto

take𝑛 ≥ 𝑛5 = 2B5555555.

Thisismy

happyface!

Theplan




• Skills:






Wrap-Up

Recap

• InsertionSort runsintimeO(n2)

• MergeSort isadivide-and-conqueralgorithmthatrunsintimeO(nlog(n))

• Howdoweshowanalgorithmiscorrect?

• Today,wediditbyinduction

• Howdowemeasuretheruntimeofanalgorithm?


• Asymptoticanalysis

Nexttime

• Amoresystematicapproachtoanalyzingtheruntimeofrecursivealgorithms.

Before nexttime

• Pre-LectureExercise:• Afewrecurrencerelations(seewebsite)

Documents

Lecture 2 - Stanford University