Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Lecture2Divide-and-conquer,MergeSort,andBig-Onotation
Announcements
• Homework!
• HW1willbereleasedFriday.
• ItisduethefollowingFriday.
• Seethewebsiteforguidelinesonhomework,including:
• Collaborationpolicy
• Bestpractices/styleguide
• WillbepostedbyFriday!
Lasttime
• Algorithmsareawesomeandpowerful!
• Algorithmdesigner’squestion:CanIdobetter?
• Karatsubaintegermultiplication
• Exampleof“DivideandConquer”
• Not-so-rigorousanalysis
Philosophy
Technicalcontent
Pluckythe
pedantic
penguin
Luckythe
lackadaisical
lemur
Olliethe
over-achieving
ostrich
Siggi the
studious
stork
Cast
Today
• Thingswewanttoknowaboutalgorithms:
• Doesitwork?
• Isitefficient?
• We’llstarttoseehowtoanswerthesebylookingatsomeexamplesofsortingalgorithms.
• InsertionSort
• MergeSort
SortingHatSort notdiscussed
Theplan
• PartI:SortingAlgorithms
• InsertionSort:doesitworkandisitfast?
• MergeSort:doesitworkandisitfast?
• Skills:
• Analyzingcorrectnessofiterativeandrecursivealgorithms.
• Analyzingrunningtimeofrecursivealgorithms(part1…morenexttime!)
• PartII:Howdowemeasuretheruntimeofanalgorithm?
• Worst-caseanalysis
• AsymptoticAnalysis
Sorting
• Importantprimitive
• Fortoday,we’llpretendallelementsaredistinct.
6 4 3 8 1 5 2 7
1 2 3 4 5 6 7 8
Ihopeeveryonedidthe
pre-lectureexercise!
Whatwasthemysterysortalgorithm?
1. MergeSort
2. QuickSort
3. InsertionSort
4. BogoSort
def MysteryAlgorithmTwo(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
def mysteryAlgorithmOne(A):
B = [None for i in range(len(A))]
for i in range(len(B)):
if B[i] == None or B[i] > x:
j = len(B)-1
while j > i:
B[j] = B[j-1]
j -= 1
B[i] = x
break
return B
Benchmark:insertionsort
• Saywewanttosort:
• Insertitemsoneatatime.
• Howwouldweactuallyimplementthis?
6 4 3 8 5
We’regoingtogo
throughthisinsome
detail– it’sgood
practice!
Inyourpre-lectureexercise…
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
InsertionSortexample
46 3 8 5
64 3 8 5
64 3 8 5
43 6 8 5
43 6 8 5
43 6 8 5
43 6 8 5
43 5 6 8
StartbymovingA[1]toward
thebeginningofthelistuntil
youfindsomethingsmaller
(orcan’tgoanyfurther):
ThenmoveA[2]:
ThenmoveA[3]:
ThenmoveA[4]:
Thenwearedone!
46 3 8 5
InsertionSort
1. Doesitwork?
2. Isitfast?
Empiricalanswers…• Doesitwork?
• Yousawitworkedonthepre-Lectureexercise.
• Isitfast?• IPython notebooklecture2_sorting.ipynb says:
InsertionSort
1. Doesitwork?
2. Isitfast?
• The“same”algorithmcan
befasterorslower
dependingonthe
implementation…
• Weareinterestedinhow
fasttherunningtime
scaleswithn,thesizeof
theinput.
InsertionSort:runningtime
n-1iterations
oftheouter
loop
Intheworstcase,
aboutniterations
ofthisinnerloop
Runningtimescalesliken2
def InsertionSort(A):
for i in range(1,len(A)):
current = A[i]
j = i-1
while j >= 0 and A[j] > current:
A[j+1] = A[j]
j -= 1
A[j+1] = current
Seems
plausible
InsertionSort
1. Doesitwork?
2. Isitfast?
• Okay,soit’sprettyobviousthatitworks.
• HOWEVER!Inthefutureitwon’tbesoobvious,solet’stakesometimenowtoseehowwewouldprovethisrigorously.
Whydoesthiswork?
• Sayyouhaveasortedlist,,and
anotherelement.
• Insertrightafterthelargestthingthat’sstill
smallerthan.(Aka,rightafter).
• Thenyougetasortedlist:
43 6 8
5
5
43 6 85
5 4
Sojustusethislogicateverystep.
Thefirstelement,[6],makesupasortedlist.
Socorrectlyinserting4intothelist[6]means
that[4,6]becomesasortedlist.
Thefirsttwoelements,[4,6],makeupa
sortedlist.
Thefirstthreeelements,[3,4,6],makeupa
sortedlist.
Socorrectlyinserting3intothelist[4,6]means
that[3,4,6]becomesasortedlist.
Socorrectlyinserting8intothelist[3,4,6]means
that[3,4,6,8]becomesasortedlist.
Thefirstfourelements,[3,4,6,8],makeupa
sortedlist.
46 3 8 54 3 8 5
64 3 8 5
64 3 8 5
4 63 8 5
43 6 8 5
43 6 85
43 6 8 5
43 6 8 5
Socorrectlyinserting5intothelist[3,4,6,8]
meansthat[3,4,5,6,8]becomesasortedlist.
YAYWEAREDONE!
Recall:proofbyinduction
• Maintainaloopinvariant.
• Proceedbyinduction.
• Fourstepsintheproofbyinduction:
• InductiveHypothesis:Theloopinvariantholdsaftertheith iteration.
• Basecase:theloopinvariantholdsbeforethe1st
iteration.
• Inductivestep:Iftheloopinvariantholdsaftertheith
iteration,thenitholdsafterthe(i+1)st iteration
• Conclusion:Iftheloopinvariantholdsafterthelastiteration,thenwewin.
Aloopinvariantissomethingthat
shouldbetrueateveryiteration.
Thisslideskippedinclass;
forreferenceonly.
Formally:induction
• Loopinvariant(i):A[:i+1] issorted.
• InductiveHypothesis:• Theloopinvariant(i) holdsattheendoftheith iteration(oftheouterloop).
• Basecase(i=0):• Beforethealgorithmstarts,A[:1] issorted.✓
• Inductivestep:
• Conclusion:• Attheendofthen-1’stiteration(aka,attheendofthealgorithm),A[:n] = A issorted.
• That’swhatwewanted!✓
Thefirsttwoelements,[4,6],makeupa
sortedlist.
Socorrectlyinserting3intothelist[4,6]means
that[3,4,6]becomesasortedlist.
64 3 8 5
4 63 8 5
A“loopinvariant”is
somethingthatwemaintain
ateveryiterationofthe
algorithm.
Thiswas
iterationi=2.
Aside:proofsbyinduction
• We’regonna see/do/skipoveralotofthem.
• I’massumingyou’recomfortablewiththemfromCS103.
• Whenyouassume…
• Ifthatwentbytoofastandwasconfusing:
• Slides[there’sahiddenonewithmoreinfo]
• Lecturenotes
• Book
• OfficeHours
Makesureyoureallyunderstandthe
argumentonthepreviousslide!
Siggi theStudiousStork
Tosummarize
InsertionSort isanalgorithmthatcorrectlysortsanarbitraryn-element
arrayintimethatscalesliken2.
Canwedobetter?
Theplan
• PartI:SortingAlgorithms
• InsertionSort:doesitworkandisitfast?
• MergeSort:doesitworkandisitfast?
• Skills:
• Analyzingcorrectnessofiterativeandrecursivealgorithms.
• Analyzingrunningtimeofrecursivealgorithms(partA)
• PartII:Howdowemeasuretheruntimeofanalgorithm?
• Worst-caseanalysis
• AsymptoticAnalysis
Canwedobetter?
• MergeSort:adivide-and-conquer approach
• Recallfromlasttime:
Bigproblem
Smaller
problem
Smaller
problem
Yetsmaller
problem
Yetsmaller
problem
Yetsmaller
problem
Yetsmaller
problem
Recurse!
Divideand
Conquer:
Recurse!
1
MergeSort
6 4 3 8 1 5 2 7
6 4 3 8 1 5 2 7
3 4 6 8 1 2 5 7
2 3 4 5 6 7 8
Recursivemagic!Recursivemagic!
CodefortheMERGE stepisgiveninthe
Lecture2notebookortheLectureNotes
MERGE!
Howwould
youdothis
in-place?
Ollietheover-achievingOstrich
MergeSort Pseudocode
• n=length(A)
• if n≤ 1:
• return A
• L=MERGESORT(A[0:n/2])
• R=MERGESORT(A[n/2:n])
• returnMERGE(L,R)
MERGESORT(A):
IfAhaslength1,
Itisalreadysorted!
Sorttherighthalf
Sortthelefthalf
Mergethetwohalves
SeeLecture2IPython notebookforMergeSort PythonCode.
Whatactuallyhappens?First,recursivelybreakupthearrayallthewaydowntothebasecases
6 4 3 8 1 5 2 7
6 4 3 8 1 5 2 7
6 4 3 8 1 5 2 7
6 4 3 8 1 5 2 7
Thisarrayof
length1is
sorted!
Then,mergethemallbackup!
64 3 8 1 5 2 7
1 2 5 73 4 6 8
1 2 3 4 5 6 7 8
Merge!Merge!Merge!Merge!
Merge! Merge!
Merge!
4 3 8 1 5 2 76
Abunchofsortedlistsoflength1(intheorderoftheoriginalsequence).
Sortedsequence!
Twoquestions
1. Doesthiswork?
2. Isitfast?
Empirically:1. Seemsto.
2. Maybe?
IPython notebook says…
ItworksLet’sassumen=2t
• Inductivehypothesis:
“Ineveryrecursivecall,
MERGESORTreturnsasortedarray.”
• n=length(A)
• if n≤ 1:
• return A
• L=MERGESORT(A[1:n/2])
• R=MERGESORT(A[n/2+1:n])
• returnMERGE(L,R)
• Basecase(n=1):a1-element
arrayisalwayssorted.
• Inductivestep:SupposethatL
andRaresorted.Then
MERGE(L,R)issorted.
• Conclusion:“Inthetoprecursive
call,MERGESORT returnsa
sortedarray.”
Fillintheinductivestep!(Eitherdoit
yourselforreaditinCLRS!)
Againwe’lluseinduction.
Thistimewithaninvariant
thatwillremaintrueafter
everyrecursivecall.
It’sfastLet’skeepassumingn=2t
CLAIM:
MERGESORTrequiresatmost11n(log(n)+1)operationstosortn numbers.
Howdoesthiscompareto
InsertionSort?
Scalingliken2 vsscalinglikenlog(n)?
Whatexactlyisan“operation”here?
We’releavingthatvagueonpurpose.
AlsoImadeupthenumber11.
Empirically
[SeeLecture2Notebookforcode]
This
grows
liken2
Thissupposedly
growslikenlog(n)
Theconstantdoesn’tmatter:eventually,𝑛2 > 111111 ⋅ 𝑛log(𝑛)
Quicklogrefresher
• log(n):howmanytimesdoyouneedtodividenby2inordertogetdownto1?
32
16
8
4
2
1
log(32)=5
Alllogarithmsinthis
coursearebase2
64
32
16
8
4
2
1
log(64)=6
log(128)=7
log(256)=8
log(512)=9
.
.
.
log(numberofparticlesin
theuniverse)<280
It’sfast!
CLAIM:
MERGESORTrequiresatmost11n(log(n)+1)operationstosortn numbers.
MuchfasterthanInsertionSort forlargen!(Nomatterhowthealgorithmsareimplemented).
(Andnomatterwhatthatconstant“11”is).
Let’sprovetheclaim
Sizen
n/2n/2
n/4
(Size1)
…
n/4n/4n/4
n/2tn/2tn/2tn/2tn/2tn/2t
…
• Laterwe’llseemoreprincipled
waysofanalyzingdivide-and-
conqueralgs.
• Butfortodaylet’sjustwingit.
Focusonjustoneof
thesesub-problems
Level0
Level1
Levelt
Howmuchworkinthissub-problem?
n/2t
n/2t+1 n/2t+1
TimespentMERGE-ing
thetwosubproblems
Timespentwithinthe
twosub-problems
+
Howmuchworkinthissub-problem?
k
k/2 k/2
TimespentMERGE-ing
thetwosubproblems
Timespentwithinthe
twosub-problems
+
Letk=n/2t…
1
HowlongdoesittaketoMERGE?
3 4 6 8 1 2 5 7
2 3 4 5 6 7 8
CodefortheMERGE
stepisgiveninthe
Lecture2notebook.
MERGE!
k
k/2 k/2
k/2k/2
k
HowlongdoesittaketoMERGE?
CodefortheMERGE
stepisgiveninthe
Lecture2notebook.
k
k/2 k/2
Pluckythe
PedanticPenguin
• Timetoinitializean
arrayofsizek
• Plusthetimeto
initializethreecounters
• Plusthetimeto
incrementtwoofthose
countersk/2times
each
• Plusthetimeto
comparetwovaluesat
leastktimes
• Plusthetimetocopyk
valuesfromthe
existingarraytothebig
array.
• Plus…
Luckythe
lackadaisicallemur
Let’ssaynomore
than11k operations.
There’ssome
justificationforthis
number“11”inthe
lecturenotes,butit’s
reallyprettyarbitrary.
…
Recursiontree
Sizen
n/2n/2
n/4
(Size1)
…
n/4n/4n/4
n/2tn/2tn/2tn/2tn/2tn/2t
…
Level
Amountofwork
atthislevel
0
#
problems
1
2
t
log(n)
1
2
4
2t
n
Sizeof
each
problem
n
n/2
n/4
n/2t
1
11n
11n
11n
11n
11n
…
Totalruntime…
• 11nstepsperlevel,ateverylevel
• log(n)+1levels
•11n(log(n)+1)stepstotal
Thatwastheclaim!
Afewreasonstobegrumpy
• Sorting
shouldtakezerosteps…
• What’swiththis11kbound?
• You(Mary)madethatnumber“11”up.
• Differentoperationsdon’ttakethesameamountoftime.
1 2 3 4 5 6 7 8
Howwewilldealwithgrumpiness
• Takeadeepbreath…
• Worstcaseanalysis
• Asymptoticnotation
Theplan
• PartI:SortingAlgorithms
• InsertionSort:doesitworkandisitfast?
• MergeSort:doesitworkandisitfast?
• Skills:
• Analyzingcorrectnessofiterativeandrecursivealgorithms.
• Analyzingrunningtimeofrecursivealgorithms(partA)
• PartII:Howdowemeasuretheruntimeofanalgorithm?
• Worst-caseanalysis
• AsymptoticAnalysis
Worst-caseanalysis
• Inthisclass,wewillfocusonworst-caseanalysis
• Pros:verystrongguarantee
• Cons:verystrongguarantee
1 2 3 4 5 6 7 8
Sortingasortedlist
shouldbefast!!
Algorithm
designer
Algorithm:
Do the thing
Do the stuff
Return the answer
Here is my algorithm!Here is an
input!
Big-Onotation
• Whatdowemeanwhenwemeasureruntime?
• Weprobablycareaboutwalltime: howlongdoesittaketosolvetheproblem,insecondsorminutesorhours?
• Thisisheavilydependentontheprogramminglanguage,architecture,etc.
• Thesethingsareveryimportant,butarenotthepointofthisclass.
• Wewantawaytotalkabouttherunningtimeofanalgorithm,independentoftheseconsiderations.
Howlongdoesan
operationtake?Whyare
webeingsosloppyabout
that“11”?
Mainidea:
Focusonhowtheruntime scaleswithn(theinputsize).
AsymptoticAnalysisHowdoestherunningtimescaleasngetslarge?
• Abstractsawayfrom
hardware- andlanguage-
specificissues.
• Makesalgorithmanalysis
muchmoretractable.
• Onlymakessenseifnis
large(comparedtothe
constantfactors).
Pros: Cons:
Onealgorithmis“faster”thananotherifits
runtimescalesbetterwiththesizeoftheinput.
2100000000000000 n
is“better”thann2 ?!?!
O(…)meansanupperbound
• LetT(n),g(n)befunctionsofpositiveintegers.• ThinkofT(n)asbeingaruntime:positiveandincreasinginn.
• Wesay“T(n)isO(g(n))”ifg(n)growsatleastasfastasT(n)asngetslarge.
• Formally,
𝑇 𝑛 = 𝑂 𝑔 𝑛 ⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)
pronounced“big-ohof…”orsometimes“ohof…”
𝑇 𝑛 = 𝑂 𝑔 𝑛
⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)
Example2𝑛< + 10 = 𝑂 𝑛<
3n2
2n2+10
n2
𝑇 𝑛 = 𝑂 𝑔 𝑛
⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)
Example2𝑛< + 10 = 𝑂 𝑛<
Formally:• Choosec=3
• Choosen0 =4
• Then:
∀𝑛 ≥ 4,
0 ≤ 2𝑛< + 10 ≤ 3 ⋅ 𝑛<
3n2
n2
𝑇 𝑛 = 𝑂 𝑔 𝑛
⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)
sameExample2𝑛< + 10 = 𝑂 𝑛<
Formally:• Choosec=7
• Choosen0 =2
• Then:
∀𝑛 ≥ 2,
0 ≤ 2𝑛< + 10 ≤ 7 ⋅ 𝑛<
7n2
n2
Anotherexample:𝑛 = 𝑂(𝑛2)
𝑇 𝑛 = 𝑂 𝑔 𝑛
⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑇 𝑛 ≤ 𝑐 ⋅ 𝑔(𝑛)
• Choosec=1
• Choosen0 =1
• Then
∀𝑛 ≥ 1,
0 ≤ 𝑛 ≤ 𝑛<
g(n)=n2
T(n)=n
Ω(…)meansalowerbound
• Wesay“T(n)isΩ(g(n))”ifg(n)growsatmostasfastasT(n)asngetslarge.
• Formally,
𝑇 𝑛 = Ω 𝑔 𝑛 ⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑐 ⋅ 𝑔 𝑛 ≤ 𝑇 𝑛
Switchedthese!!
Example𝑛 log< 𝑛 = Ω 3𝑛
• Choosec=1/3
• Choosen0 =3
• Then
𝑇 𝑛 = Ω 𝑔 𝑛
⟺
∃𝑐, 𝑛5 > 0𝑠. 𝑡. ∀𝑛 ≥ 𝑛5,
0 ≤ 𝑐 ⋅ 𝑔 𝑛 ≤ 𝑇 𝑛
∀𝑛 ≥ 3,
0 ≤3𝑛
3≤ 𝑛 log< 𝑛
Θ(…)meansboth!
•Wesay“T(n)isΘ(g(n))”if:
T(n)=O(g(n))
-AND-
T(n)=Ω(g(n))
Somemoreexamples
• Alldegreekpolynomials* areO(nk)
• Foranyk≥ 1,nk isnot O(nk-1)
(Ontheboardifwehavetime…
ifnotseethelecturenotes!)
*Needsomecaveathere…whatisit?
Take-awayfromexamples
• ToproveT(n)=O(g(n)),youhavetocomeupwithcandn0sothatthedefinitionissatisfied.
• ToproveT(n)isNOT O(g(n)),onewayisproofbycontradiction:
• Suppose(togetacontradiction)thatsomeonegivesyouacandann0 sothatthedefinitionis satisfied.
• Showthatthissomeonemustbylyingtoyoubyderivingacontradiction.
Yetmoreexamples
• n3 +3n=O(n3 – n2)
• n3 +3n=Ω(n3 – n2)
• n3 +3n=Θ(n3 – n2)
• 3n isNOT O(2n)
• log(n)=Ω(ln(n))
• log(n)=Θ(2loglog(n) )
Siggi theStudiousStork
Workthroughanyof
thesethatwedon’t
havetimetogo
throughinclass!
rememberthatlog=log2 inthisclass.
Somebrainteasers
• Aretherefunctionsf,gsothatNEITHER f=O(g)norf=Ω(g)?
• Aretherenon-decreasing functionsf,gsothattheaboveistrue?
• Definethen’th fibonacci numberbyF(0)=1,F(1)=1,F(n)=F(n-1)+F(n-2)forn>2.
• 1,1,2,3,5,8,13,21,34,55,…
Trueorfalse:
• F(n)=O(2n)
• F(n)=Ω(2n)
OllietheOver-achievingOstrich
Whathavewelearned?AsymptoticNotation
• ThismakesbothPluckyandLuckyhappy.• PluckythePedanticPenguinishappybecause
thereisaprecisedefinition.
• LuckytheLackadaisicalLemurishappybecausewe
don’thavetopaycloseattentiontoallthosepesky
constantfactorslike“11”.
• Butweshouldalwaysbecarefulnottoabuseit.
• Inthecourse,(almost)everyalgorithmwesee
willbeactuallypractical,withoutneedingto
take𝑛 ≥ 𝑛5 = 2B5555555.
Thisismy
happyface!
Theplan
• PartI:SortingAlgorithms
• InsertionSort:doesitworkandisitfast?
• MergeSort:doesitworkandisitfast?
• Skills:
• Analyzingcorrectnessofiterativeandrecursivealgorithms.
• Analyzingrunningtimeofrecursivealgorithms(partA)
• PartII:Howdowemeasuretheruntimeofanalgorithm?
• Worst-caseanalysis
• AsymptoticAnalysis
Wrap-Up
Recap
• InsertionSort runsintimeO(n2)
• MergeSort isadivide-and-conqueralgorithmthatrunsintimeO(nlog(n))
• Howdoweshowanalgorithmiscorrect?
• Today,wediditbyinduction
• Howdowemeasuretheruntimeofanalgorithm?
• Worst-caseanalysis
• Asymptoticanalysis
Nexttime
• Amoresystematicapproachtoanalyzingtheruntimeofrecursivealgorithms.
Before nexttime
• Pre-LectureExercise:• Afewrecurrencerelations(seewebsite)