Advanced Algorithmics (6EAP) Dynamic Programming · 2016. 10. 7. · • A simple inspection of the...

Preview:

Citation preview

07.10.16

1

AdvancedAlgorithmics(6EAP)DynamicProgramming

JaakVilo2016Fall

1JaakVilo

Example

• Windhasblownawaythe+,*,(,)signs• What’sthemaximalvalue?• Minimal?

2 1 7 1 4 3

• 217143

• (2+1)*7*(1+4)*3=21*15=315• 2*1+7+1*4+3=16

• Q:Howtomaximizethevalueofanyexpression?

2451981219872441123=?

• http://net.pku.edu.cn/~course/cs101/resource/Intro2Algorithm/book6/chap16.htm

• Dynamicprogramming,likethedivide-and-conquermethod,solvesproblemsbycombiningthesolutionstosubproblems.– Dünaamilineplaneerimine.

• Divide-and-conqueralgorithmspartitiontheproblemintoindependentsubproblems,solvethesubproblemsrecursively,andthencombinetheirsolutionstosolvetheoriginalproblem.

• Incontrast,dynamicprogrammingisapplicablewhenthesubproblemsarenotindependent,thatis,whensubproblemssharesubsubproblems.

07.10.16

2

Dynamicprogramming

• Avoidcalculatingrepeatingsubproblems

• fib(1)=fib(0)=1;• fib(n)=fib(n-1)+fib(n-2)

• Althoughnaturaltoencode(andausefultaskfornoviceprogrammerstolearnaboutrecursion)recursively,thisisinefficient.

n

n-1 n-2

n-2 n-3 n-3 n-4

n-3 n-4

Structurewithintheproblem

• Thefactthatitisnotatree indicatesoverlappingsubproblems.

• Adynamic-programmingalgorithmsolveseverysubsubproblemjustonceandthensavesitsanswerinatable,therebyavoidingtheworkofrecomputingtheanswereverytimethesubsubproblemisencountered.

Topp-down(recursive,memoized)

• Top-downapproach:Thisisthedirectfall-outoftherecursiveformulationofanyproblem.Ifthesolutiontoanyproblemcanbeformulatedrecursivelyusingthesolutiontoitssubproblems,andifitssubproblemsareoverlapping,thenonecaneasilymemoize orstorethesolutionstothesubproblemsinatable.Wheneverweattempttosolveanewsubproblem,wefirstcheckthetabletoseeifitisalreadysolved.Ifasolutionhasbeenrecorded,wecanuseitdirectly,otherwisewesolvethesubproblemandadditssolutiontothetable.

Bottom-up

• Bottom-upapproach:Thisisthemoreinterestingcase.Onceweformulatethesolutiontoaproblemrecursivelyasintermsofitssubproblems,wecantryreformulatingtheprobleminabottom-upfashion:trysolvingthesubproblemsfirstandusetheirsolutionstobuild-onandarriveatsolutionstobiggersubproblems.Thisisalsousuallydoneinatabularformbyiterativelygeneratingsolutionstobiggerandbiggersubproblemsbyusingthesolutionstosmallsubproblems.Forexample,ifwealreadyknowthevaluesofF41 andF40,wecandirectlycalculatethevalueofF42.

• Dynamicprogrammingistypicallyappliedtooptimizationproblems.Insuchproblemstherecanbemanypossiblesolutions.Eachsolutionhasavalue,andwewishtofindasolutionwiththeoptimal(minimumormaximum)value.

• Wecallsuchasolutionan optimalsolutiontotheproblem,asopposedtothe optimalsolution,sincetheremaybeseveralsolutionsthatachievetheoptimalvalue.

07.10.16

3

Thedevelopmentofadynamic-programmingalgorithmcanbebrokenintoasequenceoffoursteps.

1.Characterizethestructureofanoptimalsolution.2.Recursivelydefinethevalueofanoptimalsolution.3.Computethevalueofanoptimalsolutioninabottom-upfashion.

4.Constructanoptimalsolutionfromcomputedinformation.

Editdistance(Levenshteindistance)

• Smallestnrofeditoperationstoconvertonestringintotheother

I N D U S T R Y

I N T E R E S T

I N D U S T R Y

I N T E R E S T

Edit(Levenshtein)distance• Definition TheeditdistanceD(A,B)betweenstringsAandBis

theminimalnumberofeditoperationstochangeAintoB.Allowededitoperationsaredeletionofasingleletter,insertionofaletter,orreplacingoneletterwithanother.

• LetA=a1 a2 ...am andB=b1 b2 ...bm.– E1:Deletion ai →ε

– E2:Insertion ε→bi

– E3:Substitution ai →bj (ifai ≠bj)

• Otherpossiblevariants:– E4:Transposition aiai+1 →bjbj+1 andai=bj+1 jaai+1=bj

(e.g.lecture→letcure)

Howcanwecalculatethis?

α

β b

a

D(αa, βb) =

1. D(α, β) if a=b2. D(α, β)+1 if a≠b

3. D(αa, β)+14. D(α, βb)+1

min

Howcanwecalculatethisefficiently?

D(S,T) = 1. D(S[1..n-1], T[1..m-1] ) + (S[n]=T[m])? 0 : 1 2. D(S[1..n], T[1..m-1] ) +13. D(S[1..n-1], T[1..m] ) +1

min

d(i,j) = 1. d(i-1,j-1) + (S[n]=T[m])? 0 : 1 2. d(i, j-1) +13. d(i-1, j) +1

min

Define: d(i,j) = D( S[1..i], T[1..j] )

Recursion?

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

x

y

07.10.16

4

Recursion?

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

0

0 n0123456789

1 2 3 4 5 6 7 8 9 n

m

AlgorithmEditdistanceD(A,B)usingDynamicProgramming(DP)

Input:A=a1a2...an,B=b1b2...bmOutput: Valuedmn inmatrix(dij),0≤i≤m,0≤j≤n.

for i=0tomdo di0=i;for j=0to ndo d0j=j ;for j=1to ndofor i=1tomdo

dij =min(di-1,j-1 +(if ai==bj then 0else 1),di-1,j +1,di,j-1 +1)

return dmn

DynamicProgramming

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

0

0 n0123456789

1 2 3 4 5 6 7 8 9 n

m

1234

x

y

Editdistance=shortestpathin Matrixmultiplication• for i=1..n

for j=1..kcij =Σx=1..m aix bxj

A Bn

m

X = Cm

k n

k

O(nmk)

07.10.16

5

MATRIX-MULTIPLY(A,B)1if columns[A]≠rows[B]2 thenerror "incompatibledimensions"3 elsefori=1to rows[A]4 dofor j =1to columns[B]5 do C[i,j]=06 for k=1 to columns[A]7 do C[i,j]=C[i,j]+A[i,k]*B[k,j]8 return C

Chainmatrixmultiplication

• Thematrix-chainmultiplicationproblem canbestatedasfollows:givenachain<A1,A2,...,An>ofn matrices

• matrixAihasdimensionpi-1Xpi

• fullyparenthesizetheproductA1 A2...An inawaythatminimizesthenumberofscalarmultiplications.

A1A2A3A4

• (A1(A2(A3A4))),• (A1((A2A3)A4)),• ((A1A2)(A3A4)),• ((A1(A2A3))A4),• (((A1A2)A3)A4).

07.10.16

6

• DenotethenumberofalternativeparenthesizationsofasequenceofnmatricesbyP(n).

• Sincewecansplitasequenceofnmatricesbetweenthekthand(k +1)stmatricesforanyk =1,2,...,n -1andthenparenthesizethetworesultingsubsequencesindependently,weobtaintherecurrence

• Problem13-4askedyoutoshowthatthesolutiontothisrecurrenceisthesequenceofCatalannumbers:

• P(n)=C (n - 1),where

• Thenumberofsolutionsisthusexponentialinn,andthebrute-forcemethodofexhaustivesearchisthereforeapoorstrategyfordeterminingtheoptimalparenthesization ofamatrixchain.

Let’scracktheproblem

Ai..j=Ai•Ai+1 •••Aj

• OptimalparenthesizationofA1•A2 •••Ansplitsatsomek,k+1.

• Optimal=A1..k• Ak+1..n

• T(A1..n)=T(A1..k) +T(Ak+1..n)+T(A1..k• Ak+1..n)

• T(A1..k)mustbeoptimalforA1•A2 •••Ak

Recursion

• m[i,j]- minimumnumberofscalarmultipli-cationsneededtocomputethematrixAi..j;

• m[i,i]=0• cost(Ai..k• Ak+1..j)=pi-1 pk pj• m[i,j]=m[i,k]+m[k+1,j]+pi-1pkpj.

• Thisrecursiveequationassumesthatweknowthevalueofk,whichwedon't.Thereareonlyj- i possiblevaluesfork,however,namelyk=i,i+1,...,j- 1.

• Sincetheoptimalparenthesizationmustuseoneofthesevaluesfork,weneedonlycheckthemalltofindthebest.Thus,ourrecursivedefinitionfortheminimumcostofparenthesizingtheproductAiAi+1 ...Aj becomes

• Tohelpuskeeptrackofhowtoconstructanoptimalsolution,letusdefines[i,j]tobeavalueofk atwhichwecansplittheproductAiAi+1 ...Aj toobtainanoptimalparenthesization.Thatis, s[i,j]equalsavaluek suchthat m[i,j] =m[i,k] +m[k+1,j]+pi- 1pkpj.

Recursion

• Checksallpossibilities…

• But– thereisonlyafewsubproblems–choosei,js.t.1≤i≤j≤n- O(n2)

• Arecursivealgorithmmayencountereachsubproblemmanytimesindifferentbranchesofitsrecursiontree.Thispropertyofoverlappingsubproblemsisthesecondhallmarkoftheapplicabilityofdynamicprogramming.

07.10.16

7

// foreach length from 2 to n

// check all mid-points for optimality

// foreach start index i

// new best value q

// achieved at mid point k

Example

matrix dimensions: A1 30 X 35 A2 35 X 15 A3 15 X 5 A4 5 X 10 A5 10 X 20 A6 20 X 25

((A1(A2A3))((A4A5)A6))

• AsimpleinspectionofthenestedloopstructureofMATRIX-CHAIN-ORDERyieldsarunningtimeofO(n3)forthealgorithm.Theloopsarenestedthreedeep,andeachloopindex(l,i,and k)takesonatmostn values.

• TimeΩ(n3) => Θ(n3)• SpaceΘ(n2)

• Step4ofthedynamic-programmingparadigmistoconstructanoptimalsolutionfromcomputedinformation.

• Usethetables[1..n,1..n]todeterminethebestwaytomultiplythematrices.

MultiplyusingStable

MATRIX-CHAIN-MULTIPLY(A,s,i,j)1if j>i2 then X=MATRIX-CHAIN-MULTIPLY(A,s,i,s[i,j])3 Y =MATRIX-CHAIN-MULTIPLY(A,s,s[i,j]+1,j)4 returnMATRIX-MULTIPLY(X,Y)5elsereturn Ai

((A1(A2A3))((A4A5)A6))

Elementsofdynamicprogramming

• Optimalsubstructurewithinanoptimalsolution

• Overlappingsubproblems

• Memoization

07.10.16

8

• Amemoizedrecursivealgorithmmaintainsanentryinatableforthesolutiontoeachsubproblem.Eachtableentryinitiallycontainsaspecialvaluetoindicatethattheentryhasyettobefilledin.Whenthesubproblemisfirstencounteredduringtheexecutionoftherecursivealgorithm,itssolutioniscomputedandthenstoredinthetable.Eachsubsequenttimethatthesubproblemisencountered,thevaluestoredinthetableissimplylookedupandreturned. (tabulated)

• Thisapproachpresupposesthatthesetofallpossiblesubproblemparametersisknownandthattherelationbetweentablepositionsandsubproblemsisestablished.Anotherapproach

istomemoizebyusinghashingwiththesubproblemparametersaskeys.

Overlappingsubproblems

LongestCommonSubsequence(LCS) Optimaltriangulation

Two ways of triangulating a convex polygon. Every triangulation of this 7-sided polygon has 7 - 3 = 4 chords and divides the polygon into 7 - 2 = 5 triangles.

The problem is to find a triangulation that minimizes the sum of the weights of the triangles in the triangulation

Parsetree

Parse trees. (a) The parse tree for the parenthesized product ((A1(A2A3))(A4(A5A6))) and for the triangulation of the 7-sided polygon (b) The triangulation of the polygon with the parse tree overlaid. Each matrix Ai corresponds to the side vi-1 vi for i = 1, 2, . . . , 6.

Optimaltriangulation

07.10.16

9

Dynamicprogramming

• Avoidre-calculatingsamesubproblemsby– Characterisingoptimalsolution– Cleverorderingofcalculations

DynamicProgramming

d(i,j)

j

id(i-1,j-1)

d(i,j-1)

d(i-1,j)

0

0 n0123456789

1 2 3 4 5 6 7 8 9 n

m

1234

x

y

Editdistanceisametric

• Itcanbeshown,thatD(A,B)isametric– D(A,B)≥0,D(A,B)=0iffA=B– D(A,B)=D(B,A)– D(A,C)≤D(A,B)+D(B,C)

Pathofeditoperations

• Optimalsolutioncanbecalculatedafterwards– Quitetypicalindynamicprogramming

• Memorizesetspred[i,j]dependingfromwherethedij wasreached.

07.10.16

10

Threepossibleminimizingpaths

• Addintopred[i,j]– (i-1,j-1)if dij =di-1,j-1 +(ifai==bj then0else1)– (i-1,j)if dij =di-1,j +1– (i,j-1)if dij =di,j-1 +1

The path (in reverse order) ε → c6, b5 → b5, c4 → c4, a3 → a3, a2 → b2, b1 → a1.

Multiplepathspossible

• Allpathsarecorrect

• Therecanbemany(howmany?)paths

Spacecanbereduced

CalculationofD(A,B)inspaceΘ(m)

Input:A=a1a2...am,B=b1b2...bn (choosem<=n)Output: dmn=D(A,B)for i=0tomdo C[i]=ifor j=1to ndo

C=C[0];C[0]=j;for i=1tomdo

d=min(C+(if ai==bj then 0else 1), C[i-1]+1, C[i]+1)C=C[i] //memorizenew“diagonal” valueC[i]=d

write C[m]

TimecomplexityisΘ(mn)sinceC[0..m]isfilledntimes

Shortestpathinthegraphhttp://en.wikipedia.org/wiki/Shortest_path_problem

07.10.16

11

Shortestpathinthegraphhttp://en.wikipedia.org/wiki/Shortest_path_problem

All nodes at distance 1 from source

Editdistance=shortestpathin

Observations?

• Shortestpathisclosetothediagonal– Ifashortdistancepathexists

• Valuesalonganydiagonalcanonlyincrease(byatmost1)

Diagonal

Diagonal nr. 2, d02, d13, d24, d35, d46

Diagonal k , -m ≤ k ≤ n, s.t. diagonal k contains only dij where j-i = k.

Property of any diagonal: The values of matrix (dij) can on any specific diagonal either increase by 1 or stay the same

DiagonallemmaLemma: Foreachdij,1≤i≤m,1≤j≤nholds:dij=di-1,j-1 ordij =di-1,j-1 +1.

(noticethatdij anddi-1,j-1 areonthesamediagonal)Proof: Sincedij isaninteger,show:

1. dij ≤di-1,j-1 +12. dij ≥di-1,j-1

Fromthedefinitionofeditdistance1.holdssincedij ≤di-1,j-1 +1Inductiononi+j:

– Basisistrivialwheni=0orj=0(ifweagreethatd-1,j-1=d0j)– Inductionstep:thereare3possibilities-

• Onminimizationthedij iscalculatedfromentrydi-1,j-1,hencedij ≥di-1,j-1

• Onminimizationthedij iscalculatedfromentrydi-1,j,hencedij=di-1,j+1≥di-2,j-1+1≥di-1,j-1

• Onminimizationthedij iscalculatedfromentrydi,j-1.Analogicalto2.– Hence,di-1,j-1 ≤dij

07.10.16

12

Transformthematrixintofkp• Foreachdiagonalonlyshowtheposition(rowindex)wherethevalueisincreasedby1.

• Also,onecanrestrictthematrix(dij)toonlythispartwheredij ≤dmn sinceonlythosedij canbeontheshortestpath.

• We'llusethematrix(fkp)thatrepresentsthediagonalsofdij– fkp isarowindexifromdij,suchthatondiagonalkthevaluepreachesrowi(dij=pandj-i=k).

– Initialization:f0,-1=-1andfkp=-∞whenp≤|k|-1;– dmn =p,suchthatfn-m,p=m

Calculatingmatrix(fkp)bycolumns

• Assumethecolumnp-1hasbeencalculatedin(fkp),andwewanttocalculatefkp. (theregionofdij=p)

• Ondiagonalkvaluespreachatleasttherowt=max(fk,p-1+1,fk-1,p-1 ,fk+1,p-1+1)ifthediagonalkreachessofar.

• Ifonrowt+1additionallyai =bj onthesamediagonal,thendij cannotincrease,andvaluepreachesrowt+1.

• Repeatpreviousstepuntilai ≠bj ondiagonalk.

• fk,p-1+1 - samediagonal

• fk-1,p-1 - diagonalbelow

• fk+1,p-1+1 - diagonalabove

AlgorithmA():calculatefkp

A(k,p)1.t=max(fk,p-1 +1,fk-1,p-1 ,fk+1,p-1+1)2.while at+1 ==bt+1+k do t=t+13. fkp =if t>mort+k>nthen undefinedelse t

f 0,2 … t= max (3,2,3) = 2A[3]=B[3] , … A[5]=B[5] => f(0,2) = 5

f(1,2) … t

07.10.16

13

Algorithm:Diagonalmethodbycolumns

p=-1while fn-m,p ≠mp=p+1for k=-pto p do //fkp =A(k,p)

t=max(fk,p-1 +1,fk-1,p-1 ,fk+1,p-1+1)while at+1 =bt+1+k do t=t+1fkp =if t>mor t+k>nthen undefinedelse t

• pcanonlyocurondiagonals-p≤k≤p.• Methodcanbeimprovedsincekisoftensuchthatfkp isundefined.

• Wecandecreasevaluesofk:– -m≤k≤n(diagonalnumbers)– Letm≤nanddij ondiagonalk.

• if-m≤k≤0then|k|≤dij ≤m• if1≤k≤nthenk≤dij ≤k+m• Hence,-m≤k≤mifp≤mandp-m≤k≤pifp≥m

Extensionstobasiceditdistance

• Newoperations

• Variablecosts

• TimeWarping

Transposition(ab→ba)

• E4: Transpositionaiai+1 →bjbj+1 ,s.t. ai=bj+1 and ai+1=bj

• (e.g.:lecture→letcure)

d(i,j) =

1. d(i-1,j-1) + (S[n]=T[m])? 0 : 1 2. d(i, j-1) +13. d(i-1, j) +14. d(i-2,j-2) + ( if S[i-1,i] = T[j,j-1] then 1 else ∞ )

min

Generalizededitdistance

• UsemoreoperationsE1...En,andtoprovidedifferentcoststoeach.

• Definition. Letx,y∈ Σ*.Theneveryx→yisaneditoperation.Editoperationreplacesxbyy.– IfA=uxvthenaftertheoperation,A=uyv

• Wenotebyw(x→y)thecostorweightoftheoperation.

• Costmaydependonxand/ory.Butweassumew(x→y)≥0.

07.10.16

14

Generalizededitdistance

• Ifoperationscanonlybeappliedinparallel,i.e.thepartalreadychangedcannotbemodifiedagain,thenwecanusethedynamicprogramming.

• Otherwiseitisanalgorithmicallyunsolvableproblem,sincequestion- canAbetransformedintoBusingoperationsofG,isunsolvable.

• Thediagonalmethodingeneralmaynotbeapplicable.

• But,sinceeachdiversionfromdiagonal,thecostslightlyincreases,onecanstaywithinthenarrowregionaroundthediagonal.

Applicationsofgeneralizededitdistance

• Historicdocuments,names• Humanlanguageanddialects• Transliterationrulesfromonealphabettoanothere.g.Tõugu=>Tyugu(viaRussian)

• ...

Examples

näituseks– näiteksAhwrika - Aafrikaweikese - väikesematerjaali - materjali

tuseks -> teksa -> aa , hw -> fw -> v , e -> äaa -> a

“kavalam” otsimineDush, dušš, dushsh ? Gorbatšov, Gorbatshov, Горбачов, Gorbachevrežiim, rezhiim, riim

07.10.16

15

Links• Est-Eng;OldEstonian;Est-Rustransliteration

– https://biit-dev.cs.ut.ee/~orasmaa/gen_ed_test/

• Pronunciation– https://biit-dev.cs.ut.ee/~orasmaa/ing_ligikaudne/

• Github(ReinaUba;SiimOrasmaa)– https://github.com/soras/genEditDist

– Orasmaa,Siim;Käärik,Reina;Vilo,Jaak;Hennoste,Tiit(2010).InformationRetrievalofWordFormVariantsinSpokenLanguageCorporaUsingGeneralizedEditDistance.In:ProceedingsoftheSeventhconferenceonInternationalLanguageResourcesandEvaluation(LREC'10):TheInternationalConferenceonLanguageResourcesandEvaluation;Valletta,Malta;May17-23,2010.(Toim.)Calzolari,Nicoletta;Choukri,Khalid;Maegaard,Bente;Mariani,Joseph;Odjik,Jan.Valletta,Malta:ELRA,2010,623- 629.

How?

• ApplyAho-Corasicktomatchforallpossibleeditoperations

• Useminimumoverallpossiblesuchoperationsandcosts

• Implementation:ReinaKäärik,SiimOrasmaa

Possibleproblems/tasks

• Manuallycreatesensiblelistsofoperations– ForEnglish,Russian,etc…– Oldlanguage,

• Improvethespeedofthealgorithm(testing)

• Trainforautomaticextractionofeditoperationsandrespectivecostsfromexamplesofmatchingwords…

AdvancedDynamicProgramming

• RobertGiegerich:– http://www.techfak.uni-bielefeld.de/ags/pi/lehre/ADP/

• Algebraicdynamicprogramming– Functionalstyle– HaskellcompilesintoC