51
Intro Sequence comparisons Visualization Alignments Scoring Algorithms Last time Introduction What is Bioinformatics? Databases in Bioinformatics

Intro Sequence comparisons Visualization Alignments Scoring

  • Upload
    others

  • View
    18

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Last time

• Introduction• What is Bioinformatics?• Databases in Bioinformatics

Page 2: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Today: Sequence comparisons

• Visualisation• Different objectives• Pairwise alignments

Page 3: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Sequence comparisons: Goals

• What are the similarities?• Local similarities — domains and motifs• What is variable?

• Identify positions — basis for evolutionarystudies

• Understand structural similarities• Determine ancestry

Page 4: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Sequence comparisons: Goals

• What are the similarities?• Local similarities — domains and motifs• What is variable?

• Identify positions — basis for evolutionarystudies

• Understand structural similarities• Determine ancestry

Page 5: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Sequence comparisons: Goals

• What are the similarities?• Local similarities — domains and motifs• What is variable?

• Identify positions — basis for evolutionarystudies

• Understand structural similarities

• Determine ancestry

Page 6: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Sequence comparisons: Goals

• What are the similarities?• Local similarities — domains and motifs• What is variable?

• Identify positions — basis for evolutionarystudies

• Understand structural similarities• Determine ancestry

Page 7: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Homology

• Definition: Homology = common ancestry

• Principle: Similarity⇒homology• Quote: ”These sequences are somewhat

homologous”. Bad!

Similarity 6= homology

• Correct: ”These sequences are somewhatsimilar”.

Page 8: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Homology

• Definition: Homology = common ancestry• Principle: Similarity⇒homology

• Quote: ”These sequences are somewhathomologous”. Bad!

Similarity 6= homology

• Correct: ”These sequences are somewhatsimilar”.

Page 9: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Homology

• Definition: Homology = common ancestry• Principle: Similarity⇒homology• Quote: ”These sequences are somewhat

homologous”.

Bad!

Similarity 6= homology

• Correct: ”These sequences are somewhatsimilar”.

Page 10: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Homology

• Definition: Homology = common ancestry• Principle: Similarity⇒homology• Quote: ”These sequences are somewhat

homologous”. Bad!

Similarity 6= homology

• Correct: ”These sequences are somewhatsimilar”.

Page 11: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Important questions

• When are two sequences significantlysimilar?

• How do we evaluate similarity?

Page 12: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Important questions

• When are two sequences significantlysimilar?

• How do we evaluate similarity?

Page 13: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Data

• DNA: genes, genomes, non-coding DNA,etc

• Codons• RNA• Peptides

Page 14: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Idea of dotplots

Q V A S K I N T N ES

V

A

T

K

I

YMN

• •

E

Put dot where identical residues

, then filter outrandomness

Page 15: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Idea of dotplots

Q V A S K I N T N ES •V •A •T •K •I •YMN • •E •

Put dot where identical residues

, then filter outrandomness

Page 16: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Idea of dotplots

Q V A S K I N T N ES

V •A •T

K •I •YMN • •E •

Put dot where identical residues, then filter outrandomness

Page 17: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)

0 100

0

50

100

150

Page 18: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)

0 100

0

50

100

150

Page 19: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)

0 100

0

50

100

150

Page 20: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)

0 100

0

50

100

150

Page 21: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

What happened here?

s1: A B C Ds2: A C B D

Page 22: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

What happened here?

s1: A B C Ds2: A C B D

Page 23: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Genomic dotplot

Many inversions around origin and termini of replication.

Page 24: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Genomic dotplot

Many inversions around origin and termini of replication.

Page 25: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Visualizing with alignmentOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRR

S+ +PK + ++ +P F+LHT +RA+KRA FNY VA+KI NE +RPttMAP20 43 SKVAPKPFAKENTKPQE-FKLHTGQRALKRAMFNYSVATKIYMNEQQKR

OsMAP20 118 FEEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEE++ K+IEE E++ MRKEMV +AQLMP FD+PF PQRS+RPLTVP+E

PttMAP20 91 QIERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPRE

OsMAP20 167 PSFPSF

PttMAP20 140 PSF

OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.

PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91

OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||

PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141

OsMAP20 169 F--LRLKC--CI 176| :..|| ||

PttMAP20 142 FHMVNSKCWSCI 153

Page 26: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Visualizing with alignmentOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRR

S+ +PK + ++ +P F+LHT +RA+KRA FNY VA+KI NE +RPttMAP20 43 SKVAPKPFAKENTKPQE-FKLHTGQRALKRAMFNYSVATKIYMNEQQKR

OsMAP20 118 FEEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEE++ K+IEE E++ MRKEMV +AQLMP FD+PF PQRS+RPLTVP+E

PttMAP20 91 QIERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPRE

OsMAP20 167 PSFPSF

PttMAP20 140 PSF

OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.

PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91

OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||

PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141

OsMAP20 169 F--LRLKC--CI 176| :..|| ||

PttMAP20 142 FHMVNSKCWSCI 153

Page 27: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Alignments

• Def: A pairwise alignment is a pairing ofsymbols between two sequences.

• Global alignment: Involves wholesequences.

• Local alignment: Involves parts ofsequences.

• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths

Page 28: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Alignments

• Def: A pairwise alignment is a pairing ofsymbols between two sequences.

• Global alignment: Involves wholesequences.

• Local alignment: Involves parts ofsequences.

• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths

Page 29: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Alignments

• Def: A pairwise alignment is a pairing ofsymbols between two sequences.

• Global alignment: Involves wholesequences.

• Local alignment: Involves parts ofsequences.

• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths

Page 30: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Alignments

• Def: A pairwise alignment is a pairing ofsymbols between two sequences.

• Global alignment: Involves wholesequences.

• Local alignment: Involves parts ofsequences.

• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths

Page 31: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Global vs localOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118

|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91

OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||

PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141

OsMAP20 169 F--LRLKC--CI 176| :..|| ||

PttMAP20 142 FHMVNSKCWSCI 153

OsMAP20 1 MEK--TRKATSPKSSMTSSTGPKSPVRNGGSPPHKKSTSEFRGRKNESQI 48||| |:.|.......:|.:.|.|....|.:....|..

PttMAP20 1 MEKAHTKSALKKLVKASSQSAPWSNAARGMAKDDLKDP------------ 38

OsMAP20 49 FRKGGQDSITLDESKRRSPTSQTSPKRSSPKHEQPLSYFRLHTEERAIKR 98..|:|| .:||..:.::.:| ..|:|||.:||:||

PttMAP20 39 ---------LYDKSK-------VAPKPFAKENTKP-QEFKLHTGQRALKR 71

OsMAP20 99 AGFNYQVASKINTNEIIRRFEEKLSKVIEEREIKMMRKEMVHKAQLMPAF 148|.|||.||:||..||..:|..|::.|:|||.|::.||||||.:|||||.|

PttMAP20 72 AMFNYSVATKIYMNEQQKRQIERIQKIIEEEEVRTMRKEMVPRAQLMPYF 121

OsMAP20 149 DKPFHPQRSTRPLTVPKEPSF--LRLKC--CIGGEFHRHFCYNA------ 188|:||.||||:||||||:|||| :..|| ||..:...::..:|

PttMAP20 122 DRPFFPQRSSRPLTVPREPSFHMVNSKCWSCIPEDELYYYFEHAHPHDHA 171

OsMAP20 189 -KAIK 192|.:|

PttMAP20 172 WKPVK 176

Page 32: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

More terminology

• Insertion• Deletion• Indel — when we don’t know• Gap — indel in an alignment• Indel character: usually ”–”

1 MEK--TRKATSPKSSMTSSTGPKSPVRNGGSPPHKKSTSEFRGRKNESQI 48||| |:.|.......:|.:.|.|....|.:....|..

1 MEKAHTKSALKKLVKASSQSAPWSNAARGMAKDDLKDP------------ 38

49 FRKGGQDSITLDESKRRSPTSQTSPKRSSPKHEQPLSYFRLHTEERAIKR 98..|:|| .:||..:.::.:| ..|:|||.:||:||

39 ---------LYDKSK-------VAPKPFAKENTKP-QEFKLHTGQRALKR 71

Page 33: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Choosing alignment?OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRR

S+ +PK + ++ +P F+LHT +RA+KRA FNY VA+KI NE +RPttMAP20 43 SKVAPKPFAKENTKPQE-FKLHTGQRALKRAMFNYSVATKIYMNEQQKR

OsMAP20 118 FEEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEE++ K+IEE E++ MRKEMV +AQLMP FD+PF PQRS+RPLTVP+E

PttMAP20 91 QIERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPRE

OsMAP20 167 PSFPSF

PttMAP20 140 PSF

OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.

PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91

OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||

PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141

OsMAP20 169 F--LRLKC--CI 176| :..|| ||

PttMAP20 142 FHMVNSKCWSCI 153

Page 34: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Principle: Identity• Def: The identity in an alignment is the

fraction of identical paired symbols.• Early selection criteria: Choose alignment

with highest identity

Here: 62112 ≈ 55% identity

OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.

PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91

OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||

PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141

OsMAP20 169 F--LRLKC--CI 176| :..|| ||

PttMAP20 142 FHMVNSKCWSCI 153

Page 35: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Principle: Identity• Def: The identity in an alignment is the

fraction of identical paired symbols.• Early selection criteria: Choose alignment

with highest identityHere: 62

112 ≈ 55% identityOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118

|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91

OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||

PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141

OsMAP20 169 F--LRLKC--CI 176| :..|| ||

PttMAP20 142 FHMVNSKCWSCI 153

Page 36: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Scoring an alignment

• Identity looses info on similarity

• Better: assign score to every pair ofsymbols. s(x , y) = cExample: for DNA

s A T G CA 2 -1 1 -1T -1 2 -1 1G 1 -1 2 -1C -1 1 -1 2

• Indel scores: s(x ,−) = s(−, x)?= −1

Page 37: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Scoring an alignment

• Identity looses info on similarity• Better: assign score to every pair of

symbols. s(x , y) = cExample: for DNA

s A T G CA 2 -1 1 -1T -1 2 -1 1G 1 -1 2 -1C -1 1 -1 2

• Indel scores: s(x ,−) = s(−, x)?= −1

Page 38: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Scoring an alignment

• Identity looses info on similarity• Better: assign score to every pair of

symbols. s(x , y) = cExample: for DNA

s A T G CA 2 -1 1 -1T -1 2 -1 1G 1 -1 2 -1C -1 1 -1 2

• Indel scores: s(x ,−) = s(−, x)?= −1

Page 39: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Scoring an alignment• Alignment x , y from sequences x and y .

E.g.: x = AAGTT, y = AATT, alignment isx AAGTTy AA-TT

• Alignment score is

S(x , y) =

|x |∑i=1

s(xi , yi)

• Here:

S(x , y) = s(A, A) + s(A, A)

+ s(G,−) + s(T , T ) + s(T , T )

Page 40: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Scoring an alignment• Alignment x , y from sequences x and y .

E.g.: x = AAGTT, y = AATT, alignment isx AAGTTy AA-TT

• Alignment score is

S(x , y) =

|x |∑i=1

s(xi , yi)

• Here:

S(x , y) = s(A, A) + s(A, A)

+ s(G,−) + s(T , T ) + s(T , T )

Page 41: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

How do we choose an alignment?

• Want to choose best global alignment• Many alignments• Given x = x1x2 · · · xm and y = y1y2 · · · yn,

find x , y that maximize score S(x , y).

• Idea: Find best way of ending alignment

Page 42: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

How do we choose an alignment?

• Want to choose best global alignment• Many alignments• Given x = x1x2 · · · xm and y = y1y2 · · · yn,

find x , y that maximize score S(x , y).• Idea: Find best way of ending alignment

Page 43: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

How to end alignment: alternativesOne of:

x1 · · · xm−1y1 · · · yn−1

xmyn

Mm−1,n−1 + s(xm, yn)

or

x1 · · · xm−1y1 · · · yn

xm−

Mm−1,n + s(xm,−)

or

x1 · · · xmy1 · · · yn−1

−yn

Mm,n−1 + s(−, yn)

Page 44: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

How to end alignment: alternativesOne of:

x1 · · · xm−1y1 · · · yn−1

xmyn

Mm−1,n−1 + s(xm, yn)

or

x1 · · · xm−1y1 · · · yn

xm−

Mm−1,n + s(xm,−)

or

x1 · · · xmy1 · · · yn−1

−yn

Mm,n−1 + s(−, yn)

Page 45: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

How to end alignment: alternativesOne of:

x1 · · · xm−1y1 · · · yn−1

xmyn

Mm−1,n−1 + s(xm, yn)

or

x1 · · · xm−1y1 · · · yn

xm− Mm−1,n + s(xm,−)

or

x1 · · · xmy1 · · · yn−1

−yn

Mm,n−1 + s(−, yn)

Page 46: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

How to end alignment: alternativesOne of:

x1 · · · xm−1y1 · · · yn−1

xmyn

Mm−1,n−1 + s(xm, yn)

or

x1 · · · xm−1y1 · · · yn

xm− Mm−1,n + s(xm,−)

or

x1 · · · xmy1 · · · yn−1

−yn

Mm,n−1 + s(−, yn)

Page 47: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

A rekursion for max alignment score

Note: for global alignment

M0,0 = 0

Mm,n = max

Mm−1,n−1 + s(xm, yn) m > 0, n > 0Mm−1,n + s(xm,−) m > 0, n ≥ 0Mm,n−1 + s(−, yn) m ≥ 0, n > 0

We get:Mm,n = max

x ,yS(x , y)

Page 48: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Computing Mm,n

• Keep Mi ,j in a table• Table + Rekursion = Dynamic Programming• Needleman-Wunch algorithm

• mn elements in table⇒Time complexity is ∼ mn.

• When filling the table, note alternatives.• Backtracking for retrieving the alignment.

Page 49: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

Computing Mm,n

• Keep Mi ,j in a table• Table + Rekursion = Dynamic Programming• Needleman-Wunch algorithm• mn elements in table

⇒Time complexity is ∼ mn.• When filling the table, note alternatives.• Backtracking for retrieving the alignment.

Page 50: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

DP and backtracking

From Eddy, Nature Biotech, 2004

Page 51: Intro Sequence comparisons Visualization Alignments Scoring

Intro Sequence comparisons Visualization Alignments Scoring Algorithms

DP for local alignments

• Smith-Waterman algorithm• Allow ”restarting” from zero.

M0,0 = 0

Mm,n = max

Mm−1,n−1 + s(xm, yn) m > 0, n > 0Mm−1,n + s(xm,−) m > 0, n ≥ 0Mm,n−1 + s(−, yn) m ≥ 0, n > 00 ← Here!