49
Genome Matrices and The Median Problem 1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1 University of Campinas, Brazil November 2017 1 Zanetti, J.P.P., Biller, P. Meidanis, J. Bull Math Biol (2016) 78: 786

Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Genome Matrices and The Median Problem1

Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1

1University of Campinas, Brazil

November 2017

1Zanetti, J.P.P., Biller, P. Meidanis, J. Bull Math Biol (2016) 78: 786

Page 2: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Summary

1 Genome Evolution and Rearrangements

2 Mathematical Modeling

3 Case Studies

4 Recent Developments

5 Future work

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 2 / 49

Page 3: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

The Human Genome

Source: National Center for Biotechnology Information (NCBI), USA

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 3 / 49

Page 4: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

A Bacterial Genome: E. coli

Source: Science, 05 Sep 1997: Vol. 277, Issue 5331, pp. 1453-1462

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 4 / 49

Page 5: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Genome Evolution

Evolution: Events

Point mutations

Inversions

Translocations

Transpositions

Duplications

Gain/loss

Horizontal transfer

Many others

Our focus

Genome rearrangements

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 5 / 49

Page 6: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Inversion

Source: yourgenome, Public Engagement Team, Wellcome Genome Campus, accessed 2017-11-08

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 6 / 49

Page 7: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Inversion in alignment

Source: CoGePedia, Glossary: “Inversion”, accessed 2017-11-08

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 7 / 49

Page 8: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Inversion in dot plot

Source: CoGe SynMap, E. coli K12 substr. DH10B × E. coli K12 substr. W3110, accessed 2017-11-08

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 8 / 49

Page 9: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Translocation

Source: Wikipedia, Chromosomal translocation, accessed 2017-11-08

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 9 / 49

Page 10: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Integration of circular virus into human genome

Source: Kenan DJ, Mieczkowski PA, Burger-Calderon R, Singh HK, Nickeleit V., J Pathol. 2015 Nov 237(3):379–389

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 10 / 49

Page 11: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Integration of plasmid into bacterial genome

Foster J, Aliabadi Z, Slonczewski J., Microbiology: The Human Experience, W. W. Norton & Company, Inc., Indep. Publ., 2017

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 11 / 49

Page 12: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Integration/excision of phage lambda

Foster J, Aliabadi Z, Slonczewski J., Microbiology: The Human Experience, W. W. Norton & Company, Inc., Indep. Publ., 2017

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 12 / 49

Page 13: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Transposition

Source: Created by Alana Gyemi; accessed in Wikipedia, Chromosomal translocation, 2017-11-12

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 13 / 49

Page 14: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Chromosome Fission

Sorce: what-when-how, Genomics, Comparisons with primate genomes; accessed on 2017-11-14

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 14 / 49

Page 15: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Chromosome Fusion

Sorce: what-when-how, Genomics, Comparisons with primate genomes; accessed on 2017-11-14

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 15 / 49

Page 16: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Chromosome Fusion

Source: Dr. Dana M. Krempels, University of Miami, Course: Genetics (BIL250), Fall 2017 Lecture Notes, Lecture 8: Mutations

at the Chromosome Level; accessed on 2017-11-14

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 16 / 49

Page 17: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Linearization

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 17 / 49

Page 18: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Circularization

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 18 / 49

Page 19: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Link destruction or creation (cut or join)

Circular

linearization / circularization

Linear

chromosome fission / fusion

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 19 / 49

Page 20: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Link swap (cut and join)

Circular/Linear

Linear/Linear

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 20 / 49

Page 21: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Double link swap (double cut and join)

Circular chromosomes

e.g., plasmid integration/excision

e.g., inversion

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 21 / 49

Page 22: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Double link swap (double cut and join)

Circular and linear chromosomes

e.g., viral integration / excision

Linear chromosomes

inversion

e.g., translocation

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 22 / 49

Page 23: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Multichromosomal genome distances

Distances proposed

bp, inv, 2-break, k-break, etc.

DCJ, SCaJ, SCoJ, Algebraic, Rank, etc.

Weights given to rearragements

DCJ SCaJ SCoJ Rank

Link creation/destruction 1 1 1 1Link swap 1 1 2 2Double link swap 1 2 4 2

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 23 / 49

Page 24: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Relationship with Double Cut-and-Join (DCJ)

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 24 / 49

Page 25: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Brassica mitochondrial genomes

Source: Palmer JD, Hebron LA., J Mol Evol. 1988 28:87–97

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 25 / 49

Page 26: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Brassica mitochondrial genomes

Source: Palmer JD, Hebron LA., J Mol Evol. 1988 28:87–97

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 26 / 49

Page 27: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

X chromosome: human vs. mouse

Source: Pevzner P, Tesler G., Genome Research. 2003 Jan 1, 13(1):37–45

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 27 / 49

Page 28: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Vibrio genomes

16S phylogeny DCJ phylogeny

Source: Oliveira KZ., MSc Thesis, University of Campinas, 2010

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 28 / 49

Page 29: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Campanulaceae, family of flowering plants

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 29 / 49

Page 30: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Campanulaceae chloroplast genomes

Source: Biller P, Feijao P, Meidanis J., IEEE/ACM Trans Comp Bio Bioinf. 2013 Jan, 10(1):122–134

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 30 / 49

Page 31: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Recent Developmets

Modeling genomes as matrices

Algorithms

Approximation AlgorithmOrthogonal AlgorithmMI Algorithm

Joint work with L. Chindelevitch, Simon Fraser University, Canada.

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 31 / 49

Page 32: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Genome elements

Links: {ah, bh}, {bt , ct}; free ends: at , ch

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 32 / 49

Page 33: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Representing genomes as matrices

Links: {ah, bh}, {bt , ct}; free ends: at , ch

at ah bt bh ct ch

atahbtbhctch

1 0 0 0 0 00 0 0 1 0 00 0 0 0 1 00 1 0 0 0 00 0 1 0 0 00 0 0 0 0 1

Properties

symmetric matrix (A = At)

orthogonal matrix (At = A−1)

involution (A2 = I )

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 33 / 49

Page 34: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Distance

Distance between two genome matrices is the rank of their difference

d(A,B) = r(A− B)

Properties

Rank is the maximum number of linearly independent rows

d(A,B) = 0 if and only if A = B

d(A,B) = d(B,A)

d(A,C ) ≤ d(A,B) + d(B,C )

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 34 / 49

Page 35: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Matrix Median Problem

Useful for ancestor reconstruction

?

Definition

Given three input genome matrices A, B, and C , find matrix Mminimizing d(M,A) + d(M,B) + d(M,C ).

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 35 / 49

Page 36: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Median may not be genomic

0 1 0 0

1 0 0 0

0 0 0 1

0 0 1 0

0 0 0 1

0 0 1 0

0 1 0 0

1 0 0 0

0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0

−0.5 0.5 0.5 0.5

0.5 −0.5 0.5 0.5

0.5 0.5 −0.5 0.5

0.5 0.5 0.5 −0.5

Need a way to go back from matrices to genomes

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 36 / 49

Page 37: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Genomes applied to region ends

genome × vector = vector

0 0 1 00 1 0 01 0 0 00 0 0 1

0010

=

1000

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 37 / 49

Page 38: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Division into subspaces

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 38 / 49

Page 39: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Approximation Algorithm

V1

B1

P1

V2

B2

P2

V3

B3

P3

V4

B4

P4

V5

B5

P5

Subspaces

Orthonormal Bases

Projection Matrices

Median Candidates

MA = AP1 + AP2 + BP3 + AP4 + AP5

MB = BP1 + BP2 + BP3 + AP4 + BP5

MC = CP1 + BP2 + CP3 + CP4 + CP5

43 approximation factor for genome matrices

if V5 = {0} then MA = MB = MC is a median

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 39 / 49

Page 40: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Orthogonal matrices

Tests with small matrices suggested looking at orthogonal matrices

Exact, efficient algorithm

A

B C

“Walk towards the median”

Find rank 1 matrix H such that B + H is closer to both A and C

Always possible!

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 40 / 49

Page 41: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Orthogonal matrices

Algorithm

while d(A,B) + d(B,C ) > d(A,C ) doFind non-zero u ∈ im(A− B) ∩ im(C − B)

B ← B − 2uuTB/uTu

endreturn B

Nondeterministic

Does it reach all medians?

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 41 / 49

Page 42: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Implementation

Software

GNU Octave 3.8.1

Chooses matrix closer to median to “walk”

Computes im(A− B) ∩ im(C − B) as:

V = null([(null(A’-B’))’; (null(C’-B’))’])

where X’ is XT , the transpose of X

Tries all columns of V

Also code in R, python

Hardware

Laptop, 8 GB memory, 4 cores, AMD A8-7410

Windows 10 + WSL

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 42 / 49

Page 43: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Data Sets

Simulation

Start with random genome

Apply random rearrangement operations

Repeat to get A, B, C

Parameters

sizes: 12, 16, 20, 30, 50, 100, 200, 300, 500 extremities

type of operation: Add/remove adjacencies (near) or DCJ (far)

number of operations: 5% to 30%

10 × each

1,080 instances

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 43 / 49

Page 44: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Results

Near

For all instances, the algorithm finds a median

Far

For all but 5, the algorihtm finds a median

Five instances do not converge: sizes 16, 20, and 30

Not the biggest sizes!!

Times to run all instances of a given size

size Near Far

500 27 min 7:30 h300 4 min 0:50 h200 2 min 0:13 h100 1 min 0:01 h

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 44 / 49

Page 45: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Alternative approach

Drawbacks of Orthogonal Algorithm

Lack of convergence

Not fast enough

Insights from Orthogonal Algorithm

Medians reach the lower bound

For any median M:

Xv = Yv =⇒ Mv = Xv = Yv ,

for X ,Y ∈ {A,B,C} and X 6= Y

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 45 / 49

Page 46: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

MI Median

MI follows majority in V1 through V4

MI follows I in V5

V1

B1

P1

V2

B2

P2

V3

B3

P3

V4

B4

P4

V5

B5

P5

Subspaces

Bases

Projection Matrices

Median MI = AP1 + AP2 + BP3 + AP4 + IP5

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 46 / 49

Page 47: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Results

SCJ

For all 540 cases, the algorithm finds a median

Median is genomic in 535 cases

DCJ

For all 540 cases, the algorihtm finds a median

Median is genomic in 254 cases

Times to run all instances of a given size

size o-SCJ o-DCJ mi-SCJ mi-DCJ

500 27 min 7:30 h 9:52 min 8:24 min300 4 min 0:50 h 3:26 min 2:34 min200 2 min 0:13 h 1:40 min 1:08 min100 1 min 0:01 h 0:30 min 0:24 min

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 47 / 49

Page 48: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

From matrices back to genomes

0.2 0.8 0.5 0 0 0.4 0 0.10.4 0 0 0 0 0.3 0 0.60.3 0 0.5 0.2 0 0 0 0.30 0 0 0 0 1 0 0

0.1 0 0 0.1 0.1 0.4 0.2 0.70 0 0 1 0 0 0 0

0.3 0 0 0.5 0.1 0 0.4 0.10 0.8 0.2 0 0 0.8 0.2 0.3

0 1 0 0 0 0 0 01 0 0 0 0 0 0 00 0 1 0 0 0 0 00 0 0 0 0 1 0 00 0 0 0 0 0 0 10 0 0 1 0 0 0 00 0 0 0 0 0 1 00 0 0 0 1 0 0 0

Assign weight |aij |+ |aji | to edge ij

Take a maximum weight matching as your solution

A genome is a matching of gene extremities

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 48 / 49

Page 49: Genome Matrices and The Median Problemmeidanis/research/rear/Genome_Matrice… · Genome Matrices and The Median Problem1 Joao Meidanis 1 Joao Paulo Zanetti 1 Priscila Biller 1 1University

Future work

More tests with fungii, bacteria, plants, mammals, etc.

Try minimax matrices

minimize max{d(A,M), d(B,M), d(C ,M)}

Determine all sorting scenarios (done for DCJ)

Extension for gene deletions and insertions (done for DCJ)

Extension for duplicated genes (done for DCJ)

Technical issues: NP-hardness, convergence, etc.

Get this presentation:

http://www.ic.unicamp.br/~meidanis/research/rear/

Zanetti, Biller, Meidanis (Campinas) Genome Matrices and The Median Problem November 2017 49 / 49