21
CIPRES Software architecture/development Focus Leader : Mark Holder (FSU) Architecture: Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation: Project manager: Mark Miller (SDSC) Chief programmers: Mark Holder (FSU) Terri (Liebowitz) Schwartz (SDSC) Other programmers: Alex Borchers (SDSC) Zhijie Guan (SDSC) Tim McPhillips (SCSC) Other contributors: Paul Lewis (UConn), Yu Fan (UConn), David Maddison (Arizona), Peter Midford

CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

Embed Size (px)

Citation preview

Page 1: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

CIPRES Software architecture/development

Focus Leader : Mark Holder (FSU)

Architecture: Wayne Maddison (UBC)Mark Holder (FSU) David Swofford (FSU)

Implementation:Project manager: Mark Miller (SDSC)

Chief programmers:Mark Holder (FSU)Terri (Liebowitz) Schwartz (SDSC)

Other programmers:Alex Borchers (SDSC) Zhijie Guan (SDSC)Tim McPhillips (SCSC)

Other contributors: Paul Lewis (UConn), Yu Fan (UConn), David Maddison (Arizona), Peter Midford (UBC), Rutger Vos (UBC), Tandy Warnow (UT), Bernard Moret (UNM)

Page 2: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The Grand Goal:Phylogeny of all Life

www.tolweb.org

Page 3: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The Individual Goal:Phylogeny of My Group

Page 4: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The Grand Goal will be achieved by combining efforts

Page 5: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The Grand Goal will be achieved by combining efforts

Combining data: Supermatrices

Page 6: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The Grand Goal will be achieved by combining efforts

Combining data: Supermatrices

Combining results: Supertrees

Page 7: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The Grand Goal will be achieved by combining efforts

Combining data: Supermatrices

Combining results: Supertrees

Combining programming: Supertools

Page 8: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

As data grow, computational demand grows

Problem 1: How to make analytical tools that can handle trees of 100,000 taxa X 100,000 characters?

Page 9: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

Solution?: Hire 25 programmers to write the Monster App For Phylogenetics

No: A short term strategy. Better to build a foundation that will enable a worldwide community of programmers to contribute

Page 10: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

Solution?: Change our mode of software development

Current analytical tools:

PAUP*, MrBayes, PHYLIP, TNT: ≤ 3 authors

Future analytical tools:

Need community programming, with shared effort and rapid incorporation of new algorithmic ideas

Page 11: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

The workflow of a phylogenetic study involves many data, many analyses, many results

Problem 2: How to manage information?

Page 12: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

HabronattusPellenesBianorPlexippus...

Samples

Preservation

Sequencing

Alignment

tissues

alignedsequences

Tree inference—choice of method—model inference—search strategy

Implications of trees— character evolution— speciation/extinction— coevolution

Phenotypic observation

sequences

Coding

matrix

datamatrix

Publish!

Data storage (e.g. NEXUS, database)

Tree storage (e.g, NEXUS, database)

trees

Clustal,etc.

Sequencher, etc.

PAUP*, MrBayes, PHYLIP, TNT, etc.

Discrete, MacClade, Mesquite, etc.

Workflow for phylogenetic analysis

Page 13: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

HabronattusPellenesBianorPlexippus...

Samples

Preservation

Sequencing

Alignment

tissues

alignedsequences

Genbank

Tree inference—choice of method—model inference—search strategy

Implications of trees— character evolution— speciation/extinction— coevolution

trees

Databases

Phenotypic observation

sequences

Coding

matrix

datamatrix

Specimen database

Publish!

TreeBASE

Data storage (e.g. NEXUS, database)

Tree storage (e.g, NEXUS, database)

matrix

Page 14: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

HabronattusPellenesBianorPlexippus...

Samples

Preservation

Sequencing

Alignment

tissues

alignedsequences

Genbank

Tree inference—choice of method—model inference—search strategy

Implications of trees— character evolution— speciation/extinction— coevolution

trees

Information transfer

Phenotypic observation

sequences

Coding

matrix

datamatrix

Specimen database

Publish!

TreeBASE

Data storage (e.g. NEXUS, database)

Tree storage (e.g, NEXUS, database)

matrix

Page 15: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

Footnote

Programs serve so many functions, CIPRES can't possibly make major improvements for all of these software needs

Page 16: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

Solutions for information management

A communications infrastructure can mediate information transfer between programs

Databases can store data, methods, results

Page 17: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

CIPRES architecture, first goal: to build a communications infrastructure

Modules (programs) communicate data, commands and results

Modules depend on each other for services

Modules can be on different machines, in different languages

Page 18: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

CIPRES modular architecture

Tree improver

Tree evaluator

Coordinator

Tree decomposer

Tree Merger

Front EndDCM (highly simplified)

Page 19: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

Why start by building a modular system (i.e., a communications infrastructure)?

— handles information transfer among modules

— provides flexibility of analysis

— facilitates quick implementation of new algorithms

— facilitates distributed processing

— via API, defines a common language of phylogenetic communication

— engages/creates broad community of programmers

Page 20: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

CIPRES Communication

Tree improver

Tree evaluator

Coordinator

Tree decomposer

Tree Merger

Front Endmulti-platformmulti-languageGUI <—> command translationsnapshottingerror handlinglogging

Page 21: CIPRES Software architecture/development Focus Leader: Mark Holder (FSU) Architecture:Wayne Maddison (UBC) Mark Holder (FSU) David Swofford (FSU) Implementation:

And how far have we come, exactly?