Upload
leslie-james
View
219
Download
1
Embed Size (px)
Citation preview
CIPRES Software architecture/development
Focus Leader : Mark Holder (FSU)
Architecture: Wayne Maddison (UBC)Mark Holder (FSU) David Swofford (FSU)
Implementation:Project manager: Mark Miller (SDSC)
Chief programmers:Mark Holder (FSU)Terri (Liebowitz) Schwartz (SDSC)
Other programmers:Alex Borchers (SDSC) Zhijie Guan (SDSC)Tim McPhillips (SCSC)
Other contributors: Paul Lewis (UConn), Yu Fan (UConn), David Maddison (Arizona), Peter Midford (UBC), Rutger Vos (UBC), Tandy Warnow (UT), Bernard Moret (UNM)
The Grand Goal:Phylogeny of all Life
www.tolweb.org
The Individual Goal:Phylogeny of My Group
The Grand Goal will be achieved by combining efforts
The Grand Goal will be achieved by combining efforts
Combining data: Supermatrices
The Grand Goal will be achieved by combining efforts
Combining data: Supermatrices
Combining results: Supertrees
The Grand Goal will be achieved by combining efforts
Combining data: Supermatrices
Combining results: Supertrees
Combining programming: Supertools
As data grow, computational demand grows
Problem 1: How to make analytical tools that can handle trees of 100,000 taxa X 100,000 characters?
Solution?: Hire 25 programmers to write the Monster App For Phylogenetics
No: A short term strategy. Better to build a foundation that will enable a worldwide community of programmers to contribute
Solution?: Change our mode of software development
Current analytical tools:
PAUP*, MrBayes, PHYLIP, TNT: ≤ 3 authors
Future analytical tools:
Need community programming, with shared effort and rapid incorporation of new algorithmic ideas
The workflow of a phylogenetic study involves many data, many analyses, many results
Problem 2: How to manage information?
HabronattusPellenesBianorPlexippus...
Samples
Preservation
Sequencing
Alignment
tissues
alignedsequences
Tree inference—choice of method—model inference—search strategy
Implications of trees— character evolution— speciation/extinction— coevolution
Phenotypic observation
sequences
Coding
matrix
datamatrix
Publish!
Data storage (e.g. NEXUS, database)
Tree storage (e.g, NEXUS, database)
trees
Clustal,etc.
Sequencher, etc.
PAUP*, MrBayes, PHYLIP, TNT, etc.
Discrete, MacClade, Mesquite, etc.
Workflow for phylogenetic analysis
HabronattusPellenesBianorPlexippus...
Samples
Preservation
Sequencing
Alignment
tissues
alignedsequences
Genbank
Tree inference—choice of method—model inference—search strategy
Implications of trees— character evolution— speciation/extinction— coevolution
trees
Databases
Phenotypic observation
sequences
Coding
matrix
datamatrix
Specimen database
Publish!
TreeBASE
Data storage (e.g. NEXUS, database)
Tree storage (e.g, NEXUS, database)
matrix
HabronattusPellenesBianorPlexippus...
Samples
Preservation
Sequencing
Alignment
tissues
alignedsequences
Genbank
Tree inference—choice of method—model inference—search strategy
Implications of trees— character evolution— speciation/extinction— coevolution
trees
Information transfer
Phenotypic observation
sequences
Coding
matrix
datamatrix
Specimen database
Publish!
TreeBASE
Data storage (e.g. NEXUS, database)
Tree storage (e.g, NEXUS, database)
matrix
Footnote
Programs serve so many functions, CIPRES can't possibly make major improvements for all of these software needs
Solutions for information management
A communications infrastructure can mediate information transfer between programs
Databases can store data, methods, results
CIPRES architecture, first goal: to build a communications infrastructure
Modules (programs) communicate data, commands and results
Modules depend on each other for services
Modules can be on different machines, in different languages
CIPRES modular architecture
Tree improver
Tree evaluator
Coordinator
Tree decomposer
Tree Merger
Front EndDCM (highly simplified)
Why start by building a modular system (i.e., a communications infrastructure)?
— handles information transfer among modules
— provides flexibility of analysis
— facilitates quick implementation of new algorithms
— facilitates distributed processing
— via API, defines a common language of phylogenetic communication
— engages/creates broad community of programmers
CIPRES Communication
Tree improver
Tree evaluator
Coordinator
Tree decomposer
Tree Merger
Front Endmulti-platformmulti-languageGUI <—> command translationsnapshottingerror handlinglogging
And how far have we come, exactly?