View
755
Download
2
Embed Size (px)
Citation preview
Unipro UGENE: an open-source toolkit for
complex genome analysis
Konstantin Okonechnikov,
Novosibirsk State University
Olga Golosova, Alexey Varlamov, Mikhail Fursov
Unipro Company
Unipro UGENE project
• What is UGENE?
A multiplatform open-source application for molecular biologists
• Project goal:
Quality integration of popular bioinformatics tools into the
unified visual and computational solution
• History
– Started 5 years ago as years ago as a set of small collaborative projects with several academic organizations
– For last 3 years rapidly developed due to the Unipro company support
– Winner of several global Russian competitions
UGENE internals
• Written in C++/Qt
• Modular structure
• Integrated plugin system
• Automated testing
– > 4000 tests
• UGENE “Core” team:– Mostly graduates of Novosibirsk State University
– Have professional skills in bioinformatics and
software development
UGENE features: algorithms
Rich library of popular bioinformatics algorithms and computational methods
• Smith-Waterman, Clustal, Muscle, KAlign, Blast, Phylip, HMM, Primer3, Psipred, Bowtie, UGENE Genome Aligner…
• + several dozens more
Some algorithms are unique: contributed by local research labs and academia
UGENE features: data formats
• Support of popular biological data formats (~ 20 formats)
– FASTA, Genbank, Stockholm, PDB, Newick, Nexus(Mega), ClustalW, SAM, BAM…
• Retrieve information from remote databases:– NCBI, PDB, Swissprot…
UGENE features: visualization
• Sequence View
• Annotation Editor
• Sequence Circular view
• Multiple Alignment Editor
• Biopolymer 3D viewer
• Assembly Browser (new!)
• etc
• Optimization for existing algorithms
– Multi-core CPU
– Special instructions (SSE etc)
– GPU
• Support for launching computational tasks on clusters and
clouds
HMMER2:- 30x faster on Intel Core 2 Quad
Smith-Waterman:- 3x faster on SSE2-capable CPUs
- NVidia CUDA version > 10x faster on GPU
Optimized algorithms examples:
UGENE features: HPC
UGENE Workflow Designer
Joining all together
• Rich algorithm libraries
• Unified data formats
• Powerful user interface
• High performance
Visual environment for
constructing computational
workflows
Elements
Library
Workflow
SceneElement
Properties
Main Toolbar
UGENE Workflow Designer
Workflow Designer Features
• Internal data model: no data input/output
conversion
• Parameters can be customized with scripts
• Easy local usage: no any additional
configuration required
• Support for launching workflows on remote
computational resources
• Create new shell command from workflow
– Use your own workflow as a stand alone command line tool
– Example:
ugene align --in=file1.aln –-out=file2.ali
– Where
• „align‟ is the name of the workflow
• „--in‟ and „--out‟ are cmd-line aliases for workflow parameters
Reusing workflows
Easy extending workflows
• Script new features
– Use embedded scripting language нo design new workflow building blocks
– Customize elements parameters with scripts
• Add external tools
(new in 1.9.4!)
– Create custom workflow elements by configuring input and output of a an external program or script
UGENE: future plans
• Web environment for workflow
designer
– View & share workflows
– Launch workflows on cloud resources
• Support for NGS data analysis
– New algorithms: align, assembly,
SNP/indels
– BAM viewer
UGENE community
• Over 500 downloads every month, users all over the
world
• Included into major Linux distributions: Ubuntu,
Fedora, SUSE etc…
• Issue tracker, forum, SVN (links on next slide)
• New members are welcome!
Questions
Thank you for your attention!
Useful links
Website: http://ugene.unipro.ru
Issue tracker: https://ugene.unipro.ru/tracker
Board: http://ugene.unipro.ru/forum/
SVN: https://ugene.unipro.ru/svn/ugene/
UGENE team e-mail: [email protected]