15
Unipro UGENE: an open-source toolkit for complex genome analysis Konstantin Okonechnikov, Novosibirsk State University Olga Golosova, Alexey Varlamov, Mikhail Fursov Unipro Company

Unipro ugene bosc 2011 update

Embed Size (px)

Citation preview

Page 1: Unipro ugene bosc 2011 update

Unipro UGENE: an open-source toolkit for

complex genome analysis

Konstantin Okonechnikov,

Novosibirsk State University

Olga Golosova, Alexey Varlamov, Mikhail Fursov

Unipro Company

Page 2: Unipro ugene bosc 2011 update

Unipro UGENE project

• What is UGENE?

A multiplatform open-source application for molecular biologists

• Project goal:

Quality integration of popular bioinformatics tools into the

unified visual and computational solution

• History

– Started 5 years ago as years ago as a set of small collaborative projects with several academic organizations

– For last 3 years rapidly developed due to the Unipro company support

– Winner of several global Russian competitions

Page 3: Unipro ugene bosc 2011 update

UGENE internals

• Written in C++/Qt

• Modular structure

• Integrated plugin system

• Automated testing

– > 4000 tests

• UGENE “Core” team:– Mostly graduates of Novosibirsk State University

– Have professional skills in bioinformatics and

software development

Page 4: Unipro ugene bosc 2011 update

UGENE features: algorithms

Rich library of popular bioinformatics algorithms and computational methods

• Smith-Waterman, Clustal, Muscle, KAlign, Blast, Phylip, HMM, Primer3, Psipred, Bowtie, UGENE Genome Aligner…

• + several dozens more

Some algorithms are unique: contributed by local research labs and academia

Page 5: Unipro ugene bosc 2011 update

UGENE features: data formats

• Support of popular biological data formats (~ 20 formats)

– FASTA, Genbank, Stockholm, PDB, Newick, Nexus(Mega), ClustalW, SAM, BAM…

• Retrieve information from remote databases:– NCBI, PDB, Swissprot…

Page 6: Unipro ugene bosc 2011 update

UGENE features: visualization

• Sequence View

• Annotation Editor

• Sequence Circular view

• Multiple Alignment Editor

• Biopolymer 3D viewer

• Assembly Browser (new!)

• etc

Page 7: Unipro ugene bosc 2011 update

• Optimization for existing algorithms

– Multi-core CPU

– Special instructions (SSE etc)

– GPU

• Support for launching computational tasks on clusters and

clouds

HMMER2:- 30x faster on Intel Core 2 Quad

Smith-Waterman:- 3x faster on SSE2-capable CPUs

- NVidia CUDA version > 10x faster on GPU

Optimized algorithms examples:

UGENE features: HPC

Page 8: Unipro ugene bosc 2011 update

UGENE Workflow Designer

Joining all together

• Rich algorithm libraries

• Unified data formats

• Powerful user interface

• High performance

Visual environment for

constructing computational

workflows

Page 9: Unipro ugene bosc 2011 update

Elements

Library

Workflow

SceneElement

Properties

Main Toolbar

UGENE Workflow Designer

Page 10: Unipro ugene bosc 2011 update

Workflow Designer Features

• Internal data model: no data input/output

conversion

• Parameters can be customized with scripts

• Easy local usage: no any additional

configuration required

• Support for launching workflows on remote

computational resources

Page 11: Unipro ugene bosc 2011 update

• Create new shell command from workflow

– Use your own workflow as a stand alone command line tool

– Example:

ugene align --in=file1.aln –-out=file2.ali

– Where

• „align‟ is the name of the workflow

• „--in‟ and „--out‟ are cmd-line aliases for workflow parameters

Reusing workflows

Page 12: Unipro ugene bosc 2011 update

Easy extending workflows

• Script new features

– Use embedded scripting language нo design new workflow building blocks

– Customize elements parameters with scripts

• Add external tools

(new in 1.9.4!)

– Create custom workflow elements by configuring input and output of a an external program or script

Page 13: Unipro ugene bosc 2011 update

UGENE: future plans

• Web environment for workflow

designer

– View & share workflows

– Launch workflows on cloud resources

• Support for NGS data analysis

– New algorithms: align, assembly,

SNP/indels

– BAM viewer

Page 14: Unipro ugene bosc 2011 update

UGENE community

• Over 500 downloads every month, users all over the

world

• Included into major Linux distributions: Ubuntu,

Fedora, SUSE etc…

• Issue tracker, forum, SVN (links on next slide)

• New members are welcome!

Page 15: Unipro ugene bosc 2011 update

Questions

Thank you for your attention!

Useful links

Website: http://ugene.unipro.ru

Issue tracker: https://ugene.unipro.ru/tracker

Board: http://ugene.unipro.ru/forum/

SVN: https://ugene.unipro.ru/svn/ugene/

UGENE team e-mail: [email protected]