29
2nd meeting of the ATBI workgroup infrastructure and HPC 9-11 Current situation of computing in life sciences Coffee break 11-12 HRSM projects started in 2017 Exchange between computing facilities; future ideas? 12-13 Lunch break 13-15 Structure and roles in the upcoming Austrian ELIXIR node

2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

2nd meeting of the ATBI workgroup infrastructure and HPC9-11 Current situation of computing in life sciences

Coffee break11-12 HRSM projects started in 2017

Exchange between computing facilities; future ideas? 12-13 Lunch break13-15 Structure and roles in the upcoming Austrian ELIXIR node

Page 2: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Life Science Compute Cluster (LiSC) Althanstraße+Rennweg

Page 3: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Life Science Compute Cluster (LiSC)

Page 4: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Life Science Compute Cluster (LiSC)

Life Science campus Althanstraße + Rennweg:• Microbiology and Ecosystem research• Organismal systems biology, zoology, anthropology• Botany• Ecology• Pharmacy• (Cognition biology)

Page 5: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

16 Compute nodes A20 cores, 320G RAM, 3T tmp

30 Compute nodes B16 cores AMD, 128G RAM, 6T tmp

10 NAS-heads

FC-attached storage5 Arrays

650TB total capacity

2 Database Servers128G RAM,2 TB RAID 10

5 Login nodes: 1xGPU/MIC,8/16 core,48/196G RAM,2/7T tmp

5 Compute nodes C32..40 cores, 1T RAM, 8T tmp

Life Science Compute Cluster (LiSC)

6 Compute nodes D8 cores, 48G RAM, 3T tmp

2 VMWARE Servers 256G RAM16 virtual machines

FC-attached storage1 Array

24TB total capacity

Fanless desktops

NAGIOS Monitoring server(hw, services, backup, prov.)

ZID TSM Backup

Fedora Workstation 26

CentOS 7SLURM workload manager

LDAP user management

Page 6: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Support for users

• Software installation (on demand)

• Database provisioning (weekly/monthly)

• Helpdesk

• Courses

• Usage statistics

• …

http://cube.univie.ac.at/lisc

Page 7: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Environment modules on LiSC

• >200 perl, python2, python3, R modules• 548 modules and versions

sequenceanalysis 78ngstools 73metagenomics 25development 21phylogeny 21assembly 16amplicons 5genetics 5networktools 5visualisation 5rnatools 3machinelearning 2system 2

Page 8: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

HRSM projects VSCBio and SOLID

Page 9: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Tier 1: Vienna Scientific Cluster

HRSM 2013-2017: 17 bioinformatics nodes• CPU: 2x Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz (14 Cores / 28 threads per

CPU)• Storage: 4x Intel P3600 SSD 2 TB (Exported via the BINFS filesystem)• Storage: 12x Seagate ST8000NM0075 8 TB Disks (Exported via the BINFL

filesystem)• Intel Omnipath Interconnect (100 Gigabit/s)• Mellanox ConnectX3 Infiniband Controller (40 Gigabit/s to VSC-3)• 512GB for the nodes binf-01 to binf-12,• 1024GB for the nodes binf-13 to binf-16 and• the binf-17 node has a total of 1536GB of RAM (with slightly lower performance).

Page 10: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

HRSM SOLID: 2017-2020

265 kEUR

162 kEUR

12 kEUR

220 kEUR

Page 11: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Tier 1

Tier 2

Tier 3

Distribution of expertiseHPC

Bioinformatics

Biology

Page 12: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Relevant software tools and data resources

http://bio.tools

Page 13: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Aim:

HPC resources easier accessible for life scientists

Page 14: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Concept:

Bioinformatics HPC services by experts for non-experts

Page 15: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

SOLID team

TBI, Prof. Ivo Hofacker, Fakultät für Chemie, Universität Wien

CUBE, Prof. Thomas RaAei, Forschungsverbund CMM, Universität Wien

CIBIV, Prof. Arndt von Haeseler, MFPL, Universität Wien + MedUni Wien

Medical Epigenomics Lab, Dr. Christoph Bock, Medizinische Universität Wien

CemSIIS, BiosimulaNon and BioinformaNcs, Prof. Wolfgang Schreiner, MedUni Wien

Molekulare Ökologie, Prof. Birgit Schlick-Steiner, Universität Innsbruck

Page 16: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Roadmap 2017

1. Technical and administrative standards (analysis, surveys, concepts)• Homogeneity of HPC platforms• Software packaging and dependency management

2. Biological database requirements3. Software repositories, Issue tracker, Documentation

Page 17: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Roadmap 2018

1. Software deployment on VSC• Environment modules• Singularity containers

2. Hiring mainly Pre-Docs on 20% basis3. Presentation to users, feedback4. Quantitative usage measures, predictive planning5. Full operation in late 2018

Page 18: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Operation 2019/2020

• bm.wfw funding until end of 2019• In kind contributions of partners until end of 2020

Page 19: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Consolidation 2020: Austrian ELIXIR node

Page 20: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

The Austrian ELIXIR node

Page 21: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

State of the art in Austria/international

• Austria: High-performance islands, little connected, few bridges

• International:• Institutional/Company (HPC, Software, Databases, User support)• Regional/National networks/facilities (HPC, Software, Databases, Workflows,

User support)• International networks/facilities (data archives, web-based computing,

standards, …)

Page 22: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Advantages ELIXIR

• Develop national HPC infrastructure, participate+contribute to international (incentives for bioinformaticians to provide infrastructures)

• Services (licensing possible, see KEGG):• 13 compute• 71 database (including essential)• 12 interoperability• 45 tools• 8 training

• International network integration now review criterion for international projects (EU) in life science (early data access, mio EUR!)• PANGEA: MSCA-ITN via ELIXIR Italy, SmartBioMaps failed, DECIDE-Sys failed,

• Solution to• Bridging users and specialists of life science HPC• Sustainability problem in bioinformatics

Page 23: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

ELIXIR node structure

Mission1. ELIXIR Austria supports Austrian bioinformaticians and provides

excellent bioinformatics services to Austrian life scientists via ATBI, which will function as the Austrian hub for all ELIXIR activities.

2. ELIXIR Austria provides bioinformatics training for life scientists in Austria through ATBI-coordinated training activities.

3. The ELIXIR Austria node will foster cooperation within the Austrian bioinformatics community and with international networks for bioinformatics.

Page 24: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

ELIXIR node structure

Structure and topics• Section “Services”• Orchestrating the organization of existing bioinformatics hardware (funded by

HRSM, FFG and research institutions, startup funding to new professors and others) into a collaborative framework.• Provisioning and support of comprehensive bioinformatics software and

biological database services on the Austrian High-Performance Computing (HPC) resources.• Identification, maintenance and further development of Austrian core

resources (hardware, software and databases for life sciences) with the goal of achieving long-term sustainability.

Page 25: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

ELIXIR node structure

Structure and topics• Section “Training”• Improvement of undergraduate and graduate bioinformatics education in

Austria (e.g. synchronization of special courses between Universities, e-learning material).• Improvement of post-graduate bioinformatics training in Austria (from PhD-

level to on-the-job training).• Development of training offers for Austrian core resources via Summer

Schools etc..

Page 26: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

ELIXIR node structure

Structure and topics

• Section “Collaboration”• Maintaining the connections between Austrian life scientists,

bioinformaticians and ELIXIR Austria (e.g. through conferences, workshops, meetings and a jointly organized new Summer School program).• Identification of areas of shared interest with de.NBI (Germany’s Next

Generation Bioinformatics Platform; http://www.denbi.de), SIB (the Swiss Institute of Bioinformatics; http://www.sib.swiss) and other ELIXIR nodes, creation of links between these entities and ELIXIR Austria.

Page 27: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

ELIXIR node implementation

Phase 1: initiation

• The initial phase should be coordinated by ATBI, which will represent the Austrian bioinformatics community with regards to ELIXIR. The main tasks of this phase are:• Collection and provision of information about Austrian infrastructure resources

for life science data (mainly through a central website, which serves as starting point for life scientists seeking solutions for their computational and data analysis problems).• Establishment of the core of the three aspects (Service Provision, Training and

Collaboration) by identifying relevant bioinformaticians and infrastructure-related projects in Austria.• Negotiation of the contractual requirements for full membership of Austria with

ELIXIR.

Page 28: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

ELIXIR node implementa1on

Phase 2: full operation

• Austrian ELIXIR node should then serve as a “one-stop-shop” for

Austrian life scientists with regard to computational infrastructure

• Develop support mechanisms and facilities for exchange

• Financial resources:

• annual fees to ELIXIR (estimated 200k EUR p.a.)

• Operational costs of ELIXIR Austria (mainly project-based).

• In addition, funding for a country-wide bioinformatics collaboration also

needs to be identified to make an ELIXIR node fully functional.

Page 29: 2nd meeting of the ATBI workgroup infrastructure and HPC · • >200 perl, python2, python3, R modules • 548 modules and versions sequenceanalysis 78 ngstools 73 metagenomics 25

Funds for ELIXIR (institutes/ministry/university)• Swiss model:• SIB is federally funded and elixir node Switzerland• Institutions pay for SIB membership• Membership benefits: e.g. infrastructure access, SIB funded positions for

developing core resources• Austrian model:• Membership (162 kEUR in 2018) ?• Structural costs (1-2 positions) ?• Operational costs (scientific/technical/training activities) ?