12
Page 1 of 12 EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 Date: 22 nd June 2016 Time: 10am-3pm Notes & Actions compiled by Vicky Schneider Hosted by Andrew Lonie, EMBL-ABR HUB and EMBL-ABR@VLSCI node Vicky Schneider, EMBL-ABR HUB Philippa Griffin, EMBL-ABR HUB Admin and Logistics Fiona Kerr, EMBL-ABR HUB and EMBL-ABR@VLSCI node Claudia Curcio, EMBL-ABR HUB and EMBL-ABR@VLSCI node Invitees Marc Wilkins, NSW Richard Edwards, NSW Annette McGrath, CSIRO Kevin Dudley, QUT Dominique Gorse, QCIF/QFAB Michael Charleston, Utas Sonika Tyagi, AGRF David Adelson, UoA Philipp Bayer, UWA Steve Androulakis, Monash University Helen Gardiner, EMBL-ABR HUB and EMBL-ABR@VLSCI node Christina Hall, EMBL-ABR HUB and EMBL-ABR@VLSCI node Apologies Ian Small, sent proxy, UWA Ira Cooke, JCU no proxy, overseas Sylvain Floret, ANU no proxy, overseas Ute Baumann, met on the 23 rd with Vicky Nathan Watson-Haigh, met on the 23 rd with Vicky Dan Andrews, NCI, no proxy

EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 1 of 12

EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 Date: 22nd June 2016 Time: 10am-3pm Notes & Actions compiled by Vicky Schneider Hosted by Andrew Lonie, EMBL-ABR HUB and EMBL-ABR@VLSCI node Vicky Schneider, EMBL-ABR HUB Philippa Griffin, EMBL-ABR HUB Admin and Logistics Fiona Kerr, EMBL-ABR HUB and EMBL-ABR@VLSCI node Claudia Curcio, EMBL-ABR HUB and EMBL-ABR@VLSCI node Invitees Marc Wilkins, NSW Richard Edwards, NSW Annette McGrath, CSIRO Kevin Dudley, QUT Dominique Gorse, QCIF/QFAB Michael Charleston, Utas Sonika Tyagi, AGRF David Adelson, UoA Philipp Bayer, UWA Steve Androulakis, Monash University Helen Gardiner, EMBL-ABR HUB and EMBL-ABR@VLSCI node Christina Hall, EMBL-ABR HUB and EMBL-ABR@VLSCI node Apologies Ian Small, sent proxy, UWA Ira Cooke, JCU no proxy, overseas Sylvain Floret, ANU no proxy, overseas Ute Baumann, met on the 23rd with Vicky Nathan Watson-Haigh, met on the 23rd with Vicky Dan Andrews, NCI, no proxy

Page 2: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 2 of 12

Purpose The meeting was designed as an initial discussion within a focus group from around Australia, following up initial conversations held by the Deputy Director in the last months. We covered the following topics: What is EMBL-ABR now, What have we done since February. The meeting enabled discussion about the EMBL-ABR network structure, in particular how participants envisage their institution’s involvement and the funding strategy we are proposing. Synopsis Andrew Lonie welcomed everyone and provided an overview about EMBL-ABR as a distributed national research infrastructure providing bioinformatics support to life science researchers in Australia. This was followed by Vicky Schneider, who explained where we are in terms of network building; the processes we are following which include the EMBL-ABR node description and EMBL-ABR EoI Activities; the Executive structure and how we propose it will evolve as the network consolidates. Pip Griffin then provided an overview of the EMBL-ABR Key Areas: Data, Training, Tools, Compute, Platforms and International. Vicky mentioned the conferences at which EMBL-ABR will be present in 2016, and gave an overview of key documents available. This information is also contained in the EMBL-ABR Network Building Strategy and Scope document that was distributed later in the day to all participants. All the information presented, including the key documents, are available on the EMBL-ABR website: https://www.embl-abr.org.au/key-documents/ The Key Areas currently covered by EMBL-ABR were summarised on posters available for participants’ reference in the meeting room. The Areas and Activities are also summarised below:

COMPUTE: The aim of activities in this area is to assist in the design, architecture and aspects of the delivery of infrastructure to support EMBL-ABR activities and ensure interoperability across the EMBL-ABR network as well as with international efforts. In addition, this program will enable connections between researchers and decisionmakers about future computation, analysis and storage infrastructure requirements. DATA: Showcase Australian research and datasets at an international level, and

• Deliver the data Australia needs • Catalogue the data • Provide support and best practice (data management, data life cycle)

PLATFORMS: Provide Australia with access to bioinformatics platforms that link multiple tools, facilitate data sharing and analysis, and trace and record analysis pipelines. EMBL-ABR will leverage the existing Australian expertise in platform development (e.g. EMBL-ABR@VLSCI node: Genomics Virtual Lab).

Page 3: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

3

TOOLS: Currently exploring the need for bioinformatics software hosting, maintenance and support among Australian researchers and bioinformaticians. Also gaining an overview of the tools Australian researchers need and how best to deliver these. TRAINING: EMBL-ABR Training activities span:

• Training registry (in place, see here http://www.embl-abr.org.au/about/events/ )

• Training materials registry • Training programme

o Best practice workshop (the first five workshops will run 24th-28th Oct 2016)

o training impact monitoring, • Train the Trainer programme • User training in Australia.

INTERNATIONAL: EMBL-ABR is actively engaging with BD2K (NIH), CyVerse (NSF), ELIXIR Hub and nodes (Europe Bioinformatics infrastructure), EMBL-EBI.

Page 4: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 4 of 12

Participants were then asked to introduce themselves and provide three words to represent themselves. Subsequently participants worked in groups to map the national (and pan-national) initiatives in bioinformatics they were aware of around the world. The three groups came up with a nice set of initiatives, many of which were then covered in the overview of national level efforts presented by Vicky. These are illustrated below:

There was general consensus about EMBL-ABR being the appropriate corresponding Australian national-level initiative in bioinformatics, to lead the interactions and foster collaborations with the other national initiatives around the world. It was clear that all felt bioinformatics is key for biological and medical research and that there are tasks and activities that should be resourced at the national level.

Page 5: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 5 of 12

Participants were then asked to group again and reflect on the EMBL-ABR Key Areas (Data, Tools, Platforms, Training, Compute, International). Groups discussed:

• what is there currently in Australia? • how is it done? • who is doing it? • what is missing?

Each group chose a rapporteur to present the group conclusions, and were chaired by the following members: Helen Gardiner, Christina Hall and Andrew Lonie. Chairs followed the explicit direction to ensure all participants were engaged and the presentation of their group was a representation of the group members overall. The three groups were: Platypus: Andrew Lonie, Marc Wilkins, Dominique Gorse and David Adelson Powerful Owl: Steve Androulakis, Christina Hall, Pip Griffin, Kevin Dudley, Sonika Tyagi Kookaburra: Helen Gardiner, Annette McGrath, Richard Edwards, Phillip Bayer In parallel group members were asked to reflect and note down for them as representatives of their institutions what they envisage as activities their node could contribute in these areas as well as what they would like to see covered by ‘other nodes’. The summary of the information shared and collected by the groups is shown below: EMBL-ABR Key Areas: What, Who & How? Interactive Activity Summary

Key Area What/Who/How/What is missing

PLATFORMS

- GVL - GALAXY - CVL - STEMFORMATICS - TERN - GARVAN-human genome, DNA NEXUS - MISSING: platform for both bioinformaticians and life scientists, off-loading of basic skills, what is O/S?

DATA

- BIOPLATFORMS AUSTRALIA FRAMEWORK DATASETS (e.g. Melanoma Australia) - IMB MIRRORING (RDS) - ANDS - TCGA (unsure whether this is available or missing?) - AGRF, METABOLOMICS AUSTRALIA, PROTEOMICS, MHTP/MICROMON -- PUBLIC - RDSI - MISSING: AUSTRALIAN UNIQUE DATASETS on the world stage

Page 6: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

6

Key Area What/Who/How/What is missing - MISSING: what is best practice? Guidelines on annotation, sharing, reporting DoI/impact - MISSING: Motive to put data out! -Challenge: reluctance to publish data that is not well annotated

TRAINING

- QFAB - SOFTWARE CARPENTRY - MONASH TRAINING - BIOPLATFORMS AUSTRALIA/CSIRO - VLSCI - UNIVERSITY OF ADELAIDE - COMBINE - ABACBS - GALAXY (QUT, VLSCI) - WEHI R PIPELINE - BIOINFOSUMMER - WINTER SCHOOL (UQ) - TRAINING EVENTS PORTAL - OTHER INSTITUTE-SPECIFIC - GOBLET - MOOCS (JOHNS HOPKINS R) - MISSING: DEDICATED ONLINE TRAINING, ADVANCED SKILLS, RADSEQ DATA PROCESSING, LONG-READ ASSEMBLY, STATISTICS, DATA MANAGEMENT AND ARCHIVAL/CURATION/SHARING/METADATA, ANNOTATION/PATHWAYS, OMICS INTEGRATION - National Curriculum in Bioinformatics, many Universities don’t have a bachelor degree in Bioinformatics - Need to bring the community along

TOOLS

- GORDON SMYTHE, LIMMA - DEGUST (MONASH) - TORSTEN SEEMANN - DENIS BOWER (SPARK, VARIANT ANALYSIS) - CENTER FOR COMPARITIVE GENOMICS - MEME UQ (MOTIF FINDING) - UQ IMB CHIP SEQ - TONY PAPENFUSS (STRUCTURAL VARIANT DETECTION TOOLS) - RADSEQ TOOLS - PATSEQ - GARVAN NON-CODING RNA DATABASE - RNA CENTRAL - QIMR (TCGA, TINY STRUCTURAL VARIANT FINDER TOOL) - DATA VISUALISATION TOOLS - ALISTAIR FOREST, PHANTOM ANNOTATION PROJECT IN JAPAN - SEAN O'DONAGUE PROTEIN VIEWER - UTE BAUMANN COMBINING OMICS DATA ACPFG - BESPOKE PIPELINES AND MODIFICATIONS - MISSING: KNOWLEDGE OF TOOLS, SUPPORT/MODIFICATIONS, PROFESSIONAL SOFTWARE ENGINEERING, TRAINING AND

Page 7: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

7

Key Area What/Who/How/What is missing ADVOCACY OF SOFTWARE ENGINEERING - a way to know what people are doing?; Endorsed versus in progress, labelling system, Developer Help space; review S/W; - MISSING: SUPPORT BODY FOR SOFTWARE SUSTAINABILITY, GUIDELINES

COMPUTE

- NECTAR (INDIVIDUAL NODES) - CLUSTER (EG INSTITUTES, NCI, STATE BASED) - AARNET - NCI - QCIF, INTERSECT ETC - COMMERCIAL CLOUDS (e.g. Amazon for large jobs) - CSIRO-all levels: partner shares + merit allocation+Biotools install across all systems: Problem though are barriers to entry and visibility. - MISSING: possible benefits in better coordination?

INTERNATIONAL

- INTERNATIONAL WHEAT GENOME CONSORTIUM - PAG Conference - VIBE - GALAXY - GOBLET - ISMB - TCGA - 'SISTER' UNIVERSITIES - MISSING: AUSTRALIAN REPS ON INTERNATIONAL COMMITTEES (DATA STANDARDS AND BEST PRACTICE), AUSTRALIAN PARTICIPATION IN MAJOR PROJECTS - MISSING: AUSTRALIAN LEADERSHIP IN DISPERSING EXPERTISE INTERNATIONALLY – E.G. IN COMPUTE INFRASTRUCTURE

Page 8: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 8 of 12

YOUR NODE & ‘OTHER NODES’ Participants were asked to note down individually what they felt their node could contribute as well as what they would like to see as contributions from other nodes. The table below is a compilation of this activity based on the written information collected at the meeting.

Who Your node ‘Other nodes’

CSIRO Annette McGrath

- Platforms: workspace for biology? - Data: CSIRO AU data from national collections and Environomics FSP -maintain international connections e.g. GOBLET, GALAXY -Training -CSIRO developed Tools -CSIRO Compute: Ruby, Bowen, GPU Cluster, Pearcey NCI nodes

- connection to ABACBS and community efforts - what tools are being developed in Australia, by who?, and where are they? -build a platform that bioinformatics can actively support and engage with - Compute: how do we make our resources more visible and who and how we can access them?

Uni of Adelaide David Adelson

-Storage: compute local HPC -Training: local workshops, HDR + ECR, Linux, R-tools pipelines -Research support-analysis -Tools (BioGo)

- Compute: access to HPC for occasional massive jobs - Tools Repository + continued development for orphan tools - Platforms: i.e. Galaxy for Education/Training

VLSCI Andrew Lonie

- VLSCI + Res Cloud - ELIXIR Connection + BD2K - Prokka, local developer, take ownership - Some coordination: huge value but time expensive - Common platform for using cloud resources - need a data archiving collection - Common approach to NCRIS roadmap

- Pool of evaluators to share metrics & experiences - expertise in depositing data, tagging data - sharing training best practice - tool registry + expertise - advice on other international connections - using other resources: e.g QCIF ITB nodes

QCIF Dominique Gorse

- Compute: Nectar, QRIS cloud, Local support - Training: workshops, carpentries, hacking hours - International: Data chaperoning - Platforms: GVL, adding tools, custom flavours, local support - Data: support data life cycle

- Platform/Tools development share expertise -Training: share resources maintenance

Page 9: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

9

UNSW Richard Edwards

- material review: expertise, GOBLET - SLIMSuite + Servers, Wrappers and expertise/advice - EMBL, VIBE - Rest Servers (small wrappable jobs)

- platform that is good for life scientists and bioinformaticians (collaboration!) - shared resources/materials: best practice - shared resources: common storage and big data - developer network/env.: best practice BUT realistic and non judgemental - directory of tool developers - directory of expertise/contacts - funding opportunities - AWS pipelines

AGRF Sonika

- Platforms: experience of working with variety of data types, tech, expt set up - Data: Australian researchers data, BPA, International Projects - Tools: customized workflows, collaborative tools - Training: AGRF can contribute new workflows & materials on non model organisms e.g. GBS, miRNAseq - International: network of scientists

- Platform for integrative analysis - Tools: s/w engineering education, guidelines, templates, shared tools/pipelines - Training: individual nodes might have their own training materials e.g. genotyping by sequencing - International: networking - Compute: access to high performance computing via the network

Monash Uni Steve Androulakis

-Training development and steering - Instrument integration and pipelines - software hosting/standards - data management, consultation, best practice adopter - clarity around landscape, clear channels - Tools for training/collaboration

- 3rd generation sequencing analysis expertise - GVL, Galaxy Training - International links, Gov, Funding

UWA Phillip Bayer

- Data: lots of plant data, no infrastructure to share - Training: software/data carpentry; happy to teach! - International: AgBiodata, several plant geome consortia - Tools: genome assembly expertise - Compute: several clusters, PBS

- National Bioinformatics curriculum - National, citable data sharing platform

UNSW Marc Wilkins

-Training via participation with BPA - Data largely using international

- Platforms: development of platforms for mature bioinformatics tools + processes

Page 10: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

10

repositories (PRIDE, ProteomeExchange, GEO, SRA) - probably not taking advantage of available platforms - modest tool development with network visualization, co-analysis of NGS and proteomic data - Could be better connected in international aspects of tools and best practice. How do we ensure we are not reinventing the wheel? - Compute: use of state based (Intersect) national (NCI) and Amazon web services

- Training: Clinical genomics, training in R, command line - Data: Listing of databases, repositoires, interfacing with international entities (NCBI, EMBL etc) - International: best practice?, facilitating, tapping into networks, DBs, infrastructure - Tools: hosting of tools - Compute: hosting and managing of compute + storage. Staff to support users of HPC

QUT Kevin Dudley

- Platforms: Galaxy QLD, GVL - Data: Sequencers producing NGS data, Projects: proteomics metadata, epigenomes, non model transcriptomics - Training: sample prep, experimental design - Tools: non model organimss genome assembly - Compute: QLD cluster

- Training: training on how to analysis data when no reference is available - Tools: genome assembly and annotation

VLSCI (completed as an example node)

Pip Griffin

- Platforms: GVL dev and support, Genomics resources (Wallabase, ADEER): building + support - Data: BP data management, examples and documentation - Training: end –user training: introductory and selected advanced user workshops, online training (always available) using dedicated GVL, SEPSIS data training - International: BEACON project + other BD2K connections, Galaxy AU connection to Galaxy - Tools: tool hosting + maintenance, BP documentation for tool developers, setting an example - Compute: sharing knowledge between compute and bioinformatics, estimations of usage requirements

- Platforms: adoption of parts of CyVerse, topic-specific platform activities - Data: hosting showcase datasets in line with BP standard formats, UX design + improvement for data interface, training + documentation on data publishing, sharing etc. - Training: extra topics (e.g.pacbio assembly), BP data management workshops (also HUB), face to face training - International: topic-speciifc initiative membership, representation in groups, i.e. data standardsm BP - Tools: development/ improvement of extra tools for genomics platforms - Compute: communication interface between researchers and compute ppl

Page 11: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 11 of 12

Wrap Up and What’s Next Vicky asked each individual to think of something they really liked from the meeting and what they would like to see happening next, below the summary of the individual contributions:

Variety How to put value on data accessibility National strategy Gain an overview

Agreement among participants More engagement with NCRIS

resonance How do we take advantage of existing compute?

Hear concerns about non model organisms Annotation pipeline interactivity More meetings like this networking More clarity

We are all here! More structure Being able to put faces to the names Activities after this

Openness to collaborate Positive and constructive feedback

Thinking about Australian bioinformatics in an international context

Page 12: EMBL-ABR Network Focus Meeting...EMBL-ABR Network Focus Meeting No.1-2016 Hosted by EMBL-ABR Hub at the EMBL-ABR VLSCI node, Lab-14, 700 Swanston St, Carlton VIC 3053 ... distributed

Page 12 of 12

ACTIONS - Heads of nodes to work on their EMBL-ABR Node Description form and ideally provide these to Vicky Schneider by July 15th. It’s important to think in a two-phased process: what are you doing now that is suitable and possible to open up/share at the national level (whenever possible also estimate this in terms of FTEs); versus what you would like to see ideally funded in a matching funding model, so that your host institution can comit in principle to fund 50% of the FTE involved in the activities of your node. -Steve Androulakis and Vicky Schneider will work on a glossary of acronyms and terms (perhaps with accompanying diagram) of the compute infrastructure efforts and initiatives around Australia, which will be available through EMBL-ABR website. Proposed deadline: September 2016 -Helen Gardiner will be working on what is the return in terms of 1$ for access to data? Helen will need all your input and hep to find information about this of course. Proposed deadline: September 2016 - all to send Christina Hall the page with Bioinformatics training events they have, or relevant pages for events they are aware taking place in Australia so that EMBL-ABR Hub can curate these into the event registry and ultimately build a comprehensive map of existing Bioinformatics training in Australia. - all to book in their calendars the 7th of December for the EMBL-ABR All Hands Meeting and ideally be present or send a proxy on the 8th of December for the EMBL-ABR Meeting on Bioinformatics Training in Australia. EMBL-ABR Hub and EMBL-ABR@VLSCI node will provide the catering for the meetings and admin support, travel and accommodation is expected to be covered by participants. - all nodes to think whom they would like to send to take part in the week of workshop events on 24th-28th October 2016 covering annotation and curation best practice, and data life cycle (plants, animals, microbes and health). - Suggestion for all to sign up to the EMBL-ABR monthly newsletter here (https://www.embl-abr.org.au/embl-abr-news/), which will feature one node per issue as the nodes become more defined.