17
Industry Engagement sector update Gabriella Rustici Bioinformatics Training Facility, School of the Biological Sciences

The ELIXIR UK industry survey by Gabriella Rustici

Embed Size (px)

Citation preview

Industry Engagement sector update Gabriella Rustici

Bioinformatics Training Facility, School of the Biological Sciences

Activities so far

Appointed an Industry Engagement advisory committee

Confirmed members: •  Claus Bendtsen, AstraZeneca •  Mark Forster, Syngenta •  Samiul Hasan, GSK •  Wendy Filsell, Uniliver •  William Spooner, Eagle Genomics •  Audrey Kauffmann, Novartis

To be confirmed: •  Justin Powell, Takeda (not replied)

Activities so far

Circulated two surveys to:

•  help us understanding the bioinformatics-related training needs of industry and

•  consequently ensure that suitable training activities are developed and honed to target such needs.

One survey targeted bioinformaticians, the other one wet lab scientists

Surveys’ results

0

10

20

30

40

50

60

70

80

90

Bioinformaticians Wet lab

Large company Small-to-medium enterprise

0 10 20 30 40 50 60 70 80

Biogen Idec Bioindustry Park Silvano Fumero

Databiology DNAdigest.org

DNAnexus Dupont

EMD Serono (Merck Serono) Euformatics Oy

Genentech Ina Harrow Consulting

Instem Scientific LGC

Life Technologies - Thermo Fisher Lundbeck

MedImmune Novo Nordisk

Omixon Biocomputing LTD Redoxis AB

Roche Astellas Pharma Inc.

Bayer Healthcare Eli Lilly & Company

Heptares OP

Pfizer Inc. UCB

Unilever AstraZeneca

Bayer Bayer Pharma AG

NIBR Sanofi

Eagle Genomics Illumina

GlaxoSmithKline Novartis

Discipline of interest

0 10 20 30 40 50 60

Virology Toxicology

Systems biology Proteomics

Plant Sciences Oncology

Neurobiology Molecular Biology

Microbiology Medicine

Infectious diseases Immunology

Genomics/epigenomics Epidemiology

Drug development Computer Science

Computational chemistry Chemistry

Cell biology Biomedical Sciences

Bioinformatics Biochemistry/Biophysics

Bioanalytics

Bioinformaticians Wet lab

What databases do they use in their work?

0 10 20 30 40 50 60

Systems databases (e.g. BioModels, Reactome, KEGG,...)

Structures databases (e.g. PDBe)

Protein databases (e.g. Uniprot, Pfam, Intact,...)

Ontology resources (e.g. Gene Ontology,..)

No, I do not use databases

Literature services (e.g. Pubmed,...)

Gene expression databases (e.g. ArrayExpress, Gene Expression Omnibus,...)

DNA & RNA databases (e.g. Ensembl, 1000 genomes, UCSC,....)

Chemical biology databases (e.g. ChEMBL,..)

bioinformaticians

wet lab

Bioinformaticians: Software tools/Data mining software

0 2 4 6 8 10

cBioPortal Custom

Databiology Genedata

IGV Matlab

Omicsoft Pathway Studio

postgresql SQLite

SVM Spotfire

Arraystudio Expressionist

Weka Knime

Linguamatics R/Bioconductor

Data mining software

0 5 10 15 20

AffyMetrix

Ensembl

Oracle

Plink

SAS

Eclipse

Spotfire

cytoscape

R/Bioconductor

NGS tools

Software tools

0 10 20 30 40 50 60 70

Workflow tools (e.g. Galaxy)

No, I do not use any software

Gene set enrichment testing tools (e.g DAVID)

Next generating sequencing read alignment and assembly programs (e.g BWA, Bowtie,...)

Pathway & network analysis tools (e.g. Cytoscape, Biocarta, Ingenuity,...)

Data analysis environments (e.g. R/Bioconductor, Matlab,....)

Sequence alignment, similarity & homology tools (e.g. Blast, Clustal,...)

Microsoft software (e.g. Excel)

Wet lab scientists: analysis software

Wet lab scientists and statistics

6%

29%

59%

6%

How confident are you with statistics?

Very confident

Confident

Not so confident

I am not even sure of what statistics I need to know

34.0%

1.5%

32.4%

32.4%

No, I do not have any support. I am responsible for analyzing the data that I

generate.

No, the data analysis is carried out by someone else. I just receive a file with the

results.

Yes, occasionally I interact with a bioinformatician/statistician at my Institute,

particularly when I get stuck and I don’t know how to proceed.

Yes, I have a bioinformatician in the group that helps me to design experiments and

also provides support for the data analysis

Do you collaborate with a bioinformatician/statistician?

Programming experience/languages

0 5 10 15 20

Python R/BioConductor

Perl Java C++

Matlab Ruby

Javascript Unix bash

HTML MySQL PL/SQL

Scala sparql

SQL

Bioinformaticians - Programming languages

26.5%

73.5%

Wet lab – programming experience

Yes

No

Bioinformaticians: What competencies are crucial?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Using and building ontologies

Using and applying standards

Version control tools

Retrieving and manipulating data from public repositories

Working with high-performance computing or cloud-based solutions

Modeling and warehousing of biological data

Programming

Integrating public and private data-sets

Use of scripting languages

Ability to use statistical analysis software packages

Data mining of large biological data-sets

Wet lab: what expertise would you like to acquire?

0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0%

If other, please specify

Data publishing skills (e.g. how do I publish my results?)

Scientific knowledge (e.g. how should I design my experiment to obtain meaningful results?)

Data manipulation skills (e.g. what software is more appropriate to analyze my data? How does a specific software work?)

Statistical knowledge (e.g. what statistics do I need to know to be able to analyze my data?)

Data visualization skills (e.g. how does my data look like? How do I interpret and present my data?)

Bioinformaticians: Which bioinformatics training would you most value in relation to your work?

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

Basic computing skills

Data analysis skills

Programming skills

Statistical methodologies

Use of data standards in curation and/or data integration practices

Bioinformaticians: What topics would you like to see covered in future training activities?

0 5 10 15 20 25 30 35 40

Programming

HPC

Cloud solutions

Text mining

Data visualization

Drug development/discovery

New resources/latest technologies

Network analysis

Workflows

Statistics

Data integration

NGS analysis

What training format do they prefer?

0.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

Face-to-face training courses on site

Face-to-face training courses off site

Online/eLearning Face-to-face combined with online

Bioinformaticians

Wet lab

Summary

•  Statistics is a major weakness, regardless of the user group – the topic is inadequately covered in undergraduate curricula and its teaching often approached theoretically rather than practically

•  Analysis of high-throughput data, primarily with popular data mining software, knowledge of programming languages (Python, R, Perl), data integration and network analysis are high priorities for bioinformaticians

•  Basic data manipulation, visualization and statistics are fundamental to wet-lab scientists. Lack of confidence in the use of statistical software that requires scripting.

•  Face-to-face is always popular but a lot of basic training can be delivered online

Various considerations

•  Collate all Industry use cases already available to define key competencies in Industry and disseminate these to all other sectors

•  Utilize this information to prioritize key training areas

•  Hold the first Advisory committee meeting within the next six months

•  Collaborations with other Elixir nodes:

•  ELIXIR-NL will use the same surveys to assess the training needs of industry; share the results and collate more information – planning to do this in collaboration with all Elixir nodes

•  Engage Industries beyond pharma?