Upload
elixir-uk
View
57
Download
2
Tags:
Embed Size (px)
Citation preview
Industry Engagement sector update Gabriella Rustici
Bioinformatics Training Facility, School of the Biological Sciences
Activities so far
Appointed an Industry Engagement advisory committee
Confirmed members: • Claus Bendtsen, AstraZeneca • Mark Forster, Syngenta • Samiul Hasan, GSK • Wendy Filsell, Uniliver • William Spooner, Eagle Genomics • Audrey Kauffmann, Novartis
To be confirmed: • Justin Powell, Takeda (not replied)
Activities so far
Circulated two surveys to:
• help us understanding the bioinformatics-related training needs of industry and
• consequently ensure that suitable training activities are developed and honed to target such needs.
One survey targeted bioinformaticians, the other one wet lab scientists
Surveys’ results
0
10
20
30
40
50
60
70
80
90
Bioinformaticians Wet lab
Large company Small-to-medium enterprise
0 10 20 30 40 50 60 70 80
Biogen Idec Bioindustry Park Silvano Fumero
Databiology DNAdigest.org
DNAnexus Dupont
EMD Serono (Merck Serono) Euformatics Oy
Genentech Ina Harrow Consulting
Instem Scientific LGC
Life Technologies - Thermo Fisher Lundbeck
MedImmune Novo Nordisk
Omixon Biocomputing LTD Redoxis AB
Roche Astellas Pharma Inc.
Bayer Healthcare Eli Lilly & Company
Heptares OP
Pfizer Inc. UCB
Unilever AstraZeneca
Bayer Bayer Pharma AG
NIBR Sanofi
Eagle Genomics Illumina
GlaxoSmithKline Novartis
Discipline of interest
0 10 20 30 40 50 60
Virology Toxicology
Systems biology Proteomics
Plant Sciences Oncology
Neurobiology Molecular Biology
Microbiology Medicine
Infectious diseases Immunology
Genomics/epigenomics Epidemiology
Drug development Computer Science
Computational chemistry Chemistry
Cell biology Biomedical Sciences
Bioinformatics Biochemistry/Biophysics
Bioanalytics
Bioinformaticians Wet lab
What databases do they use in their work?
0 10 20 30 40 50 60
Systems databases (e.g. BioModels, Reactome, KEGG,...)
Structures databases (e.g. PDBe)
Protein databases (e.g. Uniprot, Pfam, Intact,...)
Ontology resources (e.g. Gene Ontology,..)
No, I do not use databases
Literature services (e.g. Pubmed,...)
Gene expression databases (e.g. ArrayExpress, Gene Expression Omnibus,...)
DNA & RNA databases (e.g. Ensembl, 1000 genomes, UCSC,....)
Chemical biology databases (e.g. ChEMBL,..)
bioinformaticians
wet lab
Bioinformaticians: Software tools/Data mining software
0 2 4 6 8 10
cBioPortal Custom
Databiology Genedata
IGV Matlab
Omicsoft Pathway Studio
postgresql SQLite
SVM Spotfire
Arraystudio Expressionist
Weka Knime
Linguamatics R/Bioconductor
Data mining software
0 5 10 15 20
AffyMetrix
Ensembl
Oracle
Plink
SAS
Eclipse
Spotfire
cytoscape
R/Bioconductor
NGS tools
Software tools
0 10 20 30 40 50 60 70
Workflow tools (e.g. Galaxy)
No, I do not use any software
Gene set enrichment testing tools (e.g DAVID)
Next generating sequencing read alignment and assembly programs (e.g BWA, Bowtie,...)
Pathway & network analysis tools (e.g. Cytoscape, Biocarta, Ingenuity,...)
Data analysis environments (e.g. R/Bioconductor, Matlab,....)
Sequence alignment, similarity & homology tools (e.g. Blast, Clustal,...)
Microsoft software (e.g. Excel)
Wet lab scientists: analysis software
Wet lab scientists and statistics
6%
29%
59%
6%
How confident are you with statistics?
Very confident
Confident
Not so confident
I am not even sure of what statistics I need to know
34.0%
1.5%
32.4%
32.4%
No, I do not have any support. I am responsible for analyzing the data that I
generate.
No, the data analysis is carried out by someone else. I just receive a file with the
results.
Yes, occasionally I interact with a bioinformatician/statistician at my Institute,
particularly when I get stuck and I don’t know how to proceed.
Yes, I have a bioinformatician in the group that helps me to design experiments and
also provides support for the data analysis
Do you collaborate with a bioinformatician/statistician?
Programming experience/languages
0 5 10 15 20
Python R/BioConductor
Perl Java C++
Matlab Ruby
Javascript Unix bash
HTML MySQL PL/SQL
Scala sparql
SQL
Bioinformaticians - Programming languages
26.5%
73.5%
Wet lab – programming experience
Yes
No
Bioinformaticians: What competencies are crucial?
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Using and building ontologies
Using and applying standards
Version control tools
Retrieving and manipulating data from public repositories
Working with high-performance computing or cloud-based solutions
Modeling and warehousing of biological data
Programming
Integrating public and private data-sets
Use of scripting languages
Ability to use statistical analysis software packages
Data mining of large biological data-sets
Wet lab: what expertise would you like to acquire?
0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0%
If other, please specify
Data publishing skills (e.g. how do I publish my results?)
Scientific knowledge (e.g. how should I design my experiment to obtain meaningful results?)
Data manipulation skills (e.g. what software is more appropriate to analyze my data? How does a specific software work?)
Statistical knowledge (e.g. what statistics do I need to know to be able to analyze my data?)
Data visualization skills (e.g. how does my data look like? How do I interpret and present my data?)
Bioinformaticians: Which bioinformatics training would you most value in relation to your work?
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Basic computing skills
Data analysis skills
Programming skills
Statistical methodologies
Use of data standards in curation and/or data integration practices
Bioinformaticians: What topics would you like to see covered in future training activities?
0 5 10 15 20 25 30 35 40
Programming
HPC
Cloud solutions
Text mining
Data visualization
Drug development/discovery
New resources/latest technologies
Network analysis
Workflows
Statistics
Data integration
NGS analysis
What training format do they prefer?
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
Face-to-face training courses on site
Face-to-face training courses off site
Online/eLearning Face-to-face combined with online
Bioinformaticians
Wet lab
Summary
• Statistics is a major weakness, regardless of the user group – the topic is inadequately covered in undergraduate curricula and its teaching often approached theoretically rather than practically
• Analysis of high-throughput data, primarily with popular data mining software, knowledge of programming languages (Python, R, Perl), data integration and network analysis are high priorities for bioinformaticians
• Basic data manipulation, visualization and statistics are fundamental to wet-lab scientists. Lack of confidence in the use of statistical software that requires scripting.
• Face-to-face is always popular but a lot of basic training can be delivered online
Various considerations
• Collate all Industry use cases already available to define key competencies in Industry and disseminate these to all other sectors
• Utilize this information to prioritize key training areas
• Hold the first Advisory committee meeting within the next six months
• Collaborations with other Elixir nodes:
• ELIXIR-NL will use the same surveys to assess the training needs of industry; share the results and collate more information – planning to do this in collaboration with all Elixir nodes
• Engage Industries beyond pharma?