Democratizing Data Science - Denver SQL · Democratizing Data Science With Microsoft. Who am I...

Preview:

Citation preview

Microsoft BI Consultant

Ordina Belgium

Mulkens Jan Hermans Kimberly

Data Scientist

Ordina Belgium

Democratizing Data ScienceWith Microsoft

Who am I

Kimberly Hermans

Data Scientist & CRM consultant

Practice Manager: Data Talks

“Bringing big data, data science & visualisation together and

getting rid of the silos is what will get you the return on data”

Who am I

Jan Mulkens

Microsoft BI Consultant

Competence Lead: Microsoft Advanced Analytics

@JanMulkens

www.janmulkens.be

www.globalpowerbi.com

ActivitiesBlog:

- www.janmulkens.be

- www.globalpowerbi.com

Speaker:

- Belgian SQL Server user group (DataMinds)

- Belgian Information Worker user group (BIWUG)

- Various other external events

- Webinars

- Guest lectures

Organiser:

- Internal Ordina events

- Virtual user group events

Our employer

What Ordina says

“We increase our customers 'Return on Data' by taking them on a journey to a modern &

innovative data culture. We organically grow into the most focused, fast, flexible, friendly and fun BI & Data Science community on the Belgian

Market employing >100 growing & happy employees”

What we experience

“We increase our customers 'Return on Data' by taking them on a journey to a modern &

innovative data culture. We organically grow into the most focused, fast, flexible, friendly and fun BI & Data Science community on the Belgian

Market employing >100 growing & happy employees”

*Emphasis is my own

Takeaway

Setting expectations

• Targeted at anyone looking to promote data science within their organisation. Including data scientists looking to have the business build upon their work

• Focus is on learning about which tools can help enable citizen data scientists

Tools enable citizen data science.

Microsoft made the tools for you.

What to expect Take-away

End User Everyone

The age of

Classic BI

End UserIT Analyst

The age of

Self Service BIThe age of

Data Culture

Data Culture

Amir Netz & Kamal Hathi on the Age of Data Culture

https://www.youtube.com/watch?v=rg3lzHRxNqM

Agenda

Intro to Data Science A Fool with a ToolTooling

Intro to Data Science

Intro to Data Science

Why?

Value

Source: Gartner (October 2014)http://www.gartner.com/newsroom/id/2881218

Why?

Source: Gartner (October 2014)http://www.gartner.com/newsroom/id/2881218

Why?

Value

Source: Gartner (October 2014)http://www.gartner.com/newsroom/id/2881218

Why?

Value

Who?

ML

Researcher

Data

ScientistML Engineer

Citizen Data Scientist

Up to15hrs / week

Up to15hrs / week

Up to15hrs / week

Full work week

Up to 4 hrs / week

Source: Generalized from O’Reilly’s “2015 Data Science Salary Survey” (sep 2015)

ETL Data Cleaning Machine LearningExploratory Data Analysis

Data Science Time Schedule

Citizen Data Scientists- Exploratory Analysis

- Visualization

- Putting insights into action

Come from both the business and IT

- BI Developers & Analysts

- Power users

Why?- Requirements going beyond BI

- Shortage of data scientists

- Ad-hoc analysis doesn’t scale

- Getting direct support from the business and IT for your data science project

- Bigger chance at project success & deployment because of buy-in

How?Education

- Basic education on required data science topics where necessary

EducationDifferent data problems require different knowledge

Clustering & dimensionality reduction: (e.g. K-means clustering)

Regression: (e.g. Linear regression)

Association: (e.g. Recommenders) Classification: (e.g. Logistic regr., trees)

Machine learning algorithms

Likely buy

BuySimilar

EducationDifferent data problems require different knowledge

Predict

values

Find

unusual

occurrences

Discover

structure

Predict

categories

EducationPick a data science process as a guide

Define

Goal

Collect

Data &

explore

Build

Model

Evaluate

Model

Present

Results

Deploy

Model &

monitorKDD

CRISP-DM

CCC Big Data Pipeline

Microsoft Team Data Science Process

...

How?Education

- Basic education on required data science topics where necessary

Create a Data Culture

- Open communication

- Supportive learning environment

- Reviews of models & performance

- Data scientist available to help

How?Education

- Basic education on required data science topics where necessary

Create a Data Culture

- Open communication

- Safe learning environment

- Reviews of models & performance

- Data scientist available to help

Tooling

- Using the right tools to help non-data scientists develop predictive solutions

Tooling

Tooling

Data Catalog- Register

- Annotate

- Understand

- Discover

Ingest Prepare Analyze Publish Consume

Tooling

Cognitive Services

Tooling

Azure ML Studio- Easy to get started

- Create, Share & Publish

- Supports team collaboration

- Web-Basedhttps://studio.azureml.net

Cortana Inteligence Gallery

Cortana Inteligence Gallery

Cortana Inteligence Gallery

Cortana Inteligence Gallery

Cortana Inteligence Gallery

Tooling

Power BI

Azure DS VM- Microsoft R Server

- Anaconda Python

- Julia Pro

- Jupyter Notebooks (R, Python, Julia)

- Visual Studio Community Edition- With Python, R, node.js tools

- Power BI Desktop

- SQL Server 2016 Developer Edition- Including support for in-database analytics with R Server

- Open Source Deep Learning Tools

What could possibly go wrong?

A Fool with a Toolss

Horror stories (logic)

Horror stories (stability)

Horror stories (relativity)

A B

Source: http://timoelliott.com/blog/wp-content/uploads/2015/03/citizen-data-scientists.png

Wrapup: Be a citizen data scientist

Wrapup: Use tools that enable you

Thank you!

Resources

ResourcesGet inspired: Amir Netz & Kamal Hathi on “The age of data culture”

- https://www.youtube.com/watch?v=rg3lzHRxNqM

Gartner on Advanced Analytics (Oct 2014)- http://www.gartner.com/newsroom/id/2881218

Azure Data Catalog- https://azure.microsoft.com/en-us/services/data-catalog/

Cortana Intelligence Gallery- https://gallery.cortanaintelligence.com/

ResourcesPower BI Partner Showcase

- https://powerbi.microsoft.com/en-us/partner-showcase/

Azure Data Science VM- https://azuremarketplace.microsoft.com/en-

us/marketplace/apps/microsoft-ads.standard-data-science-vm

Machine Learning Basics Infographic- https://docs.microsoft.com/en-us/azure/machine-learning/machine-

learning-basics-infographic-with-algorithm-examples

Microsoft Professional Program in Data Science- https://academy.microsoft.com

Recommended