31
Intro to Data Science @ ABM

Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Intro to Data Science @ ABM

Page 2: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Contents

• TL;DR

• Introduction

• Python Examples

• Finance, FMCG/Retail, Geospatial

• Integration with Qlik Sense

• How to get started

2

Page 3: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

TL;DR (too long; didn’t read)

• You can get started on doing data science like activities sooner than you think

• Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc)

• Learning this is easy – so many materials online (Google is your friend)

• Data science tools are good standalone for ad-hoc analysis. Getting things used by a larger number (putting code into production) means you need a tool like Qlik

• Talk to ABM about data science activities you want to do

3

Page 4: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

All rights reserved ABM Systems 2018

INTRODUCTION

4

Page 5: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Introduction

• Popularity of data science continues to grow (global and domestic)

• Data science is not just for advanced analytics – tools typically used in that space can, and often are, used to solve simpler problems in an agile way

• A focus on data science can enable ABM to focus on the data instead of software and allows analysis and problem solving actions more immediately

• From there, ABM can automate solutions to go into 3rd party software or even create our own UX/front end experience

• Large organisations like investment banks decided on deploying their own versions of Python (e.g. Bank of America’s Quartz or Goldman Sach’s SecDB/Slang)

• This focus on how to analyse data is supported by focus on data literacy part of the overall agnostic approach that ABM

5

Page 6: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Benefits of Data Science tools• Open Source typically

• Agile development

• Rapid prototyping

• No vendor lock-in

• Culture of collaboration (GitHub, Decision Inc)

• Easier to diagnose/audit problems

• Speed – ability to get started quickly on

• Ability to start small

• Attracting better talent

• Future proofing

• Repeatable processes/queries

• Lowers costs

• Acts as a base for further product development

6

Page 7: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Why Open Source?

• Fast to market – ability to jump straight into data

• Take advantage of best of breed innovations across the open source community

• Allows value to be proven without significant up front investment.

• Helps you understand data before productionizing

• Nothing worse than building a prototype before realizing that the analysis needs working on

• Leads to further analysis/questions

• Further advanced analytics work can also be performed utilizing the data science stack of tools

7

Page 8: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Common Use Cases

8

Page 9: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

All rights reserved ABM Systems 2018

PYTHON EXAMPLES

9

Page 10: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Finance - Loan Classification – LINK

Analysis of what factors affect a clients ability to repay a loan (or not).

10

Page 11: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Finance - Stock market forecasting using Prophet - LINK

Predicting future prices based on new forecasting model called Prophet

11

Page 12: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Finance - Algorithmic Based Trading using Yahoo Finance – LINK

Creating a simple trading strategy with the help of DataCamp

12

Page 13: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Geospatial – LINK and LINK

Visualizing Uber Trips and using Plotly for Geospatial analysis

13

Page 14: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

All rights reserved ABM Systems 2018

INTEGRATION WITH QLIK

14

Page 15: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Qlik Sense offers native and third party capabilities

• Native capabilities include statistical functions, interactive visualizations, and scripting for many advanced analytics use cases

• Advanced Analytics Integration offers engine-level sharing between Qlik Sense and third party tools such as R and Python, for incorporating advanced calculation and machine learning into analyses

15

Page 16: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

The Old Way

16

Data Acquisition

Data Preparation

Data Creation

Data Selection

Model Selection

Result Generation

Result Presentation

Results Interpretation Action Steps

Page 17: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

The New Way

17

Model Creation

Governance Oversight

Data Acquisition Data Selection

Data Preparation

Auto Model Selection

Adv. Paramaterization

Basic Parameterisation

Result Generation

Result Preparation

Action Steps

Page 18: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Benefits

Advanced analytics in Qlik Sense deliver powerful insights to business users

• Data scientists build advanced models and calculations

• Business decision makers can utilize them in the context of associative exploration

• Analytics are calculated and visualized in real-time as the user explores, based on selected context

18

Page 19: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Introducing Advanced Analytics Integration

• Direct integration with 3rd party advanced analytics engines through server-side extension APIs

• Allows data to be directly exchanged between the Qlik engine and external tools during analysis – Leverages Qlik’s Associative Engine to pass relevant data based on user context

• Full integration with Qlik Sense expressions and libraries

• Connectors can be built for any external engines

• Open source Analytic Connections are available in GitHub for R and Python (https://github.com/qlik-oss)

• Also there are demo apps built by Nabeel from Qlik here -https://github.com/nabeel-oz/qlik-py-tools

19

Page 20: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

How it works

20

Page 21: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Example Expression Syntax

21

Page 22: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

All rights reserved ABM Systems 2018

DEMO

22

Page 23: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Example Expression Syntax

23

Page 24: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Example Expression Syntax

24

Page 25: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Example Expression Syntax

25

Page 26: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

All rights reserved ABM Systems 2018

Getting Started

26

Page 27: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018 27

Foundations of a Data Scientist

Subject Matter

expertise

Maths & Statistics

Computer Science

No matter which area you’re coming from, the key takeaway is that you can easily upskill on the other parts We can’t all be

unicorns… but they’re not real anyway… or are they?

Page 28: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

How to get started

• Install Anaconda (https://www.anaconda.com/distribution/)

• Do a project on data you’re interested in

• You’ll learn all the tools you need for that project (Pandas, Numpy, Visualisation packages, Machine learning so scikit learn)

• Will be very hard but if you’re invested in it you will learn what you need to

• If you get to a roadblock with something you’re interested in you will more than likely succeed

• Talk about it afterwards, teach it, write a blog post

28

Page 29: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Where to get data

Work Sources

- Your own work data is a great place to start and can help convince your business to fund your learning

Australian Sources

- www.data.gov.au

- State sources as well like Bureau of Crime Statistics and Research (BOCSAR) -https://bocsar.nsw.gov.au/Pages/bocsar_datasets/Datasets-.aspx

Other

- Kaggle Data competitions - https://www.kaggle.com/datasets

29

Page 30: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

Data Science @ ABM

All rights reserved ABM Systems 2018

Learning Materials

Online Courses- Udemy – usually have specials on

- Mark (Presenter) currently taking the following course: Python for Finance: Investment Fundamentals & Data Analytics, Python for Financial Analysis & Algorithmic trading, Forecasting Models with Python, Interactive Python Dashboards with Plotly and Dash, Python for Data Science and Machine Learning Bootcamp

- Coursera- Udacity- DataCamp

Podcasts- DataFramed (by DataCamp) – Good for showcasing how different data scientists are

doing their work in their fields and how they overcome various issues they encounter- Traders Who Code episode on Chat with Traders - Finance focused but great for getting

started with data science tools: https://www.youtube.com/watch?v=Jr0szNMsgMY

30

Page 31: Intro to Data Science @ ABM · • Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc) • Learning this is easy

ABM Systems 3 Spring Street, Sydney

NSW 2000 Australia

Tel: 02 8249 4351 Fax: 02 8249 4001

Support: 02 9029 8021 Email: [email protected]

Web: www.abmsystems.com