Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Intro to Data Science @ ABM
Data Science @ ABM
All rights reserved ABM Systems 2018
Contents
• TL;DR
• Introduction
• Python Examples
• Finance, FMCG/Retail, Geospatial
• Integration with Qlik Sense
• How to get started
2
Data Science @ ABM
All rights reserved ABM Systems 2018
TL;DR (too long; didn’t read)
• You can get started on doing data science like activities sooner than you think
• Getting started is free (download Qlik Sense, download Python/R, get free sample code from GitHub, Kaggle, DataCamp etc)
• Learning this is easy – so many materials online (Google is your friend)
• Data science tools are good standalone for ad-hoc analysis. Getting things used by a larger number (putting code into production) means you need a tool like Qlik
• Talk to ABM about data science activities you want to do
3
All rights reserved ABM Systems 2018
INTRODUCTION
4
Data Science @ ABM
All rights reserved ABM Systems 2018
Introduction
• Popularity of data science continues to grow (global and domestic)
• Data science is not just for advanced analytics – tools typically used in that space can, and often are, used to solve simpler problems in an agile way
• A focus on data science can enable ABM to focus on the data instead of software and allows analysis and problem solving actions more immediately
• From there, ABM can automate solutions to go into 3rd party software or even create our own UX/front end experience
• Large organisations like investment banks decided on deploying their own versions of Python (e.g. Bank of America’s Quartz or Goldman Sach’s SecDB/Slang)
• This focus on how to analyse data is supported by focus on data literacy part of the overall agnostic approach that ABM
5
Data Science @ ABM
All rights reserved ABM Systems 2018
Benefits of Data Science tools• Open Source typically
• Agile development
• Rapid prototyping
• No vendor lock-in
• Culture of collaboration (GitHub, Decision Inc)
• Easier to diagnose/audit problems
• Speed – ability to get started quickly on
• Ability to start small
• Attracting better talent
• Future proofing
• Repeatable processes/queries
• Lowers costs
• Acts as a base for further product development
6
Data Science @ ABM
All rights reserved ABM Systems 2018
Why Open Source?
• Fast to market – ability to jump straight into data
• Take advantage of best of breed innovations across the open source community
• Allows value to be proven without significant up front investment.
• Helps you understand data before productionizing
• Nothing worse than building a prototype before realizing that the analysis needs working on
• Leads to further analysis/questions
• Further advanced analytics work can also be performed utilizing the data science stack of tools
7
Data Science @ ABM
All rights reserved ABM Systems 2018
Common Use Cases
8
All rights reserved ABM Systems 2018
PYTHON EXAMPLES
9
Data Science @ ABM
All rights reserved ABM Systems 2018
Finance - Loan Classification – LINK
Analysis of what factors affect a clients ability to repay a loan (or not).
10
Data Science @ ABM
All rights reserved ABM Systems 2018
Finance - Stock market forecasting using Prophet - LINK
Predicting future prices based on new forecasting model called Prophet
11
Data Science @ ABM
All rights reserved ABM Systems 2018
Finance - Algorithmic Based Trading using Yahoo Finance – LINK
Creating a simple trading strategy with the help of DataCamp
12
Data Science @ ABM
All rights reserved ABM Systems 2018
Geospatial – LINK and LINK
Visualizing Uber Trips and using Plotly for Geospatial analysis
13
All rights reserved ABM Systems 2018
INTEGRATION WITH QLIK
14
Data Science @ ABM
All rights reserved ABM Systems 2018
Qlik Sense offers native and third party capabilities
• Native capabilities include statistical functions, interactive visualizations, and scripting for many advanced analytics use cases
• Advanced Analytics Integration offers engine-level sharing between Qlik Sense and third party tools such as R and Python, for incorporating advanced calculation and machine learning into analyses
15
Data Science @ ABM
All rights reserved ABM Systems 2018
The Old Way
16
Data Acquisition
Data Preparation
Data Creation
Data Selection
Model Selection
Result Generation
Result Presentation
Results Interpretation Action Steps
Data Science @ ABM
All rights reserved ABM Systems 2018
The New Way
17
Model Creation
Governance Oversight
Data Acquisition Data Selection
Data Preparation
Auto Model Selection
Adv. Paramaterization
Basic Parameterisation
Result Generation
Result Preparation
Action Steps
Data Science @ ABM
All rights reserved ABM Systems 2018
Benefits
Advanced analytics in Qlik Sense deliver powerful insights to business users
• Data scientists build advanced models and calculations
• Business decision makers can utilize them in the context of associative exploration
• Analytics are calculated and visualized in real-time as the user explores, based on selected context
18
Data Science @ ABM
All rights reserved ABM Systems 2018
Introducing Advanced Analytics Integration
• Direct integration with 3rd party advanced analytics engines through server-side extension APIs
• Allows data to be directly exchanged between the Qlik engine and external tools during analysis – Leverages Qlik’s Associative Engine to pass relevant data based on user context
• Full integration with Qlik Sense expressions and libraries
• Connectors can be built for any external engines
• Open source Analytic Connections are available in GitHub for R and Python (https://github.com/qlik-oss)
• Also there are demo apps built by Nabeel from Qlik here -https://github.com/nabeel-oz/qlik-py-tools
19
Data Science @ ABM
All rights reserved ABM Systems 2018
How it works
20
Data Science @ ABM
All rights reserved ABM Systems 2018
Example Expression Syntax
21
All rights reserved ABM Systems 2018
DEMO
22
Data Science @ ABM
All rights reserved ABM Systems 2018
Example Expression Syntax
23
Data Science @ ABM
All rights reserved ABM Systems 2018
Example Expression Syntax
24
Data Science @ ABM
All rights reserved ABM Systems 2018
Example Expression Syntax
25
All rights reserved ABM Systems 2018
Getting Started
26
Data Science @ ABM
All rights reserved ABM Systems 2018 27
Foundations of a Data Scientist
Subject Matter
expertise
Maths & Statistics
Computer Science
No matter which area you’re coming from, the key takeaway is that you can easily upskill on the other parts We can’t all be
unicorns… but they’re not real anyway… or are they?
Data Science @ ABM
All rights reserved ABM Systems 2018
How to get started
• Install Anaconda (https://www.anaconda.com/distribution/)
• Do a project on data you’re interested in
• You’ll learn all the tools you need for that project (Pandas, Numpy, Visualisation packages, Machine learning so scikit learn)
• Will be very hard but if you’re invested in it you will learn what you need to
• If you get to a roadblock with something you’re interested in you will more than likely succeed
• Talk about it afterwards, teach it, write a blog post
28
Data Science @ ABM
All rights reserved ABM Systems 2018
Where to get data
Work Sources
- Your own work data is a great place to start and can help convince your business to fund your learning
Australian Sources
- www.data.gov.au
- State sources as well like Bureau of Crime Statistics and Research (BOCSAR) -https://bocsar.nsw.gov.au/Pages/bocsar_datasets/Datasets-.aspx
Other
- Kaggle Data competitions - https://www.kaggle.com/datasets
29
Data Science @ ABM
All rights reserved ABM Systems 2018
Learning Materials
Online Courses- Udemy – usually have specials on
- Mark (Presenter) currently taking the following course: Python for Finance: Investment Fundamentals & Data Analytics, Python for Financial Analysis & Algorithmic trading, Forecasting Models with Python, Interactive Python Dashboards with Plotly and Dash, Python for Data Science and Machine Learning Bootcamp
- Coursera- Udacity- DataCamp
Podcasts- DataFramed (by DataCamp) – Good for showcasing how different data scientists are
doing their work in their fields and how they overcome various issues they encounter- Traders Who Code episode on Chat with Traders - Finance focused but great for getting
started with data science tools: https://www.youtube.com/watch?v=Jr0szNMsgMY
30
ABM Systems 3 Spring Street, Sydney
NSW 2000 Australia
Tel: 02 8249 4351 Fax: 02 8249 4001
Support: 02 9029 8021 Email: [email protected]
Web: www.abmsystems.com