TOPICS - joomla.damapdx.orgjoomla.damapdx.org/images/Presentations/dama presentation with dem… ·...

Preview:

Citation preview

• What is BIG DATA? • What is BIG DATA ANALYTICS? • Overview of MACHINE LEARNING • Tools & Technologies of BIG DATA ANALYTICS • Advanced Data Visualization Tools • DEMONSTRATION • MARKET TRENDS

TOPICS

• Volume: from terabytes to petabytes and up • Variety: an expanding universe of data types and sources • Velocity: accelerated data flow in all directions

Challenges of traditional data management techniques

Volume

Traditional analytics are often designed to analyze relatively small sample sizes

Data storage across multiple drives presents problems for traditional techniques

The cost to analyze large data sets using traditional techniques is too high from both a time and memory perspective

Challenges of traditional data management techniques

Velocity

Rapidly changing data sets require dynamic, real-time analysis that is not available with traditional techniques

Information Management processes need to intelligently decide in real time what data to save and what data to discard. This is not possible with traditional techniques

Challenges of traditional data management techniques

Variety

The proliferation of data types creates compatibility issues with traditional tools

The increasing demand for data mash-ups and deep insights challenges traditional techniques that struggle with non-numeric data

Demystifying Machine Learning

Categories of Machine Learning Tasks

• Supervised Learning •Unsupervised Learning • Reinforcement Learning

Categories of Machine Learning Tasks

• Supervised Learning

Examples of Supervised Learning

• Given data about the size of houses on the real estate market, try to predict their price – Regression Problem

• Given data about the size of houses on the real estate market, ascertain whether the home(s) in question will sell for more or less than the asking price – Classification Problem

Categories of Machine Learning Tasks

• Unsupervised Learning

Examples of Unsupervised Learning

• Automatically group a collection of 1000 essays into a small number that are somehow similar or related by different variables, such as word frequency, sentence length, page count, and so on – Clustering Problem

• Suppose a pediatrician over years of experience forms associations in his mind between patient characteristics and illnesses. If a new patient shows up, then based on this patient’s characteristics such as symptoms, physical attributes, mental outlook, etc the doctor associates possible illness based on what the doctor has seen before with similar patients – Association Problem

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

• Data Mining • Hadoop • In-memory Analytics • Predictive Analytics • Text Mining • Data Visualization

Big Data Analytics - Technologies

Why Put Big Data and Analytics Together?

• Big Data provides gigantic statistical samples, which enhance analytics tool results • Analytic tools and databases can now handle big data • The economics of analytics is now more embraceable than ever • There is a lot to learn from messy data as long as it is big

Why Put Big Data and Analytics Together?

• Big Data provides gigantic statistical samples, which enhance analytics tool results • Analytic tools and databases can now handle big data • The economics of analytics is now more embraceable than ever • There is a lot to learn from messy data as long as it is big

Why Put Big Data and Analytics Together?

• Big Data provides gigantic statistical samples, which enhance analytics tool results • Analytic tools and databases can now handle big data • The economics of analytics is now more embraceable than ever • There is a lot to learn from messy data as long as it is big

Why Put Big Data and Analytics Together?

• Big Data provides gigantic statistical samples, which enhance analytics tool results • Analytic tools and databases can now handle big data • The economics of analytics is now more embraceable than ever • There is a lot to learn from messy data as long as it is big

Why Put Big Data and Analytics Together?

• Big Data provides gigantic statistical samples, which enhance analytics tool results • Analytic tools and databases can now handle big data • The economics of analytics is now more embraceable than ever • There is a lot to learn from messy data as long as it is big

Options for Big Data Analytics Plotted by Potential Growth and Commitment

Advanced Data Visualization Tools

Key Capabilities of Advanced Data Visualization Tools

• Highly interactive graphics • Intuitive analytic capabilities • Easy report building • In-memory processing capabilities

Key Capabilities of Advanced Data Visualization Tools

• Highly interactive graphics • Intuitive analytics capabilities • Easy report building • In-memory processing capabilities

Key Capabilities of Advanced Data Visualization Tools

• Highly interactive graphics • Intuitive analytic capabilities • Easy report building • In-memory processing capabilities

Key Capabilities of Advanced Data Visualization Tools

• Highly interactive graphics • Intuitive analytics capabilities • Easy report building • In-memory processing capabilities

Key Capabilities of Advanced Data Visualization Tools

• Highly interactive graphics • Intuitive analytics capabilities • Easy report building • In-memory processing capabilities

Forrester Wave: Advanced Data Visualization Platforms

Forrester Wave: Advanced Data Visualization Platforms

Key Capabilities Pillars: Tableau Software

• Enables fast, ad-hoc analysis of Big Data • Provides advanced in-memory analytics • Makes data mash-ups easy • Gives users powerful, self-service analytics

Key Capabilities Pillars: Tableau Software

• Enables fast, ad-hoc analysis of Big Data • Provides advanced in-memory analytics • Makes data mash-ups easy • Gives users powerful, self-service analytics

Key Capabilities Pillars: Tableau Software

• Enables fast, ad-hoc analysis of Big Data • Provides advanced in-memory analytics • Makes data mash-ups easy • Gives users powerful, self-service analytics

Key Capabilities Pillars: Tableau Software

• Enables fast, ad-hoc analysis of Big Data • Provides advanced in-memory analytics • Makes data mash-ups easy • Gives users powerful, self-service analytics

Key Capabilities Pillars: Tableau Software

• Enables fast, ad-hoc analysis of Big Data • Provides advanced in-memory analytics • Makes data mash-ups easy • Gives users powerful, self-service analytics

Walk-through of Visualizations

• Highlight Table • Tree Maps • Maps – Symbol Maps, Filled Maps

Evolving Business Analytics: From Descriptive to Prescriptive

Gartner’s Definition: Prescriptive Analytics

Optimization is at the heart of Prescriptive Analytics

Big Data Predictions for 2017

• SQL is not going anywhere • Turn “data lakes” into insights • Combine Big Data with Machine Learning for real-time analytics

Business Analytics Trends for 2017

• Modern BI is the new normal • Analytics are everywhere, thanks to embedded BI • People will work with data in more natural ways • Advanced Analytics becomes more accessible • Collaborative analytics will take center stage

• What is DATA SCIENCE? • Skills of a Data Scientist • Applications of Data Science

TOPICS

What is Data Science?

What is Data Science?

Skills of a Data Scientist

• Business Acumen • Knowledge of Hadoop • Skills in SQL • Competent in R Programming and Statistical Analysis/Techniques • Skills in coding (i.e. Java, Python) • Adept at Data Visualization • Skills in Effective Communication

Skills of a Data Scientist

• Business Acumen • Knowledge of Hadoop • Skills in SQL • Competent in R Programming and Statistical Analysis/Techniques • Skills in coding (i.e. Java, Python) • Adept at Data Visualization • Skills in Effective Communication

Skills of a Data Scientist

• Knowledge of Hadoop

Skills of a Data Scientist

• Knowledge of Hadoop

Skills of a Data Scientist

• Business Acumen • Knowledge of Hadoop • Skills in SQL • Competent in R Programming and Statistical Analysis/Techniques • Skills in coding (i.e. Java, Python) • Adept at Data Visualization • Skills in Effective Communication

Skills of a Data Scientist

• Business Acumen • Knowledge of Hadoop • Skills in SQL • Competent in R Programming and Statistical Analysis/Techniques • Skills in coding (i.e. Java, Python) • Adept at Data Visualization • Skills in Effective Communication

Skills of a Data Scientist

• Business Acumen • Knowledge of Hadoop • Skills in SQL • Competent in R Programming and Statistical Analysis/Techniques • Skills in coding (i.e. Java, Python) • Adept at Data Visualization • Skills in Effective Communication

Skills of a Data Scientist

• Adept at Data Visualization

Skills of a Data Scientist

• Business Acumen • Knowledge of Hadoop • Skills in SQL • Competent in R Programming and Statistical Analysis/Techniques • Skills in coding (i.e. Java, Python) • Adept at Data Visualization • Skills in Effective Communication

Zuckerberg Test for Hiring Data Scientist

• They know the techniques of Data Science • They know the tools of Data Science • They can think at different altitudes • They are superb communicators • They can give and take criticism well • They are confident but not arrogant • They get things done • They make the people around them better • They are fun to be around

Types of Data Science Projects

• Tactical Optimization • Predictive Analytics • Nuanced Learning • Recommendation engines • Automated decision engines

Industry Specific Applications of Data Science - Finance

• Online Lending

Industry Specific Applications of Data Science - Travel

• Truly Personalized Offers • Enhanced Customer Service • Safer Travel

Industry Specific Applications of Data Science - Retail

• Customer Experience • Merchandizing • Marketing • Supply Chain Logistics

THANK YOU

APPENDIX

Highlight Table

71

• Step 5: Drag Profit Measure into Rows

Visualization #2: Highlight Table Chart

Tree Maps

73

Step 5: Drag Profit Measure into Label Mark

Visualization #8: Tree Map Chart

Symbol Maps

75

• Step 5: Under ‘Measures’, drag “Profit” onto ‘Label’ in the ‘Marks’ area

Visualization #5: Symbol Maps

Filled Maps

77

• Step 4: Select ‘Filled Maps’ from the ‘Show Me’ dialog

Visualization #6: Filled Maps

Recommended