35
BIG DATA AGILE ANALYTICS Ken Collier, PhD Director, Agile Analytics @theagilist #thoughtworks 1

Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Embed Size (px)

DESCRIPTION

We are in the midst of an exciting time. There is an explosion of very interesting data, and emergence of powerful new technologies for harnessing data, and devices that enable humans to receive tremendous benefits from it. What is required are innovative processes that enable the creation and delivery of value from all of that data. More often than not, it is the predictive (what will happen?) and prescriptive (how to make it happen!) analytics that produces this value, not the raw data itself. Agile software teams are continuously involved in projects that involve rich, complex, and messy data. Often this data represents innovative analytics opportunities. Being analytics-aware gives these teams the opportunity to collaborate with stakeholders to innovate by creating additional value from the data. This session is aimed at making Agile software teams more analytics-aware so that they will recognize these innovation opportunities. The trouble with conventional analytics (like conventional software development) is that it involves long, phased, sequential steps that take too long and fail to deliver actionable results. This talk will examine the convergence of the following elements of an exciting emerging field called Agile Analytics: •sophisticated analytics techniques, plus •lean learning principles, plus •agile delivery methods, plus •so-called "big data" technologies Learn: •The analytical modeling process and techniques •How analytical models are deployed using modern technologies •The complexities of data discovery, harvesting, and preparation •How to apply agile techniques to shorten the analytics development cycle •How to apply lean learning principles to develop actionable and valuable analytics •How to apply continuous delivery techniques to operationalize analytical models

Citation preview

Page 1: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

BIG DATA AGILE ANALYTICS Ken Collier, PhD Director, Agile Analytics @theagilist #thoughtworks

1

Page 2: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Valu

e

Complexity

What happened?

Descriptive Analytics

Why did it happen?

Diagnostic Analytics

What will happen?

Predictive Analytics How can we

make it happen?

Prescriptive Analytics

Page 3: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Valu

e

Complexity

What happened?

Descriptive Analytics

Why did it happen?

Diagnostic Analytics

What will happen?

Predictive Analytics How can we

make it happen?

Prescriptive Analytics

3

Traditional Business Intelligence

Advanced Analytics

Page 4: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Advanced Analytics

Page 5: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Advanced Analytics

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Page 6: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Big Data Analytics Pipeline

Modeling Data

Operational Data

External Data

Data Integration

Reporting Engine

Dimension Mapping

Clean Data

Report Report Report

Dimensional Data

Data Sampling

Feature Selection

Data Partitioning

Test Data

Training Data

Analytical Modeling

Candidate Model

Model Validation

Accepted Model

Page 7: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Advanced Analytics

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Page 8: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Page 9: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Discover & Explore

Analyze & Act

Data Convergence Analytical Divergence

Discover

Harvest

Filter

Integrate Augment Analyze

Act

Analytical Opportunities

How Advanced Analytics Works If we knew X, we could do Y

Page 10: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Typical Timeline

3-6 months 2 months 2-4 months

10

Data Convergence Analytical Divergence

Discover

Harvest

Filter

Integrate Augment Analyze

Act

Analytical Opportunities

Traditional Analytics If we knew X, we could do Y

Page 11: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Page 12: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Page 13: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Analytical Divergence

Analytical Opportunities If we knew X, we could do Y

Data Convergence

Discover

Harvest

Filter

Integrate Augment Analyze

Act

Repeat this cycle solving small problems every few days

LEARN

MEASURE

BUILD

Agility in Analytics

Page 14: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Retain high value customers

High value business goal

Like this example…

Page 15: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

What’s the smallest, simplest thing we can do?

Retain high value customers

Like this example… Common features of

defectors?

Page 16: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Is it useful & actionable?

Retain high value customers

Like this example… Common features of

defectors?

Page 17: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Repeat! Retain high value customers

Like this example… Common features of

defectors?

Shopping behaviors of defectors?

Page 18: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Retain high value customers

Like this example… Common features of

defectors?

What leads to customers leaving?

Shopping behaviors of defectors?

What do defectors say about us?

Customers’ sentiment before defecting?

What encourages customers to stay?

Do incentives reduce defection rates?

Page 19: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Problem solved or continue?

What leads to customers leaving?

Like this example… Common features of

defectors?

Shopping behaviors of defectors?

What do defectors say about us?

Customers’ sentiment before defecting?

What encourages customers to stay?

Do incentives reduce defection rates?

Page 20: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Data Science

Machine Learning

Statistics

Page 21: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

THE “DATA SCIENTIST”

Machine Learning Statistical Modeling

Artificial Neural Networks

Decision Tree Learning

Support Vector Machines

Clustering

…and many more…

Bayesian Classification

Monte Carlo Simulation

Logistic Regression

K-Nearest Neighbor

…and many more…

Domain Knowledge

Data Semantics

Business Understanding

Business Communication

Programming Skills

Functional Programming

Data “Wrangling”

Map/Reduce, SQL, & NoSQL

Page 22: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Data Science

Visual Storytelling

Machine Learning

Statistics Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Page 23: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

drones.pitchinteractive.com

Data Visualization

Page 24: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks
Page 25: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Data Science

Visual Storytelling

Machine Learning

Statistics Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Data Reduction

Page 26: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Objective Truth

Discoverable Truth

Uninterpretable

Irrelevant Noise

Not Actionable

Impactful New Insights

“Little Data”

Page 27: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Data Science

Visual Storytelling

Machine Learning

Statistics Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Data Reduction

Insight

Knowledge

Action

Disruption

Page 28: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Data Science

Visual Storytelling

Machine Learning

Statistics Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Data Reduction

Insight

Knowledge

Action

Disruption

Business vs. IT

Focus vs. Platform

Monitor & Measure

Page 29: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks
Page 30: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks
Page 31: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Data Science

Visual Storytelling

Machine Learning

Statistics Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Data Reduction

Insight

Knowledge

Action

Disruption

Business vs. IT

Focus vs. Platform

Monitor & Measure

Privacy Controls Radical Transparency

Data Democracy

Open Data

Page 32: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks
Page 33: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks
Page 34: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Advanced Analytics

Data Science

Visual Storytelling

Machine Learning

Statistics Agile Analytics

Big Data Solutions Thinking

Ethics

Agile Delivery Lean

Learning

Impact

Volume Velocity

Variety

NoSQL Complexity

Polyglot Persistence

Continuous Integration

Collaboration Evolve

Continuous Delivery

Hypothesis

Build Learn

Measure

Data Reduction

Insight

Knowledge

Action

Disruption

Business vs. IT

Focus vs. Platform

Monitor & Measure

Privacy Controls Radical Transparency

Data Democracy

Open Data

Page 35: Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtworks

Ken Collier, Director, Agile Analytics [email protected]

Value Creation

Cool New Technologies +

Sophisticated Analytics +

Lean Learning Principals +

Fast Agile Delivery =