Download pdf - Analytics Lessons Learnt

Transcript

Analytics - Lessons Learnt

Dr. Venkata PingaliApril 1, 2016

Basic Process

Conceptual Process Biz Analytics

TeamData Engg

Qtns, Context

Data Req

Datasets

Model Results

Story TellingAll three roles could be in a single team!

Process in RealityBiz Analytics

TeamData Engg

Qtns, Context

Data Req

Datasets

Model Results

Story Telling

IterativeUncertainExpensiveLaborious

Process in RealityBiz Analytics

TeamData Engg

Qtns, Context

Data Req

Datasets

Model Results

Story Telling

IterativeUncertainExpensiveLaborious

http://fortune.com/2016/02/05/why-big-data-isnt-paying-off-for-companies-yet/

"80% of ..companies strategic decision go haywire.. “flawed” data

Nature of Domain

Sense-making with Purpose

● Goal is impact - real change in the real world○ Not mathematical machoness ○ Not blogs, presentations, etc.

● Model + Delivery = Impact ● Model - an approximation to real world

○ Three levels - Question, Domain, Process○ Realworld has (unknown) complexity○ Not an end in itself

● Delivery - Facilitation of incremental change○ Multiple levels - Mindsets, technology, processes

Closing loop is a reality check

An imperfect search process

● Imperfect questions, data, and process○ Complexity discovered over time ○ Iterative refinement

● Laborious, error prone, and always incomplete○ Data preparation (60-80% of work) is error prone○ Questions -> Answers -> Questions

● Initial framing is just the beginning○ Story will reveal itself over time

Design for uncertainty

Successful Analytics Shifts Power

● There are winners and losers○ Change is always painful○ Efficiencies have to come from somewhere

● Mostly through power to contradict○ Upsets conventional wisdom

● Sometimes through new paths forward

Analytics is serious business

Trust is #1 requirement

● Change require trust in output (evidence and path forward)● Gaining trust is hard work

○ Delivered by what you do and how ○ All the time and everything you do

● Integrity required through the entire lifecycle○ Data○ Process○ Interpretation

Design for trust

Math is either correct or not

● Sense-making may be qualitative but data or transformations are not○ Every step is mathematical step

● Correct math is the basis for trust○ Process is laborious ○ Work should not be trusted by default!

● “Hidden” transformations are risky○ Excel changes ○ Filtering rules

Mathematical indiscipline will be punished

Efficiency is #2 requirement

● Data science getting out of the lab environment● Decision makers have realized that they could be wrong (often?)

○ Need to be contradicted only once - happening frequently○ Now they are asking for input in all areas

● Sea change in last 4 years ● Growing combinations - #decisions x #scope x #frequency x #depth

○ Growing much faster than people & process can cope

Process efficiency is essential to scaling

Team Character determines Quality

● Fundamentally about collaborative reasoning under uncertainty○ Need a creative group of people

● Balanced skill along multiple dimensions○ Domain (technology, business, individual)○ Approach (model, experiment, field work)○ Engagement (presentation, tech delivery, ops)

● Balanced process ○ Increased curiosity bandwidth will give people mastery, purpose

● Sense of purpose

Look to build a strong team

Surviving the Insight Ladder

● Step 1 Wranging - Get to facts at summary level● Step 2 Discovery - Frame initial questions & iterate to get to real

questions● Step 3 Relevance - Meaningful imprecise answers● Step 4 Accuracy - Meaning precise answers● Step 5 Robustness - Meaningful, precise, robust answers

Continuously increase curiosity bandwidth

Time spent here = Curiousity bandwidth

Business

Has to be shared organizational experience

● Mistakes are frequent○ Through the entire lifecycle

● Domain knowledge is discovered○ More important than math

Make analytics a collective experience

Costs are front-loaded

● Data preparation/wranging○ Takes arbitrary amount of time ○ Time/Effort ~ #elements ^^ 2

● Errors in model development and operation● Data version updates ● Changes in narratives

Budgeting and expectation setting should be realistic

Empathetic delivery

● Analytics has collateral damage ○ People get fired, budgets are cut, new responsibilities get added

● Empathetic positioning and language○ Understand that everybody wants to do their job well○ People are not dumb

● Incremental actionables○ Show way forward in byte chunks

Plan the delivery carefully

Analytics work is risky

● Over-hyped context○ Bigger, better examples everywhere - real or imagined

● Burden of expectations/magic from customer ● Things go wrong

○ Underwhelming/no results, methodological issues, wrong data

● Crisis as a teaching moment○ Culture of learning, understanding and continuous refinement

Enable team to take risks and have honest conversations

Individual

Dont be pygmalion

● Dont fall in love with data ○ It is imperfect like everything else

● Even simple data is too rich ○ You see what you want to see

● Be deeply skeptical ● Explore without judgment, detached

Develop non-judgment curiousity

Extra

Decision-maker Questions

1. Where did the numbers come from? (Correctness, Lineage)a. Assumption, models, datasets

2. Is this an accident? Does it hold now? (Reproducibility, Retargetability)a. Model, dataset, and question revisions

3. Can you get the results faster? (Efficiency)a. Time, effort, cost

4. Can you also analyze X? (Extensibility) a. Different dataset, question

5. Could we try X? (Dataset generation - synthetic and real)a. What if scenarios, field experiments