LinkedIn.com/in/LarryMaccherone
If you build it, they will come...... or will they?
Measuring Delivered ValueIs it an unachievable snipe hunt or can it really be done?
LinkedIn.com/in/LarryMaccherone
Shoeless Joe Jackson
"Is this heaven?...”
“No, it's Iowa...”
“Iowa? Is there a heaven?...”
“Oh yeah. It's the place where dreams come true."
~ Field of Dreams
2
LinkedIn.com/in/LarryMaccherone
Why not trust our gut?
• Surprising resultsWould you have guessed that the Nigerian scam artists whose opening emails have the WORST grammar make the MOST money?
• Human bias
• Confidence that you are not missing opportunities
Not new reasons but Lean Startup techniques are largely QUALITATIVE
What I’m talking about today is decidedly QUANTITATIVE
Larry MaccheroneLinkedIn.com/in/LarryMaccherone
LinkedIn.com/in/LarryMaccherone
Software Development Performance Index (SDPI) metrics
1. Productivity - Throughput
2. Predictability - Variation in Throughput
3. Responsiveness - Time in Process (TiP)
4. Quality - Defect density/aging
LinkedIn.com/in/LarryMaccherone
SDPI Research FindingsTeams with low WiP have up to:● 4x better Quality● 2x faster Time to market● But potentially 34% worse productivity
Stable teams result in up to:● 60% better Productivity● 40% better Predictability
Dedicated teams: Teams made up of people who only work on that one team have double the Productivity
Team size:● Smaller teams have better Productivity● Larger teams have better Quality
Iteration length● Teams using two-week iterations have
the best balanced performance● Longer iterations correlate with higher
Quality● Shorter iterations correlate with higher
Productivity and Responsiveness
Ratio of testers to developers:● More testers lead to better Quality● More testers lead to worse Productivity
and Responsiveness● Teams with no testers have:
● The best Productivity● Almost as good Quality ● But much wider variation in Quality
And more on Kanban vs Scrum, co-location, Big bang agile, Gender diversity, etc.
LinkedIn.com/in/LarryMaccherone
Software Development Performance Index (SDPI) metrics
1. Productivity - Throughput
2. Predictability - Variation in Throughput
3. Responsiveness - Time in Process (TiP)
4. Quality - Defect density/aging
Never got a chance to incorporate at Rally...
5. Customer/stakeholder satisfaction – NPS (Pendo)
6. Employee happiness/engagement (AgilityHealth)
7. Build-the-right-thing - ??? (Pendo)
8. Code quality - Static analysis (Comcast)
LinkedIn.com/in/LarryMaccherone
Approach to SDPI
research
LinkedIn.com/in/LarryMaccherone
Types of signalsOutcome signal• Evidence of an outcome you are trying to effect• SDPI Examples: Throughput, Defect density, Time in
Process (TiP)• Plotted on the y-axis
Lever signal• Evidence of some decision you have made or
influenced• SDPI Examples: Simultaneous WiP, Ratio of testers to
developers • Plotted on the x-axis
LinkedIn.com/in/LarryMaccherone
Correlation of lever signals to outcome signals
LinkedIn.com/in/LarryMaccherone
Nuance - Controlling for other possible causes
LinkedIn.com/in/LarryMaccherone
Nuance - Overlap in spread (among other nuances)
LinkedIn.com/in/LarryMaccherone
Approach to “build-the-right-thing”
research
LinkedIn.com/in/LarryMaccherone
Three capabilities:
1. Analytics on visitor dataIf you are a SaaS software vendor, you would buy Pendo to gain insight on how visitors use your product
2. GuidesIn app education for your visitors on how to use your app. Think dialog bubbles
3. PollsSurvey your visitors. Net Promotor Score (NPS)?, Did you like this feature? What’s your role? Target a particular segment (visitor or account cohort)
LinkedIn.com/in/LarryMaccherone
A little more context1. Captures all browser events with no explicit coding
Enables analysis going back in time as soon as events are “tagged”Enables research on even untagged features
2. Powerful segmentation
3. Three critical product manager capabilities w/ only the cost of one product and only one browser agent integration (cost, performance, resource consumption, etc.)
4. Combines the use of analytics with guidesExample: As a product manager, I want to interview visitors that have used feature X at least 5 times to get their feedback. A guide can use that metric and prompt only those visitors to sign up for an interview slot.
LinkedIn.com/in/LarryMaccherone
Data model (assumes SaaS B2B application)
• Account – Another business paying you (or in trial) for use of your SaaS productExample metric: % of last 30 days that account used your product
• Visitor – A user at an accountExample metric: Weekly/monthly visitor churn
• Page – A page in your appExample metric: % of users that utilize this page
• Feature – A feature in your appExample metric: % of users that utilize this feature
LinkedIn.com/in/LarryMaccherone
Confusing Meta Meta
• Pendo uses Pendo to measure its own accounts/visitors
• My research started with this data and that drives most of the examples in this talkThis can be very confusing because the same word will show up twice. Example: The “feature” in Pendo that allows our customers to look at data related to a “feature” in their app. So the “feature feature” for Pendo’s use of Pendo.
• However, the primary goal is to offer these capabilities to Pendo customers
LinkedIn.com/in/LarryMaccherone
Research data
• Terrabytes of data
• ”Tagged” to page, feature, visitor, and account
Not yet considered but have potential access to:
• Personas of visitor or account
• Net Promotor Score (NPS) survey results
LinkedIn.com/in/LarryMaccherone
“Build-the-right-thing” signals
LinkedIn.com/in/LarryMaccherone
Outcome signals
• Churn vs Renew (requires a full renewal cycle)
• Win vs Loss (can gather much more rapidly)
• (later) Net promoter score (NPS)
LinkedIn.com/in/LarryMaccherone
Win/Loss and Renew/Churn outcome signals
LinkedIn.com/in/LarryMaccherone
Criteria for good outcome signals
• Can be extracted without human input using simple heuristics:• 15 consecutive months* (configurable) of usage => Renewed,
9 months* of use followed by no use => Churn
• First 4 months* of use => Win,4 month* since first use and most recent month no use => Loss
• May get better with human input
* Month = 30 days
LinkedIn.com/in/LarryMaccherone
“Lever” (input) Signals
LinkedIn.com/in/LarryMaccherone
Lots of alternatives, but which are strong signals?
LinkedIn.com/in/LarryMaccherone
Lever signal strength
A weak/moderate signal A strong signal
LinkedIn.com/in/LarryMaccherone
Signal strength analysis
LinkedIn.com/in/LarryMaccherone
Results for Pendo (Win/Loss)
Strong levers for Win/Loss1. Weekly new visitors
2. Use of the “feature” feature
3. Use of the “page” feature4. Days active
Weak levers for Win/Loss1. All features except “pages” and “features”
Could be much stronger if I can control for visitor persona and/or account type.Suspect there are a few key features for each persona/account type that would be highly predictive
2. Use of the “guide” featureMatches Pendo sales experience
3. Visitor change (churn)
LinkedIn.com/in/LarryMaccherone
Results for Pendo (Renew/Churn)
Strong levers for Renew/Churn1. Weekly new visitors
#1 strongest for Win/Loss
2. Event rate#5 strongest for Win/Loss
3. Time on site#6 strongest for Win/Loss
Weak levers for Renew/Churn1. Use of “page” feature
#5 weakest for Win/Loss
2. Use of “feature” feature#6 weakest for Win/Loss
3. Visitor change#3 weakest for Win/Loss
LinkedIn.com/in/LarryMaccherone
Combining signals withBayesian regression
LinkedIn.com/in/LarryMaccherone
Bayesian regression
• Think spam filters
• Builds conclusions from lots of little bits of evidence
• Deals well with missing evidence
• Intrinsically provides a confidence factor
LinkedIn.com/in/LarryMaccherone
Bayesian regression
LinkedIn.com/in/LarryMaccherone
Bayesian regression results for Pendo• Can predict Win/Loss 90%+
• Can predict Renew/Churn 93%+
• Accurately targets “on-the-bubble” accounts to nudge them one way
LinkedIn.com/in/LarryMaccherone
Takeaways
LinkedIn.com/in/LarryMaccherone
Takeaways for Pendo
1. Growing users in an account the best predictor for both Win/Loss AND Renew/Churn
2. Visitor change (churn) generally bad predictorMay be a Pendo-only phenomenon because Pendo does not charge by visitor
3. Confirms qualitative suspicion that the “guides” feature is not important for the sale, but is important for the renewal
4. No “key” low-level features foundShould change with visitor/account personas cohort analysis
5. Event rate might be a proxy for account size and/or financial healthSuspect trend would be valuable here but haven’t tried yet
LinkedIn.com/in/LarryMaccherone
Up next to address
• How much of this holds true when we look at Pendo’s customers customers?
• Some personas (visitor or account) only use some featuresWill try clustering of these entities
• Some features are only meant to be used once a monthWill seek to add this feature meta-data to the analysis
• Trends often matter more than level. Sometimes it’s level AND trendWill try to add trend analysis
LinkedIn.com/in/LarryMaccherone
Next two sections are “sidebars”Skip if there is no time
LinkedIn.com/in/LarryMaccherone
Non-parametric modeling
LinkedIn.com/in/LarryMaccherone
Are either of these normally distributed? Weibull?
LinkedIn.com/in/LarryMaccherone
Non-parametric modeling - Pros
• Deals elegantly with non-normal, non-Weibull, etc. distributionsHandles fat tails, bi-modal, tri-modal, etc.
• No need for a human to identify the distributionHumans get it wrong or often just default to something (usually Normal) rather than do the analysis to pick the right one
• No need to pick histogram bucketing approachConstant width (percentiles) works great almost all of the time
LinkedIn.com/in/LarryMaccherone
Non-parametric modeling - Cons• It’s not sophisticated
Just about every data science competition is won by a “simple” Bayesian network algorithm
• It requires a lot of computing powerSo what? We have it
• It’s not what all the stats and data science teachers teachInertia of status quo is hard to overcome
• Requires a bit more data
LinkedIn.com/in/LarryMaccherone
SDPI researh extensions withAgilityHealth
LinkedIn.com/in/LarryMaccherone
AgilityHealth Radar
• 75 Question Survey
• Taken in a coaching-oriented facilitationway. Each section is explained before thesection’s questions are answered.
• Facilitated by highly-trained agile experts
• Targeted at self-assessment and improvement, not comparative benchmarking
LinkedIn.com/in/LarryMaccherone
Analyzing AgilityHealth survey data
• Even more closely related to my SDPI work
• Should be even more interesting if/when we can correlate with SDPI data
• Promising research but with it’s own set of issues and confounds
• Those surveyed often give middle of the road responses (Casper Milquetoast effect)Even worse when survey results for all team members are averaged
• How to sense if there is survey fatigue and account for it? (75 questions)May be partially or wholly mitigated by coaching/facilitation during the survey
• Self-assessment confounds? Do late novices rate themselves higher than late intermediates because they don’t yet know what they don’t know?
• Hard to tease outcome signals from lever signals
LinkedIn.com/in/LarryMaccherone
Preliminary results
Best levers for team performance1. Team confidence and skills
2. Short-term plan
3. Backlog management
4. Clarity of roles
5. Leadership (summarized)
(actually 6 tied for 5th including Tech leadership, PO leadership, PO engagement, Servant leadership, Self organization, and Vision & purpose)
Worst levers for team performance1. Sustainable pace
Do “harder working” teams perform better?2. Physical environment and “tools”
Agile manifesto correct?3. Team allocation and stability
Why does this contradict SDPI research?It also contradicts AgilityHealth experience
4. RoadmapMore important for portfolio? Not team
5. Manager role in process improvementWhat @ process improvement in general?
LinkedIn.com/in/LarryMaccherone
People will come
‘People will come for reasons they can't even fathom. They'll arrive at your door and you’ll say "Of course, we won't mind if you look around. It's only $20 per person". They'll pass over the money without even thinking about it. And they'll watch the game and it'll be as if they dipped themselves in magic waters.’
~ Field of Dreams
46
LinkedIn.com/in/LarryMaccherone
They may come but…
‘News Corp tried to guide MySpace, to add planning, and to use “professional management” to determine the business’s future. That was fatally flawed when competing with Facebook which was managed lettting the marketplace decide where the business should go.’
~ Forbes
47
LinkedIn.com/in/LarryMaccherone
Thanks and questions