Upload
aileen-hensley
View
220
Download
0
Embed Size (px)
Citation preview
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
2
CONTENTS
• Big Data in the 1930’s and why that matters now• TV measurement and Return Path Data (STB)• Interesting questions for understanding error
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
3
BIG DATA 1930’S STYLE
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
4
PROBABILITY SAMPLING 1930’S STYLE
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
5
EVOLUTION OF STATISTICAL CONCEPTS IN RESEARCH
Early days: Novel, non-scientific
1930’s: Scientific sampling
Since the 1950’s: weighting, probability models, imputation techniques, data fusion, time series analyses, hybrid (Big Data/sample integration)
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
6
NIELSEN AND AUDIENCE MEASUREMENT
1923: Nielsen Founded1950: Introduces TV Audience
Measurement
Current technology: People Meter• Electronic measurement• Probability samples• All people and sets in home
measured
Nielsen Ratings are the currency for US TV advertising
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
7
THE CHANGING TV ENVIRONMENT
• Fragmentation of Viewing Choices
• Proliferation of Devices
• Increasing Population Diversity
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
8
RESEARCH DATA - STATISTICAL TOOLS
From: Sample/Measure/Project (Panel Data)To: Sample/Measure/Project + Integrate
- Data Fusion- Probability Modeling- Calibration- Predictive Modeling
Using Multiple Panels, Census Data, Surveys
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
9
WHAT STB AND PANELS CAN GIVE US
STBLarge convenience samples,
stable resultsDATA
PanelsCompleteness of Audience
MeasurementRESEARCH PRODUCTS
In combination, STB + Panels offer the possibility of stable,
UNBIASED RESEARCH
+
=
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
10
STB GAPS AND BIAS
1. Data Quality/coverage/
timeliness/representativeness
2. Set Activity (On/Off/Other Source)
3. Household Characteristics
4. Persons viewing (including visitors in the home)
5. Other Viewing Activity
Bias
Standard Error
STB
Bias
Standard Error
People Meter
STB + People Meter?Bias
Standard Error
Total Survey Error
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
11
0
10
20
30
40
50
60
70
12:0
0 AM
1:00
AM
2:00
AM
3:00
AM
4:00
AM
5:00
AM
6:00
AM
7:00
AM
8:00
AM
9:00
AM
10:0
0 AM
11:0
0 AM
12:0
0 PM
1:00
PM
2:00
PM
3:00
PM
4:00
PM
5:00
PM
6:00
PM
7:00
PM
8:00
PM
9:00
PM
10:0
0 PM
11:0
0 PM
AA %
STB Tuning Activity
STB DATA QUALITY – EXAMPLE ANALYSES
• Good… • Not so good…
0.00
0.50
1.00
1.50
2.00
2.50
3.00
5:00 PM
5:30 PM
6:00 PM
6:30 PM
7:00 PM
7:30 PM
8:00 PM
8:30 PM
9:00 PM
9:30 PM
10:00 PM
10:30 PM
11:00 PM
11:30 PM
12:00 AM
12:30 AM
1:00 AM
%
Adjacent Tuning Sessions - April 22nd 2011
Same Channel Different Channel
Machine Reboot ActivityProgram junction spikes
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
12
ARE WE IMPROVING THE MEASUREMENT?1. Transparency and validation at each step and overall
2. Total Survey Error
0.0
2.0
4.0
6.0
8.0
10.0
12.0
5 AM -6 AM
6 AM -7 AM
7 AM -9 AM
9 AM -12 PM
12 PM -3 PM
3 PM -5 PM
5 PM -6 PM
6 PM -6:30 PM
6:30 PM - 7 PM
7 PM -10 PM
10 PM -11 PM
11 PM -11:30 PM
11:30 PM -1 AM
Local Station Ratings M-F Nov 2010 -Women 18+
People Meter Hybrid
Females 18+ 19 5 7Females 18 - 34 38 41 40
Total Survey Error % Reduction
Broad-cast Cable TotalTotal Survey
Error Bias
Standard Error
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
13
ASSESSING INTEGRATION ERROR
• Input Error (GIGO) • Matching Error• Statistical Error• Validity Levels• Multiple Database error compounding
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
14
ASSESSING INTEGRATION ERRORS
• Input Error (GIGO)- Coverage Gaps, Definitional problems, Input Errors etc- But possible improvement through integration weighting
effects
Most problems remain but some can be mitigated through integration
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
15
ASSESSING INTEGRATION ERRORS
• Matching Error (eg address matching)- Good – correct match, Bad – no match, Ugly – incorrect
match- Trade-off between match rates and error rates
Multiple databases may have correlated errors – that may be preferable to random errors since overall effect is restricted to a smaller group (eg new householders in some address lists)
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
16
STATISTICAL ERROR (SAMPLE-BASED IMPUTATION)• Model bias leads to attenuation (regression to mean)• Individual data point bias can be undetectable due to
sampling error
Persons 2+ Total Viewing Weekly Average Hours across 1000 Product Categories
0
5
10
15
20
25
30
35
40
Fused
Actual
Fused Best Fit
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
17
SEPARATING MODEL BIAS AND SAMPLING ERROR
Actual vs Expected Distribution of Differences between Real
and Fused Results
0
100
200
300
400
500
600
700
800
900
1000
M-Sun 1am-6am
Expected
Z-tests on each comparison and evaluation of Z-score distributions
Deviation from expected distribution gives bias estimate
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
18
STATISTICAL ERROR - MULTIPLE DATA SETS
TV
BuyWeb
Hub and Spoke Sequential
TV
BuyWeb
1 2 1 2
2
Comparison with Single Source Data:Nielsen National People Meter TV and Internet matched
with Credit Card Purchase Data
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
19
ACCURACY TEST
TV
BuyWeb
Hub and Spoke Sequential
TV
BuyWeb
R = 0.4
Correlation of 8 product categories with 14 TV Networks and 60 Websites
R = 0.5
R = 0.67R = 0.44
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
20
SEQUENTIAL VS HUB AND SPOKE
• Unless the Hub has all the relevant linking information, a sequential approach gives better results
• In our example, we captured interactions between web and purchase behavior through the sequential fusion
• However sequential fusions can fall down with too many data-sets as error compounds.
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
21
VALIDITY LEVELS – INDIVIDUAL VS AGGREGATED
Individual Prediction• IDEAL SCENARIO: You can predict
every individual’s behavior
• REALITY With most Imputation methods we can do better than random but rarely can we get close to 100% accuracy.
• Eg ~40% improvement on random when predicting product users based on cookies.
ie 14% of online ad impressions delivered to product users rather
than 10%
Aggregate Prediction• Imputation methods can reliably
predict aggregate level behavior given good predictive variables
• Eg 90% Accuracy (10% regression to mean) for TV audience estimates by product users
• Errors compound with multiple sources but extent varies by case
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
22
CONCLUSION
• Data Everywhere!• Data quality and relevance is essential• Integration brings insights and error• Statistical Integrity is as important now as it
was in the 1930’s
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
24
AD EFFECTIVENESS - MORE COMPLICATED
• Imagine a data set of 10,000 people for whom you have tracked exposure to a brand’s website and subsequent purchase of that brand.
• In our initial thought experiment, 76% converted.
HUB: Matching
info
TBD...
PUR-CHASE
Website visit
TBD...
TBD...
TBD...
TBD...
TBD...
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
25
A BASIC EXPERIMENT
• Now imagine that you have measurement error in 10% of your cases. We ran a simulation of 1000 datasets which had incorrect data on site visits in 10% of cases.
• The difference between the original conversion rate and that in the 1000 error ridden test cases is about 8.5%. SD is xx.
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
26
A BASIC EXPERIMENT
• What happens when we add another data set?
HUB: Matching
info
TBD...
PUR-CHASE!
Website visit
Saw TV ad
TBD...
TBD...
TBD...
TBD...
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
27
MORE DATA – SAME ERROR
• Given two types of ad exposure data to measure, the impact of error in a single data source should be less...
• Imagine that you have measurement error in 10% of your cases for one data source – the same error as in previous experiment.
• As expected, conversion values are closer to our error-free data set. SD =
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
28
MORE DATA – MORE ERROR
• Next, we introduced error into the TV data set as well.
• Worsening of performance SD is xx.
• But it looks more additive than exponential.
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
29
MORE DATA – EVEN MORE ERROR
• Next, we imagined combining 6 data sets, each with 10% error.
• WHAT DO WE SEE?
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
30
MATCHING ERROR• In any data combination, there is an additional source of error – mismatches
to the HUB or identity variable.
• Mispelled names can lead to false negatives. Non-deterministic matching can lead to false positives.
• Introducing 10% matching error (to first only, both and second only data sets) suggests that the impact is negligible over conversion in error free data.
• Suggests the quality of data is more important than the matching quality.
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
31
ASIDE: THE IMPORTANCE OF WEIGHT
• Here, TV data was heavily weighted toward exposure.
• That overwhelmed any error from website visit data. Indeed, it appeared to counterbalance it.
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
32
ASIDE: THE IMPORTANCE OF CORRELATION
• The greater the correlation between the dependent and independent variable, the greater the impact of error.
Weaker correlation between webvisit and purchase (xx)
Strong correlation between webvisit and purchase (xx)
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
33
WHAT DO WE KNOW THUS FAR?
• Still more work to do certainly. But we have formed certain hypotheses:• When combining multiple data sets, the error appears additive.
• Error rates being equal, the underlying aspects of the data are more likely to impact the outcome than the combination.
• It is important, however, to qualify basic relatedness between each independent variable and the dependent outcome. This argues for a hub and spoke approach to data combination.
• SO how did these hypotheses fare in a quick test using real world data? (next slide on your recent error work)
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
34
There are two basic paths to integrating data
A serial integration: (A+B)+C
Each data set resulting from an integration is smaller thaneither original source due to non-matches.
Combining Data Sets
Data Source
A+B
Data Source B
Data Source A
Data Source C
Data Source A+B+C
+ =
+ =Data
SourceA+B
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
35
COMBINING DATA SETS
Another approach is a hub and
spoke model:
(A+B)+(A+C)...etc.
While the final integrated set
is still reduced due to non-
matches, the error from each
match to the HUB is known.
HUB: Matching
info
TBD...
TBD...
TBD.
TBD...
TBD...
TBD...
TBD...
TBD...
Copy
right
©20
12 T
he N
iels
en C
ompa
ny. C
onfid
entia
l and
pro
prie
tary
.
36
AD EFFECTIVENESS - MORE COMPLICATED
Ad effectiveness captures the correlation between exposure to advertising and subsequent purchase of a product.
When someone who sees an ad buys a product, we say they have CONVERTED.
HUB: Matching
info
TBD...
PUR-CHASE
TBD.
TBD...
TBD...
TBD...
TBD...
TBD...