8
Prediction of social media campaign outcomes Study using Support Vector Machine PropheZy for prediction of multivariate datasets Z/Yen Group Limited with OutsideLine n (b) < n (p) (R(KPI) 25%-100% ) x 2

Prediction of social media campaign outcomes zyen may 2011

Embed Size (px)

DESCRIPTION

I am an associate with Z/Yen Group, a risk and forecasting consultancy in the City. We have built specialist analytical tools for crunching very large unstructured business datasets, to help us find trends anomalies and patterns in data. We wanted to see whether social media metrics are highly patterned or not. So after a chat with the IAB Social Media Council, we paired up with Ronnie Brown at agency Outside Line to analyse some campaign data using our prediction software tool PropheZy. Its usual applications are in trading and compliance, where it detects the best indicators of success and predicts the outcomes of transactions. We wondered: if our financial clients can know in advance what’s going to be troublesome and what’s most likely to succeed, could a social media campaign do the same ? We looked at a year-long 2-million click blogger outreach campaign by Outside Line on behalf of its former client LG the consumer electronics manufacturer. In the campaign, Outside Line had engaged with 70 bloggers, aiming for positive coverage of gadgets. Now blogger outreach is generally thought of as a hit and miss platform, with a high reliance on touchy-feely factors and virality. When Outside Line was running the campaign they blogged that the keys to success were tone and approach. We all got a shock when the data crunching was done. Our software engine worked only with the first month’s metrics across ten data fields, and built a predictive model of the rest of the campaign. It then used the model to forecast the readership of each blogger for the rest of the year. Its predictions – when compared to what actually happened - turned out five times better than random across all the 1300 blogs of the campaign. It was especially precise in predicting the big-hitting blogs. When PropheZy was asked to identify which bloggers would go on to take the campaign to 70 % of its KPI of hits, its list of 14 hottest bloggers turned out to be 80% correct. This means that the factors which create social media success were, in fact, rather easy to predict in this case. The LG blogger campaign – if it had run this prediction software on its first month’s data – would have known not to bother with half the bloggers from then on. It could have hit its KPIs with only half the campaign resources. This brings a new set of possibilities to the old adage that half the marketing budget is wasted – you just can’t know which half. Perhaps now you can. The folks at Z/Yen Group and Outside Line are all looking closely at how to build on this discovery. Potentially, there’s a massive saving of time and money to be made in Social Media campaigning. We have put the full deck of slides up here. This was a limited trial, and blogger outreach is one small corner of the social media landscape. But the message for marketers is unambiguous: the delivery of social media campaign goals is predicted by patterns in data that are detectable, and reliable.

Citation preview

Page 1: Prediction of social media campaign outcomes zyen may 2011

Prediction of social media campaign outcomes

Study using Support Vector Machine PropheZy

for prediction of multivariate datasetsZ/Yen Group Limited with OutsideLine

n(b) < n(p) (R(KPI) 25%-100% ) x 2

Page 2: Prediction of social media campaign outcomes zyen may 2011

Z/Yen data trial April 2011Predict SM outcomes from a campaign dataset

• Internet Advertising Bureau (IAB) brokered• Blogger Outreach campaign data provided• Client: LG• Agency: Outside Line• 1271 records spanning 13 months• 84 million hits• 10 data fields:

Blog Name • Blog type • Unique users • Technorati author • Technorati blog rank • Content Type • Week posted • Comments • Reaction • Replies

Page 3: Prediction of social media campaign outcomes zyen may 2011

Is social media predictable ?

• Virality, “buzz”, network effects are inherently hard to model– Unquantifiable and changeable factors – Exaggerated impacts from “herd” movements

• Data clusters at the extremes:– Few blogs with many hits– Many blogs with few hits

Page 4: Prediction of social media campaign outcomes zyen may 2011

PropheZy SVM

• Multivariate regressions on large unstructured datasets

• Trained to find best fitting models• Proven ability to predict– Wholesale & retail finance transaction outcome– Process management outcome– Compliance and anomaly detection

Page 5: Prediction of social media campaign outcomes zyen may 2011

Predicting Blogger Outreach: approach

• KPI = Hits on blog• 10 fields usable data (small dataset)• Expanded to 26 further derived fields

• Eg time since previous blogs 1-5

• No external data • No author biog, content analysis, environment info

• Banding scheme • equal number of records in all bands

• Predictions are correct if within one band

Page 6: Prediction of social media campaign outcomes zyen may 2011

Prediction quality• Predicts better than standard regression

• Prediction improves by: more fields, top range of KPIs

Technique Random outcome

Correct prediction

Standard regression 10% 28%PropheZy 10% 48%

Prediction based on.... % accuracy

Top 3 fields of data 36%

All fields of data 48%

Top 3 quartiles of total KPI score 79%

Page 7: Prediction of social media campaign outcomes zyen may 2011

What does it tell us ?Effectiveness

• Blogger success on average can be predicted with 48% accuracy on these data.

• Higher accuracy for: – top-ranking blogs (79%) – more data (maybe to between 54% and 77%)

• Training set- 10 mn hits on 69 blogs. – The 14 top scoring blogs (7mn hits) were predicted with

accuracy 79%– PropheZy could ensure that 70% of campaign target is

achieved with just 20% of campaign input.

Page 8: Prediction of social media campaign outcomes zyen may 2011

What does it tell us ?Blogger Outreach

• Promising outlook for PropheZy as a predictor of this and other Social Media outputs

• Predictability is greatest for highest KPI scores• Possible applications:– Optimisation tools to increase campaign ROI ?– Benchmark for blogger outreach campaigns ? if: 20/70 rule: top 20% of blogs get 70% of hits

then: n(b) < n(p) (R(KPI) 25%-100% ) x 2