14
Prediction of Influencers from Word Use Chan Shing Hei

Prediction of Influencers from Word Use Chan Shing Hei

Embed Size (px)

Citation preview

Page 1: Prediction of Influencers from Word Use Chan Shing Hei

Prediction of Influencers from

Word UseChan Shing Hei

Page 2: Prediction of Influencers from Word Use Chan Shing Hei

• Detecting influential users before they show observable signals of influence

• Method:psycholinguistic category scores from word usage

• Twitter Databuilt predictive models of influence from such category based features

Introduction

Page 3: Prediction of Influencers from Word Use Chan Shing Hei

Measuring Influence

• measure influence score

the average number of retweets generated from a user's tweets

Page 4: Prediction of Influencers from Word Use Chan Shing Hei

Dataset• Twitter's streaming API

• Nov 1, 2013 to Nov 14, 2013

• randomly sample 1000 users

• Tweets from last one month (Oct 2013)

• historical tweets (max 200)

• average influence score: 0.213 SD: 0.098

Page 5: Prediction of Influencers from Word Use Chan Shing Hei

Psycholinguistic Analysis from text• How to measure word use ?

users’ historical tweets with the Linguistic Inquiry and Word Count (LIWC) 2001 dictionary

• How to computed his/her LIWC based scores?

in each category as the ratio of the number of occurrences of words in that category in one’s tweets and the total number of words in his/her tweets

Page 6: Prediction of Influencers from Word Use Chan Shing Hei

Influence and Word Use

Page 7: Prediction of Influencers from Word Use Chan Shing Hei

Finding from analysis

• LIWC category that negatively correlated with influence score:

Negative emotion

physical states

inhibition

Page 8: Prediction of Influencers from Word Use Chan Shing Hei

• LIWC category that positively correlated with influence score:

more interactive

positive feelings or emotion

determination and desires for the future

Finding from previous analysis

Page 9: Prediction of Influencers from Word Use Chan Shing Hei

Prediction Models

• Using Weka

• regression analysis and a classification influence score

• Evaluate the performance

Page 10: Prediction of Influencers from Word Use Chan Shing Hei

Regression analysis

• linear regressions predict influence score using LIWC measures

Page 11: Prediction of Influencers from Word Use Chan Shing Hei

Classification study 1

• supervised binary machine learning algorithms

Page 12: Prediction of Influencers from Word Use Chan Shing Hei

• divide influence scores into 10 equal sized bins, and trained supervised classifiers with 10 classes

Classification study 2

Page 13: Prediction of Influencers from Word Use Chan Shing Hei

Suggestion

• Adding more criteria for measure influence

• Other social media

• Real application – political campaigns

Page 14: Prediction of Influencers from Word Use Chan Shing Hei

Conclusion

• Correlations of word usage with influence behavior

• discovers a set of psycholinguistic categories

• identify users early to be an influencer