Upload
poo-kuan-hoong
View
203
Download
0
Embed Size (px)
Citation preview
1
Discovery of Twitter User Interestingness Based on Retweets, Reply Mentions and Pure Mentions Relationships
Ong Kok Chien , Poo Kuan Hoong and Chiung Ching HoFaculty of Computing Informatics, Multimedia University Cyberjaya.
2016 International Conference on Information in Business and Technology Management (I2BM)
2
2016 International Conference on Information in Business and Technology Management (I2BM)
Outline
IntroductionObjectiveMethodsResultsSummary
3
2016 International Conference on Information in Business and Technology Management (I2BM)
Introduction
Explore the graph relationships between Retweets (RT), Reply-Mentions, (RM) and Pure-Mentions (PM)
Compare the ranking with hand-marked (HM) ranking by seven (7) annotators
4
2016 International Conference on Information in Business and Technology Management (I2BM)
Maximum 140 characters microblogging site.
“A Tweet is an expression of a moment or idea. It can contain text, photos, and videos. Millions of Tweets are shared in real time, every day.”
Reply
Retweet
Favorite
Hashtags
https://about.twitter.com/what-is-twitter/story-of-a-tweet
.com
5
2016 International Conference on Information in Business and Technology Management (I2BM)
Objectives To rank Twitter users using Page Rank.
6
2016 International Conference on Information in Business and Technology Management (I2BM)
Methods
Link-based ranking algorithms (PageRank)
Twitter Users as Nodes.
Relationships as Edges.
7
2016 International Conference on Information in Business and Technology Management (I2BM)
Example
PageRank (PR) E.g.: BackLinks in Websites - Referring back to
Original Content.
- Sergey Brin & Larry Page (1998). The anatomy of a large-scale hypertextual Web search engine.
Image extracted from Wikipedia
8
2016 International Conference on Information in Business and Technology Management (I2BM)
Example
Minister of Youth & Sports
Khairykj
shatyrah2
AyenSanji
RT-ed RT-ed
https://twitter.com/Khairykj/status/410964119521460224
9
2016 International Conference on Information in Business and Technology Management (I2BM)
Architecture
Twitter Streaming API
Configure Keywords
1 JSON raw data2
3 HiveQL 4 UnixScript
10
2016 International Conference on Information in Business and Technology Management (I2BM)
Keywords
HyppTV Streamyx UMobile Unifi Yes4G Celcom xpaxsays
11
2016 International Conference on Information in Business and Technology Management (I2BM)
Basic Statistics
Dataset Total Tweets : 7,931 After discard non-native Retweets: 7,922 English Tweets (language=en): 2,229 Unique RT pairs of users: 512 Unique PM pairs of users: 620 Unique RM pairs of users: 545 Unique Full-Mention (FM) pairs of users: 1,154
1st Feb 2015 -> 7th Feb 2015
12
2016 International Conference on Information in Business and Technology Management (I2BM)
Categories of Tweets
Tweets are categorized into the following categories: (1) News - products/company info; (2) Advertisements - promotion; (3) Business - special offers; (4) Jokes - funny/pranks content; (5) Questions - seeking for answers/response; (6) Answers - response to a question (@mentions); (7) Statement - Complaints/comments/feedback; (8) Conversation - response to another tweets; and (9) Irrelevant - not related to telco products/services.
13
2016 International Conference on Information in Business and Technology Management (I2BM)
Interestingness score schema
The interestingness score schema was set from the range 0 to 4: 0 = Irrelevant; 1 = Less Interesting/Informative; 2 = Interesting/Informative; 3 = Quite Interesting/Informative; and 4 = Very Interesting/Informative
14
2016 International Conference on Information in Business and Technology Management (I2BM)
Results Scored an average informative/interestingness score
of 1.33 out of 4 by our 7 annotators from 2 iterations. Agreement amongst 7 annotators after 2 iterations
for 9 categories was 62.27% and score (between 0-4) was 51.64%.
15
2016 International Conference on Information in Business and Technology Management (I2BM)
Results
Rank
HM RM PM FM RT
1 Asianadotmy
Azwnrafi Zaynneutron Zaynneutron
Zaynneutron
2 Zulhusnia Lauravinzant
Shahril_Wokay2
Ndiarzali88 Ndiarzali88
3 IzRijap TuneTalk Azwnrafi TuneTalk UniFiEdge4 Socasnov FirdausAzil TuneTalk Lauravinza
ntTuneTalk
5 Pjolll Alliebnormand
Alliebnormand
Azwnrafi Asianadotmy
6 uk_htc FookHeng_Lee
Twtwanitaa FirdausAzil uk_htc
7 FookHeng_Lee
Pjolll Lauravinzant FookHeng_Lee
HyppWorld
8 TuneTalk Ndiarzali88 FirdausAzil Twtwanitaa Zulhusnia9 NurIllihazwa
niUniFiEdge FookHeng_L
eePjolll IzRijap
10 Shahril_Wokay2
HyppWorld Pjolll Alliebnormand
FookHeng_Lee
• RT shows the closest match of ranking sequence as compared to RM and PM.
• For the case of RM and PM, RM appears to be a better match to the HM sequence.
16
2016 International Conference on Information in Business and Technology Management (I2BM)
SummaryA PR graph relationships analysis of how RT, RM and PM impact the perception of user-level informative/interestingness, validated with HM evaluations.
17
2016 International Conference on Information in Business and Technology Management (I2BM)
Future Work
Further evaluation to be conducted using different weightages of RT, RM and PM.