View
24
Download
3
Category
Preview:
DESCRIPTION
Collating Social Network Profiles. Objective. System. . . . Objective. System. Input. Output. . Social Network Profiles. - PowerPoint PPT Presentation
Citation preview
Collating Social Network Profiles
2
<Twitter Profile, Facebook Profile, G+ Profile, …>
Objective
<Company Name> System<Twitter Profile, Facebook Profile, G+ Profile, …>
3
<Twitter Profile, Facebook Profile, G+ Profile, …>
Objective
Company Name SystemSocial Network
Profiles
Input Output
4
Record Linkage+
Identity
5
Agenda
Introduction Objective
Contrast to Existing Work
Work Done Baseline System
Individual Network Approach
Machine Learning Experiments
Next Steps, Q&A
6
Baseline System
7
Ground Truth
Two networks: Facebook and TwitterTop seventy 2013 Fortune 500 companies
8
Baseline Algorithm
1.Take company name.
2.Search Facebook/Twitter API using it.
3.Return first result from each.
9
Baseline Performance
Facebook Twitter Both0
10
20
30
40
50
60
70
34
52
30
Corr
ect
Matc
hes
10
Individual Network Approach
11
New Approach
Score profiles based onEdit Distance
Company Name – Username
Company Name – Display Name
Relative Popularity
12
Display Name
Username
13
New Approach
Score profiles based onEdit Distance
Company Name – Username
Company Name – Display Name
Relative Popularity
14
Scoring
Edit Distance Score:
Popularity Score:
15
Best Performing Combination
Facebook Twitter Both0
10
20
30
40
50
60
70
34
52
30
40
50
34
Baseline Username Edit Distance + Popularity
Corr
ect
Matc
hes
16
Machine Learning Experiments
17
Freebase Ground Truth
1,422 with a social media presence
917 with Facebook, 687 with Twitter
598 with both
553 with valid profiles
18
Training Set
553 Correct
553 Incorrect
1106
Total
19
Cross Validation Results
Classifier Test | Train Train | Test
Linear Regression 0.734 0.707
Gaussian Naïve Bayes 0.972 0.956
Multinomial Naïve Bayes 0.511 0.506
Bernoulli Naïve Bayes 0.720 0.701
Decision Tree 0.954 0.935
20
Next Steps
Improve training set: provide harder examples
21
Next Steps
Improve training set: provide harder examplesIncorporate more profile data
22
Next Steps
Improve training set: provide harder examplesIncorporate more profile dataBuild system around classifiers
23
Agenda
Introduction ObjectiveContrast to Existing Work
Work Done Baseline SystemIndividual Network ApproachMachine Learning Experiments
Next Steps, Q&A
Recommended