Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Private traits and attributes are predictable from digital records of human behaviorMichal Kosinski, David Stillwell, and Thore Graepel
Computational Social Media
Karolos Antoniadis
Presentation12th of March, 2020
2
Private traits and attributes are predictable from digital records of human behavior
3
Private traits and attributes are predictable from digital records of human behavior
4
Private traits and attributes are predictable from digital records of human behavior
• openness• extraversion• age• sexual orientation• gender• ethnicity• etc.
5
Private traits and attributes are predictable from digital records of human behavior
• openness• extraversion• age• sexual orientation• gender• ethnicity• etc.→
Problem
• Information about people might be predicted.
• For example, studies have shown that attributes can be predicted from browsing logs, used language.
6
Problem
• Information about people might be predicted.
• For example, studies have shown that attributes can be predicted from browsing logs, used language.
7
Question: Use basic digital records to automatically and accurately estimate personal attributes?
Contribution
With Facebook likes, we can accurately estimate a wide range of personal attributes (typically assumed private).
8
Approach - Data
Objects: quotes, web sites, press articles, books, images, etc.
9
Likes are shared with friends to express support, bookmarking, etc.
Approach - Data
10
9 million unique objects liked by users.
A majority of the objects associated with very few users.
Discard likes with < 20 users and users with < 2 likes.
What remains?
Approach - Data
11
55,814 Objects
58,4
66 U
sers
…
…
Approach - Data
12
55,814 Objects
58,4
66 U
sers
…
…
user
object
Approach - Data
13
55,814 Objects
58,4
66 U
sers
…
…
0 or 1user
object
Approach - Labels
14
Personality traits with the International Personality Item Pool (IPIP).
Religion, political party, etc. from Facebook profile.
Ethnicity by looking at users’ pictures.
Two types: dichotomous and numeric.
Approach - Models
Reduce the dimensionality of the User-Like matrix with SVD.
15
Use 100 components.
Build models that predict traits and attributes.
For numeric variables: linear regressionFor dichotomous variables: logistic regression
Approach - Overview
16
Highest accuracy: gender & ethnicity
Lowest accuracy: divorced parents
17
Results - Dichotomous Variables
Results - Numeric Variables
18
Results - Predictive Likes
19
Results - Predictive Likes
20
Results
21
Results - Power of Likes
22
Even a single like resultsin nonnegligible accuracy.
Results - Overview
23
Few users were associated with explicitly revealing Likes.
Less than 5% of gay users liked explicitly gay objects.
Results - Overview
24
Likes can accurately predict individual traits and attributes.
Few users were associated with explicitly revealing Likes.
Less than 5% of gay users liked explicitly gay objects.
ConclusionPersonal attributes, ranging from sexual orientation to intelligence, can be automatically and accurately inferred using their Facebook’s likes.
25
PROS CONS
• Improve products and services• Improve recommendations• New avenues in psychology
• Revealing without consent (danger)• What do we reveal?• Distrust in online services
ConclusionPersonal attributes, ranging from sexual orientation to intelligence, can be automatically and accurately inferred using their Facebook’s likes.
26
PROS CONS
• Improve products and services• Improve recommendations• New avenues in psychology
• Revealing without consent (danger)• What do we reveal?• Distrust in online services
Thank y
ou!