26
Automated Personality Classification A. KARTELJ and V. FILIPOVIC School of Mathematics, University of Belgrade, Serbia and V. MILUTINOVIC School of Electrical Engineering, University of Belgrade, Serbia

Automated Personality Classification

  • Upload
    sierra

  • View
    38

  • Download
    1

Embed Size (px)

DESCRIPTION

Automated Personality Classification. A. KARTELJ and V. FILIPOVIC School of Mathematics, University of Belgrade, Serbia and V. MILUTINOVIC School of Electrical Engineering, University of Belgrade, Serbia. Agenda. Problem overview Classification of the existing solutions - PowerPoint PPT Presentation

Citation preview

Page 1: Automated Personality Classification

Automated Personality Classification

A. KARTELJ and V. FILIPOVICSchool of Mathematics, University of Belgrade, SerbiaandV. MILUTINOVICSchool of Electrical Engineering, University of Belgrade, Serbia

Page 2: Automated Personality Classification

AgendaProblem overviewClassification of the existing

solutionsPresentation of the existing

solutionsComparison of the solutionsWork in progress:

Bayesian Structure Learning for the APC

Future work: Video Based APC

Conclusions

MULTI 2012 23.10.2012

Page 3: Automated Personality Classification

Problem Overview

MULTI 2012 33.10.2012

Page 4: Automated Personality Classification

The Big 5 Model

MULTI 2012 43.10.2012

Page 5: Automated Personality Classification

The Steps in Our Research1. Survey paper

(under review at ACM CSUR)2. Research paper:

A new APC model based on Bayesian structure learning (in progress)

3. Real-purpose applicationof the APC model from step 2

4. Go to step 3 MULTI 2012 53.10.2012

Page 6: Automated Personality Classification

Elements of APCCorpus:

Essay, weblog, email, news group, Twitter counts...

Personality measurement:Questionnaire (internet and written). We are searching for an alternative!

Model:Stylistic analysis, linguistic features, machine learning techniques

MULTI 2012 63.10.2012

Page 7: Automated Personality Classification

Applications

MULTI 2012 73.10.2012

Page 8: Automated Personality Classification

Mining People’s Characteristics

MULTI 2012 83.10.2012

Page 9: Automated Personality Classification

Classification of Solutions

MULTI 2012 93.10.2012

• C1 criterion separates solutions by type of conversation (1 = self-reflexive, N = continuous)

• C2 criterion separates solutions by approach (TD = top-down, DD = data-driven, or HY = hybrid)

Page 10: Automated Personality Classification

Linguistic Styles: Language Use as an Individual DifferencePennebaker and King [1999]

MULTI 2012 103.10.2012

Page 11: Automated Personality Classification

LIWC and MRC FeaturesFeature Type ExampleAnger words LIWC Hate, killMetaphysical issues LIWC God, heaven, coffinPhysical state / function

LIWC Ache, breast, sleep

Inclusive words LIWC With, and, includeSocial processes LIWC Talk, us, friendFamily members LIWC Mom, brother, cousinPast tense verbs LIWC Walked, were, hadReferences to friends

LIWC Pal, buddy, coworker

Imagery of words MRC Low: future, peace – High: table, car

Syllables per word MRC Low: a – High: uncompromisingly

Concreteness MRC Low: patience, candor – High: ship

Frequency of use MRC Low: duly, nudity – High: he, the

MULTI 2012 113.10.2012

Page 12: Automated Personality Classification

What Are They Blogging About? Personality, Topic and Motivation in BlogsGill et al. [2009]

MULTI 2012 123.10.2012

Page 13: Automated Personality Classification

Taking Care of the Linguistic Features of ExtraversionGill and Oberlander [2002]

MULTI 2012 133.10.2012

Page 14: Automated Personality Classification

Personality Based Latent Friendship Mining Wang et al. [2009]

MULTI 2012 143.10.2012

Page 15: Automated Personality Classification

A Comparative Evaluation of Personality Estimation Algorithms for the TWIN Recommender System Roshchina et al. [2011]

MULTI 2012 153.10.2012

Page 16: Automated Personality Classification

Predicting Personality with Social MediaGolbeck et al. [2011]

MULTI 2012 163.10.2012

Page 17: Automated Personality Classification

Our Twitter Profiles, Our Selves: Predicting Personality with TwitterQuercia et al. [2011]

MULTI 2012 173.10.2012

Page 18: Automated Personality Classification

Paper Input Corpus Features Algorithm Soft. Cit. I S A R

[Pennebaker and King 1999] text essays LIWC correlations n/a 455 H H H M

[Mairesse et al. 2007] text, speech essays LIWC, MRC C4.5, NB, SMO,

M5’ Weka 99 M M H M

[Gill et al. 2009] text weblogs (14.8words) LIWC linear regression n/a 26 H H M M

[Yarkoni 2010] text weblogs (100K words) LIWC correlations n/a 21 H M M M

[Gill and Oberlander 2002] text emails (105

students) bigrams bigram analysis n/a 49 L M M L

[Nowson et al. 2005] text weblogs (410K words) word list correlations n/a 48 L H H L

[Oberlander 2006] text weblogs (410K words) N-grams NB, SMO Weka 53 H M H M

[Wang et al. 2009] text, weblogs (200 pairs) lexical freq. ,TFIDF

logistic regression Minitab 1 H M M M

[Iacobelli et al. 2011] text weblogs (3000) LIWC, bigrams, SVM, SMO, NB.. Weka 1 H H M H

[Argamon et al. 2005] text essays word list, conj. SMO Weka 38 H M M M

[Argamon et al. 2007] text essays word list, conj. SMO Weka, ATMan 45 H M M M

[Mairesse and Walker 2006]

text , conv. extracts

96 persons (≈100Kwords)

LIWC, MRC, utterance… RankBoost n/a 22 M M H M

[Rigby and Hassan 2007] text mail. lists (140K

emails) LIWC C4.5 Weka, SPSS 30 M H M L

[Roshchina et al. 2011] text TripAdvisor

reviews LIWC, MRC Linear, M5, SVM Weka 2 H M L M

[Quercia et al. 2011] meta 335 Twitter users Twitter counts M5’ rules Weka 5 M H M M

[Golbeck et al. 2011] text, meta 279 FB users 5 classes

(161 in total)M5’ rules, Gaussian processes

Weka 12 H M M M

[Celli 2012] text 1065 posts 22 ling. Features

majority-based classification n/a 1 M M M M

MULTI 2012 183.10.2012

Page 19: Automated Personality Classification

Naive Bayes Classifier

MULTI 2012 193.10.2012

Page 20: Automated Personality Classification

Naive Bayes and Bayesian Network

MULTI 2012 203.10.2012

Page 21: Automated Personality Classification

Bayesian Network for the APC

MULTI 2012 213.10.2012

Page 22: Automated Personality Classification

Bayesian Network Structure Learning1. Obtain corpus (training set T)2. Fit T to appropriate network

structure by:a) ILP formulation + solver (CPLEX,

Gurobi…) on smaller instances

b) Apply metaheuristic on larger instances

3. Validate quality of metaheuristic approach

4. Compare obtained APC accuracy with other approaches

MULTI 2012 223.10.2012

Page 23: Automated Personality Classification

Other Ideas

MULTI 2012 23

Games with a purpose (GWAP)

Clustering personality characteristics

3.10.2012

Page 24: Automated Personality Classification

Packing everything together: Video Based APC

MULTI 2012 243.10.2012

Page 25: Automated Personality Classification

ConclusionsClassification of the existing

solutions (Survey paper)Filling the gaps inside

classification treeIntroducing Bayesian Structure

Learning for the APCUtilizing metaheuristics in

dealing with high dimensionality

APC potential: social networks, recommender, and expert systems

MULTI 2012 253.10.2012