Upload
leyna
View
55
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Automated Personality Classification. A. KARTELJ and V. FILIPOVIC School of Mathematics, University of Belgrade, Serbia and V. MILUTINOVIC School of Electrical Engineering, University of Belgrade, Serbia. Agenda. Problem overview Classification of the existing solutions - PowerPoint PPT Presentation
Citation preview
Automated Personality Automated Personality ClassificationClassification
A. KARTELJ and V. FILIPOVICSchool of Mathematics, University of Belgrade, SerbiaandV. MILUTINOVICSchool of Electrical Engineering, University of Belgrade, Serbia
AgendaAgendaProblem overviewClassification of the existing solutionsPresentation of the existing solutionsComparison of the solutionsWork in progress:
Bayesian Structure Learning for the APC
Future work: Video Based APC
ConclusionsMULTI 2012 23.10.2012
Problem Problem OOverviewverview
MULTI 2012 33.10.2012
The Big 5 ModelThe Big 5 Model
MULTI 2012 43.10.2012
The The SSteps in teps in OOur ur RResearchesearch1. Survey paper
(under review at ACM CSUR)2. Research paper:
A new APC model based on Bayesian structure learning (in progress)
3. Real-purpose applicationof the APC model from step 2
4. Go to step 3 MULTI 2012 53.10.2012
Elements of APCElements of APCCorpus:
Essay, weblog, email, news group, Twitter counts...
Personality measurement:Questionnaire (internet and written). We are searching for an alternative!
Model:Stylistic analysis, linguistic features, machine learning techniques
MULTI 2012 63.10.2012
ApplicationsApplications
MULTI 2012 73.10.2012
Mining Mining People’s People’s CharacteristicsCharacteristics
MULTI 2012 83.10.2012
Classification of SolutionsClassification of Solutions
MULTI 2012 93.10.2012
• C1 criterion separates solutions by type of conversation (1 = self-reflexive, N = continuous)
• C2 criterion separates solutions by approach (TD = top-down, DD = data-driven, or HY = hybrid)
Linguistic Styles: Linguistic Styles: Language Use as an Individual Language Use as an Individual DifferenceDifferencePennebaker Pennebaker and King and King [[1999]1999]
MULTI 2012 103.10.2012
LIWC and MRC FeaturesLIWC and MRC FeaturesFeature Type ExampleAnger words LIWC Hate, killMetaphysical issues LIWC God, heaven, coffinPhysical state / function
LIWC Ache, breast, sleep
Inclusive words LIWC With, and, includeSocial processes LIWC Talk, us, friendFamily members LIWC Mom, brother, cousinPast tense verbs LIWC Walked, were, hadReferences to friends
LIWC Pal, buddy, coworker
Imagery of words MRC Low: future, peace – High: table, car
Syllables per word MRC Low: a – High: uncompromisingly
Concreteness MRC Low: patience, candor – High: ship
Frequency of use MRC Low: duly, nudity – High: he, the
MULTI 2012 113.10.2012
What Are They Blogging About? What Are They Blogging About? Personality, Topic and Motivation in Personality, Topic and Motivation in BlogsBlogsGill et al. [2009]Gill et al. [2009]
MULTI 2012 123.10.2012
Taking Care of the Linguistic Taking Care of the Linguistic Features Features of Extraversionof ExtraversionGill and OberGill and Oberlander [2002]lander [2002]
MULTI 2012 133.10.2012
Personality Based Latent Personality Based Latent Friendship Mining Friendship Mining Wang et al. [2009]Wang et al. [2009]
MULTI 2012 143.10.2012
A Comparative Evaluation of Personality Estimation A Comparative Evaluation of Personality Estimation Algorithms for the Algorithms for the TWIN Recommender System TWIN Recommender System Roshchina et al. [2011]Roshchina et al. [2011]
MULTI 2012 153.10.2012
Predicting Personality Predicting Personality with Social Mediawith Social MediaGolbeck et al. [2011]Golbeck et al. [2011]
MULTI 2012 163.10.2012
Our Twitter Profiles, Our Our Twitter Profiles, Our Selves: Predicting Selves: Predicting Personality with TwitterPersonality with TwitterQuercia et al. Quercia et al. [[20112011]]
MULTI 2012 173.10.2012
Paper Input Corpus Features Algorithm Soft. Cit. I S A R[Pennebaker and King 1999] text essays LIWC correlations n/a 455 H H H M
[Mairesse et al. 2007] text, speech essays LIWC, MRC C4.5, NB, SMO,
M5’ Weka 99 M M H M
[Gill et al. 2009] text weblogs (14.8words) LIWC linear regression n/a 26 H H M M
[Yarkoni 2010] text weblogs (100K words) LIWC correlations n/a 21 H M M M
[Gill and Oberlander 2002] text emails (105
students) bigrams bigram analysis n/a 49 L M M L
[Nowson et al. 2005] text weblogs (410K words) word list correlations n/a 48 L H H L
[Oberlander 2006] text weblogs (410K words) N-grams NB, SMO Weka 53 H M H M
[Wang et al. 2009] text, weblogs (200 pairs) lexical freq. ,TFIDF
logistic regression Minitab 1 H M M M
[Iacobelli et al. 2011] text weblogs (3000) LIWC, bigrams, SVM, SMO, NB.. Weka 1 H H M H
[Argamon et al. 2005] text essays word list, conj.SMO Weka 38 H M M M[Argamon et al. 2007] text essays word list, conj.SMO Weka,
ATMan 45 H M M M
[Mairesse and Walker 2006]
text , conv. extracts
96 persons (≈100Kwords)
LIWC, MRC, utterance… RankBoost n/a 22 M M H M
[Rigby and Hassan 2007] text mail. lists (140K
emails) LIWC C4.5 Weka, SPSS 30 M H M L
[Roshchina et al. 2011] text TripAdvisor
reviews LIWC, MRC Linear, M5, SVM Weka 2 H M L M[Quercia et al. 2011] meta 335 Twitter users Twitter counts M5’ rules Weka 5 M H M M
[Golbeck et al. 2011] text, meta 279 FB users 5 classes
(161 in total)M5’ rules, Gaussian processes
Weka 12 H M M M
[Celli 2012] text 1065 posts 22 ling. Features
majority-based classification n/a 1 M M M MMULTI 2012 183.10.2012
Naive Bayes ClassifierNaive Bayes Classifier
MULTI 2012 193.10.2012
Naive Bayes and Bayesian Naive Bayes and Bayesian NetworkNetwork
MULTI 2012 203.10.2012
Bayesian Network for the Bayesian Network for the APCAPC
MULTI 2012 213.10.2012
Bayesian Network Structure Bayesian Network Structure LearningLearning1. Obtain corpus (training set T)2. Fit T to appropriate network structure
by:a) ILP formulation + solver (CPLEX, Gurobi…)
on smaller instancesb) Apply metaheuristic on larger instances
3. Validate quality of metaheuristic approach
4. Compare obtained APC accuracy with other approaches
MULTI 2012 223.10.2012
Other IdeasOther Ideas
MULTI 2012 23
Games with a purpose (GWAP)
Clustering personality characteristics
3.10.2012
Packing everything together: Packing everything together:
Video Based APCVideo Based APC
MULTI 2012 243.10.2012
ConclusionsConclusionsClassification of the existing solutions
(Survey paper)Filling the gaps inside classification
treeIntroducing Bayesian Structure
Learning for the APCUtilizing metaheuristics in dealing
with high dimensionalityAPC potential: social networks,
recommender, and expert systemsMULTI 2012 253.10.2012
THANK YOU.THANK YOU.Aleksandar Kartelj [email protected] Filipovic [email protected] Milutinovic [email protected]