58
To be or not be engaged: What are the ques2ons (to ask)? Mounia Lalmas Yahoo! Labs Barcelona [email protected] 1

To be or not be engaged: What are the questions (to ask)?

Embed Size (px)

DESCRIPTION

In the online world, user engagement refers to the quality of the user experience that emphasizes the phenomena associated with wanting to use a web application longer and frequently. User engagement is a multifaceted, complex phenomenon, giving rise to a number of approaches for its measurement: self-reporting (e.g., questionnaires); observational methods (e.g., facial expression analysis, desktop actions); and web analytics using online behavior metrics. These methods represent various trade-offs between the scale of the data analyzed and the depth of understanding. For instance, surveys are hardly scalable but offer rich, qualitative insights, whereas click data can be collected on a large-scale but are more difficult to analyze. Still, the core research questions each type of measurement is able to answer are unclear. This talk will present various efforts aiming at combining approaches to measure engagement and seeking to provide insights into what questions to ask when measuring engagement. Keynote at 18th International Conference on Application of Natural Language to Information Systems (NLDB2013), University of Salford, MediaCityUK Blog: http://labtomarket.wordpress.com

Citation preview

Page 1: To be or not be engaged: What are the questions (to ask)?

To  be  or  not  be  engaged:    What  are  the  ques2ons  (to  ask)?  

Mounia  Lalmas  Yahoo!  Labs  Barcelona  [email protected]  

1  

Page 2: To be or not be engaged: What are the questions (to ask)?

About  me  

•  Since  January  2011:  Visi2ng  Principal  Scien2st  at  Yahoo!  Labs  Barcelona  •  User  engagement,  social  media,  search  

•  1999-­‐2008:  Lecturer  (assistant  professor)  to  Professor  at  Queen  Mary,  University  of  London  •  XML  retrieval  and  evalua>on  (INEX)  

•  2008-­‐2010:  MicrosoR  Research/RAEng  Research  Professor  at  the  University  of  Glasgow  

•  Quantum  theory  to  model  informa>on  retrieval  

Blog:  labtomarket.wordpress.com  2  

Page 3: To be or not be engaged: What are the questions (to ask)?

Why  is  it  important  to  engage  users?  

•  In  today’s  wired  world,  users  have  enhanced  expecta>ons  about  their  interac>ons  with  technology  

 …  resul>ng  in  increased  compe>>on  amongst  the                                    purveyors  and  designers  of  interac>ve  systems.    •  In  addi>on  to  u>litarian  factors,  such  as  usability,  we  must  consider  the  hedonic  and  experien>al  factors  of  interac>ng  with  technology,  such  as  fun,  fulfillment,  play,  and  user  engagement.  

•  In  order  to  make  engaging  systems,  we  need  to  understand  what  user  engagement  is  and  how  to  measure  it.  

3  

Page 4: To be or not be engaged: What are the questions (to ask)?

Why  is  it  important  to  measure  and  interpret  user  engagement  well?  

CTR  

4  

Page 5: To be or not be engaged: What are the questions (to ask)?

Outline  •  What  is  user  engagement?  

•  What  are  the  characteris>cs  of  user  engagement?  

•  How  to  measure  user  engagement?  

•  What  are  the  ques>ons  to  ask?  

saliency,  interes>ng,  serendipity,  relevance,  sen>ment,  reading,  news,  social  media,    user  generated  content,  automa>c  linking,  aesthe>cs.  

5  

Page 6: To be or not be engaged: What are the questions (to ask)?

WHAT  IS  USER  ENGAGEMENT?  

6  

Page 7: To be or not be engaged: What are the questions (to ask)?

h[p://thenextweb.com/asia/2013/05/03/kakao-­‐talk-­‐rolls-­‐out-­‐plus-­‐friend-­‐home-­‐a-­‐revamped-­‐pla_orm-­‐to-­‐connect-­‐users-­‐with-­‐their-­‐favorite-­‐brands/  

Engagement  is  on  everyone’s  mind  

h[p://socialbarrel.com/70-­‐percent-­‐of-­‐brand-­‐engagement-­‐on-­‐pinterest-­‐come-­‐from-­‐users/51032/  

h[p://iac>onable.com/user-­‐engagement/  

h[p://www.cio.com.au/ar>cle/459294/heart_founda>on_uses_gamifica>on_drive_user_engagement/  

h[p://www.localgov.co.uk/index.cfm?method=news.detail&id=109512  

h[p://www.trefis.com/stock/lnkd/ar>cles/179410/linkedin-­‐makes-­‐a-­‐90-­‐million-­‐bet-­‐on-­‐pulse-­‐to-­‐help-­‐drive-­‐user-­‐engagement/2013-­‐04-­‐15   7  

Page 8: To be or not be engaged: What are the questions (to ask)?

What  is  user  engagement?  

User  engagement  is  a  quality  of  the  user  experience  that  emphasizes  the  posi>ve  aspects  of  interac>on  –  in  par>cular  the  fact  of  being  cap>vated  by  the  technology  (Ahield  et  al,  2011).        

 

   

     

user  feelings:  happy,  sad,  excited,  …  

emo>onal,  cogni>ve  and  behavioural  connec>on    that  exists,  at  any  point  in  >me  and  over  >me,  between    a  user  and  a  technological  resource    

user  interac2ons:  click,    read,  comment,  buy…  

user  mental  states:  involved,    lost,  concentrated…  

8  

Page 9: To be or not be engaged: What are the questions (to ask)?

Considera2ons  in  the  measurement  of  user  engagement  •  Short  term  (within  session)  and  long  term  (across  mul>ple  sessions)  

•  Laboratory  vs.  field  studies  •  Subjec>ve  vs.  objec>ve  measurement  •  Large  scale  (e.g.,  dwell  >me  of  100,000  people)  vs.  small  scale  (gaze  pa[erns  of  10  people)  

• User  engagement  as  process  vs.  as  product  

One  is  not  be[er  than  other;  it  depends  on  what  is  the  aim.  

9  

Page 10: To be or not be engaged: What are the questions (to ask)?

CHARACTERISTICS  OF  USER  ENGAGEMENT  

10  

Page 11: To be or not be engaged: What are the questions (to ask)?

Characteris2cs  of  user  engagement  (I)  

• Users  must  be  focused  to  be  engaged    • Distor>ons  in  the  subjec>ve  percep>on  of  >me  used  to  measure  it  

Focused  a_en2on  (Webster  &  Ho,  1997;  O’Brien,  

2008)  

• Emo>ons  experienced  by  user  are  intrinsically  mo>va>ng  • Ini>al  affec>ve  “hook”  can  induce  a  desire  for  explora>on,  ac>ve  discovery  or  par>cipa>on  

Posi2ve  Affect    (O’Brien  &  Toms,  2008)  

• Sensory,  visual  appeal  of  interface  s>mulates  user  &  promotes  focused  a[en>on  

• Linked  to  design  principles  (e.g.  symmetry,  balance,  saliency)  

Aesthe2cs    (Jacques  et  al,  1995;  O’Brien,  2008)  

• People  remember  enjoyable,  useful,  engaging  experiences  and  want  to  repeat  them  

• Reflected  in  e.g.  the  propensity  of  users  to  recommend  an  experience/a  site/a  product  

Endurability      (Read,  MacFarlane,  &  Casey,  2002;  

O’Brien,  2008)  11  

Page 12: To be or not be engaged: What are the questions (to ask)?

Characteris2cs  of  user  engagement  (II)  

• Novelty,  surprise,  unfamiliarity  and  the  unexpected  • Appeal  to  users’  curiosity;  encourages  inquisi>ve  behavior  and  promotes  repeated  engagement  

Novelty    (Webster  &  Ho,  1997;  O’Brien,  

2008)    

• Richness  captures  the  growth  poten>al  of  an  ac>vity  • Control  captures  the  extent  to  which  a  person  is  able  to  achieve  this  growth  poten>al  

Richness  and  control  (Jacques  et  al,  1995;  Webster  &  

Ho,  1997)  

•  Trust  is  a  necessary  condi>on  for  user  engagement  •  Implicit  contract  among  people  and  en>>es  which  is  more  than  technological  

Reputa2on,  trust  and  expecta2on  (Attfield et al,

2011)  

• Difficul>es  in  sevng  up  “laboratory”  style  experiments  

• Why  should  users  engage?  

Mo2va2on,  interests,  incen2ves,  and  benefits  (Jacques  et  al.,  1995;  O’Brien  &  

Toms,  2008)  12  

Page 13: To be or not be engaged: What are the questions (to ask)?

MEASURING  USER  ENGAGEMENT  

13  

Page 14: To be or not be engaged: What are the questions (to ask)?

Measuring  user  engagement  Measures   Characteris2cs  

Self-­‐reported  engagement  

Ques>onnaire,  interview,  report,  product  reac>on  cards,  think-­‐aloud  

Subjec>ve  Short-­‐  and  long-­‐term  Lab    and  field  Small-­‐scale  Product  outcome  

Cogni>ve  engagement  

Task-­‐based  methods  (>me  spent,  follow-­‐on  task)    Physiological  measures  (e.g.  EEG,  SCL,  fMRI,  eye  tracking,  mouse-­‐tracking)  

Objec>ve  Short-­‐term  Lab  and  field  Small-­‐scale  and  large-­‐scale  Process  outcome  

Interac>on  engagement  

Web  analy>cs      metrics  +  models  

Objec>ve  Short-­‐  and  long-­‐term  Field    Large-­‐scale  Process  outcome  

14  

Page 15: To be or not be engaged: What are the questions (to ask)?

Large-­‐scale  measurements  of  user  engagement  –  Web  analy2cs  

Intra-­‐session  measures   Inter-­‐session  measures  

•  Dwell  >me  /  session  dura>on  •  Play  >me  (video)  •  (Mouse  movement)  •  Click  through  rate  (CTR)  •  Mouse  movement  •  Number  of  pages  viewed  (click  

depth)  •  Conversion  rate  (mostly  for  e-­‐

commerce)  •  Number  of  UCG  (comments)    

•  Frac>on  of  return  visits    •  Time  between  visits  (inter-­‐session  

>me,  absence  >me)  •  Total  view  >me  per  month  (video)  •  Life>me  value  (number  of  ac>ons)  •  Number  of  sessions  per  unit  of  >me  •  Total  usage  >me  per  unit  of  >me  •  Number  of  friends  on  site  (social  

networks)  •  Number  of  UCG  (comments)  

•  Intra-­‐session  engagement  measures  our  success  in  a[rac>ng  the  user  to  remain  on  our  site  for  as  long  as  possible.  

•  Inter-­‐session  engagement  can  be  measured  directly  or,  for  commercial  sites,  by  observing  life>me  customer  value.  

15  

Page 16: To be or not be engaged: What are the questions (to ask)?

Cogni2ve  engagement  •  Eye  tracking  • Mouse  movement  •  Face  expression  •  Psychophysiological  measures          Respira>on,  Pulse  rate          Temperature,  Brain  wave,          Skin  conductance,  …  

 

16  

Page 17: To be or not be engaged: What are the questions (to ask)?

Signals  –  Signals  –  Signals:  Five  studies  

self-­‐reported  engagement  

 

WHAT  ARE  THE  QUESTIONS  TO  ASK?  

Interac>on  

engagement    

17  

Page 18: To be or not be engaged: What are the questions (to ask)?

STUDY  I  •  Domain:  entertainment  news  •  Study:  saliency  •  Measurement:  focus  a[en>on  and  affect  

18  +  Lori  McCay-­‐Peet  +  Vidhya  Navalpakkam  

Page 19: To be or not be engaged: What are the questions (to ask)?

• How  the  visual  catchiness  (saliency)  of  “relevant”  informa>on  impacts  user  engagement  metrics  such  as  focused  a[en>on  and  emo>on  (affect)  •  focused  a_en2on  refers  to  the  exclusion  of  other  things  

•  affect  relates  to  the  emo>ons  experienced  during  the  interac>on  

•  Saliency  model  of  visual  a[en>on  developed                        by  (Iv  &  Koch,  2000)      

Self-­‐report  engagement  

19  

Page 20: To be or not be engaged: What are the questions (to ask)?

Manipula2ng  saliency  

Web  page  screenshot    

 Saliency  maps    

salient  con

di>o

n  no

n-­‐salient  con

di>o

n  

(McCay-­‐Peet  et  al,  2012)  20  

Page 21: To be or not be engaged: What are the questions (to ask)?

Study  design  •  8  tasks  =  finding  latest  news  or  headline  on  celebrity  or  entertainment  topic  

•  Affect  measured  pre-­‐  and  post-­‐  task  using  the  Posi>ve  e.g.  “determined”,  “a[en>ve” and  Nega>ve  e.g.  “hos>le”,  “afraid”  Affect  Schedule  (PANAS)    

•  Focused  a[en>on  measured    with  7-­‐item  focused  a4en5on  subscale  e.g.  “I  was  so  involved  in  my  news  tasks  that  I  lost  track  of  >me”,  “I  blocked  things  out  around  me  when  I  was  comple>ng  the  news  tasks”  and  perceived  >me  

•  Interest  level  in  topics  (pre-­‐task)  and  ques>onnaire  (post-­‐task)  e.g.  “I  was  interested  in  the  content  of  the  web  pages”,  “I  wanted  to  find  out  more  about  the  topics  that  I  encountered  on  the  web  pages”  

•  189  (90+99)  par>cipants  from  Amazon  Mechanical  Turk    21  

Page 22: To be or not be engaged: What are the questions (to ask)?

PANAS  (10  posi2ve  items  and  10  nega2ve  items)  •  You  feel  this  way  right  now,  that  is,  at  the  present  moment      [1  =  very  slightly  or  not  at  all;  2  =  a  li[le;  3  =  moderately;    

           4  =  quite  a  bit;  5  =  extremely]              [randomize  items]  

     distressed,  upset,  guilty,  scared,  hos>le,      irritable,  ashamed,  nervous,  ji[ery,  afraid    interested,  excited,  strong,  enthusias>c,  proud,    alert,  inspired,  determined,  a[en>ve,  ac>ve  

(Watson,  Clark  &  Tellegen,  1988)  

22  

Page 23: To be or not be engaged: What are the questions (to ask)?

7-­‐item  focused  a_en2on  subscale      (part  of  the  31-­‐item  user  engagement  scale)      5-­‐point  scale  (strong  disagree  to  strong  agree)  

1.  I  lost  myself  in  this  news  tasks  experience  2.  I  was  so  involved  in  my  news  tasks  that  I  lost  track  of  >me  3.  I  blocked  things  out  around  me  when  I  was  comple>ng  the  

news  tasks  4.  When  I  was  performing  these  news  tasks,  I  lost  track  of  

the  world  around  me  5.  The  >me  I  spent  performing  these  news  tasks  just  slipped  

away  6.  I  was  absorbed  in  my  news  tasks    7.  During  the  news  tasks  experience  I  let  myself  go  

(O'Brien  &  Toms,  2010)   23  

Page 24: To be or not be engaged: What are the questions (to ask)?

Saliency  and  posi2ve  affect  

• When  headlines  are  visually  non-­‐salient  •   users  are  slow  at  finding  them,  report  more  distrac>on  due  to  web  page  features,  and  show  a  drop  in  affect  

• When  headlines  are  visually  catchy  or  salient  •   user  find  them  faster,  report  that  it  is  easy  to  focus,  and  maintain  posi>ve  affect  

•  Saliency  is  helpful  in  task  performance,  focusing/avoiding  distrac2on  and  in  maintaining  posi2ve  affect  

24  

Page 25: To be or not be engaged: What are the questions (to ask)?

Saliency  and  focused  a_en2on  • Adapted  focused  a[en>on  subscale  from  the  online  shopping  domain  to  entertainment  news  domain  

• Users  reported  “easier  to  focus  in  the  salient  condi>on”  BUT  no  significant  improvement  in  the  focused  a[en>on  subscale  or  differences  in  perceived  >me  spent  on  tasks  

 • User  interest  in  web  page  content  is  a  good  predictor  of  focused  a_en2on,  which  in  turn  is  a  good  predictor  of  posi2ve  affect    

25  

Page 26: To be or not be engaged: What are the questions (to ask)?

Self-­‐repor2ng,  crowdsourcing,  saliency  and  user  engagement  

•  Interac>on  of  saliency,  focused  a[en>on,  and  affect,  together  with  user  interest,  is  complex.  

•  Using  crowdsourcing  worked!  

• What  next?    •  include  web  page  content  as  a  quality  of  user  engagement  in  focused  a[en>on  scale  

•  more  “realis2c”  user  (interac>ve)  reading  experience  •  other  measurements:  mouse-­‐tracking,  eye-­‐tracking,  facial  expression  analysis,  etc.  

(McCay-­‐Peet,  Lalmas  &  Navalpakkam,  2012)   26  

Page 27: To be or not be engaged: What are the questions (to ask)?

STUDY  II  •  Domain:  news  and  user  generated  content  

(comments)  •  Study:  interes>ngness  and  sen>ment  •  Measurement:  focus  a[en>on,  affect  and  

gaze  

27  +  Ioannis  Arapakis  +  Barla  Cambazoglu  +  Mari-­‐Carmen  Marcos  +  Joemon  Jose  

Page 28: To be or not be engaged: What are the questions (to ask)?

Gaze  and  self-­‐repor2ng  

•  News  +  comments  •  Sen>ment,  interest  •  57  users  (lab-­‐based)  •  Reading  task  (114)  

 •  Ques>onnaire  (qualita>ve  data)  •  Record  eye  tracking                (quan>ta>ve  data)  

Three  metrics:  gaze,  focus  a[en>on  and  posi>ve  affect  

28  (Lin  et  al,  2007)  

Page 29: To be or not be engaged: What are the questions (to ask)?

Interes2ng  content  promote  users  engagement  metrics  

• All  three  metrics:    •  focus  a[en>on,  posi>ve  affect  &  gaze  

• What  is  the  right  trade-­‐off?  •  news  is  news  J    

• Can  we  predict?  •  provider,  editor,  writer,  category,  genre,  visual  aids,  …,  sen2mentality,  …  

• Role  of  user-­‐generated  content  (comments)  •  As  measure  of  engagement?  •  To  promote  engagement?   29  

Page 30: To be or not be engaged: What are the questions (to ask)?

Lots  of  sen2ments  but  with  nega2ve  connota2ons!  

•  Positive effect (and interest, enjoyment and wanted to know more) correlates

•  Positively (é) with sentimentality (lots of emotions)

•  Negatively (ê) with positive polarity (happy news)

Sen2Strenght  (from  -­‐5  to  5  per  word)            sen>mentality:  sum  of  absolute  values  (amount  of  sen>ments)          polairity:  sum  of  values  (direc>on  of  the  sen>ments:  posi>ve  vs    nega>ve)  

(Thelwall,  Buckley  &  Paltoglou,  2012)  

30  

Page 31: To be or not be engaged: What are the questions (to ask)?

Effect  of  comments  on  user  engagement  

•  6  ranking  of  comments:  •  most  replied,  most  popular,  newest  •  sen>mentality  high,  sen>mentality  low  •  polarity  plus,  polarity  minus  

•  Longer  gaze  on  •  newest  and  most  popular  for  interes>ng  news  •  most  replied  and  high  sen>mentality  for  non-­‐interes>ng  news  

• Can  we  leverage  this  to  prolong  user  a[en>on?  31  

Page 32: To be or not be engaged: What are the questions (to ask)?

Gaze,  sen2mentality,  interest  

•  Interes>ng  and  “a[rac>ve”  content!  •  Sen>ment  as  a  proxy  of  focus  a[en>on,  posi>ve  affect  and  gaze?  

• Next  •  Larger-­‐scale  study  •  Other  domains  (beyond  news!)  •  Role  of  social  signals  (e.g.  Facebook,  Twi[er)  •  Lots  more  data:  mouse  tracking,  EEG,  facial  expression  

(Arapakis  et  al.,  2013)  32  

Page 33: To be or not be engaged: What are the questions (to ask)?

STUDY  III  •  Domain:  news  and  social  media  (Wikipedia)  •  Study:  interes>ngness,  aesthe>cs,  task  •  Measurement:  focus  a[en>on,  affect  and  

mouse  movement  

33  +  David  Warnock  

Page 34: To be or not be engaged: What are the questions (to ask)?

Mouse  tracking  and  self-­‐repor2ng  •  324  users  from  Amazon  Mechanical  Turk  (between  subject  

design)  •  Two  domains  (BBC  News  and  Wikipedia)  •  Two  tasks  (reading  and  search)  •  “Normal  vs  Ugly” interface  

•  Ques>onnaires  (qualita>ve  data)  •  focus  a[en>on,  posi>ve  effect,  novelty,    •  interest,  usability,  aesthe>cs    •  +  demographics,  handeness  &  hardware  

•  Mouse  tracking  (quan>ta>ve  data)  •  movement  speed,  movement  rate,  click  rate,  pause  length,  

percentage  of  >me  s>ll  

34  

Page 35: To be or not be engaged: What are the questions (to ask)?

“Ugly” vs  “Normal”  Interface  (BBC  News)  

35  

Page 36: To be or not be engaged: What are the questions (to ask)?

Mouse  tracking  can  tell  about  •  Age  

•  Hardware  •  Mouse  •  Trackpad  

•  Task  •  Searching:  There  are  many  different  types  of  phobia.  What  is  Gephyrophobia  a  fear  of?  

•  Reading:  (Wikipedia)  Archimedes,  Sec5on  1:  Biography  

36  

Page 37: To be or not be engaged: What are the questions (to ask)?

Mouse  tracking  could  not  tell  much  about  •  focused  a[en>on  and  posi>ve  affect  •  user  interests  in  the  task/topic  

• BUT  BUT  BUT  BUT  •  “ugly”  variant  did  not  result  in  lower  aesthe>cs  scores  •   although  BBC  >  Wikipedia  

•  BUT  –  the  comments  le�  …  •  Wikipedia:  “The  website  was  simply  awful.  Ads  flashing  everywhere,  poor  text  colors  on  a  dark  blue  background.”;  “The  webpage  was  en5rely  blue.  I  don't  know  if  it  was  supposed  to  be  like  that,  but  it  definitely  detracted    

         from  the  browsing  experience.”  •  BBC  News:  “The  website's  layout  and  color  scheme  were  a  bitch  to  navigate  and  read.”;  “Comic  sans  is  a  horrible  font.”  

37  

Page 38: To be or not be engaged: What are the questions (to ask)?

Mouse  tracking  and  user  engagement  

•  Task  and  hardware  • Do  we  have  a  Hawthorne  Effect???  •  “Usability” vs  engagement  

•  “Even  uglier”  interface?    • Within-­‐  vs  between-­‐subject  design?  

• What  next?  •  Sequence  of  movements  • Automa>c  clustering  

(Warnock  &  Lalmas,  2013)    

38  

Page 39: To be or not be engaged: What are the questions (to ask)?

STUDY  IV  

•  Domain:  news  •  Study:  automa>c  linking  •  Measurement:  interes>ngness  

39  +  Ioannis  Arapakis  +Hakan  Ceylan  +  Pinar  Domnez  

Page 40: To be or not be engaged: What are the questions (to ask)?

Automatic linking & reading experience 40  

Keeping  users  reading  more  ar>cles  

Page 41: To be or not be engaged: What are the questions (to ask)?

LEPA: Linker for Events to Past Articles LEPA is a a fully automated approach to constructing hyperlinks in news articles using “simple” text processing and understanding techniques

Indexer  

• Processes  ar>cles  over  a  >me  period  by  extrac>ng  features  from  each  ar>cle  and  storing  them  to  facilitate  faster  retrieval  

Linker  

•  Iden>fies  sentences  that  contain  newsworthy  events  

•  For  each  such  event  it  retrieves  from  the  index  all  the  matching  ar>cles  and  links  the  top-­‐ranked  with  the  event   41  

Page 42: To be or not be engaged: What are the questions (to ask)?

Three-stage evaluation

Pilot  study  

Assessing  reading  experience  

Assessing  links  

+

42  

Page 43: To be or not be engaged: What are the questions (to ask)?

Pilot study • Rating results:

•  Bad: 35.15% •  Fair: 33.93% •  Good: 20% •  Excellent: 9.09% •  Not Judged: 1.81%

• With 63.03% of the links being good: •  initial evidence that LEPA is not too far from the

optimum achieved by human editors

Professional  editors        A  collec>on  of  system-­‐embedded        links  (164  ar>cle-­‐link  combina>ons)          5-­‐point  Likert  Scale:  (i)  bad,  (ii)  fair,        (iii)  good,  (iv)  excellent,  and  (v)  not        judged      

43  

Page 44: To be or not be engaged: What are the questions (to ask)?

Assessing  the  links:  are  they  related?  

•  664  par>cipants  recruited  through  Amazon  Mechanical  Turk;  between-­‐group  design  (two  groups)  

•  Precision  =  frac>on  of  links  (total=164)  that  received,  in  terms  of  relatedness,  a  score  equal  to,  or  greater  than,  3  on  a  5-­‐point  Likert  scale  

 System-Embedded Links Manually-curated Links

Participant A

Participant B

All Participant A

Participant B

All

Related to the main theme 49% 42% 45% 54% 51% 53%

Related to subtopic 21% 24% 22% 31% 34% 33% Tangentially related 13% 15% 14% 9% 12% 10% Unrelated 15% 16% 16% 5% 1% 3% Other 2% 2% 2% 1% 2% 1%

44  

Page 45: To be or not be engaged: What are the questions (to ask)?

Assessing  the  Reading  Experience  •  120  par>cipants  recruited  through  Amazom  Mechanical  Turk;  between-­‐groups  design  (three  groups)  

•  Editors  +  two  opposite  “extremes”  of  LEPA:    •  High  recall:  best  at  embedding  newsworthy  links  &  ar>cles  that  provide  interes>ng  insights  

•  High  precision:  best  in  terms  of  embedding  the  right  number  of  links  

good  topical  coverage  

informa2veness  

broader  perspec2ve  

interes2ng  insights  

good  topical  coverage  

link  presenta2on  

content  volume  

posi2ve  news  reading  

experience  45  

indu

c>ve,  the

ma>

c  coding  of  

open

-­‐end

ed  que

s>on

s  

Page 46: To be or not be engaged: What are the questions (to ask)?

Automa2c  linking  and  news  reading  experience  

•  Even  under  realis>c  and  uncontrolled  condi>ons,  performance  of  LEPA  comparable  to  that  of  editors,  and  in  some  cases  be[er  

• High  precision  vs.  high  recall  • High  precision  threshold  leads  to  a  be[er  news  reading  experience:  less  is  more      “They  were  too  many,  being  mostly  quite  long,  in  some  cases  more  than  half  the  length  of  the  main  ar5cle,  and  some5mes  they  repeated  the  same  iden5cal  informa5on”  

46  

Page 47: To be or not be engaged: What are the questions (to ask)?

STUDY  V  

•  Domain:  social  media  (Yahoo!  Answers  and  Wikipedia)  

•  Study:  serendipity  •  Measurement:  relevance,  unexpectedness,  

interes>ngness  

47  +  Ilaria  Bordino  +  Yelena  Mejova  

Page 48: To be or not be engaged: What are the questions (to ask)?

En2ty-­‐driven  Exploratory  Search  

Linguis-cally  Mo-vated  Seman-c  Aggrega-on  Engines    “transi5on  to  a  truly  seman5c  aggrega5on  paradigm  where  machines  understand  a  user’s  intent,  discover  and  organize  facts,  iden5fy  opinions,  experiences  and  trends”  

En>ty  Search  

we  build  an  en>ty-­‐driven  serendipitous  search  system  based  on  en>ty  networks  extracted  from  Wikipedia  and  Yahoo!  Answers  

Serendipity   finding  something  good  or  useful  while  not  specifically  looking  for  it,  serendipitous  search  systems  provide  relevant  and  interes>ng  results  

48  

Page 49: To be or not be engaged: What are the questions (to ask)?

Yahoo!  Answers        vs        Wikipedia  community-­‐driven  ques>on  &  answer  portal  •  67  336  144  ques>ons  &  261  770  047  answers  

•  January  1,  2010  –  December  31,  2011  

•  English-­‐language  

community-­‐driven  encyclopedia  •  3  795  865  ar>cles  •  as  of  end  of  December  

2011  •  English  Wikipedia  

curated  high-­‐quality  knowledge  variety  of  niche  topics  

minimally  curated  opinions,  gossip,  personal  info  

variety  of  points  of  view  

49  

Page 50: To be or not be engaged: What are the questions (to ask)?

Entity  &  Relationship  Extraction  

•  en2ty  –  any  well-­‐defined  concept  that  has  a  Wikipedia  page  •  rela2onship  –  a  topical  rela>onship/similarity  between  a  pair  of  en>>es  based  on  document  co-­‐occurrence  •  related  to  the  number  of  documents  in  which  the  two  en>>es  occur  

 

50  

Dataset   #  Nodes   #  Edges   Density   #  Isolated  

Yahoo!  Answers   896,799   112,595,138   0.00028   69,856  

Wikipedia   1,754,069   237,058,218   0.00015   82,381  

Dataset   Avg  Degree   Max  Degree   Size  of  Largest  CC  

Yahoo!  Answers   251   231,921   826,402  (92.15%)  

Wikipedia   270   346,070   1,671,241  (95.28%)  

Page 51: To be or not be engaged: What are the questions (to ask)?

Wikipedia  

51  

Yahoo!  Answers  

Page 52: To be or not be engaged: What are the questions (to ask)?

Retrieval  

Wikipedia   Yahoo!  Answers   Combined  

Precision  @  5   0.668   0.724   0.744  

MAP   0.716   0.762   0.782  

Jus>n  Bieber,  Nicki  Minaj,  Katy  Perry,  Shakira,  Eminem,  Lady  Gaga,  Jose  Mourinho,  Selena  Gomez,  Kim  Kardashian,  Miley  Cyrus,  Robert  Pavnson,  Adele  %28singer%29,  Steve  Jobs,  Osama  bin  Laden,  Ron  Paul,  Twi[er,  Facebook,  Ne_lix,  IPad,  IPhone,  Touchpad,  Kindle,  Olympic  Games,  Cricket,  FIFA,  Tennis,  Mount  Everest,  Eiffel  Tower,  Oxford  Street,  Nubcrburgring,  Hai>,  Chile,  Libya,  Egypt,  Middle  East,  Earthquake,  Oil  spill,  Tsunami,  Subprime  mortgage  crisis,  Bailout,  Terrorism,  Asperger  syndrome,  McDonal's,  Vitamin  D,  Appendici>s,  Cholera,  Influenza,  Pertussis,  Vaccine,  Childbirth  

3  labels  per  query-­‐result  pair  gold  standard  quality  control  

Yahoo!  Answers  Jon  Rubinstein  Timothy  Cook  Kane  Kramer  Steve  Wozniak  Jerry  York  

Wikipedia  System  7  

PowerPC  G4  SuperDrive  

Power  Macintosh  Power  Compu>ng  Corp.  

Steve  Jobs  •  Annotator  agreement  (overlap):  0.85  

•  Average  overlap  in  top  5  results:    <1    

52  

retrieve  en>>es  most  related  to  a  query  en>ty  using  random  walk  

Page 53: To be or not be engaged: What are the questions (to ask)?

|  relevant  &  unexpected  |  /  |  unexpected  |  number  of  serendipitous  results  out  of  all  of  the  unexpected  results  retrieved   |  relevant  &  unexpected  |  /  |  retrieved  |  

serendipitous  out  of  all  retrieved   53  

Baseline   Data   Top:  5  en>>es  that  occur  most  frequently   WP   0.63  (0.58)  in  top  5  search  from  Bing  and  Google   YA   0.69  (0.63)  Top  –WP:  same  as  above,  but  excluding     WP   0.63  (0.58)  Wikipedia  page  from  results   YA   0.70  (0.64)  Rel:  top  5  en>>es  in  the  related  query     WP   0.64  (0.61)  sugges>ons  provided  by  Bing  and  Google   YA   0.70  (0.65)  Rel  +  Top:  union  of  Top  and  Rel   WP   0.61  (0.54)   YA   0.68  (0.57)  

Serendipity    “making  fortunate  discoveries  by  accident”  Serendipity  =  unexpectedness  +  relevance  

   “Expected”  result  baselines  from  web  search  

Page 54: To be or not be engaged: What are the questions (to ask)?

Interes2ngness  ≠  Relevance    Interes2ng  >  Relevant  

 

                                                                                           Relevant  >  Interes2ng  

Oil  Spill    à      Penguins  in  Sweaters    WP  

Robert  Pavnson    à    Water  for  Elephants    WP  

Lady  Gaga    à    Britney  Spears        WP  

Egypt    à    Cairo  Conference      WP  

Ne_lix    à    Blu-­‐ray  Disc      YA  

Egypt    à    Ptolemaic  Kingdom    WP  &  YA  

54  (Bordino,  Mejova  &  Lalmas,  2013)    

Page 55: To be or not be engaged: What are the questions (to ask)?

Similarity  (Kendall’s  tau-­‐b)  between  result  sets  and  reference  ranking  

55  

 Data   tau-­‐b  Which  result  is  more    WP   0.162  relevant  to  the  query?    YA   0.336  If  someone  is  interested  in  the  query,  would    WP   0.162  they  also  be  interested  in  the  result?    YA   0.312  Even  if  you  are  not  interested  in  the  query,    WP   0.139  is  the  result  interes5ng  to  you  personally?    YA   0.324  Would  you  learn  anything  new  about    WP   0.167    the  query  from  the  results    YA   0.307  

Following  (Arguello  et  al,  2011)  1.  Labelers  provide  pairwise  

comparisons  between  results  2.  Combine  into  a  reference  ranking  3.  Compare  result  ranking  to  op>mal  

ranking  using  Kendall’s  tau  

Assessing    “interes2ngness”  

Page 56: To be or not be engaged: What are the questions (to ask)?

Multimedia search activities often driven by entertainment needs, not by information needs

Serendipity  in  multimedia  search?  

(Slaney,  2011)   56  

Page 57: To be or not be engaged: What are the questions (to ask)?

What  are  the  ques2ons  to  ask?  • No  one  measurement  is  perfect  or  complete.  • All  studies  (process  or  product)  have  different  constraints.  

• Need  to  ensure  methods  are  applied  consistently  with  a[en>on  to  reliability:  what  is  a  good  signal?  

• More  emphasis  should  be  placed  on  using  mixed  methods  to  improve  the  validity  of  the  measures.  

• Be  careful  of  the  WEIRD  syndrome  (Western,  Educated,  Industrialized,  Rich,  and  Democra>c)  

57  

Page 58: To be or not be engaged: What are the questions (to ask)?

Acknowledgements  

•  Collaborators:  Ioannis  Arapakis,  Ilaria  Bordino,  ,  Barla  Cambazoglu,  Hakan  Ceylan,  Pinar  Domnez,  Lori  McCay-­‐Peet,  Yelena  Mejova,  Vidhya  Navalpakkam,  David  Warnock,  and  others  at  Yahoo!  Labs.  

•  This  talk  uses  some  material  from  a  tutorial  “Measuring  User  Engagement”  given  at  WWW  2013,  Rio  de  Janeiro  (with  Heather  O’Brien  and  Elad  Yom-­‐Tov)  

Blog:  labtomarket.wordpress.com  

58