Andorra Future Of Web Search Talk

Preview:

DESCRIPTION

A talk at the Andorra Workshop on the Future Of Web Search

Citation preview

The Social Media Opportunityand Application for Landmark Search

Mor Naaman

Yahoo! Advanced Development Research

Mor Naaman

Social Media Patterns:

Semantic space (from any text)

Activity and viewing data

User/personal data

Social network

Location/time metadata

2

Mor Naaman

E.g., Semantic Patterns

3

Mor Naaman

Heard of Flickr?

4

Mor Naaman

This is not An Arch

“Noisy” data

Photographer biases

Wrong data

6 kms5 kms

5

Mor Naaman

Tag Patterns

6

Mor Naaman

What is “Social Media”?

Online media published or shared by individuals and organizations, in an environment that encourages significant individual participation and that promotes curation, discussion and re-use.

7

Mor Naaman

Social Media = Context

Context is kingPredictor of content

Modifies perception of content

Social media: context also predicts activity?

8

Mor Naaman

Social Media = Challenge

Content is still hard…

Unstructured data (no semantics)

Tags, not ground truth labels

Noise

Scale • Computation

• Long tail means no supervised learning

9

Mor Naaman

Social Media = Opportunity

New types of data / context

Openness/need for new applications and experiences

Social environment encourages engagement

Opportunity for MM!10

User

CommunityAnalyze, extract

patterns

Support, motivate

Encourage, enable

Mor Naaman

Approach

Analyze context to extract patternsReduce content analysis to constrained scenario/task

Leverage content to improve metadata, relevance

11

Mor Naaman

Location-driven Modeling

12

Mor Naaman

Extracting Knowledge

13

More “activity” in a certain locationindicates the importance of that location

Tags that are unique to a certain location can be used to represent the location

Mor Naaman

Tag Maps - SF

14

Mor Naaman

Make a World Explorer

15

published in JCDL 2007

http://tagmaps.research.yahoo.com

Mor Naaman 16

Better Image Search

Mor Naaman

Approach

Analyze context to extract patternsReduce content analysis to constrained scenario/task

Leverage content to improve metadata, relevance

17

Mor Naaman

Rolling in Content

We identified the landmarks...

We know where they are...

We can get the matching photos...

18

Mor Naaman

System Overview

19

published in WWW 2008

published in ACM MM 2007

Mor Naaman

Learning from noisy labels

20

Mor Naaman

Visual Features

•Color: moments over a 5x5 grid

•Texture: Gabor over global image

• Interest points: SIFT

21

Mor Naaman

Ranking Clusters (1)

22

Same “objects” that appear often in cluster’s photos suggest relevance

Mor Naaman

Ranking Clusters (2)

23

Similarity between photos inside cluster versus outside the cluster suggests coherence

Use Visual Features to compare average intra-cluster and inter-cluster similarity

Mor Naaman

System Overview

24

Mor Naaman

Sample Results: Golden Gate

Tags-only Tags+Location Tags+Location+Visual

XX

X

X

XX

XX

X25

Mor Naaman

Performance: PrecisionP

re

cis

ion

@ 1

0

0

0.25

0.50

0.75

1.00

alcatraz

baybridge

coittower

deyoung

ferrybuilding

goldengatebridge

lombardstreet

palaceoffinearts

sfmoma

transamerica

average

Tag-Only Tag-Location Tag-Visual Tag-Location-Visual

26

+45% w/visual

+30% w/location

Mor Naaman

Performance: RepresentativeR

ep

res

en

tati

ve

Ph

oto

s

0

2.5

5.0

7.5

10.0

alcatraz

baybridge

coittower

deyoung

ferrybuilding

goldengatebridge

lombardstreet

palaceoffinearts

sfmoma

transamerica

average

Tag-Only Tag-Location Tag-Visual

27

Mor Naaman

Improve Relevance

28

Mor Naaman

Repeated in Other Context

Analyze context to extract patternsReduce content analysis to constrained scenario/task

Leverage content to improve metadata, relevance

29

Mor Naaman

Social Media @ Music Events

30

Analyze context to get set of media items from a single event

Use content (AF) to robustly synchronize the clips

Increase relevance,

findability

Mor Naaman

Social Media = Opportunity

To better understand media contentAnd robustly apply content analysis

To predict and enhance use and engagement

To invent the future of multimedia systems

31

Mor Naaman

Thanks

With: Lyndon Kennedy, Tye Rattenbury, Alex Jaffe, Shane Ahern, Rahul Nair, Jeannie Yang

Some Slides: http://slideshare.net/mor

All photos/icons CC or with permission

http://infolab.stanford.edu/~mor

mor@yahoo-inc.com , mor@cs.stanford.edu

32

Mor Naaman

Ranking images: low-level similarity

Euclidean distance from cluster centroid in color and texture space.

33

Mor Naaman

Notes: Fire Eagle

34

http://fireeagle.yahoo.net

Mor Naaman

Landmark Graph Structure

35

Close-up Shots of TowerLess-Connected PhotosHighly-Connected Photos Taken From Tower Far-away Shots

(a) Golden Gate Bridge (b) Coit Tower

Mor Naaman

Results: Palace of Fine Arts

XX

X

XXXX

Tags-only Tags+Location Tags+Location+Visual

36

Recommended