Towards Boosting Video Popularity via Tag Selection Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu University of British Columbia -

Towards Boosting Video Popularity via Tag Selection

Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu

University of British Columbia - Vancouver, CanadaUniversidade Federal de Minas Gerais - Belo Horizonte, Brazil

{elizeus,matei}@ece.ubc.ca{tpontes,jussara}@dcc.ufmg.br

Glasgow, Scotland - April 1st, 2014

SoMuS Workshop

Introduction100 hours of video per minute specialized content

management companies are responsible to:

publishmonitorpromote

the owner's content

ads

ads

ads

ads

ads

ads

revenues shared between the manager and the

video's owner

As revenue is directly related to the number of ad prints, this motivates managers to boost content popularity.

1

keyword/tag-based search

website

video repository

Introduction

Title

DescriptionTag Tag Tag Tag

commentcommentcomment

email

promotion campaign

URL

Viewers reach a content item via different leads:

2

Introduction

Textual features have a major impact on how users find a video, and consequently on

the advertisement-generated revenues.

Title

Description

Tag

comment

Focus on automated tag selection to boost video popularity.

Q1. What are the challenges in building a ground truth?Q2. Can the textual features of existing videos be further optimized to attract traffic?

3

ContextAssumption: annotating a video with the terms users would use to

search for it increases the chance that users view the video.

- Recommendation Pipeline -4

Data Sources

ExpertsPeers

5

Recommenders

● Frequency & Random Walk

● Goal is not to design a novel and more efficient tag recommendation algorithm.

● Goal is to understand whether videos currently published on YouTube have their tags optimized to attract search traffic.

Optimization

maximize

subject to

Let v be a video and C = < ki >, i = 1, …, n, be a list of candidate keywords:

length of ki in bytes

budget

xi ∈ {0,1} is an indicator variable

scoring function provided by the recommender for ki with respect to v

http://pt.wiktionary.org/w/index.php?title=%E2%88%88&action=edit&redlink=1

Ground Truth

Q1. What are the challenges in building a ground truth?

The ideal ground truth would be constructed through the creation of an experiment that vary the video tagset and capture their impact on the number of views attracted.

Challenges of this approach:

➔ collecting this requires the publishing rights for the videos;

➔ it is a time consuming experiment;

➔ unpopular videos (even with an optimized tagset) may bias the results.

Building the Ground Truth

Our task - Watch a movie trailer video and answer the following question:

What query terms would you use to search for this video?

For each video, associate a minimum of 3 and a maximum of 10 keywords.

Amazon Mechanical Turk (AMT)

AMT Task PropertiesProperty Value

# Videos 382

# Turkers 33

# Evaluations 1,146

$ / Task $0.30

$ / Hour $3.00

Total Cost $345.00

Quality Control - each evaluation was inspected before approval.

Ground Truth Characterization

58% of turkers evaluated more than 5 videos.

Observation: one turker evaluated 333 videos.


96% of the videos have 10 or more different

keywords associated.

Observation: 82% of the evaluations provided more than the minimum required of 3 keywords.


32% of videos have keywords summing up to 100 characters.

Observation: These values guided the budget parameter in our experiments.

Experimental EvaluationQ2. Can the current YouTube video tags be improved?

By improvement we mean recommending tags that better matches the ground truth.

Recommender algorithms: Frequency and Random Walk.

Success Metric: F3-measure .

Performance of the original YouTube tagset

Performance of the recommended tagset produced by all data sources combinedX

Experimental Results

Kolmogorov-Smirnovtest of significance:

Frequency: D- = 0.44p-value = 3.9 x 1016

Random Walk: D- = 0.43p-value = 5.5 x 10-15

The performance of All data sources is higher than the achieved by YouTube tags.

Conclusions

➔ Tags currently assigned to YouTube videos can be improved by automated methods to attract more search traffic;

➔ The results show that even simple recommenders with a combination of data sources can improve tags.

Future Work

➔ Compare the performance of data sources individually and grouped by type (peer vs. expert, structured vs. unstructured);

➔ Investigate if the number of contributors in a peer-produced data source affects the value of tags;

➔ Explore other classes of videos.

Thank

you!

Documents

Towards Boosting Video Popularity via Tag Selection Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu University of British Columbia -