Upload
sydney-gordon
View
213
Download
1
Embed Size (px)
Citation preview
Towards Boosting Video Popularity via Tag Selection
Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu
University of British Columbia - Vancouver, CanadaUniversidade Federal de Minas Gerais - Belo Horizonte, Brazil
{elizeus,matei}@ece.ubc.ca{tpontes,jussara}@dcc.ufmg.br
Glasgow, Scotland - April 1st, 2014
SoMuS Workshop
Introduction100 hours of video per minute specialized content
management companies are responsible to:
publishmonitorpromote
the owner's content
ads
ads
ads
ads
ads
ads
revenues shared between the manager and the
video's owner
As revenue is directly related to the number of ad prints, this motivates managers to boost content popularity.
1
keyword/tag-based search
website
video repository
Introduction
Title
DescriptionTag Tag Tag Tag
commentcommentcomment
promotion campaign
URL
Viewers reach a content item via different leads:
2
Introduction
Textual features have a major impact on how users find a video, and consequently on
the advertisement-generated revenues.
Title
Description
Tag
comment
Focus on automated tag selection to boost video popularity.
Q1. What are the challenges in building a ground truth?Q2. Can the textual features of existing videos be further optimized to attract traffic?
3
ContextAssumption: annotating a video with the terms users would use to
search for it increases the chance that users view the video.
- Recommendation Pipeline -4
Data Sources
ExpertsPeers
5
Recommenders
● Frequency & Random Walk
● Goal is not to design a novel and more efficient tag recommendation algorithm.
● Goal is to understand whether videos currently published on YouTube have their tags optimized to attract search traffic.
Optimization
maximize
subject to
Let v be a video and C = < ki >, i = 1, …, n, be a list of candidate keywords:
length of ki in bytes
budget
xi ∈ {0,1} is an indicator variable
scoring function provided by the recommender for ki with respect to v
Ground Truth
Q1. What are the challenges in building a ground truth?
The ideal ground truth would be constructed through the creation of an experiment that vary the video tagset and capture their impact on the number of views attracted.
Challenges of this approach:
➔ collecting this requires the publishing rights for the videos;
➔ it is a time consuming experiment;
➔ unpopular videos (even with an optimized tagset) may bias the results.
Building the Ground Truth
Our task - Watch a movie trailer video and answer the following question:
What query terms would you use to search for this video?
For each video, associate a minimum of 3 and a maximum of 10 keywords.
Amazon Mechanical Turk (AMT)
AMT Task PropertiesProperty Value
# Videos 382
# Turkers 33
# Evaluations 1,146
$ / Task $0.30
$ / Hour $3.00
Total Cost $345.00
Quality Control - each evaluation was inspected before approval.
Ground Truth Characterization
58% of turkers evaluated more than 5 videos.
Observation: one turker evaluated 333 videos.
Ground Truth Characterization
96% of the videos have 10 or more different
keywords associated.
Observation: 82% of the evaluations provided more than the minimum required of 3 keywords.
Ground Truth Characterization
32% of videos have keywords summing up to 100 characters.
Observation: These values guided the budget parameter in our experiments.
Experimental EvaluationQ2. Can the current YouTube video tags be improved?
By improvement we mean recommending tags that better matches the ground truth.
Recommender algorithms: Frequency and Random Walk.
Success Metric: F3-measure .
Performance of the original YouTube tagset
Performance of the recommended tagset produced by all data sources combinedX
Experimental Results
Kolmogorov-Smirnovtest of significance:
Frequency: D- = 0.44p-value = 3.9 x 1016
Random Walk: D- = 0.43p-value = 5.5 x 10-15
The performance of All data sources is higher than the achieved by YouTube tags.
Conclusions
➔ Tags currently assigned to YouTube videos can be improved by automated methods to attract more search traffic;
➔ The results show that even simple recommenders with a combination of data sources can improve tags.
Future Work
➔ Compare the performance of data sources individually and grouped by type (peer vs. expert, structured vs. unstructured);
➔ Investigate if the number of contributors in a peer-produced data source affects the value of tags;
➔ Explore other classes of videos.
Thank
you!