Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Media Advertising Retrieval
Mihăiță Barbu Oana BaronVlad DogaruCătălin MoraruRoxana Murăruș
The Advertising Market
● Traditional advertising:○ Few, expensive opportunities○ Targeting en-masse, by immediate context only○ Difficult to measure effectiveness
● Internet advertising:
○ Billions of opportunities daily○ Open to personalization via rich context of impression○ Effectiveness is measurable: can measure click-through
rates as % of impressions, and conversions as % of clicks
The Advertising Market (2)
● "The internet has become as fundamental as television to advertisers"
Search Advertising Business ModelsCPI – Cost per Impression
● Also known as CPM, cost per 1000 (mille) ad impression
● Advertiser is charged whenever ad is shown
● Very popular in the mid-90s (all those banners on top of search result pages)
○ Still used, mainly for building brand recognition○ However, Display Advertising is much more prevalent today on
general Web pages, with targeted audience segments
● Very simple mechanism○ But requires trust between search engine and advertiser
● Risk is all on the advertiser
Search Advertising Business Models CPC – Cost per Click
● Advertisers bid for keywords that will trigger the ad’s placement
● The main business model of search engines since the turn of the century
● Still a relatively simple mechanism
● Shared risk/win-win situation – the success of an ad campaign is in the best interest of all parties
● Trackable by all parties, which seems to overcome the trust issue
○ However, invalid clicks are an issue
● Click fraud - clicks being done maliciously
Search Advertising Business ModelsCPA – Cost per Action
● Also known as cost per conversion
● Search engine/publisher gets a fee only if the user, upon clicking an ad, completes some transaction at the advertiser’s site
○ Doesn’t necessarily mean that the user buys something
● Amazon were the among the pioneers of this model by paying referral fees when revenue was generated
● Risk is all on the publisher/search engine
● Requires the advertiser to share some business data with search engine
○ But more difficult to spam
Search Advertising Business Models
Behavioral Advertising
● The way in which searchers examine a SERP is influenced by the position and relevance of the results.
● Searchers have a strong bias towards result entries at
higher positions on the SERP.
● "Golden Triangle" (“F-shaped pattern”) ○ a gaze heat map depicting the
distribution of visual attention
○ describes where people allocate their visual attention on SERPs
Behavioral Advertising (2)
● Types of searchers: ○ Exhaustive searchers - carefully look at all entries to find
which one fits their query best ○ Economic users - usually just click the most noticeable
relevant entry
● Longer snippets influence the search performance depending on the tasks type:
○ informational tasks - better search performance○ navigational tasks - degrade performance
● Focus on ads and related search
● Tested with different variation of the : ○ type of task (informational or navigational) ○ quality of the ads (relevant or irrelevant to the query)○ sequence in which ads of different quality were
presented
● Clicks and attention regarding: ○ the top results - more clicks than attention○ the top and right ads - more attention than clicks○ the top ads - get the same amount of attention as the
organic results, but much fewer clicks.
Eye-track study of ad's quality
Effects of Task Type● Time spent on SERP depends on the type of the tasks
○ Users spend more time when presented with informational tasks than with navigational tasks.
○ For informational tasks■ Users spend most of their time in the upper search
box and organic search results.■ The extra time that they had were strikingly not spent
on the top ads.■ There was no even distribution of attention, but most
of that was given to the top two organic search results.
○ Users spend twice as much time on the search box for informational tasks than navigational tasks.
Effects of Ad Quality● Good quality ads
○ users spend twice as much time on the top ads○ less attention was given to the organic search results.○ these ads had a direct effect on the performance and
attention of the users towards the SERP.
● Right rail ads seem to be ignored for the most part.
● The order in which users see pages with good ads or bad ads have a strong effect on the behavior of users.
○ For displaying good and bad ads in random order, the ads tend to be ignored even more, despite their quality.
○ For consistently showing good ads, the ads get more attention and clicks.
Content Analysis
Image Advertising
● The attempt to create a favorable mental picture of a product or firm in mind of consumers.
● This image aims to associate the advertised product and/or
firm with certain lifestyles or values.
● Three basic functions:○ increase consumer awareness○ convert the awareness into familiarity○ use the familiarity to influence consumer buying
behavior.
Why is Image Retrieval Hard ?● What is the topic of this image ?● What are right keywords to
index this image ?● What words would you use to
retrieve this image ?=> The Semantic Gap
● A picture is worth a thousand words ● The meaning of an image is highly individual and
subjective
Image retrieval frameworks
● Text-based Image Retrieval○ images are manually annotated by human labelers and then
searched using annotated keywords. ○ disadvantage: manually annotating large quantity of images is
too tedious and time-consuming.
● Content-based Image Retrieval (CBIR) ○ images are indexed by their visual content, such as color,
texture, and shapes. ○ disadvantage: still, the effectiveness is limited by the semantic
gap between low-level image features and high-level semantic concepts
Bridging the semantic gap in CBIR
1. Semantic similarity based on text models
● A document = the textual information around the image
● Term-level model○ based on tf-idf & cosine similarity
● Topic-level model
○ Latent Dirichlet Allocation & Kullback-Leibler divergence○ LDA is a generative probabilistic model of a corpus○ documents are random mixtures over latent topics
2. Model placement & Semantic maps
● Based on the K-means clustering
● For each cluster, take the corresponding○ sub image set○ sub textual document set
● For each sub document set D with N documents○ construct a local semantic map S defined as:
■ a square matrix with N x N elements■ cell (i,j) = semantic distance between docs i, j in D
calculated using the specific text model■ cels (i,i) set to zero
3. Ranking-based distance metric learning
● probabilistic model for the IR problem○ (any ranking list can be the final list with a certain probability )
● for each local set, learn a Mahalanobis (quadratic) distance metric M = ATA
● learn a linear transformation A of input space such as to get good retrieval performance in the transformed space
● design a ranking-based cost function to optimize leave-on-out retrieval
● base concepts for the model:○ visual distance distribution ○ semantic distance distribution○ cost function has scale invariant properties
● outputs: local models with distance metrics
4. Model fusion
● takes the local models and joins them into a final model● the final model is represented by a ranking function f● f outputs the ranking score for any image given an unseen
query image
Application to Search-based Image Annotation
● How SBIA differs from traditional annotation:○ web-scale training data○ unlimited vocabulary for annotation○ real-time speed
● Steps○ search for visually similar images○ mine text annotation from the retrieved images
● SBIA with ranking-based distance metric learning (SBIA-RDML) :
○ term model - the term score depends on terms co-occurrence in retrieved images documents
○ topic model (LDA) - the term score depends on probability of latent topics, given the retrieved image docs; probability of term within the topics; retrieved image ranking score
Bridging the Semantic Gap - Results
● 2 million images for training; annotated dataset for testing
● comparing RDML - term & topic with CBIR without distance metric learning and with CBIR using the LNCA algorithm
● LDA semantic measure based on topic-level model more
performing than cosine measure because:○ LDA doesn't require identical terms between all documents ○ cosine similarity doesn't distinguish polysemy & synonymy
● topic-level text model in annotation mining is very useful
when semantic maps are built upon topic-level model (best results)
Visual Contextual Advertising● Advertisements are recommended entirely based on the
visual context of an image● Recommendations for images with little or no text● Exploit the annotated image data from social Web sites
such as Flickr to link the visual feature space and the word space
Visual Contextual Advertising
● Model the visual contextual advertising problem with a Markov chain which utilizes annotated images to transform images from the image feature space to the word space.
● With the representations of images in word space, a language model for information retrieval is then applied to find the most relevant advertisements.
Video-based advertising
Adwords on Video Scripts
● Placement of relevant advertisments based on the content displayed to the user.
● When the content is video, automatic keyword extraction becomes a problem.
● Speech recognition is difficult, so we use written scripts for videos that already exists.
● Bad choice in some cases, as video conveys more than just a verbal message:
○ Someone drives a sports car while speaking about insurance – do you serve an insurance ad or a sports car ad?
Adwords on Video Scripts (2)
● Consider each (wordwise) unigram and bigram in the script as adword candidate.
● Use maximum entropy models as the learning algorithm.
● Additionally, use situation-aware advertising:○ Define a limited set of situations (scene
types) in a video.○ Categorize products based on shopping
websites and manual intervention.
Image AdvertisingApproaches:
● Directly matching ad to query image – common words for description
● Ontology design patterns – available text?● Based on large-scale video collection
Types of video frames:
● Story frames – main concepts (human knowledge)● Ads frames
Image Advertising (cont.)
Process:● Input: keyframes of video, image(s) query, text (tags, OCR,
surrounding texts)● Analyze similarities: visual and textual → find ads● Expand search results
– analyze the rest of the video – noise, scatterd topics
● Ads clustering and ranking● Multi-Modal Dirichlet Process Mixture Sets model – learns
key topics from results, ranks ads with learnt topics● Challenges: latent topics, determine number of topics, topic
mining and ads ranking, topic discovery
Video Advertising
What do users want?● less intrusiveness – not interrupted cognitive process
(commercial during an exciting scene)● relevant ads
What can we do?
● find the right timing● show contextually relevant ads
Detect the perfect timing
● Detect insertion points (logical shot boundary):● Detect discontinuity – scene fragmentation, merge
similar shots gradually ● Detect attractiveness – estimate attention or use a
model (values depend on neighbors' values)● Analyze audio-visual properties – spatial color, texture
histograms● Detect textual, visual-aural relevance
VideoSense – Ad ranking
Database of ads – which one to insert?● Use textual information describing the video:
– direct text: title, tags, [query], [captions], expanded tags – indirect text: categorization
● Representation for document D: (k1,...,kn; c1,...,cn) ki – keyword of direct text, associated with a weight wi ci – category associated with a probability pi Dx, Dy – textual documents R(Dx, Dy) - textual relevance
Vector Model
D represented as vector of weights:
● w(D) – weighting vector of index keywords● w(D) – could be tf-idf, but not such a good idea
(small tf, unstable idf)● Solution: use just term frequency
Probabilistic Model
D represented as a vector of probabilities
● Tree of predefined categories
d(ci) – depth of category cil(ci, cj) – depth of first common ancestor R depends on l(ci, cj) and a predefined parameter
Dynamic Ad Allocation
PageSense
● Style-wise web pageadvertising platform
● Automatic detection of style and position
● Rank the ads by semantic relevance and web page style
● Embed relevant ads in non-intrusive areas Detecting blank area and
embedding relevant style-consistent ads
PageSense - System overview
Three major components:
○ Ad matching engine○ Ad position detection engine○ Ad delivery engine
PageSense - Performance Evaluation
The evaluations on ad satisfactions
ImageSense
● contextual in-image advertising ● should be locally relevant
○ surrounding text○ visual content
ImageSense - Segmentation
Vision-based structure of a sample Web page
ImageSense - Textual relevance
ImageSense - Content relevance
An example of saliency map and weight map of an image. (a) original image (b) saliency map with weight grids overlaid (c) weight map.
Multimedia Answering
● Targets Q&A communities (Yahoo Answers, Stack Overflow, Quora etc.)
● Some answers are more easily expressed with images or video than text.
● Paper proposes automatic answering of questions by text, multimedia, or a combination thereof.
● Approaches already exist, but in narrow domains (e.g. cooking).
● Two step approach:○ First, answer a question automatically with text (choose
from previous answers).○ Second, from the chosen answer, choose multimedia
content that answers the question.
Answer Medium Selection
● Some questions can be answered text-only: "When did Kim Jong-il die?"
● Split answers into four categories:○ Text only○ Text and image○ Text and video○ Text, image and video
● Certain keywords induce text-only answers: be, can, will, have, when, how+adj/adv.
● Opposite end of the spectrum: people or events deemed important: president, king, singer, battle, war
Multimedia Query Generation and Selection
● Can be generated from either the original question, or the textual answer which was chosen in the first step.
○ Q: "What does Disneyland look like?"○ Q: "What is the capital of Romania?" A: "Bucharest."
● The proposed method chooses between three queries:○ Modified initial question○ Textual answer keywords○ Combination of the above two
● Existing search engines store keywords, surrounding text, title and alt text.
● That may not be useful for generated queries.● A query-dependent re-ranking algorithm is employed.
Trends
Microcomputations as Payment
● Online advertisment revenue is declining.● Users do not trust online ads.● Proposed framework lets users run
microcomputations on behalf of the website, in exchange for viewing content.
● The company behind the website can then exchange the results they got from users for currency.
Actors in the Micropayment Model
● Service Provider (e.g. newspaper): offers content and services
● User: content consumer, has to agree to run microcomputations
● Intermediary: is responsible for distributing microcomputations to the User, also deals with the Service Provider. Additionally, the intermediary is responsible for breaking larger tasks into microcomputations and verifying and aggregating results.
● Customer (SETI@home, research facilities): gives tasks to the Intermediary and receives the batched results. As the name says, the customer also provides remuneration, negotiated with the Intermediary and Service Provider.
Transforming Distributed Tasks into Verifiable Computations● Intermediary uses selective redundancy to verify
the results:●N input values are chosen randomly (ringers) and
results for them are precalculated.● User results are only accepted if all ringers
embedded in the input data result in the correct values.
● Statistically, it is very hard to fool the Intermediary, because the ringers are unknown to the User.
Challenges and Solutions
● Users redistributing data free of charge: high overhead, small gains.
● MITM attacks to get other Users to perform our work: HTTPS can be employed once a MITM attack is detected.
● Malicious Intermediaries: no solution provided, as problem is subtly ignored.