Thanks To: Dr. Andrei Broder Dr. Evgeniy Gabrilovich Dr. Vanja
Josifovski
Slide 5
2012 Global Ad Spend $530 Billion Half the money I spend on
advertising is wasted; the trouble is I don't know which half. --
John Wanamaker (attributed) [1838-1922]
Slide 6
Yahoo! Research 2011 6 US online advertising spending (source:
eMarketer.com, November 2010) YearOnlineOnline % of total media
2009$22.7B13.9% 2010$25.8B15.3% 2011$28.5B16.7% 2012$32.6B18.3%
2013$36.0B19.8% 2014$40.5B21.5%
Slide 7
Yahoo! Research 2011 7 measurability and reach No more coupon
codes Flexible ad targeting + conversion tracking Experimentation
rules ! What changed in 100 years?
Slide 8
Yahoo! Research 2011 8 What is Computational Advertising? A new
scientific sub-discipline that provides the foundation for building
online ad retrieval platforms Find the optimal ad for a given user
in a specific context Information retrieval Statistical modeling
& machine learning Large-scale text analysis Microeconomics
Computational Advertising
Slide 9
Ad Industry Structure Advertiser Audience Media $$$$ $$$$$
$$
Slide 10
The Great Divide BrandDirect Response Emotions Indirect
benefits Banners, TV, stadiums Transactions Gross profits Search,
coupons, 1-800, radio, mail
Slide 11
Slide 12
Slide 13
Yahoo! Research 2011 13 Anatomy of an ad
http://research.yahoo.com/sigir10_compadv Display URL Landing URL
Bid phrases: {SIGIR 2010, computational advertising, Evgeniy
Gabrilovich,...} Bid: $0.10 Landing page Creative Title Tutorial at
SIGIR 2010 Information Retrieval Challenges in Computational
Advertising research.yahoo.com/tutorials/ sigir
Yahoo! Research 2011 28 Our approach to sponsored search Query
Front end Candidate ads Revenue reordering Ad slate Ad query
generation First pass retrieval Relevance reordering Ad search
engine Ad query Miele Rich query Use bid phrases and landing pages
to augment the ads (cf. query expansion)
Slide 29
Yahoo! Research 2011 29 Ads vs. Web pages Very short Optimized
for presentation, not for indexing creatives have low SNR Legacy
bid-phrase-centric definition dictated by the exact match scenario
very limiting today Complex structure Not overly short (at least
more often than not ) Simple structure: sections/subsections and
(optional) HTML markup Ads Web pages
Slide 30
Yahoo! Research 2011 30 Smaller corpus Much broader notion of
relevance (relatedness) Different (but rich) information is
available bids, budgets, landing pages, conversion rates, elaborate
nested structure of campaigns, Huge corpus Mainly aiming at pages
that subsume all the query terms Strict notion of relevance Anchor
text and other valuable signals are available Ad retrieval vs. Web
search Ad retrieval Web search
Slide 31
RPV Optimization: Problems with Sort by CPC Example Term: "mba"
Ad TitleUniv. of Phx: Online MBAUniv. of Washington MBA Ad Body
100% online university. Fully accredited. Foster School of
business. Top 30 ranked. CPC$10.00$0.50 CTR0.01%4% Position#1#10
RPV$0.0010$0.0200
Slide 32
RPV Optimization: Problems with Sort by CPC Example Term: "mba"
Ad TitleUniv. of Phx: Online MBAUniv. of Washington MBA Ad Body
100% online university. Fully accredited. Foster School of
business. Top 30 ranked. CPC$10.00$0.50 CTR0.01%4% Position#1#10
RPV$0.0010$0.0200
Slide 33
RPV Optimization Sort by (CPC_Bid x CTR)
Slide 34
Yahoo! Research 2011 One does not have to show ads! Roughly
half of the queries have no ads Repeatedly showing non-relevant ads
can have detrimental long-term effects Modeling actual (short- and
long-term) costs of showing non-relevant ads is very difficult
Goal: predict when (not) to show ads 34 Learning when (not) to
advertise (CIKM 2008, Broder et al.) Should we show ads at all
Slide 35
Yahoo! Research 2011 35 Two approaches: Thresholding vs.
Machine Learning Global threshold on relevance scores of individual
ads Only show ads with scores above the threshold Problem: Scores
are not necessarily comparable across queries! Learn a binary
prediction model for sets of ads Features defined over sets of ads
rather than individual ads Relevance (word overlap, cosine
similarity between ad and query/page etc.) Result set cohesiveness
(coefficient of variation of ad scores, result set clarity,
entropy) Should we show ads at all
Slide 36
Yahoo! Research 2011 36 Features Relevance features Word
overlap, cosine similarity between ad and query/page Vocabulary
mismatch features Translation models PMI between query/page terms
and bid terms Ad-based features Bid price (higher bids often
indicate better ads) Result-set cohesiveness features Coefficient
of variation of ad scores (std/mean) Result set clarity If the set
of ads is very cohesive and focused on 1-2 topics, the relevance
language model is very different from the collection model
Entropy
Slide 37
Yahoo! Research 2011 37 Estimating advertizability of tail
queries for sponsored search [Pandey et al. SIGIR 2010] The
previous model mainly looked at the candidate set of ads An
alternative approach is to look at the words in the queries
download, buy, insurance, flight, hotel amenable to advertising
weather, free, university low ad CTR Parameters (= word
propensities to attract ad clicks) are estimated from training
data
Slide 38
Yahoo! Research 2011 38 Binary classifier (relevant /
non-relevant ads) Baseline: text overlap features (query/ad) Click
history (query/ad) with back-off Click propensity in query/ad
translation Incorporating click history (WSDM 2010, Hillard et al.)
Cold start (i.e., no click history) is OK Using click data to
overcome synonymy Query = running gear Ad = Best jogging shoes
Results Query coverage 9% Ads per query 12% CTR 10% Same # clicks
on fewer ads Counting clicks for query/ad word pairs IBM Model 1
Bayes rule Should we show ads at all
Slide 39
Yahoo! Research 2011 39 Ready to buy or just browsing ?
Classifying research- and purchase-oriented sessions Inferring eye
gaze position from observable actions Keystrokes, GUI
(scroll/click), mouse movement, browser (new tab, close,
back/forward) Research vs. purchase classification (in lab): F1 =
0.96 Ad clickthrough in sessions classified as Purchase > 2X
compared to sessions classified as Research Predicting future ad
clicks: F1 = 0.07 0.17 (+141%) Incorporating multi-modal
interaction data (SIGIR 2010, Guo & Agichtein) Should we show
ads at all
Ad Industry Structure Advertiser Audience Media $$$$ $$$$$
$$
Slide 44
End Users Dont bug me Unless I like what you have to offer
Slide 45
Yahoo! Research 2011 Ads as another source of content for
enriching Web search results 45 I do not regard advertising as
entertainment or an art form, but as a medium of information.
Slide 46
Better Matching Context detection GPS, location App vs. content
In-game Info seeker vs. transactor Calendars/schedules/events
Social networks/status Twitter - now Behavioral esp. w/knowledge of
specific site behaviors Contextual Privacy Google AOL search
data
Slide 47
Yahoo! Research 2011 47 Textual advertising Sponsored Search
Ads driven by search keywords a.k.a. keyword driven ads or paid
search Content Match Ads driven by the content of a web page a.k.a.
context driven ads or contextual ads Textual advertising on the Web
is strongly related to NLP and information retrieval
Slide 48
Context? Flowers Mentos gum Trial Prep Credit score Cosmetics
Hampton Inns WeightWatchers Vacation Home Rentals Home Depot Web
Hosting WebMD Colon Cleanse Warning My Teeth Arent Yellow
Classmates.com
Slide 49
TESTING
Slide 50
Nine Differences A B upgrade! Lost 90% revenue. Reverting
coupon code increased CR 6.5%
Slide 51
Testing
Slide 52
A/B Split Test
Slide 53
Testing Sample Size, margin of error, confidence x = Z( c / 100
) 2 r(100-r) n = N x / ((N-1)E 2 + x) E = Sqrt[ (N - n)x / n(N-1)
]
Slide 54
Sample Size Problems So many ideas, so little to sample
Disproportionate advantage to scale Multivariate testing Taguchi
Method Method for calculating signal-to-noise ratio of different
parameters in an experimental design Allows optimization with A/B
test of each cross-product 00
Slide 55
Testing
Slide 56
Repetition
Slide 57
Slide 58
Professional Photos BeforeAfter We observed an immediate 30%
increase in conversion rates
Slide 59
Fact Sheet Design
Slide 60
The Great Divide BrandDirect Response Emotions Indirect
benefits Banners, TV, stadiums Transactions Gross profits Search,
coupons, 1-800, radio, mail
Slide 61
Graphical ads all over the web! 61
Slide 62
Advertisers: basic principles Display advertisers aim to Reach
audiences of interest with certain frequency Achieve certain
performance of the campaign In marketing terms, display advertising
aims for both Brand marketing that raises the awareness for a brand
Direct marketing For both reach and performance campaigns, the key
challenge is to select the right audience, i.e. to target the right
users Many advertisers have multiple simultaneous campaigns with
different goals In all cases, ultimate goal is to maximize ROI
62
Slide 63
The tasks in targeting 63 Targeting can be decomposed into
three related tasks User profile generation Audience selection:
find the best audience for a given ad Performance prediction: find
the best ad for a given impression User profile generation
Understanding the user based on his past activity and available
metadata Determining what are the predictive features Audience
selection Define the objective Training data, Modeling methods Can
be done offline/near-line and by third party.
Using demographics in advertising Important indicator of
peoples interest and potential of a conversion Imagine you want to
sell a $50K sports car. Who do you target? Used widely in
traditional advertising: TV, magazines, etc. maintain very detailed
statistics of their audience Common classic dimensions: Age Gender
Income bracket Location Interests (Golf enthusiast) 65
Slide 66
Obtaining Demographic Information User supplied demographic
information Most reliable if filled correctly In some cases 15-20%
of users born on 1 st of January Most users see very little
incentive to fill the form Privacy concerns But CC data, shipping
address, etc, are ~100% reliable. Inferred demographic information
Guess demographics based on browsing/querying behavior 74% women/
58% of men seek health or medical info online 34% women/ 25% men
seek religious info online Wider reach virtually every user 66
Slide 67
Geo targeting Goal: determine user location Home Current Often
wrong Inputs Registration data IP (Main source!) Browser default
language Search language Etc Lots of papers/results, but no time to
discuss 67
Slide 68
Behavioral Targeting A technique used by publishers and
advertisers to increase campaign effectiveness based on a given
users historical behavior: Previous searches/search sessions
Previous browsing activity Previous ad-clicks Previous conversions
Declared demographics data Etc. Utility everyone wins! (at least in
theory ) Advertisers: get a more appropriate/receptive audience,
increased conversion rate, better ROI Publishers: can ask for a
premium Users: see more interesting ads 68 Main techniques
Slide 69
Basic search retargeting scheme User searches for shoes on the
XYZ engine 69 Site ABC sends ad request + XYZ cookie to XYZ XYZ
creates shoes ad based on XYZ cookie that remembers shoes
Slide 70
Basic browse retargeting scheme User Joe views skiing site ABC
that contains some XYZ produced ad or just beacon Joes XYZ cookie
captures visit to ABC Now on site DEF Joe sees ski ads 70 Sends ad
request + XYZ cookie to XYZ XYZ creates skiing ad based on XYZ
cookie that remembers skiing Alternative: XYZ puts ad for ABC