Upload
umangwar
View
213
Download
0
Embed Size (px)
Citation preview
Graph Theory for Online Advertising
J. Tipan Verella
March 19, 2014
Tipan GTOA March 19, 2014 1 / 18
Introduction
What is so great about Graphs?
A GraphG = (V ;E ) is a pair of sets, vertices and edges.
Degree of Vertex, Connected Components
SystemsEngineering for Complex Behavioral Systems
bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent
Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.
What does it have to do with online advertising?
Tipan GTOA March 19, 2014 2 / 18
Introduction
What is so great about Graphs?
A GraphG = (V ;E ) is a pair of sets, vertices and edges.
Degree of Vertex, Connected Components
SystemsEngineering for Complex Behavioral Systems
bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent
Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.
What does it have to do with online advertising?
Tipan GTOA March 19, 2014 2 / 18
Introduction
What is so great about Graphs?
A GraphG = (V ;E ) is a pair of sets, vertices and edges.
Degree of Vertex, Connected Components
SystemsEngineering for Complex Behavioral Systems
bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent
Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.
What does it have to do with online advertising?
Tipan GTOA March 19, 2014 2 / 18
Introduction
What is so great about Graphs?
A GraphG = (V ;E ) is a pair of sets, vertices and edges.
Degree of Vertex, Connected Components
SystemsEngineering for Complex Behavioral Systems
bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent
Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.
What does it have to do with online advertising?
Tipan GTOA March 19, 2014 2 / 18
Introduction
What is so great about Graphs?
A GraphG = (V ;E ) is a pair of sets, vertices and edges.
Degree of Vertex, Connected Components
SystemsEngineering for Complex Behavioral Systems
bio-chemical reaction networks,ecological systems, distributed adaptivesystems; self-organization, phase transitionmarkets,herd behavior and crowdsourcing, bittorrent
Graphs (Networks) are a versatile tool forunderstanding structures of Complex Systems.
What does it have to do with online advertising?
Tipan GTOA March 19, 2014 2 / 18
Introduction
Anecdotes from the Industry
Facebook Presto 2013, Demonstrating the Scalability of Presto1
Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2
Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3
Google Pregel (2010) A System for Large-Scale Graph Processing4
Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.
1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920
2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184
Tipan GTOA March 19, 2014 3 / 18
Introduction
Anecdotes from the Industry
Facebook Presto 2013, Demonstrating the Scalability of Presto1
Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2
Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3
Google Pregel (2010) A System for Large-Scale Graph Processing4
Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.
1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920
2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184
Tipan GTOA March 19, 2014 3 / 18
Introduction
Anecdotes from the Industry
Facebook Presto 2013, Demonstrating the Scalability of Presto1
Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2
Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3
Google Pregel (2010) A System for Large-Scale Graph Processing4
Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.
1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920
2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184
Tipan GTOA March 19, 2014 3 / 18
Introduction
Anecdotes from the Industry
Facebook Presto 2013, Demonstrating the Scalability of Presto1
Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2
Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3
Google Pregel (2010) A System for Large-Scale Graph Processing4
Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.
1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920
2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184
Tipan GTOA March 19, 2014 3 / 18
Introduction
Anecdotes from the Industry
Facebook Presto 2013, Demonstrating the Scalability of Presto1
Microsoft Horton (2012) is a research project in the eXtremeComputing Group to enable querying large distributedgraphs.2
Yahoo! Apache Giraph (2011) is an iterative graph processingsystem built for high scalability.3
Google Pregel (2010) A System for Large-Scale Graph Processing4
Inspired by Leslie Valiant’s Bulk Synchronous Parallel model for distributedcomputing.
1https://www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920
2http://research.microsoft.com/en-us/projects/ldg/3https://giraph.apache.org/4http://dl.acm.org/citation.cfm?id=1807184
Tipan GTOA March 19, 2014 3 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
Performance Advertising
Advertiser would prefer to only pay for actions
Publisher would prefer to only charge on views (impressions)
Tipan GTOA March 19, 2014 4 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Advertiser Problem
j is the proportion of your budget you spend on site j
Nj( j) are the impressions procured by spending j on site j
�j is the conversion rate of your ad on site j
max
∑j2J
Nj( j) � �j︸ ︷︷ ︸Actionsj
subject to:∑j
j � Budget
Tipan GTOA March 19, 2014 5 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Advertiser Problem
j is the proportion of your budget you spend on site j
Nj( j) are the impressions procured by spending j on site j
�j is the conversion rate of your ad on site j
max
∑j2J
Nj( j) � �j︸ ︷︷ ︸Actionsj
subject to:∑j
j � Budget
Tipan GTOA March 19, 2014 5 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Publisher Problem
�(i ; n) is the revenue if impression n is awarded to advertiser i
�i ;n is 1 or 0 depending on whether or not impression n isawarded to advertiser i
I is the set of advertisers
max�i;n
∑n2N
∑i2I
�(i ; n) � �i ;n
subject to:∑i2I
�i ;n � 1 8n
Tipan GTOA March 19, 2014 6 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Publisher Problem
�(i ; n) is the revenue if impression n is awarded to advertiser i
�i ;n is 1 or 0 depending on whether or not impression n isawarded to advertiser i
I is the set of advertisers
max�i;n
∑n2N
∑i2I
�(i ; n) � �i ;n
subject to:∑i2I
�i ;n � 1 8n
Tipan GTOA March 19, 2014 6 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The AdNetwork Problem
�i ;j is the fraction of the inventory on site j allocated toadvertiser i
Nj are the total number of impressions from site j�i ;j is the conversion rate of advertiser i on site j�(i) is the amount paid per conversion by advertiser icj is the cost per impression on site jBi is the budget of advertiser i
max�
∑i2I
∑j2J
0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸
revenue
−
cost︷ ︸︸ ︷cj � Nj
1CA
subject to:∑j2J
�i ;j � Nj � �i ;j � �(i) � Bi 8i 2 I
∑i2I
�i ;j � 1 8j 2 J
Tipan GTOA March 19, 2014 7 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The AdNetwork Problem
�i ;j is the fraction of the inventory on site j allocated toadvertiser i
Nj are the total number of impressions from site j�i ;j is the conversion rate of advertiser i on site j�(i) is the amount paid per conversion by advertiser icj is the cost per impression on site jBi is the budget of advertiser i
max�
∑i2I
∑j2J
0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸
revenue
−
cost︷ ︸︸ ︷cj � Nj
1CA
subject to:∑j2J
�i ;j � Nj � �i ;j � �(i) � Bi 8i 2 I
∑i2I
�i ;j � 1 8j 2 J
Tipan GTOA March 19, 2014 7 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Centralized Approach: Linear Programming
max�
∑i2I
∑j2J
0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸
revenue
−
cost︷ ︸︸ ︷cj � Nj
1CA
subject to:∑j2J
�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸spend of advertiser i
� Bi 8i 2 I
∑i2I
�i ;j � 1 8j 2 J
Plan, Evaluate, Update
Duality can says a lot about the structure of your problem
DOES NOT SCALE!
Tipan GTOA March 19, 2014 8 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Centralized Approach: Linear Programming
max�
∑i2I
∑j2J
0B@�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸
revenue
−
cost︷ ︸︸ ︷cj � Nj
1CA
subject to:∑j2J
�i ;j � Nj � �i ;j � �(i)︸ ︷︷ ︸spend of advertiser i
� Bi 8i 2 I
∑i2I
�i ;j � 1 8j 2 J
Plan, Evaluate, Update
Duality can says a lot about the structure of your problem
DOES NOT SCALE!
Tipan GTOA March 19, 2014 8 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Decentralized Approach: The Market Paradigm
Publisher runs auctions, the good (impressions) goes to the agentthat values it the most 5
the monopoly should provide as detailed a description ofthe good as possiblethe auction solves the allocation problem
Advertiser places bids, 2nd price auction it is optimal to bid yourvaluation
valuation depends on conversion rates, a priori unknown!the number of auctions is also unknown!
performance rates have to be estimatedcontrol algorithms have to be implemented in order topace the delivery of the ad campaign
Markets are complex systems!
5Hal Varian on the Online Ad AuctionTipan GTOA March 19, 2014 9 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Decentralized Approach: The Market Paradigm
Publisher runs auctions, the good (impressions) goes to the agentthat values it the most 5
the monopoly should provide as detailed a description ofthe good as possiblethe auction solves the allocation problem
Advertiser places bids, 2nd price auction it is optimal to bid yourvaluation
valuation depends on conversion rates, a priori unknown!the number of auctions is also unknown!performance rates have to be estimatedcontrol algorithms have to be implemented in order topace the delivery of the ad campaign
Markets are complex systems!
5Hal Varian on the Online Ad AuctionTipan GTOA March 19, 2014 9 / 18
Strategy and Structure Optimization Problems in Online Performance Advertising
The Decentralized Approach: The Market Paradigm
Publisher runs auctions, the good (impressions) goes to the agentthat values it the most 5
the monopoly should provide as detailed a description ofthe good as possiblethe auction solves the allocation problem
Advertiser places bids, 2nd price auction it is optimal to bid yourvaluation
valuation depends on conversion rates, a priori unknown!the number of auctions is also unknown!performance rates have to be estimatedcontrol algorithms have to be implemented in order topace the delivery of the ad campaign
Markets are complex systems!
5Hal Varian on the Online Ad AuctionTipan GTOA March 19, 2014 9 / 18
Strategy and Structure Graphs and Behavior
More About Graphs: Random Graphs
Let V be a vertex set, with |V | = n.
For each pair of vertices (u; v), with u; v 2 V , we decide to put theedge (u; v) based on the outcome of a coin flip, with probabilityp = c
n .
Tipan GTOA March 19, 2014 10 / 18
Strategy and Structure Graphs and Behavior
Erdos and Renyi
Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.
Figure : as c goes from < 1 to > 1
6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18
Strategy and Structure Graphs and Behavior
Erdos and Renyi
Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.
Figure : as c goes from < 1 to > 1
6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18
Strategy and Structure Graphs and Behavior
Erdos and Renyi
Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.
Figure : as c goes from < 1 to > 1
6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18
Strategy and Structure Graphs and Behavior
Erdos and Renyi
Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.
Figure : as c goes from < 1 to > 1
6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18
Strategy and Structure Graphs and Behavior
Erdos and Renyi
Paul Erdos and Alfred Renyi6 proved (1960) that such a graph experiencea phase transition at c = 1.
Figure : as c goes from < 1 to > 1
6On the Evolution of Random GraphsTipan GTOA March 19, 2014 11 / 18
Strategy and Structure Graphs and Behavior
Local Interactions in the Quantitative Social Sciences
Sociologist, Mark Granovetter: The Strength of Weak Ties (1973)Economists: predictive power of social interactions
Lawrence Blumef (1993), propose using model from statisticalmechanics to understand strategic interactionsEdward Gleaser EtAl 1996, Crime and Social InteractionsSteven Durlauf (1999) asks in PNAS, How can statistical mechanicscontribute to social science?H. Peyton Young 2001, Individual Strategy and Social Structure: AnEvolutionary Theory of Institutions
by 1996, Social Network Analysis: Methods and Applications by Faustand Wasserman.
More recently sociolgists at Cornell University have been using graphbased sampling methods 7 to do estimations for hidden populations
sociologists like A.V. Papachristos have been using social networks tounderstand the crime in Chicago8.
7responsdent driven sampling8Social Networks and Gang Violence
Tipan GTOA March 19, 2014 12 / 18
Strategy and Structure Graphs and Behavior
Local Interactions in the Quantitative Social Sciences
Sociologist, Mark Granovetter: The Strength of Weak Ties (1973)Economists: predictive power of social interactions
Lawrence Blumef (1993), propose using model from statisticalmechanics to understand strategic interactionsEdward Gleaser EtAl 1996, Crime and Social InteractionsSteven Durlauf (1999) asks in PNAS, How can statistical mechanicscontribute to social science?H. Peyton Young 2001, Individual Strategy and Social Structure: AnEvolutionary Theory of Institutions
by 1996, Social Network Analysis: Methods and Applications by Faustand Wasserman.
More recently sociolgists at Cornell University have been using graphbased sampling methods 7 to do estimations for hidden populations
sociologists like A.V. Papachristos have been using social networks tounderstand the crime in Chicago8.
7responsdent driven sampling8Social Networks and Gang Violence
Tipan GTOA March 19, 2014 12 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
CrowdSourcing: Power to the People!
Yochai Benkler on Directories and GooglePageRank
channels/categories/directories,
advertisers/campaigns/creatives
Tipan GTOA March 19, 2014 13 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
Site Networks and Audiences
Tipan GTOA March 19, 2014 14 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
Site Networks and Audiences
Tipan GTOA March 19, 2014 14 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
Community Detection
Why understand community structures of complex networks?
Size, problem reduction
Topology, diverse degree distribution
Biological Sciences Perspective:
network enables the discovery of organization interactions of abio-chemical system
Complex Networks as backbone of Complex Systems
Communities enable decomposition into subsystems, modules
In online advertising: Feature Extraction!
Tipan GTOA March 19, 2014 15 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
Community Detection
Why understand community structures of complex networks?
Size, problem reduction
Topology, diverse degree distribution
Biological Sciences Perspective:
network enables the discovery of organization interactions of abio-chemical system
Complex Networks as backbone of Complex Systems
Communities enable decomposition into subsystems, modules
In online advertising: Feature Extraction!
Tipan GTOA March 19, 2014 15 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
Community Detection
Why understand community structures of complex networks?
Size, problem reduction
Topology, diverse degree distribution
Biological Sciences Perspective:
network enables the discovery of organization interactions of abio-chemical system
Complex Networks as backbone of Complex Systems
Communities enable decomposition into subsystems, modules
In online advertising: Feature Extraction!
Tipan GTOA March 19, 2014 15 / 18
Scale and Complexity The Web is a Network! ...Bipartite Graphs Everywhere!
The Pinned Random Walk
Definition (PRW)
Let G = (V ;E ) be a connected undirected graph. Let P be the transition
probability matrix induced by the incidence matrix, Pij =Eij∑j Eij
. Let �0
be a probability measure on V and � 2 (0; 1). We call an pinned randomwalk the discrete time stochastic process, Xk , on G that changes measures�k on V according to:
X0 = x0; almost surely
�k = ��k−1P + (1 − �)�0 (1)
Tipan GTOA March 19, 2014 16 / 18
Conclusion
So . . . What is so great about Networks?
Coming out of the woodworks of the systems you deal with withinonline advertising, because your systems are Complex!
They are the underlying structures of you advertising systems
They are predictive!
Statisticians are actively working on tools to extract information fromthose rich strutures.
Tipan GTOA March 19, 2014 17 / 18
Conclusion
Thank You!
Millennial Media
Rosalee MacKinnon
Rick Daggett
Dr. Jean M. Grow
Tipan GTOA March 19, 2014 18 / 18
Conclusion
Thank You!
Millennial Media
Rosalee MacKinnon
Rick Daggett
Dr. Jean M. Grow
Tipan GTOA March 19, 2014 18 / 18
Conclusion
Thank You!
Millennial Media
Rosalee MacKinnon
Rick Daggett
Dr. Jean M. Grow
Tipan GTOA March 19, 2014 18 / 18