15
CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Embed Size (px)

Citation preview

Page 1: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

CS155b: E-Commerce

Lecture 16: April 10, 2001

WWW Searching and Google

Page 2: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

WWW Digraph

• More than 1 Billion Nodes (Pages)

• Average Degree (links/Page) is 5-15. (Hard to Compute!)

• Massive, Distributed, Explicit Digraph

(Not Like Call Graphs)

Page 3: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

“Hot” Research Area

• Graph Representation

• Duplicate Elimination

• Clustering

• Ranking Query Results

http://theory.stanford.edu/~focs98/tutorials.html

A. Broder & M. Henzinger

Page 4: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

“Abundance” Problem

http://simon.cs.cornell.edu/home/kleinber/ ….. kleinber.html

• Given a query find:– Good Content (“Authorities”)– Good Sources of Links (“Hubs”)

• Mutually Reinforcing

• Simple (Core) Algorithm

A

H

Page 5: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

T = {n Pages}, A = {Links}

Xp >0, p T non-negative “Authority Weights”

Yp >0, p T non-negative “Hub Weights”

I operation Update Authority Weights

Xp Yq

O operation Update Hub Weights

Yp Xq

Normalize: X2 = Y2 = 1

(q,p) A

(p,q) A

pp T p T

p

Page 6: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Core Algorithm

Z (1,1,…,1)

X Y Z

Repeat until Convergence

Apply I /* Update Authority weights */

Apply O /* Update Hub Weights */

Normalize

Return Limit (X*, Y*)

Page 7: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Convergence of (Xi, Yi) = (OI)i(Z,Z)

A = n x n “Adjacency Matrix”

Rewrite I and O:

X ATY ; Y AX

Xi = (ATA) i-1 ATZ ; Yi = (AAT)iZ

AAT Symm., Non-negative and Z = (1,1,…, 1)

X* = lim Xi = 1(ATA)

Y* = lim Yi = 1 (AAT)i

i

88

Page 8: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Whole Algorithm (k,d,c)

q Search Engine |S| < k

Base Set T:(In S, S , S) and <d links/page

Remove “Internal Links”Run Core Algorithm on TFrom Result (X,Y), Select

C pages with max X* valuesC pages with max Y* values

Page 9: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Examples (k= 200, d=5)q = censorship + net

www.EFF.orgwww.EFF.org/BlueRib.html

www.CDT.orgwww.VTW.orgwww.ACLU.prg

q = Gateswww.roadahead.comwww.microsoft.comwww.ms.com/corpinfo/bill-g.html

[Compares well with Yahoo, Galaxy, etc.]

Page 10: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Approach to “Massiveness”:Throw Out Most of G!!

• Non-principal Eigenvectors correspond to

“Non-principal Communities”

• Open (?):

Objective Performance Criteria

Dependence on Search Engine

Nondeterministic Choice of S and T

Page 11: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Google History

• Founded in 1998 by Larry Page and Sergey Brin, two Stanford Ph.D. candidates.

• Privately held company, whose backers include Kleiner Perkins Caufield & Byers and Sequoia Capital.

• Continues to win top awards for Search Engines. Computer Scientists love it!!!

Page 12: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Major Partners

• Yahoo!• Palm• Nextel• Netscape• Cisco Systems• Virgin Net• Netease.com• RedHat• Virgilio• Washingtonpost.com

Page 13: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Business Model

• The company delivers services through its own web site at www.google.com and by licensing its search technology to commercial sites

• Advertising: – Premium Sponsorship – Purchase a keyword

– AdWords – Manage your Ad text

Page 14: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

I’d like to buy a Keyword

The advertiser’s text-based ad will appear at the top of a Google results page whenever the keyword they have purchased is included in a user's search.

The ads appear adjacent to, but are distinguished from, the results listings.

Page 15: CS155b: E-Commerce Lecture 16: April 10, 2001 WWW Searching and Google

Category purchase

• Google uses a classification system to create an ongoing "Virtual Directory" of categories an advertiser can purchase.

• Advertisers can select the categories most appropriate to their business and Google will match the most relevant category ads to each user's search.

• The advantage of this approach is that it covers a broader audience that might be missed through the purchase of keywords alone.