25
1 Milena Mihail [email protected] Web Science Tea Feb 29, 08 Discussion Topic:

1 Milena Mihail [email protected] Web Science Tea Feb 29, 08 Discussion Topic:

Embed Size (px)

Citation preview

1

Milena [email protected]

Web Science Tea Feb 29, 08

Discussion Topic:

What is Web Science ?

Includes some intersection of comp sci, economics, social sci.

Our grassroots discussions :

Microsoft:New Cambridge LabJennifer Chayes

Yahoo: Raghavan WWW06 Brachman GT talk

Chris Klaus GT talk

NSF : CDI

Elsewhere :

Our non grassroots discussions :Super-Duper Data Center, ala Jeanette WingShould revisit this point, in view of NSF-Google-IBM ?

What is Web Science ?

The study of the WWW, broadly defined.By virtue of the pervasiveness of the object of study.

Systems-like science (like chemistry or biology).As opposed to “computer science” which is the study of “computation”,biology is the study of “life” from the cell to evolution to animals….

Should be studied in terms of its descriptive/predictive/explanatory/prescriptive analytic value.

Parenthesis: MSN SemGrail 07

Why should there be Web Science ?

Encourage collaboration across different areas.Something between the union and intersection of several areas.Need to establish common vocabulary, goals, problems.“Understanding the elephant versus the tail trunk”.

Educate students for industry.

Encourage academia to understand the study of the Web as a discipline.

Parenthesis: MSN SemGrail 07

Themes cutting across subareas of Web science

Long Tails / Economics / Culture

Fractal Nature, multi-scale

Humans and machines interact and interactions registered.New dimension in social sciences.

Transformed way we think about information(analogy to introduction of printing press).Democracy of information,producers and consumers of information coincide.

Dynamics, emergent systems, social networksRequires new analytics (eg what are right logics, probabilistic and approximation metrics)

Parenthesis: MSN SemGrail 07

What is Web Science ?

Includes some intersection of comp sci, economics, social sci.

Our grassroots discussions :

(in this spirit)

Outline: Wide Range of Models Canonical Example: Modeling Small World PhenomenonModel Parameters/Metrics and their RelevanceModels : Structural Explanatory (Optimization or Incentive Driven) HybridWhich question are you (am I) trying to answer?

Range of Models

Internet (general) Routing Internet

AS Level

RoutingLevel

(nice pictures with some meaning)

few long linksin a flat world

Sparse Power Law Graphswith very different assortativity

Range of Models

Patent / co-author networkin Boston area

(nice pictures with some meaning)

Flickr social networkfrom Flickrsearch keyword “graph”

notice bottleneck bad cut

notice no botlleneck bad cut

( Range of Flickr Pictures - meaning ? )

Technology PlatformsLocal Facebook Friendship Graph

A Wep Page Organization

4 Color Theorem

Range of Models

Biological Networkswith unclear meaning,but make front pageof Nature/Science/PNAS

Range of Models

(nice pictures with no meaning)

Range of Mathematical Models

Rick Durrett, Cornell, Probabilist

Mat

thew

Jack

son,

Sta

ford

, Eco

nom

ist

n

Canonical Example: Modeling the Small World Phenomenon

Milgram’s Experiment 60’s :Even though relationships are highly clustered,most people are pairwise reachable via short paths,“Six Degrees of Separation” (for fun, see also Facebook group)

Strogatz&Watt’s Model 80’s:In a clustered graph of size n,a few random linksdecrease the diameter to logn.

Clustering and Small Diameter

Kleinberg 90’s: Navigability !These short paths can be found efficiently with local search!

14

Kleinberg’s navigability model

Theorem: The only value for which the network is navigableis r =2.

Are there natural network models which are navigable and have, eg, power-law degree distributions ?

Are there natural models where the threshold is not sharp ?

Model Parameters/Metrics (as a function of n) and their Relevance

Average degree and Degree distribution

Clustering coefficient (small dense subgraphs)

Diameter

Expansion/Conductance (bottlenecks)

Eigenvalues, eigenvectors (quantify bottlenecks and find groups efficiently)

eg in Prediction / Simulationeconomics engineering

Evolving toward monopolies/oligopolies?

Can it be searched, crawled efficiently?

Can pagerank be computer efficiently?Can it route with low congestion?Does it support efficient info retrieval?How does information/technology spread?

Important to have FLEXIBLE network models

Assortativity

Structural / Macroscopic ModelsRandom graphs with desirable graph properties, thought to be aggregating all microscopic primitives

Example 1: Power Law Random Graph

Given Choose random perfect matching over

Example 2: Growth & Preferential Attachment

One vertex at a time

New vertex attaches to

existing vertices

Some evolutionary random graph models may also capture more factors,e.g, geography, and hence varying conductance.

Example 2, generalization towards flexibility:

Explanatory / Microscopic Models / Optimization Driven

Example: HOT, evolutionary, new node attaches by minimizing cost and maximizing quality of service

Point: Optimization primitivescan yield power law distributions.

Explanatory / Microscopic Models / Incentive Driven

Example: A Network Formation Game

How fast can such a stable configuration be reached?

RANDOM DOT PRODUCT GRAPH MODEL

Hybrid Models

Example 1:

Example 2:

24

SUMMARY

It is important to identify critical metrics and parameters ie, how they impact network performance.It is important to develop models where critical parameters vary and flexible network models.

It is important to identify network primitives related to optimization and incentives.It is important to develop mechanisms that affect such primitives.

HOW ABOUT YOU ?

WHICH QUESTIONS DO YOU WANT TO ANSWER ?