Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv...

Preview:

Citation preview

Approximating Average Parameters

of GraphsOded Goldreich, Weizmann

InstituteDana Ron, Tel Aviv

University

The Type of Problems we Consider

Let f be a “natural” function defined on graphs.The domain of f : vertices/pairs of vertices/etc.

The goal: Estimate the average value of f .

The means:(1) Queries to f ;(2) Queries to the graph: a. Neighbor queries (who is j’th neighbor of v?) b. Vertex-pair queries (are u and v neighbors?)

Questions of interest: (1) Can we do this much more efficiently as compared to general functions? (2) How do different queries influence the complexity?

The Particular Problems we Study

Problem1: Estimating the average degree in a graph (first considered by Feige (STOC04))

Problem2: Estimating the average distance to a given vertex in a graph

Problem3: Estimating the average distance between pairs of vertices in a graph

deg(v)

distv(u)

dist(u,v)

Our Results

Problem1: Estimating average degree in a (simple) graph G=(V,E) , |V|=n, |E|=m, dG=2m/n deg(v)

UB: Can obtain (1+)-approximation in time n1/2 poly(log n / )by using neighbor queries (only).

LB: A (1+)-approximation requires ((n / )1/2) queries (when allowed all types of queries).

Compare to Feige: (2+)-approx in similar complexity using degree queries only (queries to f ); (2-o(1))-approx requires (n) degree queries.

Note: Can improve when avg degree is high: (n/dG)1/2 instead of n1/2 with matching lower bound.

Our Results Cont’

Problem2: Estimating the average distance to a vertexProblem3: Estimating the average distance between vertices G=(V,E) , |V|=n, |E|=m, dG= avg distance

UB: Can obtain (1+)-approx in time (n/dG)1/2 poly(log n / )

by using distance queries (only).

LB: A (1+)-approximation requires ((n /( dG))1/2) queries when allowed all types of queries.If allow only neighbor queries: (m) necessary for constant approximation.

Note (non sublinear): Can obtain (1+)-approx using O(m n1/2 poly(1/)) neighbor queries only. When graph not too dense ( m=o(n3/2) ) save as compared to computing exact/approx for all pairs.

Our Results: Summary

Common to problems:

• Can obtain (1+)-approximation in sublinear-time.

• Dependence on n: n1/2; dependence on (1/ ): polynomial.

• Running time improves as avg increases.

• Matching lower bounds in terms of dependence on n/avg

Differences between problems:

• In case of avg degree: neighbor queries help (to improve quality of estimate), can do without degree queries (to f)

• In case of avg distance: neighbor queries do not help, cannot do without distance queries (to f)

Estimating Average DegreeG=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1

deg(v)

Idea for algorithm inspired by [Kaufman, Krivelevich, R]’s procedure for selecting an edge “almost uniformly”

Ingredient 1: consider partition of all graph vertices into buckets: In bucket Bi vertices v s.t.

(1+)i-1 < deg(v) ≤ (1+)i ( = /8 , O((log n)/) buckets )

Suppose for every Bi had estimate bi s.t. bi = |Bi|(1 /8) .

(1/n) i bi (1+)i = (1 ) dG (*)How to get bi? By sampling. Difficulty: if |Bi| is small (<< n1/2) then necessary sample is too large ( (|Bi|/n)-1 >> n1/2 ).

Ingredient 2: ignore small Bi ‘s, take sum in (*) over sufficiently large bi ‘s (Bi ‘s). ( For large Bi ‘s get bi by sampling n1/2 vertices )

Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1

Reminders: Bi = { v : (1+)i-1 < deg(v) ≤ (1+)i } ( = /8 )

bi = |Bi|(1 /8) for |Bi| > (( n)1/2/ 4(log n)/))

Consider sum: (1/n) i: Bi large bi (1+)i (**)

i.e., small overestimate. By how much can we underestimate?

clearly: ≤ (1+ ) dG

Sum of degrees = 2 num of edges. Large bucketsSmall

buckets

Counted twice

Counted once

Not Counted

Total not counted ≤ ( n)/2. All others, at least once. Hence, sum in (**) ≥ dG / (2 + )

Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1

Recap: Can get factor-(2+ ) approx using n1/2 poly((log n)/) degree queries only (alternative to [Feige]).

Recall [Feige]: cannot do (much) better using degree queries only.

Large buckets

Small buckets

Counted twice

Counted once

Not Counted (few)

Ingredient 3: Estimate num of edges counted once and compensate for them.

Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1

Ingredient 3: Estimate num of edges counted once and compensate.

Large bucketsSmall

buckets

Bi

For each large Bi estimate num of edges btwn Bi and small buckets.

Implementation: Let Si be vertices sampled in Bi . For each uSi select random neighbor v . Let i be frac of v’s in small buckets.

Estimate is: (1/n) i: Bi large bi (1+ i) (1+)i

W.h.p estimate is (1 ) dG

Large bucketsSmall

buckets

Bi

Estimating Average Degree SummaryG=(V,E) , |V|=n, |E|=m, dG=2m/n (≥ 1)

Average Degree Approximation Algorithm:

1. Unif. and indep. select K=O(n1/2 poly((log n)/)) vertices. Let S be (multi-)set of selected vertices.

2. For i=0,…,log1+n , let Si=SBi .

3. Let L = { i : |Si|/K ≥ (/(8n))1/2 / (log1+n +1) }

4. For each i L, every u Si, select random neighbor v of u.

Let i be frac. of rand. neighbors in small buckets.

5. Output (1/K) iL |Si| (1+ i) (1+)i

Note 1: If have l.b. , dG≥ , then : O((n/ )1/2 poly((log n)/))(sufficient to get good bi for |Bi| > (( n)1/2/ 4(log n)/)) )

Note 2: Can get same complexity without knowing l.b. (Can search for l.b. because alg does never overestimates)

Estimating Average Degree L.B.G=(V,E) , |V|=n, |E|=m, dG=2m/n (≥ 1)

Thm. For any n, d [2 , o(n) ], ((1/(dn)) , o(n/d) ) distinguishing between avg. deg. d and avg degree (1+ )d

requires ((n/( d))1/2) queries (all types allowed)

For k ( d n)1/2 consider (random labelings of) graphs:

dG1 = d

k verticesclique

G2 :

n-k verticesd-regular

dG2= (1+ ) d

To distinguish must select vertex in small component

n-k verticesd-regular

G1 :

k verticesd-regular

Estimating Average Distance

Problem2: Estimating the average distance to a vertexProblem3: Estimating the average distance between vertices G=(V,E) , |V|=n, |E|=m, dG= avg distance

UB: Can obtain (1+)-approx in time (n/dG)1/2 poly(log n / )

by using distance queries (only).

LB: A (1+)-approximation requires ((n /( dG))1/2) queries when allowed all types of queries.If allow only neighbor queries: (m) necessary for constant approximation.

Note (non sublinear): Can obtain (1+)-approx using O(m n1/2 poly(1/)) neighbor queries only. When graph not too dense ( m=o(n3/2) ) save as compared to computing exact/approx for all pairs.

Estimating Average DistanceAlgorithms: For both problems simply take sample of vertices / pairs of vertices and compute average over sample.

Analysis: Uses Chebyshev’s inequality – prove small variance.

Sketch for Problem 3 (avg. dist. btwn pairs):

Let dmax be max distance btwn pairs;

For i=0,…,dmax let pi be fraction of pairs at distance i;

Let be distributed according to pi : E[]=dG.

We show that E[2] = O(n1/2 E[]2). Core of proof: showing that E[]= (d2

max /n). Reason: if some pair are far, then many pairs are far.

v0 v1 vd

. . . . . .vi

. . .vd-i

wFor i=1,…, d/3 , d=dmax , wdist(w,vi)+dist(w,vd-i)≥d/3

Estimating Average Distance L.B. (for Problem 2 – avg. dist. to vertex s)

G=(V,E) , |V|=n, |E|=m, dG= avg dist to s

Thm. For any n, d [2 , o(n) ], ((1/(dn)) , o(n/d) ) distinguishing between avg. dist. d and avg dist (1+ )d

requires ((n/( d))1/2) queries (all types allowed)

Construction for >1/4: For k (2(1+ )(d n))1/2 and t ((1+ )1/2-1/2)(2 d n)1/2 consider (rand labelings of) graphs:

dG1 = (1+ ) d

G2 (two-sided broom graph):

dG2 < d

To distinguish must select/reach right-side edge

G1 (broom graph):

vks

. . .v1

w1

w2

wn-k-1

.

.s

. . .v1 vt

w1

w2

wn-k-1

.

.vt+1

vt+2

vk

Summary

Study estimation of average value of “natural” functions on graphs: average degree and average distance.

Give sublinear algorithms (dependence on n is n1/2) and roughly matching lower bounds.

Different problems exhibit different behavior in terms of the “power” of the queries: queries to the estimated functions vs. queries to the structure of the graph.

Recommended