Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University

Approximating Average Parameters

of GraphsOded Goldreich, Weizmann

InstituteDana Ron, Tel Aviv

University

The Type of Problems we Consider

Let f be a “natural” function defined on graphs.The domain of f : vertices/pairs of vertices/etc.

The goal: Estimate the average value of f .

The means:(1) Queries to f ;(2) Queries to the graph: a. Neighbor queries (who is j’th neighbor of v?) b. Vertex-pair queries (are u and v neighbors?)

Questions of interest: (1) Can we do this much more efficiently as compared to general functions? (2) How do different queries influence the complexity?

The Particular Problems we Study

Problem1: Estimating the average degree in a graph (first considered by Feige (STOC04))

Problem2: Estimating the average distance to a given vertex in a graph

Problem3: Estimating the average distance between pairs of vertices in a graph

deg(v)

distv(u)

dist(u,v)

Our Results

Problem1: Estimating average degree in a (simple) graph G=(V,E) , |V|=n, |E|=m, dG=2m/n deg(v)

UB: Can obtain (1+)-approximation in time n1/2 poly(log n / )by using neighbor queries (only).

LB: A (1+)-approximation requires ((n / )1/2) queries (when allowed all types of queries).

Compare to Feige: (2+)-approx in similar complexity using degree queries only (queries to f ); (2-o(1))-approx requires (n) degree queries.

Note: Can improve when avg degree is high: (n/dG)1/2 instead of n1/2 with matching lower bound.

Our Results Cont’

Problem2: Estimating the average distance to a vertexProblem3: Estimating the average distance between vertices G=(V,E) , |V|=n, |E|=m, dG= avg distance

UB: Can obtain (1+)-approx in time (n/dG)1/2 poly(log n / )

by using distance queries (only).

LB: A (1+)-approximation requires ((n /( dG))1/2) queries when allowed all types of queries.If allow only neighbor queries: (m) necessary for constant approximation.

Note (non sublinear): Can obtain (1+)-approx using O(m n1/2 poly(1/)) neighbor queries only. When graph not too dense ( m=o(n3/2) ) save as compared to computing exact/approx for all pairs.

Our Results: Summary

Common to problems:

• Can obtain (1+)-approximation in sublinear-time.

• Dependence on n: n1/2; dependence on (1/ ): polynomial.

• Running time improves as avg increases.

• Matching lower bounds in terms of dependence on n/avg

Differences between problems:

• In case of avg degree: neighbor queries help (to improve quality of estimate), can do without degree queries (to f)

• In case of avg distance: neighbor queries do not help, cannot do without distance queries (to f)

Estimating Average DegreeG=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1

deg(v)

Idea for algorithm inspired by [Kaufman, Krivelevich, R]’s procedure for selecting an edge “almost uniformly”

Ingredient 1: consider partition of all graph vertices into buckets: In bucket Bi vertices v s.t.

(1+)i-1 < deg(v) ≤ (1+)i ( = /8 , O((log n)/) buckets )

Suppose for every Bi had estimate bi s.t. bi = |Bi|(1 /8) .

(1/n) i bi (1+)i = (1 ) dG (*)How to get bi? By sampling. Difficulty: if |Bi| is small (<< n1/2) then necessary sample is too large ( (|Bi|/n)-1 >> n1/2 ).

Ingredient 2: ignore small Bi ‘s, take sum in (*) over sufficiently large bi ‘s (Bi ‘s). ( For large Bi ‘s get bi by sampling n1/2 vertices )

Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1

Reminders: Bi = { v : (1+)i-1 < deg(v) ≤ (1+)i } ( = /8 )

bi = |Bi|(1 /8) for |Bi| > (( n)1/2/ 4(log n)/))

Consider sum: (1/n) i: Bi large bi (1+)i (**)

i.e., small overestimate. By how much can we underestimate?

clearly: ≤ (1+ ) dG

Sum of degrees = 2 num of edges. Large bucketsSmall

buckets

Counted twice

Counted once

Not Counted

Total not counted ≤ ( n)/2. All others, at least once. Hence, sum in (**) ≥ dG / (2 + )


Recap: Can get factor-(2+ ) approx using n1/2 poly((log n)/) degree queries only (alternative to [Feige]).

Recall [Feige]: cannot do (much) better using degree queries only.

Large buckets

Small buckets

Counted twice

Counted once

Not Counted (few)

Ingredient 3: Estimate num of edges counted once and compensate for them.


Ingredient 3: Estimate num of edges counted once and compensate.

Large bucketsSmall

buckets

Bi

For each large Bi estimate num of edges btwn Bi and small buckets.

Implementation: Let Si be vertices sampled in Bi . For each uSi select random neighbor v . Let i be frac of v’s in small buckets.

Estimate is: (1/n) i: Bi large bi (1+ i) (1+)i

W.h.p estimate is (1 ) dG

Large bucketsSmall

buckets

Bi

Estimating Average Degree SummaryG=(V,E) , |V|=n, |E|=m, dG=2m/n (≥ 1)

Average Degree Approximation Algorithm:

1. Unif. and indep. select K=O(n1/2 poly((log n)/)) vertices. Let S be (multi-)set of selected vertices.

2. For i=0,…,log1+n , let Si=SBi .

3. Let L = { i : |Si|/K ≥ (/(8n))1/2 / (log1+n +1) }

4. For each i L, every u Si, select random neighbor v of u.

Let i be frac. of rand. neighbors in small buckets.

5. Output (1/K) iL |Si| (1+ i) (1+)i

Note 1: If have l.b. , dG≥ , then : O((n/ )1/2 poly((log n)/))(sufficient to get good bi for |Bi| > (( n)1/2/ 4(log n)/)) )

Note 2: Can get same complexity without knowing l.b. (Can search for l.b. because alg does never overestimates)

Estimating Average Degree L.B.G=(V,E) , |V|=n, |E|=m, dG=2m/n (≥ 1)

Thm. For any n, d [2 , o(n) ], ((1/(dn)) , o(n/d) ) distinguishing between avg. deg. d and avg degree (1+ )d

requires ((n/( d))1/2) queries (all types allowed)

For k ( d n)1/2 consider (random labelings of) graphs:

dG1 = d

k verticesclique

G2 :

n-k verticesd-regular

dG2= (1+ ) d

To distinguish must select vertex in small component

n-k verticesd-regular

G1 :

k verticesd-regular

Estimating Average Distance

Problem2: Estimating the average distance to a vertexProblem3: Estimating the average distance between vertices G=(V,E) , |V|=n, |E|=m, dG= avg distance

UB: Can obtain (1+)-approx in time (n/dG)1/2 poly(log n / )

by using distance queries (only).

LB: A (1+)-approximation requires ((n /( dG))1/2) queries when allowed all types of queries.If allow only neighbor queries: (m) necessary for constant approximation.

Note (non sublinear): Can obtain (1+)-approx using O(m n1/2 poly(1/)) neighbor queries only. When graph not too dense ( m=o(n3/2) ) save as compared to computing exact/approx for all pairs.

Estimating Average DistanceAlgorithms: For both problems simply take sample of vertices / pairs of vertices and compute average over sample.

Analysis: Uses Chebyshev’s inequality – prove small variance.

Sketch for Problem 3 (avg. dist. btwn pairs):

Let dmax be max distance btwn pairs;

For i=0,…,dmax let pi be fraction of pairs at distance i;

Let be distributed according to pi : E[]=dG.

We show that E[2] = O(n1/2 E[]2). Core of proof: showing that E[]= (d2

max /n). Reason: if some pair are far, then many pairs are far.

v0 v1 vd

. . . . . .vi

. . .vd-i

wFor i=1,…, d/3 , d=dmax , wdist(w,vi)+dist(w,vd-i)≥d/3

Estimating Average Distance L.B. (for Problem 2 – avg. dist. to vertex s)

G=(V,E) , |V|=n, |E|=m, dG= avg dist to s

Thm. For any n, d [2 , o(n) ], ((1/(dn)) , o(n/d) ) distinguishing between avg. dist. d and avg dist (1+ )d

requires ((n/( d))1/2) queries (all types allowed)

Construction for >1/4: For k (2(1+ )(d n))1/2 and t ((1+ )1/2-1/2)(2 d n)1/2 consider (rand labelings of) graphs:

dG1 = (1+ ) d

G2 (two-sided broom graph):

dG2 < d

To distinguish must select/reach right-side edge

G1 (broom graph):

vks

. . .v1

w1

w2

wn-k-1

.

.s

. . .v1 vt

w1

w2

wn-k-1

.

.vt+1

vt+2

vk

Summary

Study estimation of average value of “natural” functions on graphs: average degree and average distance.

Give sublinear algorithms (dependence on n is n1/2) and roughly matching lower bounds.

Different problems exhibit different behavior in terms of the “power” of the queries: queries to the estimated functions vs. queries to the structure of the graph.

Documents

Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University