Upload
rosa-bell
View
212
Download
0
Embed Size (px)
Citation preview
Approximating Average Parameters
of GraphsOded Goldreich, Weizmann
InstituteDana Ron, Tel Aviv
University
The Type of Problems we Consider
Let f be a “natural” function defined on graphs.The domain of f : vertices/pairs of vertices/etc.
The goal: Estimate the average value of f .
The means:(1) Queries to f ;(2) Queries to the graph: a. Neighbor queries (who is j’th neighbor of v?) b. Vertex-pair queries (are u and v neighbors?)
Questions of interest: (1) Can we do this much more efficiently as compared to general functions? (2) How do different queries influence the complexity?
The Particular Problems we Study
Problem1: Estimating the average degree in a graph (first considered by Feige (STOC04))
Problem2: Estimating the average distance to a given vertex in a graph
Problem3: Estimating the average distance between pairs of vertices in a graph
deg(v)
distv(u)
dist(u,v)
Our Results
Problem1: Estimating average degree in a (simple) graph G=(V,E) , |V|=n, |E|=m, dG=2m/n deg(v)
UB: Can obtain (1+)-approximation in time n1/2 poly(log n / )by using neighbor queries (only).
LB: A (1+)-approximation requires ((n / )1/2) queries (when allowed all types of queries).
Compare to Feige: (2+)-approx in similar complexity using degree queries only (queries to f ); (2-o(1))-approx requires (n) degree queries.
Note: Can improve when avg degree is high: (n/dG)1/2 instead of n1/2 with matching lower bound.
Our Results Cont’
Problem2: Estimating the average distance to a vertexProblem3: Estimating the average distance between vertices G=(V,E) , |V|=n, |E|=m, dG= avg distance
UB: Can obtain (1+)-approx in time (n/dG)1/2 poly(log n / )
by using distance queries (only).
LB: A (1+)-approximation requires ((n /( dG))1/2) queries when allowed all types of queries.If allow only neighbor queries: (m) necessary for constant approximation.
Note (non sublinear): Can obtain (1+)-approx using O(m n1/2 poly(1/)) neighbor queries only. When graph not too dense ( m=o(n3/2) ) save as compared to computing exact/approx for all pairs.
Our Results: Summary
Common to problems:
• Can obtain (1+)-approximation in sublinear-time.
• Dependence on n: n1/2; dependence on (1/ ): polynomial.
• Running time improves as avg increases.
• Matching lower bounds in terms of dependence on n/avg
Differences between problems:
• In case of avg degree: neighbor queries help (to improve quality of estimate), can do without degree queries (to f)
• In case of avg distance: neighbor queries do not help, cannot do without distance queries (to f)
Estimating Average DegreeG=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1
deg(v)
Idea for algorithm inspired by [Kaufman, Krivelevich, R]’s procedure for selecting an edge “almost uniformly”
Ingredient 1: consider partition of all graph vertices into buckets: In bucket Bi vertices v s.t.
(1+)i-1 < deg(v) ≤ (1+)i ( = /8 , O((log n)/) buckets )
Suppose for every Bi had estimate bi s.t. bi = |Bi|(1 /8) .
(1/n) i bi (1+)i = (1 ) dG (*)How to get bi? By sampling. Difficulty: if |Bi| is small (<< n1/2) then necessary sample is too large ( (|Bi|/n)-1 >> n1/2 ).
Ingredient 2: ignore small Bi ‘s, take sum in (*) over sufficiently large bi ‘s (Bi ‘s). ( For large Bi ‘s get bi by sampling n1/2 vertices )
Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1
Reminders: Bi = { v : (1+)i-1 < deg(v) ≤ (1+)i } ( = /8 )
bi = |Bi|(1 /8) for |Bi| > (( n)1/2/ 4(log n)/))
Consider sum: (1/n) i: Bi large bi (1+)i (**)
i.e., small overestimate. By how much can we underestimate?
clearly: ≤ (1+ ) dG
Sum of degrees = 2 num of edges. Large bucketsSmall
buckets
Counted twice
Counted once
Not Counted
Total not counted ≤ ( n)/2. All others, at least once. Hence, sum in (**) ≥ dG / (2 + )
Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1
Recap: Can get factor-(2+ ) approx using n1/2 poly((log n)/) degree queries only (alternative to [Feige]).
Recall [Feige]: cannot do (much) better using degree queries only.
Large buckets
Small buckets
Counted twice
Counted once
Not Counted (few)
Ingredient 3: Estimate num of edges counted once and compensate for them.
Estimating Average Degree Cont’G=(V,E) , |V|=n, |E|=m, dG=2m/n ≥ 1
Ingredient 3: Estimate num of edges counted once and compensate.
Large bucketsSmall
buckets
Bi
For each large Bi estimate num of edges btwn Bi and small buckets.
Implementation: Let Si be vertices sampled in Bi . For each uSi select random neighbor v . Let i be frac of v’s in small buckets.
Estimate is: (1/n) i: Bi large bi (1+ i) (1+)i
W.h.p estimate is (1 ) dG
Large bucketsSmall
buckets
Bi
Estimating Average Degree SummaryG=(V,E) , |V|=n, |E|=m, dG=2m/n (≥ 1)
Average Degree Approximation Algorithm:
1. Unif. and indep. select K=O(n1/2 poly((log n)/)) vertices. Let S be (multi-)set of selected vertices.
2. For i=0,…,log1+n , let Si=SBi .
3. Let L = { i : |Si|/K ≥ (/(8n))1/2 / (log1+n +1) }
4. For each i L, every u Si, select random neighbor v of u.
Let i be frac. of rand. neighbors in small buckets.
5. Output (1/K) iL |Si| (1+ i) (1+)i
Note 1: If have l.b. , dG≥ , then : O((n/ )1/2 poly((log n)/))(sufficient to get good bi for |Bi| > (( n)1/2/ 4(log n)/)) )
Note 2: Can get same complexity without knowing l.b. (Can search for l.b. because alg does never overestimates)
Estimating Average Degree L.B.G=(V,E) , |V|=n, |E|=m, dG=2m/n (≥ 1)
Thm. For any n, d [2 , o(n) ], ((1/(dn)) , o(n/d) ) distinguishing between avg. deg. d and avg degree (1+ )d
requires ((n/( d))1/2) queries (all types allowed)
For k ( d n)1/2 consider (random labelings of) graphs:
dG1 = d
k verticesclique
G2 :
n-k verticesd-regular
dG2= (1+ ) d
To distinguish must select vertex in small component
n-k verticesd-regular
G1 :
k verticesd-regular
Estimating Average Distance
Problem2: Estimating the average distance to a vertexProblem3: Estimating the average distance between vertices G=(V,E) , |V|=n, |E|=m, dG= avg distance
UB: Can obtain (1+)-approx in time (n/dG)1/2 poly(log n / )
by using distance queries (only).
LB: A (1+)-approximation requires ((n /( dG))1/2) queries when allowed all types of queries.If allow only neighbor queries: (m) necessary for constant approximation.
Note (non sublinear): Can obtain (1+)-approx using O(m n1/2 poly(1/)) neighbor queries only. When graph not too dense ( m=o(n3/2) ) save as compared to computing exact/approx for all pairs.
Estimating Average DistanceAlgorithms: For both problems simply take sample of vertices / pairs of vertices and compute average over sample.
Analysis: Uses Chebyshev’s inequality – prove small variance.
Sketch for Problem 3 (avg. dist. btwn pairs):
Let dmax be max distance btwn pairs;
For i=0,…,dmax let pi be fraction of pairs at distance i;
Let be distributed according to pi : E[]=dG.
We show that E[2] = O(n1/2 E[]2). Core of proof: showing that E[]= (d2
max /n). Reason: if some pair are far, then many pairs are far.
v0 v1 vd
. . . . . .vi
. . .vd-i
wFor i=1,…, d/3 , d=dmax , wdist(w,vi)+dist(w,vd-i)≥d/3
Estimating Average Distance L.B. (for Problem 2 – avg. dist. to vertex s)
G=(V,E) , |V|=n, |E|=m, dG= avg dist to s
Thm. For any n, d [2 , o(n) ], ((1/(dn)) , o(n/d) ) distinguishing between avg. dist. d and avg dist (1+ )d
requires ((n/( d))1/2) queries (all types allowed)
Construction for >1/4: For k (2(1+ )(d n))1/2 and t ((1+ )1/2-1/2)(2 d n)1/2 consider (rand labelings of) graphs:
dG1 = (1+ ) d
G2 (two-sided broom graph):
dG2 < d
To distinguish must select/reach right-side edge
G1 (broom graph):
vks
. . .v1
w1
w2
wn-k-1
.
.s
. . .v1 vt
w1
w2
wn-k-1
.
.vt+1
vt+2
vk
Summary
Study estimation of average value of “natural” functions on graphs: average degree and average distance.
Give sublinear algorithms (dependence on n is n1/2) and roughly matching lower bounds.
Different problems exhibit different behavior in terms of the “power” of the queries: queries to the estimated functions vs. queries to the structure of the graph.