1
o “Anytime” property: once is initialized to be asymp. consistent (and ), then remains asymp. consistent at every iteration. •ADMM: •Binary Two-node Toy Model: . Estimating (true ) as and (both known) are varied. •Matrix Consensus (introducing for theoretically purpose): •Matrix Consensus reduces to linear consensus when W i are diagonal matrices. •Asymptotic covariance of : •Joint optimization consensus is asymptotically equivalent to matrix consensus with W i =H i : •The optimal weights should minimize the asymptotic mean square error (MSE): •The optimal weights of matrix consensus are given by •Let ; reform the optimization, Choosing the Optimal Weights Distributed Parameter Estimation via Pseudo-likelihood Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine Motivation aphical models in exponential family form: ask: distributed algorithms for estimating parameters iven i.i.d. data, . xample: wireless sensor network as MRF, Limited computational power and memory on local sensors. High communication cost. k: calculate the partition function Z, or Important: probability of evidence, parameter estimation #P-complete in general graphs Approximations and bounds are needed M-Estimators •M-estimator: Asymptotic consistency and normality: if •Intuition: •Maximum likelihood estimator (MLE): •Maximum Pseudo-likelihood (PL) estimator (MPLE): A Distributed Paradigm ADMM for Joint Optimization Consensus Choosing the Optimal Weights (cont.) • Optimal weights for linear consensus : If corr(s i α , s j α ) = 0 for i = j, then the optimal weights of is •Optimal weights for max consensus is If corr(s i α , s j α ) = 1 for i = j, then max consensus with achieves the performance of the best linear consensus. Experiments (Sandwich formula) Fisher information: , Hessian: . Non-zero elements zero elements Perform local estimators on sensor nodes: Combine the local estimations: o Joint Optimization Consensus (joint MPLE): o Linear Consensus: o Max Consensus: o Max consensus is a special linear consensus. o Under mild conditions, all these consensus estimators are asymp. consistent if the local estimators are asymp. Consistent. o quadratic programming, solvable, but requires global calculation. o If corr(s i , s j ) = 0 for i = j, then (joint MPLE, W i =H i ) achieves the optimum asymptotic MSE. Sum of rows of V α -1 // recall that max consensus are special cases of linea consensus. Star Graphs (unbalanced degrees, max consensus preferred): 4X4 Grid (balanced degrees, joint MPLE preferred): Large-scale Models (100 nodes, similar trends as small models): o requires calculating partition function, NP-hard. o Important: each term of PL only involves local data and parameters. Joint optimization consensus can be solved distributedly via alternating direction method of multipliers (ADMM): •Augmented Lagrangian: ‘’Iterative” linear consensus See similar algorithm in Wiesel & Hero 2012 ADMM Iteration

ADMM:

  • Upload
    leanna

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Distributed Parameter Estimation via Pseudo-likelihood . …. Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine. Non-zero elements. zero elements. Choosing the Optimal Weights (cont.). Motivation. ADMM for Joint Optimization Consensus. - PowerPoint PPT Presentation

Citation preview

Page 1: ADMM:

o “Anytime” property: once is initialized to be asymp. consistent (and ), then remains asymp. consistent at every iteration.

•ADMM:

•Binary Two-node Toy Model: . Estimating (true ) as and (both known) are varied.•Matrix Consensus (introducing for theoretically purpose):

•Matrix Consensus reduces to linear consensus when Wi are diagonal matrices.

•Asymptotic covariance of :

•Joint optimization consensus is asymptotically equivalent to matrix consensus with Wi=Hi:

•The optimal weights should minimize the asymptotic mean square error (MSE):

•The optimal weights of matrix consensus are given by

•Let ; reform the optimization,

Choosing the Optimal Weights

Distributed Parameter Estimation via Pseudo-likelihood Qiang Liu Alexander Ihler

Department of Computer Science, University of California, Irvine

Motivation•Graphical models in exponential family form:

•Task: distributed algorithms for estimating parameters given i.i.d. data, .

•Example: wireless sensor network as MRF, •Limited computational power and memory on local sensors.

•High communication cost.

Task: calculate the partition function Z, or • Important: probability of evidence, parameter estimation• #P-complete in general graphs• Approximations and bounds are needed

M-Estimators•M-estimator:

• Asymptotic consistency and normality: if

•Intuition:

•Maximum likelihood estimator (MLE):

•Maximum Pseudo-likelihood (PL) estimator (MPLE):

A Distributed Paradigm

ADMM for Joint Optimization Consensus Choosing the Optimal Weights (cont.)

• Optimal weights for linear consensus :

•If corr(siα, sj

α) = 0 for i ≠ j, then the optimal weights of is

•Optimal weights for max consensus is

•If corr(siα, sj

α) = 1 for i ≠ j, then max consensus with achieves the performance of the best linear consensus.

Experiments(Sandwich formula)

Fisher information: , Hessian: .

Non-zero elementszero elements

• Perform local estimators on sensor nodes:

• Combine the local estimations:o Joint Optimization Consensus (joint MPLE):

o Linear Consensus:

o Max Consensus:

o Max consensus is a special linear consensus.

o Under mild conditions, all these consensus estimators are asymp. consistent if the local estimators are asymp. Consistent.

o quadratic programming, solvable, but requires global calculation.

o If corr(si, sj) = 0 for i ≠ j, then (joint MPLE, Wi =Hi) achieves the optimum asymptotic MSE.

Sum of rows of Vα-1

// recall that max consensus are special cases of linear consensus.

• Star Graphs (unbalanced degrees, max consensus preferred):

• 4X4 Grid (balanced degrees, joint MPLE preferred):

• Large-scale Models (100 nodes, similar trends as small models):

o requires calculating partition function, NP-hard.

o Important: each term of PL only involves local data and parameters.

Joint optimization consensus can be solved distributedly via alternating direction method of multipliers (ADMM):

•Augmented Lagrangian:

‘’Iterative” linear consensus See similar algorithm in

Wiesel & Hero 2012

ADMM Iteration