23
Random Fields A short introduction Abhishek Srivastav Penn State Univ.(UP) April 06, 2010 Srivastav (PSU) Random Fields April 06, 2010 1 / 23

Random Fields - A short introduction · Random Fields RANDOM FIELDS Measure-theoretic definition Definition Let (K,K,P) be a measure space.Let GN,d be the set of all Rd-valued functions

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

  • Random FieldsA short introduction

    Abhishek Srivastav

    Penn State Univ.(UP)

    April 06, 2010

    Srivastav (PSU) Random Fields April 06, 2010 1 / 23

  • Overview

    OVERVIEW

    • From random variables to random fields• Generalization of stochastic processes

    • Random fields• Measure-theoretic and Kolmogorov’s definition

    • Random fields on graphs• Markov and Gibbs random fields

    • Equivalence Theorem

    • Main questions and solutions• Problems of interest, sampling and approximate methods

    • Approximate methods• Mean Field Theory

    • Example application - sensor networks

    • References

    Srivastav (PSU) Random Fields April 06, 2010 2 / 23

  • Random Fields

    RANDOM FIELDSGeneralization of stochastic processes

    Stochastic process

    Let (Ω, Υ,P) be a probability space, then a stochastic process is z(t, ζ) is defined as a Υ −Rmeasurable map z : T × Ω → R; where the index set T ⊆ R serves as the time parameter

    Definition (Random Field)

    Let (K,K,P) be a complete probability space and T a topological space. Then a measurablemapping F : K → (RT)d is called a real-valued random field. Where, (RT)d is the space of allRd -valued functions on the topological space T

    Examples:

    • Crop yields and disease/infection over a spatially distributed region

    • Geographical distribution of parameters such as temperature or rainfall

    • Images

    • Magnetic Resonance (MR) Images - Brain and other tissues

    • Dynamics of complex networks

    Srivastav (PSU) Random Fields April 06, 2010 3 / 23

  • Random Fields

    RANDOM FIELDSMeasure-theoretic definition

    Definition

    Let (K,K,P) be a measure space. Let GN,d be the set of all Rd -valued functions on RN ;N, d ∈ N, and GN,d be the corresponding σ-algebra. Then a measurable mapF : (K,K) → (GN,d ,GN,d ) is a called an N-dimensional random field.

    • Random field F maps elements k of the sample space K to functions in GN,d . Equivalentlyit maps sets in K to sets in GN,d

    • GN,d contains sets of the form {g ∈ GN,d : g(~ri ) ∈ Bi , i = 1, . . . , m}, where m is arbitrary,~ri ∈ R

    N and Bi ∈ Bd

    • Note: Sets of the form {g ∈ GN,d : g(~rα) ∈ Bα, α ∈ I}, where I is uncountable are notusually present in GN,d . Needs F to be separable.

    • For a given k ∈ K, the corresponding function in GN,d is called a realization of the randomfield and is denoted as F(•, k). At a given point ~r ∈ RN the value of this function is writtenas F(~r , k)

    Srivastav (PSU) Random Fields April 06, 2010 4 / 23

  • Random Fields

    RANDOM FIELDSKolmogorov’s definition

    Definition

    Let F be a family of random variables such that

    F = {F(~ri , •) : (Ω, Υ) → (Rd , Bd ),~r ∈ RN} (1)

    Then F is a random field if the distribution function P~r1,...,~rn(~x1, . . . , ~xn), ~x ∈ Rd satisfies the

    following

    1 Symmetry: P~r1,...,~rn (~x1, . . . , ~xn) is invariant under identical permutations of ~x and ~r .

    2 Consistency: P~r1,...,~rn+m (B × Rmd ) = P~r1,...,~rn (B) for every n,m ≥ 1 and B ∈ B

    nd

    • Distribution function P~r1,...,~rn(~x1, . . . , ~xn) = P(. . . , (−∞, ~xi ], . . . ). Each semi-open interval(−∞, ~xi ] is d-dimensional.

    • Distribution function P~r1,...,~rn(~x1, . . . , ~xn) is defined on Bnd

    Srivastav (PSU) Random Fields April 06, 2010 5 / 23

  • Random Fields

    RANDOM FIELDS ON GRAPHS

    • Let G , (S, E) be a graph; node set S = {s1, , s2, . . . , sN} and E is the edge set

    • Let ∂ = {∂i}si∈S be the neighborhood system, where ∂i is the neighbor set for a node si .

    • G = (S, E) and G = (S, ∂) are equivalent definitions. For a given edge set E ,∂i = {sk : (si , sk) ∈ E}

    • Ω is the set of all states (or labels) that any node si can take

    • K , ΩN is called the configuration space

    Random field over graph

    A random field over G = (S, E) is defined as

    F = {Fi : Ω → Rd , i = 1, 2, . . . , N}

    • k ∈ K is an ordered sequence (ω1, ω2, . . . , ωN) of N states

    Srivastav (PSU) Random Fields April 06, 2010 6 / 23

  • Random Fields

    MARKOV RANDOM FIELDSRandom field on graphs

    Definition

    F is called a Markov Random Field (MRF), with respect to G(or equivalently ∂), if and only if itsatisfies the conditions:

    1 Positivity: P(k) > 0,∀k ∈ K

    2 Markov Property: P(ωi |kS\{i}) = P(ωi |∂i )

    where kS\{i} is the configuration specified for the node set S \ {i} and P is the probabilitymeasure over the random field F .

    • Markov random field evolves as a local process

    • Allows to model contextual constraints or dependencies.

    Srivastav (PSU) Random Fields April 06, 2010 7 / 23

  • Random Fields

    GIBBS RANDOM FIELDSRandom field on graphs

    Definition

    A random field F on a graph G is called a Gibbs random field(GRF) when the probabilitymeasure P on F follows the Gibbs distribution.

    P(k) =1

    Zexp(−βH(k)) (2)

    where

    • Z =∑

    k∈K exp(−βH(k)) is the partition function and serves as a normalizing constant

    • β is the inverse temperature

    • H(k) is called the Hamiltonian and defines the energy of the configuration k ∈ K

    • Gibbs Random Field evolves as a global process

    Srivastav (PSU) Random Fields April 06, 2010 8 / 23

  • Random Fields

    MARKOV-GIBBS EQUIVALENCE

    • The Hamiltonian or the energy function for k is defined as

    H(k) =∑

    A⊂S

    VA(kA)

    • VA are called a clique potentials if VA = 0 is A is not a clique

    H(k) =∑

    c∈C

    Vc (kc )

    • Corresponding Gibbs field it is called a neighbor Gibbs random field

    S1

    S2

    S4

    S3

    S1

    S2

    S4

    S3

    S1

    S2

    S4

    S1

    S3

    S4

    S3

    S2

    S4

    S1

    S3

    S2

    S3

    S4

    S1

    S3

    S1

    S4

    S2

    S3

    S1

    S2

    S2

    S4

    Theorem (MRF-GRF Equivalence)

    Let the ∂ be the neighborhood system on a node-set S. Let F be a random field on S. Then Fis a Markov random field with respect to ∂ if and only if it is a neighbor Gibbs random field withrespect to ∂

    Remarks

    • Equivalence theorem essentially relates global and local behavior

    • Provides a convenient way to model local Markov dependencies as clique potentials

    • Joint density (global behavior) is given by the Gibbs distribution

    Srivastav (PSU) Random Fields April 06, 2010 9 / 23

  • Random Fields

    MARKOV-GIBBS EQUIVALENCEProof

    Proof that GRF =⇒ MRF

    P(ωi |kS\{i}) =P(ωi , kS\{i})

    P(kS\{i})=

    P(k)∑

    ω′i∈Ω P(k

    ′)(3)

    P(ωi |kS\{i}) =exp (−βH(k))

    ω′i∈Ω exp (−βH(k

    ′))(4)

    P(ωi |kS\{i}) =Λ(Ci , k)Λ(C̃i , k)

    ω′i∈Ω Λ(Ci , k

    ′)Λ(C̃i , k′)(5)

    where

    Λ(A, Q) = exp

    −β∑

    c∈A

    Vc(Qc)

    P(ωi |kS\{i}) =exp

    (

    −β∑

    c∈CiVc (kc )

    )

    ω′i∈Ω exp

    (

    −β∑

    c∈CiVc(k′c )

    ) = P(ωi |k∂i ) (6)

    Srivastav (PSU) Random Fields April 06, 2010 10 / 23

  • Random Fields

    MARKOV-GIBBS EQUIVALENCEProof (Contd . . . )

    Proof that MRF =⇒ GRF [Besag]

    P(k1)

    P(k2)=

    |S|∏

    i=1

    P(ω1i |ω11 , . . . , ω

    1i−1, ω

    2i+1, . . . , ω

    2|S|

    )

    P(ω2i |ω11 , . . . , ω

    1i−1, ω

    2i+1, . . . , ω

    2|S|

    )(7)

    requires the positivity condition P(k) > 0, ∀k ∈ KDefine

    Q(k) ≡ ln

    (

    P(k)

    P(0)

    )

    (8)

    Theorem

    For P satisfying the conditions of MRF there exists an expansion of Q(k) unique on K of thefollowing form:

    Q(k) =∑

    1≤i≤n

    ωiGi (ωi ) +∑ ∑

    1≤i

  • Random Fields

    MAIN QUESTIONS & SOLUTIONSRandom Fields

    Problems of interests

    Problems of interest when dealing with a random field can be group as

    • Sampling from the joint distribution P(k)Examples:

    • Sensor networks• Multi-agent systems e.g. swarms

    • Minimization of the Hamiltonian function H(~σ) over the configuration space KExamples:

    • Optimization under local constraints• Image processing - MAP-MRF labeling for restoration of noisy images

    • Expected values computation Examples:

    • Physics - net magnetization• Social networks - opinion formation

    Solution

    • Exact solution of the joint density is usually difficult due to intractable Z

    • Main solutions approaches are - Sampling methods and Variational approximation• Metropolis Algorithm• Simulated Annealing• Gibbs Sampling (Geman & Geman)• Variational approximations

    Srivastav (PSU) Random Fields April 06, 2010 12 / 23

  • Random Fields Approximate methods

    APPROXIMATE METHODS

    Definition (Kullback-Leibler (KL) divergence)

    Let P and Q be two probability measures over K, then the relative entropy or theKullback-Leibler (KL) divergence is defined as:

    DKL(Q||P) =∑

    k

    Q ln

    (

    Q

    P

    )

    DKL(P||Q) ≥ 0; equality holds if and only if P and Q are identical

    For a trial distribution Q and the Gibbs density P(k)

    DKL(Q||P) = T lnZ + EQ[H] − TS(Q)

    • EQ[H] is the variational energy• S(Q) is the entropy of Q• F ≡ −T lnZ is called the Helmholtz free energy• F (Q) = EQ[H] − TS(Q) is called the variational free energy

    Optimization Problem

    • Helmholtz free energy F ≤ F (Q), the variational free energy, since DKL(P||Q) ≥ 0

    • True distribution can be recovered as P = arg minQ F (Q)

    Srivastav (PSU) Random Fields April 06, 2010 13 / 23

  • Random Fields Approximate methods

    MEAN FIELD SOLUTIONSApproximate methods contd . . .

    • For Q(k) =∏

    i Qi (ωi ), approximation obtained is called the mean-field solution• All nodes in the system are assumed to be independent under mean-field Q• F (Q) can be optimized with respect to individual factors Qi to get mean-field equations

    Qi (ωi ) =1

    Ziexp

    (

    −βEQ[H(•|ωi )])

    where Zi =∑

    ωiexp(EQ[H(•|ωi )]) is the local partition function

    S1

    S2

    S4

    S3

    S3

    S1

    S2

    S4

    EQ [H(·|ω3)]

    EQ [H(·|ω4)]

    EQ [H(·|ω1)

    EQ [H(·|ω2)

    Remarks

    • Mean-field solution gives the local update equations for a node si

    • Local updates do not explicitly depend on the states of the rest of the network

    • Local updates rely on the expectation (or mean-fields) EQ[H(•|ωi )] computed w.r.t. Q.

    • Mean-field assumption transforms the underlying graph G to a new graph with no edges

    • Iterative methods can be used to get fixed point or approximate solution Q∗

    • Possibility of more than one fixed point due to non-linear local update equations

    Srivastav (PSU) Random Fields April 06, 2010 14 / 23

  • Random Fields Sensor networks

    Collaboration in DSNsDeveloping the framework . . .

    Objective

    A framework for collaboration in DSNs based on statistical mechanics and random field approach, which is

    • Robust

    • Decentralized

    • Local in Action, Global in Effect

    • Scalable

    • Resource-aware

    • Dynamically adaptive and event-driven

    • Real time

    • Let P be a |Q|-simplex:

    P =

    p = (p1, . . . , p|Q|), pk ∈ R, pk ≥ 0,

    |Q|∑

    k=1

    pk = 1

    • Let the set of vertices V of the simplex be

    V =

    σ = (v1, . . . , v|Q|), vk ∈ {0, 1},

    |Q|∑

    k=1

    vk = 1

    • Let the sensor network be represented by a graph G

    • Define a random field F = {Fi : Q → V, si ∈ S}

    • E = {ǫ, e1, . . . , em−1} is the set of events

    • ǫ = null event

    • Let B = {b0, b1, . . . , bm−1} be the vertices of a |E|-simplex with b(ǫ) = b0

    Srivastav (PSU) Random Fields April 06, 2010 15 / 23

  • Random Fields Sensor networks

    Collaboration in DSNsDeveloping the framework . . .

    • Clique potentials are defined for inter-node and node-eventinteraction as follows:

    H(~σ) = −∑

    {i,j}∈C2

    σT

    i Jijσj −∑

    {i}∈C1

    σT

    i Kibi

    • Jij are |Q| × |Q| neighbor interaction matrices (NIM) forneighbor pairs {i , j} ∈ C2

    • Ki are |Q| × |E| event response matrices (ERM) for nodes siσ

    1

    σ2

    σ3

    σ4

    σ5

    σ6

    σ7

    σ8

    σ9

    σ10

    σ12

    σ11

    J16

    J69

    J912

    J1112

    J1011

    J710

    J78

    J811J

    89J

    56

    J58

    J45

    J47

    J34

    J23

    J12

    J25

    b1

    b2

    b3

    b4

    b9

    b8

    b7

    b11

    b10

    b12

    b5

    b6

    K12

    K11

    K10

    K9

    K8

    K7

    K6

    K5

    K4

    K1

    K2

    K3

    Remarks

    • Objective is to have local node dynamics that conform to the joint density

    • Distributed sampling from the Gibbs distribution P is needed

    • Methods such as Gibbs sampling require current states of neighbors

    • Wireless communication problems prohibit such an approach

    • A variational approximation is proposed for distributed sampling from P

    • For independent updates, trial distribution Q =∏

    i Qi (σi |bi ) is completely factorized

    • Motivation is the graphical structure induced by the mean-field approximation

    Srivastav (PSU) Random Fields April 06, 2010 16 / 23

  • Random Fields Sensor networks

    Collaboration in DSNsDeveloping the framework . . .

    • Mean-field equations become

    Qi (σi |bi ) =1

    Ziexp

    (

    −βEQ[H(•|bi , σi )])

    • Let Qi (σi |bi ) = pi be the state probability vector

    • Since σi ∈ V , EQ[σi |bi ] = pi

    • Mean-field equations are a map Ti (e, •) : P|∂i | → P

    h(σi ; k) = −σT

    i Kbi (τ) −∑

    j∈∂i

    σT

    i Jijpj (k)

    pi (σi ; k + 1) =exp(−βh(σi ; k))

    ℓ exp(−βh(σℓi ; k))

    Remarks

    • Node dynamics is assumed to be faster than the time-scale (τ) of events

    • Ti (e, •) induce a continuous map TG(~e, •) : P|S| → P|S| on the entire sensor network

    • Brouwer fixed point theorem states that TG(~e, •) has a fixed point TG(~e, ~p) = ~p

    • Since the system is non-linear there maybe more than one fixed points

    • For symmetric J, β is small enough ⇒ only fixed point attractors of TG(~e, •)

    • Unique fixed point of TG(~e, •) for β < βc , a critical value

    Srivastav (PSU) Random Fields April 06, 2010 17 / 23

  • Random Fields Sensor networks

    i -PFSAFramework for collaboration in DSNs

    Definition (i-PFSA )

    An interacting-Probabilistic Finite State Automata (i-PFSA ) is defined as the tupleAi = {Q, E, p, p

    ∗, Ti (e, {p}i )} where:

    • Q is a finite set of states of the automata with Q = QC ∪ QNC ;

    • E is a strictly partially ordered (using ≺) finite set of events;

    • p ∈ P is defines the dynamics or the state visit probability vector;

    • p∗ ∈ P is a vector of reference state visit probabilities; and

    • Ti (e, {p}i )} define the local update dynamics

    Remarks

    • (A, G) is an i-PFSA network where A = {Ai , si ∈ S}

    • i-PFSA runs at the top (application) layer of sensor nodes

    • Probabilistic state transitions generate a state sequence

    • Actions in each state may lead to the discovery of events

    • Visiting set QC may give new dynamics of neighboring nodes

    • New dynamics of i-PFSA are computed if required

    q1

    q2q3

    Neighborhood interaction

    State Sequence. . . q1q2 . . .

    ActionsEvents

    Srivastav (PSU) Random Fields April 06, 2010 18 / 23

  • Adaptive Sensor Activity Scheduling Problem Description

    Adaptive sensor activity schedulingProblem description

    Problem description

    • Sensor nodes are often operated on very low duty cycles

    • Fixed duty-cycles good for data collection, not for detection of rare and random events

    • Most power saving schemes available are geared towards communication

    • Enabling detection of unpredictable events requires sensor nodes to be omni-active

    • Energy costs for sensing are further aggravated if active sensors (e.g. radar) are in use

    Methods in use

    • Coordinated sleep/activity scheduling methods

    • Un-coordinated or randomized scheduling

    • A hierarchical approach of passive vigilance

    • Switching between low resolution and a higher resolution sensing mode

    Objective

    Event driven adaptive scheduling of sensor activity to enable resource-aware detection andtracking using the i-PFSA framework

    Srivastav (PSU) Random Fields April 06, 2010 19 / 23

  • Adaptive Sensor Activity Scheduling Target Tracking

    Adaptive Sensor Activity Scheduling (A-SAS)Target tracking

    0 5 10 15 20 25 300

    5

    10

    15

    20

    0 5 10 15 20 25 300

    5

    10

    15

    20

    0 5 10 15 20 25 300

    5

    10

    15

    20

    Inactive Sense+Rx Cluster Head Tx

    0 5 10 15 20 25 300

    5

    10

    15

    20

    (a)

    0.4

    0.5

    0.6

    0.70.8

    0.9

    0 5 10 15 20 25 300

    5

    10

    15

    20

    (b)

    0.4 0.4

    0.40.4

    0.50.5

    0.5

    0.5

    0.6

    0.6

    0.6

    0.7

    0.7

    0.8

    0.8

    0.9

    0 5 10 15 20 25 300

    5

    10

    15

    20

    (c)

    0.4

    0.4

    0.4

    0.4

    0.5

    0.5

    0.5

    0.5

    0.6

    0.6

    0.6

    0.7

    0.7

    0.80.8 0.9

    Srivastav (PSU) Random Fields April 06, 2010 20 / 23

  • Adaptive Sensor Activity Scheduling Target Tracking

    DEMONSTRATIONAdaptive Sensor Activity Scheduling (A-SAS)

    Click to start Click to start

    Srivastav (PSU) Random Fields April 06, 2010 21 / 23

    SensorFieldMovie.aviMedia File (video/avi)

    ContourMovie.aviMedia File (video/avi)

  • Adaptive Sensor Activity Scheduling Adaptive pattern tracking in multi-hop networks

    Adaptive pattern tracking in multi-hop networks

    • Q = {S + R, R, I}

    • QC = {S + R, R}; QNC = {I}

    • E = {ǫ,mEvent, tEvent}

    • ǫ ≺ mEvent ≺ tEvent

    • Dominance defined as pC ≥ pd

    • p∗ = [0.3 0.1 0.6]T; p = [0.3 0.69 0.01]T; and

    p = [0.99 0.005 0.005]T for mEvent and tEvent respectively

    J =

    w 0 00 w 00 0 w

    • Local sinks determined via diffusion of sink-hop values

    • mEvent is triggered at a node when it is the local sink

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 50 100 150 200 250 3000

    50

    100

    150

    200

    0 0.2 0.4 0.6 0.8 1

    Srivastav (PSU) Random Fields April 06, 2010 22 / 23

  • Adaptive Sensor Activity Scheduling Adaptive pattern tracking in multi-hop networks

    REFERENCES

    1 Markov Random Field Modeling in Computer Vision, Stan Z. Li, Springer-Verlag

    2 Markov Random Fields and Their Applications, R. Kindermann and J.L. Snell,Contemporary Mathematics Series, American Mathematical Society (1980)

    3 Spatial interaction and the statistical analysis of lattice systems, J. Besag, J. of the Royalsociety. Series B, Vol. 336, No. 2 (1974)

    4 Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images, S.Geman and D. Geman, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 6, No.6 (1984)

    5 On the analysis of dirty pictures, J. Besag, J. of the Royal society. Series B, Vol. 48, No. 4(1986)

    Srivastav (PSU) Random Fields April 06, 2010 23 / 23

    OverviewRandom FieldsApproximate methodsApplicationSensor networks

    Adaptive Sensor Activity SchedulingProblem DescriptionTarget TrackingAdaptive pattern tracking in multi-hop networks