Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Estimating Time-Varying Networks
Mladen Kolar
The University of ChicagoBooth School of Business
November 20, 2013
Acknowledgments
E. Xing L. Song A. Ahmed A. Parikh
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 2
Networks are mathematical abstractions of complexsystems
Networks are useful for
visualization
discovery of regularitypatterns
exploratory analysis
. . .
of complex systems.
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 3
Interactions between variables are not always observable
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 4
Interactions between variables are not always observable
Data collected over a period oftime is easily accessible
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 5
Estimating time-varying networks
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 6
Talk Objective
How to recover changing interactions between objects from datacollected over time?
Challenges:
- Number of samples small
- Large number of objects
- Noisy data
- Data may contain missing values
- . . .
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 7
Talk Objective
How to recover changing interactions between objects from datacollected over time?
Challenges:
- Number of samples small
- Large number of objects
- Noisy data
- Data may contain missing values
- . . .
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 7
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 8
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 8
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 8
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 8
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 8
Markov Networks
Random vector X = (X1, . . . , Xp)′
Graph G = (V,E) with p nodes
- represents conditional independence relationships between nodes
Useful for exploring associations between measured variables
(a, b) 6∈ E ⇐⇒ Xa ⊥ Xb | Xab
(ab := V \a, b
)P[Xa | Xb, Xab] = P[Xa | Xab]
(Koller and Friedman, 2009)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 9
Two Common Markov Networks
Gaussian Markov Network: X ∼ N (µ,Σ)
p(x) ∝ exp
(−1
2(x− µ)TΣ−1(x− µ)
)The precision matrix Ω = Σ−1 encodes both parameters and the graphstructure
∗ ∗ ∗ ∗ ∗ 0∗ ∗ ∗ ∗ ∗ 0∗ ∗ ∗ 0 0 0∗ ∗ 0 ∗ 0 0∗ ∗ 0 0 ∗ 00 0 0 0 0 0
1 2
3
45
6
(Koller and Friedman, 2009; Lauritzen, 1996)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 10
Two Common Markov Networks
Gaussian Markov Network: X ∼ N (µ,Σ)
p(x) ∝ exp
(−1
2(x− µ)TΣ−1(x− µ)
)The precision matrix Ω = Σ−1 encodes both parameters and the graphstructure
Discrete Markov network: X ∈ −1, 1p (Ising model)
p(x; Θ) ∝ exp
∑a∈V
xaθaa +∑
a,b∈V×Vxaxbθab
Θ = (θab)ab encodes the conditional independence relationships
(Koller and Friedman, 2009; Lauritzen, 1996)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 11
Structure Learning Problem
Given an i.i.d. sample Dn = xini=1 from a distribution P ∈ P
Learn the set of conditional independence relationships
G = G(Dn)
Gaussian Markov Networks (Drton and Perlman, 2007)
- Form the maximum likelihood estimator for the covariance matrix
- Test for zeros in the precision matrix
Discrete Markov Networks (Chickering, 1996)
- Hard to learn structure, since the log partition function cannot beevaluated efficiently
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 12
Structure Learning Problem
Given an i.i.d. sample Dn = xini=1 from a distribution P ∈ P
Learn the set of conditional independence relationships
G = G(Dn)
Gaussian Markov Networks (Drton and Perlman, 2007)
- Form the maximum likelihood estimator for the covariance matrix
- Test for zeros in the precision matrix
Discrete Markov Networks (Chickering, 1996)
- Hard to learn structure, since the log partition function cannot beevaluated efficiently
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 12
Structure Learning Problem
Given an i.i.d. sample Dn = xini=1 from a distribution P ∈ P
Learn the set of conditional independence relationships
G = G(Dn)
Gaussian Markov Networks (Drton and Perlman, 2007)
- Form the maximum likelihood estimator for the covariance matrix
- Test for zeros in the precision matrix
Discrete Markov Networks (Chickering, 1996)
- Hard to learn structure, since the log partition function cannot beevaluated efficiently
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 12
Structure Learning in High-Dimensions
Penalized Pseudo-Likelihood Estimation
- Neighborhood Selection
- Useful for learning the structure of Gaussian and discrete MarkovNetworks
θa = arg maxθa∈Rp
∑i∈[n]
γ(θa; xi)− λ||θa||1
Conditional likelihood: γ(θa; xi) = logP[xi,a | xi,a;θa]
(Meinshausen and Buhlmann, 2006)(Ravikumar, Wainwright, and Lafferty, 2009)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 13
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)
Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 14
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)
Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
1
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 14
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)
Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
1
θ12
θ13
θ14
θ15
θ16θ17
θ18
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 15
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
1
θ12
θ13
θ15
θ14
θ16θ17
θ18
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 16
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
1
θ12
θ13
θ15
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 17
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
1
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 18
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)
Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
2
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 19
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)
Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
2
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 20
Neighborhood Selection
Local structure estimation
θa = arg maxθa∈Rp
`(θa;Dn)−λ||θa||1
θ1 =(∗ ∗ 0 ∗ 0 0 0
)
Estimated neighborhood
Na = b ∈ V | θab 6= 0
Na = 2, 3, 5
1
2
3
4
5
6
7
8
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 21
Properties of Neighborhood Selection
Graph structure can be recovered consistently
- provable guarantees in a high-dimensional setting- Meinshausen and Buhlmann (2006); Ravikumar, Wainwright, and Lafferty (2009)
Peng, Wang, Zhou, and Zhu (2009)
Fast estimation procedures
- efficient solvers for `1 penalized problems
- Beck and Teboulle (2009); Friedman, Hastie, and Tibshirani (2008)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 22
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 23
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 23
Estimating Time-Varying Networks
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 24
Estimating Time-Varying Networks
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 25
Estimating Time-Varying Networks
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 26
Estimating Time-Varying Networks
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 27
Estimating Time-Varying Networks
Et = (a, b) ∈ V × V | θtab 6= 0
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 28
Estimating Time-Varying Networks
Et = (a, b) ∈ V × V | θtab 6= 0M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 28
General Estimation Framework
Data: Dn = xt | xt ∼ P(θt;Gt)t∈Tn , Tn = 1/n, 2/n, . . . , 1
argmax `(Dn, θt)− pen(θt
)Loss: `(Dn, θt)
- measures the fit of model to data
Penalty: pen(θt
)- balances the complexity of model and the fit to data
- encodes structural assumptions about model class
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 29
Two scenarios
1 Smooth Networks
(Song et al., 2009)(Kolar et al., 2010)(Kolar and Xing, 2011)(Kolar and Xing, 2012c)
2 Networks With Jumps
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 30
Two scenarios
1 Smooth Networks
(Song et al., 2009)(Kolar et al., 2010)(Kolar and Xing, 2011)(Kolar and Xing, 2012c)
2 Networks With Jumps
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 30
Two scenarios
1 Smooth Networks
(Song et al., 2009)(Kolar et al., 2010)(Kolar and Xing, 2011)(Kolar and Xing, 2012c)
2 Networks With Jumps
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 30
Two scenarios
1 Smooth Networks(Song et al., 2009)(Kolar et al., 2010)(Kolar and Xing, 2011)(Kolar and Xing, 2012c)
2 Networks With Jumps
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 30
Smoothly Evolving Networks
γ(θ; xt) = logP[xt,a | xt,a;θ]
wτ (t) =Kh(t− τ)∑t∈Tn Kh(t− τ)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 31
Smoothly Evolving Networks
γ(θ; xt) = logP[xt,a | xt,a;θ]
wτ (t) =Kh(t− τ)∑t∈Tn Kh(t− τ)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 32
Smoothly Evolving Networks
γ(θ; xt) = logP[xt,a | xt,a;θ]
wτ (t) =Kh(t− τ)∑t∈Tn Kh(t− τ)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 33
Smoothly Evolving Networks
γ(θ; xt) = logP[xt,a | xt,a;θ]
wτ (t) =Kh(t− τ)∑t∈Tn Kh(t− τ)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 34
Two scenarios
1 Smooth Networks
2 Networks With Jumps
(Kolar et al., 2010)(Kolar, Song, and Xing, 2009)(Kolar and Xing, 2012a)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 35
Two scenarios
1 Smooth Networks
2 Networks With Jumps(Kolar et al., 2010)(Kolar, Song, and Xing, 2009)(Kolar and Xing, 2012a)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 35
Networks With Jumps
maxθtt∈Tn
∑t
γ(θt; xt)− λ1
∑t
||θt||1
− λ2
∑t
||θt − θt−1||2
Fused Penalty (Tibshirani et al., 2005)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 36
θt12
θt13
θt1p
Networks With Jumps
maxθtt∈Tn
∑t
γ(θt; xt)− λ1
∑t
||θt||1
− λ2
∑t
||θt − θt−1||2
Fused Penalty (Tibshirani et al., 2005)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 36
θt12
θt13
θt1p
Sparsity
Networks With Jumps
maxθtt∈Tn
∑t
γ(θt; xt)− λ1
∑t
||θt||1
− λ2
∑t
||θt − θt−1||2
Fused Penalty (Tibshirani et al., 2005)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 36
θt12
θt13
θt1p
Sparsity
Structural Changes
Networks With Jumps
maxθtt∈Tn
∑t
γ(θt; xt)− λ1
∑t
||θt||1
− λ2
∑t
||θt − θt−1||2
Fused Penalty (Tibshirani et al., 2005)
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 36
θt12
θt13
θt1p
Sparsity
Structural Changes
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 37
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 37
Drosophila Life Cycle
Data from Arbeitman et al. (2002)
66 microarray measurements acrossfull life cycle
Four stages in the life cycle
- embryo
- larva
- pupal
- adult
Analyze subset of 588 genes relatedto development
Kolar, Song, Ahmed, and Xing (2010)
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 38
Estimated Dynamic Network
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 39
molecular
function
biological
process
cellular
component
Transient Group Interactions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 40
Known Gene Interactions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 41
Known Gene Interactions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 42
Known Gene Interactions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 43
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 44
Outline
1 Estimating Conditional Independence Relationships
Representation – Markov NetworksEstimating Graph Structure
2 Time-Varying Networks
Smoothly Varying NetworksNetworks With Jumps
3 An Application
4 Some theoretical results
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 44
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Fisher information matrix:
Qτa := E[∇2 logPθτa [Xa|Xa]], a ∈ V, τ ∈ [0, 1]
- bounded eigenvalues
- incoherence condition
Smoothness: Σt = (σtab) are smooth functions of time
Kernel satisfies regularity conditions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 45
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Fisher information matrix:
Qτa := E[∇2 logPθτa [Xa|Xa]], a ∈ V, τ ∈ [0, 1]
- bounded eigenvalues
- incoherence condition
Smoothness: Σt = (σtab) are smooth functions of time
Kernel satisfies regularity conditions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 45
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Fisher information matrix:
Qτa := E[∇2 logPθτa [Xa|Xa]], a ∈ V, τ ∈ [0, 1]
- bounded eigenvalues
- incoherence condition
Smoothness: Σt = (σtab) are smooth functions of time
Kernel satisfies regularity conditions
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 45
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Parameters: λ √
log pn1/3 , h n−
13
Sparsity: s3 log pn2/3 = o(1) (s – maximal node degree)
Signal strength: θmin = mine∈Eτ |θτe | = Ω(√
s log pn1/3
)
P [graph not recovered] = O(exp
(−Cs−3nh+ C ′ log p
)) n,p→∞−−−−−→ 0
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 46
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Parameters: λ √
log pn1/3 , h n−
13
Sparsity: s3 log pn2/3 = o(1) (s – maximal node degree)
Signal strength: θmin = mine∈Eτ |θτe | = Ω(√
s log pn1/3
)
P [graph not recovered] = O(exp
(−Cs−3nh+ C ′ log p
)) n,p→∞−−−−−→ 0
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 46
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Parameters: λ √
log pn1/3 , h n−
13
Sparsity: s3 log pn2/3 = o(1) (s – maximal node degree)
Signal strength: θmin = mine∈Eτ |θτe | = Ω(√
s log pn1/3
)
P [graph not recovered] = O(exp
(−Cs−3nh+ C ′ log p
)) n,p→∞−−−−−→ 0
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 46
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Parameters: λ √
log pn1/3 , h n−
13
Sparsity: s3 log pn2/3 = o(1) (s – maximal node degree)
Signal strength: θmin = mine∈Eτ |θτe | = Ω(√
s log pn1/3
)
P [graph not recovered] = O(exp
(−Cs−3nh+ C ′ log p
)) n,p→∞−−−−−→ 0
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 46
Theoretical Properties
Theorem (Kolar and Xing (2012c))
Under suitable technical assumptions the graph Gτ is recovered withexponentially high probability for any fixed point τ ∈ [0, 1].
Parameters: λ √
log pn1/3 , h n−
13
Sparsity: s3 log pn2/3 = o(1) (s – maximal node degree)
Signal strength: θmin = mine∈Eτ |θτe | = Ω(√
s log pn1/3
)
P [graph not recovered] = O(exp
(−Cs−3nh+ C ′ log p
)) n,p→∞−−−−−→ 0
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 46
Simulation Results
15 20 25 30
0
20
40
60
80
100
Chain Graph
Scaled sample size n/(s4.5 log1.5(p))
Ha
mm
ing
dis
tan
ce
p = 60p = 100p = 140
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 47
Thank you!
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 48
References I
Arbeitman, M., E. Furlong, F. Imam, E. Johnson, B. Null, B. Baker, M. Krasnow,M. Scott, R. Davis, and K. White (2002). “Gene Expression During the Life Cycle ofDrosophila melanogaster”. In: Science 297, pp. 2270–2275.
Balakrishnan, S., M. Kolar, A. Rinaldo, and A. Singh (2012). “Recovering Block-structuredActivations Using Compressive Measurements”. In: arXiv preprint arXiv:1209.3431.
Banerjee, Onureena, Laurent El Ghaoui, and Alexandre d’Aspremont (2008). “ModelSelection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian orBinary Data”. In: J. Mach. Learn. Res. 9, pp. 485–516. issn: 1533-7928.
Beck, A. and M. Teboulle (2009). “A fast iterative shrinkage-thresholding algorithm forlinear inverse problems”. In: SIAM Journal on Imaging Sciences 2.1, 183202.
Chickering, David Maxwell (1996). “Learning Bayesian Networks is NP-Complete”. In:Learning from Data: Artificial Intelligence and Statistics V. Springer-Verlag, pp. 121–130.
Drton, Mathias and Michael D. Perlman (2007). “Multiple Testing and Error Control inGaussian Graphical Model Selection”. In: Statistical Science 22.3, pp. 430–449. url:doi:10.1214/088342307000000113.
Friedman, J., T. Hastie, and R. Tibshirani (2008). “Regularization paths for generalizedlinear models via coordinate descent”. In: Department of Statistics, Stanford University,Tech. Rep.
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 49
References II
Kolar, M., H. Liu, and E. P. Xing (Oct. 2012). “Graph Estimation From Multi-attributeData”. In: ArXiv e-prints. arXiv:1210.7665 [stat.ML].
Kolar, M. and J. Sharpnack (May 2012). “Variance function estimation inhigh-dimensions”. In: ArXiv e-prints. arXiv:1205.4770 [stat.ML].
Kolar, M. and E.P. Xing (2011). “On Time Varying Undirected Graphs”. In: Proceedings ofthe 14th International Conference on Artifical Intelligence and Statistics AISTATS.
— (2012a). “Estimating Networks With Jumps”. In: Electronic Journal of Statistics.
— (2012b). “Estimating Sparse Precision Matrices from Data with Missing Values”. In:
Kolar, M., S. Balakrishnan, A. Rinaldo, and A. Singh (2011). “Minimax localization ofstructural information in large noisy matrices”. In: Advances in Neural InformationProcessing Systems.
Kolar, Mladen, John Lafferty, and Larry Wasserman (July 2011). “Union Support Recoveryin Multi-task Learning”. In: J. Mach. Learn. Res. 12, pp. 2415–2435. issn: 1532-4435.
Kolar, Mladen and Hai Liu (2012a). “Optimal ROAD For Feature Selection inHigh-Dimensional Classification”. In: Submitted.
— (2012b). “Union Support Recovery in Multi-task Learning”. In: J. Mach. Learn. Res.(AISTATS track) WCP Vol 22, pp. 647–655.
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 50
References III
Kolar, Mladen, Ankur P. Parikh, and Eric P. Xing (2010). “On Sparse NonparametricConditional Covariance Selection”. In: ICML ’10: Proceedings of the 27th AnnualInternational Conference on Machine Learning. Hiafa, Israel.
Kolar, Mladen, Le Song, and Eric Xing (2009). “Sparsistent Learning of Varying-coefficientModels with Structural Changes”. In: Advances in Neural Information Processing Systems22. Ed. by Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta,pp. 1006–1014.
Kolar, Mladen and Eric P. Xing (2010). “Ultra-high Dimensional Multiple Output LearningWith Simultaneous Orthogonal Matching Pursuit: Screening Approach”. In: AISTATS2010: Proceedings of the 13th International Conference on Artifical Intelligence andStatistics, pp. 413–420.
Kolar, Mladen and Eric P Xing (July 2012c). “Sparsistent Estimation of Time-VaryingDiscrete Markov Random Fields”. In: 0907.2337.
Kolar, Mladen, Le Song, Amr Ahmed, and Eric P. Xing (2010). “Estimating Time-VaryingNetworks”. In: Annals of Applied Statistics 4.1, pp. 94–123.
Koller, D. and N. Friedman (2009). Probabilistic graphical models: principles andtechniques. MIT press.
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 51
References IV
Lauritzen, S. L. (1996). Graphical Models (Oxford Statistical Science Series). OxfordUniversity Press, USA.
Meinshausen, Nicolai and Peter Buhlmann (2006). “High-dimensional graphs and variableselection with the Lasso”. In: Annals of Statistics 34.3, pp. 1436–1462.
Nesterov, Yu. (May 2005). “Smooth minimization of non-smooth functions”. In:Mathematical Programming 103.1, pp. 127–152. doi: 10.1007/s10107-004-0552-5. url:http://dx.doi.org/10.1007/s10107-004-0552-5.
Peng, Jie, Pei Wang, Nengfeng Zhou, and Ji Zhu (2009). “Partial Correlation Estimation byJoint Sparse Regression Models”. In: Journal of the American Statistical Association104.486, pp. 735–746. doi: 10.1198/jasa.2009.0126. eprint:http://pubs.amstat.org/doi/pdf/10.1198/jasa.2009.0126. url:http://pubs.amstat.org/doi/abs/10.1198/jasa.2009.0126.
Ravikumar, P., M. J. Wainwright, and J. D. Lafferty (2009). “High-Dimensional Ising ModelSelection Using `1 Regularized Logistic Regression”. In: Annals of Statistics to appear.
Song, L., M. Kolar, and E.P. Xing (2009). “Time-varying dynamic bayesian networks”. In:Advances in Neural Information Processing Systems 22, pp. 1732–1740.
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 52
References V
Song, Le, Mladen Kolar, and Eric P. Xing (2009). “KELLER: Estimating Time-EvolvingInteractions Between Genes”. In: Proceedings of the 16th International Conference onIntelligent Systems for Molecular Biology.
Tibshirani, Robert, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight (2005).“Sparsity and smoothness via the fused lasso”. In: Journal Of The Royal Statistical SocietySeries B 67.1, pp. 91–108. url:http://ideas.repec.org/a/bla/jorssb/v67y2005i1p91-108.html.
M. Kolar (Chicago Booth) Estimating Time-Varying Networks November 20, 2013 53