Upload
portia
View
31
Download
0
Embed Size (px)
DESCRIPTION
Learning Dynamic Models from Unsequenced Data Jeff Schneider School of Computer Science Carnegie Mellon University. joint work with Tzu-Kuo Huang, Le Song. Learning Dynamic Models. Hidden Markov Models e.g. for speech recognition Dynamic Bayesian Networks e.g. for protein/gene interaction - PowerPoint PPT Presentation
Citation preview
1
Learning Dynamic Models from Unsequenced Data
Jeff Schneider
School of Computer ScienceCarnegie Mellon University
joint work with Tzu-Kuo Huang, Le Song
2
Hubble Ultra Deep Field
Learning Dynamic Models
Hidden Markov Modelse.g. for speech recognition
Dynamic Bayesian Networkse.g. for protein/gene interaction
System Identificatione.g. for control
[source: Wikimedia Commons]
[source: SISL ARLUT]
[Bagnell & Schneider, 2001][source: UAV ETHZ]
• Key Assumption: SEQUENCED observations• What if observations are NOT SEQUENCED?
3
When are Observations not Sequenced?Galaxy evolution• dynamics are too slow to watch
Slow developing diseases• Alzheimers• Parkinsons
Biological processes• measurements are often destructive
[source: STAGES]
[source: Getty Images]
[source: Bryan Neff Lab, UWO]
How can we learn dynamic models for these?
4
Outline
• Linear Models [Huang and Schneider, ICML, 2009]
• Nonlinear Models [Huang, Song, Schneider, AISTATS, 2010]
• Combining Sequence and Unsequenced Data[Huang and Schneider, NIPS, 2011]
5
Problem Description
Estimate A from the sample of xi’s
6
Doesn't seem impossible …
7
Identifiability Issues
8
Identifiability Issues
9
A Maximum Likelihood Approach
n
ip
ixAix
XAXp1
)2(
)exp(2/2
22
2||~||
)~,,|(
suppose we knew the dynamic model and the predecessor of each point …
10
Likelihood continued
11
Likelihood (continued)
• we don’t know the time either so also integrate out over time
• then use the empirical density as an estimate for the resulting marginal distribution
12
Unordered Method (UM): Estimation
13
Expectation Maximization
14
input output
Sample Synthetic Result
15
Partial-order Method (PM)
16
Partial Order Approximation (PM)
Perform estimation by alternating maximization
• Replace UM's E-step with a maximum spanning tree on the complete graph over data points
- weight on each edge is probability of one point being generated from the other given A and
- enforces a global consistency on the solution
• M-step is unchanged: weighted regression
17
Learning Nonlinear Dynamic Models
[Huang, Song, Schneider, AISTATS, 2010]
18
Learning Nonlinear Dynamic Models
An important issue
• Linear model provides a severely restricted space of models- we know a model is wrong because the regression yields
large residuals and low likelihoods
• The nonlinear models are too powerful; they can fit anything!
• Solution: restrict the space of nonlinear models1. form the full kernel matrix2. use a low-rank approximation of the kernel matrix
19
Synthetic Nonlinear Data: Lorenz Attractor
Estimated gradients by kernel UM
20
Ordering by Temporal Smoothing
21
Ordering by Temporal Smoothing
22
Ordering by Temporal Smoothing
23
Evaluation Criteria
24
Results: 3D-1
25
Results: 3D-2
26
3D-1: Algorithm Comparison
27
3D-2: Algorithm Comparison
28
Methods for Real Data
1. Run k-means to cluster the data
2. Find an ordering of the cluster centers
• TSP on pairwise L1 distances (TSP+L1)OR
• Temporal Smoothing Method (TSM)
3. Learn a dynamic model for the cluster centers
4. Initialize UM/PM with the learned model
29
Gene Expression in Yeast Metabolic Cycle
30
Gene Expression in Yeast Metabolic Cycle
31
Results on Individual Genes
32
Results over the whole space
33
Cosine score in high dimensionsProbability of random direction achieving a cosine score > 0.5
dimension
34
Suppose we have some sequenced data
ttt Axx 1 ),0(~ 2INt linear dynamic model:
perform a standard regression:
2||||min FAXAY
]...,[ 32 nxxxY ]...,[ 121 nxxxX
what if the amount of data is not enough to regress reliably?
35
Regularization for Regression
add regularization to the regression:
12 ||||||||min AXAY FA
can the unsequenced data be used in regularization?
22 ||||||||min FFAAXAY ridge regression:
lasso:
36
Lyapunov Regularization
Lyapunov equation relates dynamic model to steady state distribution:
QIQAAT 2
Q – covariance of steady state distribution
222 ||ˆˆ||||||min FT
FAQIAQAXAY
1. estimate Q from the unsequenced data!2. optimize via gradient descent using the unpenalized
or the ridge regression solution as the initial point
37
Lyapunov Regularization: Toy Example
• 2-d linear system• 2nd column of A fixed at the correct value• given 4 sequence points• given 20 unsequenced points
-0.428 0.572-1.043 -0.714A = = 1
38
Lyapunov Regularization: Toy Example
39
Results on Synthetic DataRandom 200 dimensional sparse (1/8) stable system
40
Work in Progress …• cell cycle data from: [Zhou, Li, Yan,
Wong, IEEE Trans on Inf Tech in Biomedicine, 2009]
• 49 features on protein subcellular location
• 34 sequences having a full cycle and length at least 30 were identified
• another 11,556 are unsequenced
• use the 34 sequences as ground truth and train on the unsequenced data
A set of 100 sequenced images
A tracking algorithm identified 34 sequences
41
Preliminary Results: Protein Subcellular Location Dynamics
cosi
ne s
core
norm
aliz
ed e
rror
42
Conclusions and Future Work
• Demonstrated ability to learn (non)linear dynamic models from unsequenced data
• Demonstrated method to use sequenced and unsequenced data together
• Continuing efforts on real scientific data
• Can we do this with hidden states?
43
EXTRA SLIDES
44
Real Data: Swinging Pendulum Video
45
Results: Swinging Pendulum Video
46
47