Upload
davin-holcombe
View
219
Download
1
Embed Size (px)
Citation preview
1
Processing SequentialSensor Data
John KrummMicrosoft Research
Redmond, Washington [email protected]
22
Interpret a Sequential Signal
0 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100
120
1-D Signal
Time (seconds)
Signal is• Often a function of time (as above)• Often from a sensor
33
Pervasive/Ubicomp Examples
Signal sources• Accelerometer• Light sensor• Gyro sensor• Indoor location• GPS• Microphone• …
Interpretations• Speed• Mode of transportation• Location• Moving vs. not moving• Proximity to other people• Emotion• …
44
Goals of this Tutorial
• Confidence to add sequential signal processing to your research• Ability to assess research with simple sequential signal processing• Know the terminology• Know the basic techniques
• How to implement• Where they’re appropriate
• Assess numerical results in an accepted way• At least give the appearance that you know what you’re talking about
55
Not Covering
Regression – fit function to data
Classification – classify things based on measured features
Statistical Tests – determine if data support a hypothesis
0 10 20 30 40 50 60 70 80 90 1000
2000
4000
6000
8000
10000
12000
0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.60
0.51
1.52
2.53
3.54
4.5
0%10%20%30%40%50%60%70%80%90%
100%
6
Outline
• Introduction (already done!)• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results
77
Signal Dimensionality
1D: z(t)
2D: z(t) =( )z1(t)
z2(t)
0 10 20 30 40 50 60 70 80 90 1000
20406080
100120
1-D Signal
Time (seconds)
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
1002-D Signal
z1 (meters)
z2 (m
eter
s)
bold means vector
88
Sampled SignalCannot measure nor store continuous signal, so take samples instead
[ z(0), z(Δ), z(2Δ), … , z((n-1)Δ) ] = [ z1, z2, z3, … , zn ]
Δ = sampling interval, e.g. 1 second, 5 minutes, …
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 505
1015202530354045
1-D Signal
Time (seconds)
Δ = 0.1 seconds
99
Signal + Noise
zi = xi + vi
measurement from noisy sensor
actual value, but unknown
random number representing sensor noise
0 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100
120
1-D Signal
Time (seconds)
Noise• Often assumed to be Gaussian• Often assumed to be zero mean• Often assumed to be i.i.d. (independent, identically distributed)• vi ~ N(0,σ) for zero mean, Gaussian, i.i.d., σ is standard deviation
10
Running Example
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Actual Path and Measured Locations
x (meters)
y (m
eter
s)𝒛𝑖 = 𝒙𝑖 +𝒗𝑖 𝒙𝑖 = ቀ𝑥𝑖𝑦𝑖ቁ= ሺ𝑥𝑖,𝑦𝑖ሻ𝑇
Track a moving person in (x,y)• 1000 (x,y) measurements• Δ = 1 second
measurement vector
actual location
noise
zero mean
standard deviation = 3 meters
Also 10 randomly inserted outliers with N(0,15)start outlier
𝒗𝑖 = ൭𝑣𝑖(𝑥)𝑣𝑖(𝑦)൱~൬𝑁ሺ0,3ሻ𝑁ሺ0,3ሻ൰
11
Outline
• Introduction• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results
12
Mean Filter• Also called “moving average” and “box car filter”• Apply to x and y measurements separately
zx
t
Filtered version of this point is mean of points in solid box
• “Causal” filter because it doesn’t look into future• Causes lag when values change sharply• Help fix with decaying weights, e.g.• Sensitive to outliers, i.e. one really bad point can cause mean to take on any value• Simple and effective (I will not vote to reject your paper if you use this technique)
1313
Mean Filter
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100 Mean Filter
x (meters)y
(met
ers)
10 points in each mean
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Actual Path and Measured Locations
x (meters)
y (m
eter
s)
• Outlier has noticeable impact• If only there were some convenient way to fix this …
outlier
1414
Median Filter
zx
t
Filtered version of this point is mean median of points in solid box
Insensitive to value of, e.g., this point
median (1, 3, 4, 7, 1 x 1010) = 4mean (1, 3, 4, 7, 1 x 1010) ≈ 2 x 109
Median is way less sensitive to outliners than mean
1515
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Median Filter
x (meters)
y (m
eter
s)
Median Filter
10 points in each median
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Actual Path and Measured Locations
x (meters)
y (m
eter
s)
Outlier has noticeable less impact
outlier
1616
Mean and Median Filter
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Mean and Median Filter
MeanMedian
x (meters)
y (m
eter
s)
Editorial: mean vs. median
The median is almost always better to use than the mean.
17
Outline
• Introduction• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results
18
Kalman Filter
My favorite book on Kalman filtering
• Mean and median filters assume smoothness• Kalman filter adds assumption about trajectory
Assumed trajectory is parabolic
datadynamicsWeight data against
assumptions about system’s dynamics
Big difference #1: Kalman filter includes (helpful) assumptions about behavior of measured process
1919
Kalman Filter
Big difference #2: Kalman filter can include state variables that are not measured directly
Kalman filter separates measured variables from state variables
𝒛𝑖 = ൭𝑧𝑖(𝑥)𝑧𝑖(𝑦)൱
Running example: measure (x,y) coordinates (noisy)
𝒙𝑖 =ۉ
ۈۇ
𝑥𝑖𝑦𝑖𝑣𝑖(𝑥)𝑣𝑖(𝑦) ی
ۋۊ
Running example: estimate location and velocity (!)
Measure:
Infer state:
0 10 20 30 40 50 60 70 80 90 1000
102030405060708090
100
2020
Kalman Filter Measurements
𝒛𝑖 =𝐻𝑖𝒙𝑖 +𝒗𝑖
Measurement vector is related to state vector by a matrix multiplication plus noise.
൭𝑧𝑖(𝑥)𝑧𝑖(𝑦)൱= ቂ
1 0 0 00 1 0 0ቃۉ
ۈۇ
𝑥𝑖𝑦𝑖𝑣𝑖(𝑥)𝑣𝑖(𝑦) ی
ۋ 𝑁ሺ𝟎,𝑅𝑖ሻ+ۊ
Sleepy eyes threat level: orange
𝑧𝑖(𝑥) = 𝑥𝑖 +𝑁ሺ0,𝜎𝑟ሻ 𝑧𝑖(𝑦) = 𝑦𝑖 +𝑁ሺ0,𝜎𝑟ሻ
Running example:
• In this case, measurements are just noisy copies of actual location• Makes sensor noise explicit, e.g. GPS has σ of around 5 meters
2121
Kalman Filter DynamicsInsert a bias for how we think system will change through time𝒙𝑖 =Φ𝑖−1𝒙𝑖−1+𝑤𝑖−1
ۉ
ۈۇ
𝑥𝑖𝑦𝑖𝑣𝑖(𝑥)𝑣𝑖(𝑦) ی
ۋ=ۊ ൦
1 0 ∆𝑡𝑖 00 1 0 ∆𝑡𝑖0 0 1 00 0 0 1 ൪ ۉ
ۈۇ
𝑥𝑖−1𝑦𝑖−1𝑣𝑖−1(𝑥)𝑣𝑖−1(𝑦) ی
ۋۊ +൮
00𝑁(0,𝜎𝑠)𝑁(0,𝜎𝑠)൲
𝑥𝑖 = 𝑥𝑖−1+∆𝑡𝑖𝑣𝑖(𝑥) location is standard straight-line motion
𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) velocity changes randomly (because we don’t have any idea what it actually does)
2222
Kalman Filter Ingredients
ቂ1 0 0 00 1 0 0ቃ
൦
1 0 ∆𝑡𝑖 00 1 0 ∆𝑡𝑖0 0 1 00 0 0 1 ൪
H matrix: gives measurements for given state
𝑁ሺ𝟎,𝑅𝑖ሻ Measurement noise: sensor noise
φ matrix: gives time dynamics of state
𝑁ሺ𝟎,𝑄𝑖ሻ Process noise: uncertainty in dynamics model
2323
Kalman Filter Recipe
𝒙ෝ��𝑖(−) =Φ𝑖−1𝒙ෝ��𝑖−1(+)
𝑃𝑖(−) =Φ𝑖−1𝑃𝑖−1(+)Φ𝑖−1𝑇 +𝑄𝑖−1
𝐾𝑖 =𝑃𝑖(−)𝐻𝑖𝑇ቀ𝐻𝑖𝑃𝑖(−)𝐻𝑖𝑇+𝑅𝑖ቁ−1 𝒙ෝ��𝑖(+) =𝒙ෝ��𝑖(−) +𝐾𝑖 ቀ𝒛𝑖 −𝐻𝑖𝒙ෝ��𝑖(−)ቁ
𝑃𝑖(+) = ሺ𝐼−𝐾𝑖𝐻𝑖ሻ𝑃𝑖(−)
• Just plug in measurements and go• Recursive filter – current time step uses state and error estimates from previous time step
Sleepy eyes threat level: red
Big difference #3: Kalman filter gives uncertainty estimate in the form of a Gaussian covariance matrix
2525
Kalman Filter
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Kalman Filter
x (meters)
y (m
eter
s)
𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) Velocity model:
• Smooth• Tends to overshoot corners• Too much dependence on straight line velocity assumption• Too little dependence on data
datadynamics
2626
Kalman Filter
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Kalman Filter
UntunedTuned
x (meters)
y (m
eter
s)
• Hard to pick process noise σs
• Process noise models our uncertainty in system dynamics• Here it accounts for fact that motion is not a straight line
𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) Velocity model:
“Tuning” σs (by trying a bunch of values) gives better result
2727
Kalman Filter
Editorial: Kalman filter
The Kalman filter was fine back in the old days. But I really prefer more modern methods that are not saddled with Kalman’s restrictions on continuous state variables and linearity assumptions.
28
Outline
• Introduction (already done!)• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results
2929
Particle Filter
Dieter Fox et al.
WiFi tracking in a multi-floor building
• Multiple “particles” as hypotheses• Particles move based on probabilistic motion model• Particles live or die based on how well they match sensor data
3030
Particle Filter
Dieter Fox et al.
• Allows multi-modal uncertainty (Kalman is unimodal Gaussian)• Allows continuous and discrete state variables (e.g. 3rd floor)• Allows rich dynamic model (e.g. must follow floor plan)• Can be slow, especially if state vector dimension is too large(e.g. (x, y, identity, activity, next activity, emotional state, …) )
3131
Particle Filter Ingredients
𝑝ሺ𝒛𝑖ȁ%𝒙𝑖ሻ • z = measurement, x = state, not necessarily same• Probability distribution of a measurement given actual value• Can be anything, not just Gaussian like Kalman• But we use Gaussian for running example, just like Kalman
p(z i
|xi)
zi
xi
For running example, measurement is noisy version of actual value
E.g. measured speed (in z) will be slower if emotional state (in x) is “tired”
3232
Particle Filter Ingredients
𝑝ሺ𝒙𝑖ȁ%𝒙𝑖−1ሻ • Probabilistic dynamics, how state changes through time• Can be anything, e.g.
• Tend to go slower up hills• Avoid left turns• Attracted to Scandinavian people
• Closed form not necessary• Just need a dynamic simulation with a noise component• But we use Gaussian for running example, just like Kalman
xi
xi-1
random vector
3333
Home Example
𝑝ሺ𝒙𝑖ȁ%𝒙𝑖−1ሻ 𝑝ሺ𝒛𝑖ȁ%𝒙𝑖ሻ
z = ( (x,y) location in house from WiFi)T
Measurements
x = (room, activity)
State (what we want to estimate)
• p((x,y) in kitchen | in bathroom) = 0
• p( sleeping now | sleeping previously) = 0.9• p( cooking now | working previously) = 0.02• p( watching TV & sleeping| *) = 0• p( bedroom 4 | master bedroom) = 0
Rich measurement and state dynamics models
34
Particle Filter AlgorithmStart with N instances of state vector xi
(j) , i = 0, j = 1 … N1. i = i+12. Take new measurement zi
3. Propagate particles forward in time with p(xi|xi-1), i.e. generate new, random hypotheses
4. Compute importance weights wi(j) = p(zi|xi
(j)), i.e. how well does measurement support hypothesis?
5. Normalize importance weights so they sum to 1.06. Randomly pick new particles based on importance weights7. Goto 1
Compute state estimate• Weighted mean (assumes unimodal)• Median
Sleepy eyes threat level: orange
3535
Particle Filter
Dieter Fox et al.
WiFi tracking in a multi-floor building
• Multiple “particles” as hypotheses• Particles move based on probabilistic motion model• Particles live or die based on how well they match sensor data
3636
Particle Filter Running Example
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100Particle Filter
ActualParticle 1000Particle 1000000
x (meters)
y (m
eter
s)
Sometimes increasing the number of particles helps
𝑝ሺ𝒛𝑖ȁ%𝒙𝑖ሻ
p(z i
|xi)
zi
xi
Measurement model reflects true, simulated measurement noise. Same as Kalman in this case.
𝑝ሺ𝒙𝑖ȁ%𝒙𝑖−1ሻ 𝑥𝑖 = 𝑥𝑖−1+∆𝑡𝑖𝑣𝑖(𝑥) location is standard
straight-line motion
𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) velocity changes randomly (because we don’t have any idea what it actually does)
Straight line motion with random velocity change. Same as Kalman in this case.
3737
Particle Filter Resources
UbiComp 2004
Especially Chapter 1
3838
Particle Filter
Editorial: Particle filter
The particle filter is wonderfully rich and expressive if you can afford the computations. Be careful not to let your state vector get too large.
39
Outline
• Introduction• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results
4040
Hidden Markov Model (HMM)
Big difference from previous: states are discrete, e.g.• Spoken phoneme• {walking, driving, biking, riding bus}• {moving, still}• {cooking, sleeping, watching TV, playing game, … }
Markov 1856 - 1922 Hidden Markov
4141
(Unhidden) Markov Model
bus walk
drive
0.9 0.7
0.9
0.1
0.1
0.1
0.20.0
0.0
• Move to new state (or not)• at every time click• when finished with current state
• Transition probabilities control state transitions
Example inspired by:
UbiComp 2003
4242
Hidden Markov Model
bus walk
drive
0.9 0.7
0.9
0.1
0.1
0.1
0.20.0
0.0
Can “see” states only via noisy sensor
accelerometer
4343
HMM: Two Parts
P(X0(j)) ajk P(X1
(j)|z1) P(X2(j)|z2)ajk
Initial StateProbabilities
TransitionProbabilities
Observation Probabilities
Observation Probabilities
TransitionProbabilities
P(X3(j)|z2)
Observation Probabilities
ajk
TransitionProbabilities
Two parts to every HMM:1) Observation probabilities P(Xi
(j)|zi) – probability of state j given measurement at time i2) Transition probabilities ajk – probability of transition from state j to state k
• Find path that maximizes product of probabilities (observation & transition)• Use Viterbi algorithm to find path efficiently
4444
Smooth Results with HMM
moving vs. still0 10 20 30 40 50 60 70 80 90 100
010203040506070
Signal Strength
Time (sec.)
Sign
al S
tren
gth
noise variance
still moving
still moving still
still moving
0.99989 0.999890.00011
0.00011
Signal strength has higher variance when moving → observation probabilities
Transitions between states relatively rare (made-up numbers) → transition probabilities
4545
Smooth Results with HMM
noise variance
still moving
still still
movingmoving
still
moving
still
moving
0 10 20 30 40 50 60 70 80 90 1000
10203040506070
0.99989
0.99989
0.00011
0.00011
0.4
0.6
0.2
0.8
0.9
0.1 0.7
0.3 Viterbi algorithm finds path with maximum product of observation and transition probabilities
still
moving
still
moving
still
moving
inferred and smoothed with
HMM
inferred
actual
0 200 400 600 800 1000
Time (seconds)
Still vs. Moving Estimate
Results in fewer false transitions between states, i.e. smoother and slightly more accurate
4646
Running Example
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
y (m
eter
s)
x (meters)
Hidden Markov ModelDiscrete states are 10,000 1m x 1m squares
Observation probabilities spread in Gaussian over nearby squares as per measurement noise model
Transition probabilities go to 8-connected neighbors
0.011762 0.136136 0.011762
0.13964 0.401401 0.13964
0.011762 0.136136 0.011762
4747
HMM Reference
• Good description of Viterbi algorithm• Also how to learn model from data
4848
Hidden Markov Model
Editorial: Hidden Markov Model
The HMM is great for certain applications when your states are discrete.
Tracking in (x,y,z) with HMM?• Huge state space (→ slow)• Long dwells• Interactions with other airplanes
49
Outline
• Introduction• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results
5050
Presenting ContinuousPerformance Results
Measured Mean Median Kalman (untuned)
Kalman (tuned)
Particle HMM05
101520253035404550
Tracking Error vs. Filter
Mean Error
Median Error
met
ers
0
1
2
3
4
5
6
7
Tracking Error vs. Filter
Mean Error
Median Error
met
ers
𝑒𝑖 =ԡ𝒙ෝ��𝑖 −𝒙𝑖ԡ
estimatedvalue
actualvalue
Euclidiandistance
Plot mean or median of Euclidian distance error• Median is less sensitive to error outliers
Note: Don’t judge these filtering methods based on these plots. I didn’t spend much time tuning the methods to improve their performance.
5151
Presenting ContinuousPerformance Results
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Cumulative Error Distribution
Median
HMM
Kalman (tuned)
Particle
Mean
Kalman (untuned)
Error (meters)
Frac
tion
Cumulative error distribution• Shows how errors are distributed• More detailed than just a mean or median error
50% of the time, the particle filter gives an error of 2 meters or less (median error)
95th percentile
95% of the time, the particle filter gives an error of 6 meters or less (95th percentile error)
5252
Presenting DiscretePerformance Results
Techniques like particle filter and HMM can classify sequential data into discrete classes
Sitti ng Standing Walking Up s ta i rs Down sta i rs
Elevator down
Elevator up
Brushing teeth
Sitti ng 75% 24% 1% 0% 0% 0% 0% 0%
Standing 29% 55% 6% 1% 0% 4% 3% 2%
Walking 4% 7% 79% 3% 4% 1% 1% 1%
Up sta i rs 0% 1% 4% 95% 0% 0% 1% 0%
Down sta i rs 0% 1% 7% 0% 89% 2% 0% 0%
Elevator down 0% 2% 1% 0% 8% 87% 1% 0%
Elevator up 0% 2% 2% 6% 0% 3% 87% 0%
Brushing teeth 2% 10% 3% 0% 0% 0% 0% 85%
Actu
al A
ctivi
ties
Inferred Activities
Confusion matrix
Pervasive 2006
5353
End
Mean Median Kalman (untuned)
Kalman (tuned)
Particle HMM01234567
Tracking Error vs. Filter
Mean ErrorMedian Error
met
ers
• Introduction• Signal terminology and assumptions• Running example• Filtering
• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model
• Presenting performance results 0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
Actual Path and Measured Locations
x (meters)
y (m
eter
s)
54
55
Ubiquitous Computing Fundamentals,CRC Press, © 2010