Processing Sequential Sensor Data John Krumm Microsoft Research Redmond, Washington USA [email protected] 1

1

Processing SequentialSensor Data

John KrummMicrosoft Research

Redmond, Washington [email protected]

22

Interpret a Sequential Signal

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

1-D Signal

Time (seconds)

Signal is• Often a function of time (as above)• Often from a sensor

33

Pervasive/Ubicomp Examples

Signal sources• Accelerometer• Light sensor• Gyro sensor• Indoor location• GPS• Microphone• …

Interpretations• Speed• Mode of transportation• Location• Moving vs. not moving• Proximity to other people• Emotion• …

44

Goals of this Tutorial

• Confidence to add sequential signal processing to your research• Ability to assess research with simple sequential signal processing• Know the terminology• Know the basic techniques

• How to implement• Where they’re appropriate

• Assess numerical results in an accepted way• At least give the appearance that you know what you’re talking about

55

Not Covering

Regression – fit function to data

Classification – classify things based on measured features

Statistical Tests – determine if data support a hypothesis

0 10 20 30 40 50 60 70 80 90 1000

2000

4000

6000

8000

10000

12000

0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.60

0.51

1.52

2.53

3.54

4.5

0%10%20%30%40%50%60%70%80%90%

100%

6

Outline

• Introduction (already done!)• Signal terminology and assumptions• Running example• Filtering

• Mean and median filters• Kalman filter• Particle filter• Hidden Markov model

• Presenting performance results

77

Signal Dimensionality

1D: z(t)

2D: z(t) =( )z1(t)

z2(t)

0 10 20 30 40 50 60 70 80 90 1000

20406080

100120

1-D Signal

Time (seconds)

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

1002-D Signal

z1 (meters)

z2 (m

eter

s)

bold means vector

88

Sampled SignalCannot measure nor store continuous signal, so take samples instead

[ z(0), z(Δ), z(2Δ), … , z((n-1)Δ) ] = [ z1, z2, z3, … , zn ]

Δ = sampling interval, e.g. 1 second, 5 minutes, …

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 505

1015202530354045

1-D Signal

Time (seconds)

Δ = 0.1 seconds

99

Signal + Noise

zi = xi + vi

measurement from noisy sensor

actual value, but unknown

random number representing sensor noise

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

1-D Signal

Time (seconds)

Noise• Often assumed to be Gaussian• Often assumed to be zero mean• Often assumed to be i.i.d. (independent, identically distributed)• vi ~ N(0,σ) for zero mean, Gaussian, i.i.d., σ is standard deviation

10

Running Example

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100Actual Path and Measured Locations

x (meters)

y (m

eter

s)𝒛𝑖 = 𝒙𝑖 +𝒗𝑖 𝒙𝑖 = ቀ𝑥𝑖𝑦𝑖ቁ= ሺ𝑥𝑖,𝑦𝑖ሻ𝑇

Track a moving person in (x,y)• 1000 (x,y) measurements• Δ = 1 second

measurement vector

actual location

noise

zero mean

standard deviation = 3 meters

Also 10 randomly inserted outliers with N(0,15)start outlier

𝒗𝑖 = ൭𝑣𝑖(𝑥)𝑣𝑖(𝑦)൱~൬𝑁ሺ0,3ሻ𝑁ሺ0,3ሻ൰

11

Outline

• Introduction• Signal terminology and assumptions• Running example• Filtering



12

Mean Filter• Also called “moving average” and “box car filter”• Apply to x and y measurements separately

zx

t

Filtered version of this point is mean of points in solid box

• “Causal” filter because it doesn’t look into future• Causes lag when values change sharply• Help fix with decaying weights, e.g.• Sensitive to outliers, i.e. one really bad point can cause mean to take on any value• Simple and effective (I will not vote to reject your paper if you use this technique)

1313

Mean Filter

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100 Mean Filter

x (meters)y

(met

ers)

10 points in each mean

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90


x (meters)

y (m

eter

s)

• Outlier has noticeable impact• If only there were some convenient way to fix this …

outlier

1414

Median Filter

zx

t

Filtered version of this point is mean median of points in solid box

Insensitive to value of, e.g., this point

median (1, 3, 4, 7, 1 x 1010) = 4mean (1, 3, 4, 7, 1 x 1010) ≈ 2 x 109

Median is way less sensitive to outliners than mean

1515

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100Median Filter

x (meters)

y (m

eter

s)

Median Filter

10 points in each median

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90


x (meters)

y (m

eter

s)

Outlier has noticeable less impact

outlier

1616

Mean and Median Filter

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100Mean and Median Filter

MeanMedian

x (meters)

y (m

eter

s)

Editorial: mean vs. median

The median is almost always better to use than the mean.

17

Outline




18

Kalman Filter

My favorite book on Kalman filtering

• Mean and median filters assume smoothness• Kalman filter adds assumption about trajectory

Assumed trajectory is parabolic

datadynamicsWeight data against

assumptions about system’s dynamics

Big difference #1: Kalman filter includes (helpful) assumptions about behavior of measured process

1919

Kalman Filter

Big difference #2: Kalman filter can include state variables that are not measured directly

Kalman filter separates measured variables from state variables

𝒛𝑖 = ൭𝑧𝑖(𝑥)𝑧𝑖(𝑦)൱

Running example: measure (x,y) coordinates (noisy)

𝒙𝑖 =ۉ

ۈۇ

𝑥𝑖𝑦𝑖𝑣𝑖(𝑥)𝑣𝑖(𝑦) ی

ۋۊ

Running example: estimate location and velocity (!)

Measure:

Infer state:

0 10 20 30 40 50 60 70 80 90 1000

102030405060708090

100

2020

Kalman Filter Measurements

𝒛𝑖 =𝐻𝑖𝒙𝑖 +𝒗𝑖

Measurement vector is related to state vector by a matrix multiplication plus noise.

൭𝑧𝑖(𝑥)𝑧𝑖(𝑦)൱= ቂ

1 0 0 00 1 0 0ቃۉ

ۈۇ


ۋ 𝑁ሺ𝟎,𝑅𝑖ሻ+ۊ

Sleepy eyes threat level: orange

𝑧𝑖(𝑥) = 𝑥𝑖 +𝑁ሺ0,𝜎𝑟ሻ 𝑧𝑖(𝑦) = 𝑦𝑖 +𝑁ሺ0,𝜎𝑟ሻ

Running example:

• In this case, measurements are just noisy copies of actual location• Makes sensor noise explicit, e.g. GPS has σ of around 5 meters

2121

Kalman Filter DynamicsInsert a bias for how we think system will change through time𝒙𝑖 =Φ𝑖−1𝒙𝑖−1+𝑤𝑖−1

ۉ

ۈۇ


ۋ=ۊ ൦

1 0 ∆𝑡𝑖 00 1 0 ∆𝑡𝑖0 0 1 00 0 0 1 ൪ ۉ

ۈۇ

𝑥𝑖−1𝑦𝑖−1𝑣𝑖−1(𝑥)𝑣𝑖−1(𝑦) ی

ۋۊ +൮

00𝑁(0,𝜎𝑠)𝑁(0,𝜎𝑠)൲

𝑥𝑖 = 𝑥𝑖−1+∆𝑡𝑖𝑣𝑖(𝑥) location is standard straight-line motion

𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) velocity changes randomly (because we don’t have any idea what it actually does)

2222

Kalman Filter Ingredients

ቂ1 0 0 00 1 0 0ቃ

൦

1 0 ∆𝑡𝑖 00 1 0 ∆𝑡𝑖0 0 1 00 0 0 1 ൪

H matrix: gives measurements for given state

𝑁ሺ𝟎,𝑅𝑖ሻ Measurement noise: sensor noise

φ matrix: gives time dynamics of state

𝑁ሺ𝟎,𝑄𝑖ሻ Process noise: uncertainty in dynamics model

2323

Kalman Filter Recipe

𝒙ෝ��𝑖(−) =Φ𝑖−1𝒙ෝ��𝑖−1(+)

𝑃𝑖(−) =Φ𝑖−1𝑃𝑖−1(+)Φ𝑖−1𝑇 +𝑄𝑖−1

𝐾𝑖 =𝑃𝑖(−)𝐻𝑖𝑇ቀ𝐻𝑖𝑃𝑖(−)𝐻𝑖𝑇+𝑅𝑖ቁ−1 𝒙ෝ��𝑖(+) =𝒙ෝ��𝑖(−) +𝐾𝑖 ቀ𝒛𝑖 −𝐻𝑖𝒙ෝ��𝑖(−)ቁ

𝑃𝑖(+) = ሺ𝐼−𝐾𝑖𝐻𝑖ሻ𝑃𝑖(−)

• Just plug in measurements and go• Recursive filter – current time step uses state and error estimates from previous time step

Sleepy eyes threat level: red

Big difference #3: Kalman filter gives uncertainty estimate in the form of a Gaussian covariance matrix

http://photo.net/photodb/photo?photo_id=4537671&size=md

2525

Kalman Filter

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100Kalman Filter

x (meters)

y (m

eter

s)

𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) Velocity model:

• Smooth• Tends to overshoot corners• Too much dependence on straight line velocity assumption• Too little dependence on data

datadynamics

2626

Kalman Filter

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100Kalman Filter

UntunedTuned

x (meters)

y (m

eter

s)

• Hard to pick process noise σs

• Process noise models our uncertainty in system dynamics• Here it accounts for fact that motion is not a straight line

𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) Velocity model:

“Tuning” σs (by trying a bunch of values) gives better result

2727

Kalman Filter

Editorial: Kalman filter

The Kalman filter was fine back in the old days. But I really prefer more modern methods that are not saddled with Kalman’s restrictions on continuous state variables and linearity assumptions.

28

Outline

• Introduction (already done!)• Signal terminology and assumptions• Running example• Filtering



2929

Particle Filter

Dieter Fox et al.

WiFi tracking in a multi-floor building

• Multiple “particles” as hypotheses• Particles move based on probabilistic motion model• Particles live or die based on how well they match sensor data

3030

Particle Filter

Dieter Fox et al.

• Allows multi-modal uncertainty (Kalman is unimodal Gaussian)• Allows continuous and discrete state variables (e.g. 3rd floor)• Allows rich dynamic model (e.g. must follow floor plan)• Can be slow, especially if state vector dimension is too large(e.g. (x, y, identity, activity, next activity, emotional state, …) )

3131

Particle Filter Ingredients

𝑝ሺ𝒛𝑖ȁ%𝒙𝑖ሻ • z = measurement, x = state, not necessarily same• Probability distribution of a measurement given actual value• Can be anything, not just Gaussian like Kalman• But we use Gaussian for running example, just like Kalman

p(z i

|xi)

zi

xi

For running example, measurement is noisy version of actual value

E.g. measured speed (in z) will be slower if emotional state (in x) is “tired”

3232

Particle Filter Ingredients

𝑝ሺ𝒙𝑖ȁ%𝒙𝑖−1ሻ • Probabilistic dynamics, how state changes through time• Can be anything, e.g.

• Tend to go slower up hills• Avoid left turns• Attracted to Scandinavian people

• Closed form not necessary• Just need a dynamic simulation with a noise component• But we use Gaussian for running example, just like Kalman

xi

xi-1

random vector

3333

Home Example

𝑝ሺ𝒙𝑖ȁ%𝒙𝑖−1ሻ 𝑝ሺ𝒛𝑖ȁ%𝒙𝑖ሻ

z = ( (x,y) location in house from WiFi)T

Measurements

x = (room, activity)

State (what we want to estimate)

• p((x,y) in kitchen | in bathroom) = 0

• p( sleeping now | sleeping previously) = 0.9• p( cooking now | working previously) = 0.02• p( watching TV & sleeping| *) = 0• p( bedroom 4 | master bedroom) = 0

Rich measurement and state dynamics models

34

Particle Filter AlgorithmStart with N instances of state vector xi

(j) , i = 0, j = 1 … N1. i = i+12. Take new measurement zi

3. Propagate particles forward in time with p(xi|xi-1), i.e. generate new, random hypotheses

4. Compute importance weights wi(j) = p(zi|xi

(j)), i.e. how well does measurement support hypothesis?

5. Normalize importance weights so they sum to 1.06. Randomly pick new particles based on importance weights7. Goto 1

Compute state estimate• Weighted mean (assumes unimodal)• Median

Sleepy eyes threat level: orange

3535

Particle Filter

Dieter Fox et al.

WiFi tracking in a multi-floor building

• Multiple “particles” as hypotheses• Particles move based on probabilistic motion model• Particles live or die based on how well they match sensor data

3636

Particle Filter Running Example

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100Particle Filter

ActualParticle 1000Particle 1000000

x (meters)

y (m

eter

s)

Sometimes increasing the number of particles helps

𝑝ሺ𝒛𝑖ȁ%𝒙𝑖ሻ

p(z i

|xi)

zi

xi

Measurement model reflects true, simulated measurement noise. Same as Kalman in this case.

𝑝ሺ𝒙𝑖ȁ%𝒙𝑖−1ሻ 𝑥𝑖 = 𝑥𝑖−1+∆𝑡𝑖𝑣𝑖(𝑥) location is standard

straight-line motion

𝑣𝑖(𝑥) = 𝑣𝑖−1(𝑥) +𝑁(0,𝜎𝑠) velocity changes randomly (because we don’t have any idea what it actually does)

Straight line motion with random velocity change. Same as Kalman in this case.

3737

Particle Filter Resources

UbiComp 2004

Especially Chapter 1

3838

Particle Filter

Editorial: Particle filter

The particle filter is wonderfully rich and expressive if you can afford the computations. Be careful not to let your state vector get too large.

39

Outline




http://www.grillmeats.com/cookingtimes/cutsofpork.exe

4040

Hidden Markov Model (HMM)

Big difference from previous: states are discrete, e.g.• Spoken phoneme• {walking, driving, biking, riding bus}• {moving, still}• {cooking, sleeping, watching TV, playing game, … }

Markov 1856 - 1922 Hidden Markov

4141

(Unhidden) Markov Model

bus walk

drive

0.9 0.7

0.9

0.1

0.1

0.1

0.20.0

0.0

• Move to new state (or not)• at every time click• when finished with current state

• Transition probabilities control state transitions

Example inspired by:

UbiComp 2003

4242

Hidden Markov Model

bus walk

drive

0.9 0.7

0.9

0.1

0.1

0.1

0.20.0

0.0

Can “see” states only via noisy sensor

accelerometer

4343

HMM: Two Parts

P(X0(j)) ajk P(X1

(j)|z1) P(X2(j)|z2)ajk

Initial StateProbabilities

TransitionProbabilities

Observation Probabilities



P(X3(j)|z2)


ajk


Two parts to every HMM:1) Observation probabilities P(Xi

(j)|zi) – probability of state j given measurement at time i2) Transition probabilities ajk – probability of transition from state j to state k

• Find path that maximizes product of probabilities (observation & transition)• Use Viterbi algorithm to find path efficiently

4444

Smooth Results with HMM

moving vs. still0 10 20 30 40 50 60 70 80 90 100

010203040506070

Signal Strength

Time (sec.)

Sign

al S

tren

gth

noise variance

still moving

still moving still

still moving

0.99989 0.999890.00011

0.00011

Signal strength has higher variance when moving → observation probabilities

Transitions between states relatively rare (made-up numbers) → transition probabilities

4545

Smooth Results with HMM

noise variance

still moving

still still

movingmoving

still

moving

still

moving

0 10 20 30 40 50 60 70 80 90 1000

10203040506070

0.99989

0.99989

0.00011

0.00011

0.4

0.6

0.2

0.8

0.9

0.1 0.7

0.3 Viterbi algorithm finds path with maximum product of observation and transition probabilities

still

moving

still

moving

still

moving

inferred and smoothed with

HMM

inferred

actual

0 200 400 600 800 1000

Time (seconds)

Still vs. Moving Estimate

Results in fewer false transitions between states, i.e. smoother and slightly more accurate

4646

Running Example

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50 60 70 80 90 100

y (m

eter

s)

x (meters)

Hidden Markov ModelDiscrete states are 10,000 1m x 1m squares

Observation probabilities spread in Gaussian over nearby squares as per measurement noise model

Transition probabilities go to 8-connected neighbors

0.011762 0.136136 0.011762

0.13964 0.401401 0.13964

0.011762 0.136136 0.011762

4747

HMM Reference

• Good description of Viterbi algorithm• Also how to learn model from data

4848

Hidden Markov Model

Editorial: Hidden Markov Model

The HMM is great for certain applications when your states are discrete.

Tracking in (x,y,z) with HMM?• Huge state space (→ slow)• Long dwells• Interactions with other airplanes

49

Outline




5050

Presenting ContinuousPerformance Results

Measured Mean Median Kalman (untuned)

Kalman (tuned)

Particle HMM05

101520253035404550

Tracking Error vs. Filter

Mean Error

Median Error

met

ers

0

1

2

3

4

5

6

7


Mean Error

Median Error

met

ers

𝑒𝑖 =ԡ𝒙ෝ��𝑖 −𝒙𝑖ԡ

estimatedvalue

actualvalue

Euclidiandistance

Plot mean or median of Euclidian distance error• Median is less sensitive to error outliers

Note: Don’t judge these filtering methods based on these plots. I didn’t spend much time tuning the methods to improve their performance.

5151

Presenting ContinuousPerformance Results

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Cumulative Error Distribution

Median

HMM

Kalman (tuned)

Particle

Mean

Kalman (untuned)

Error (meters)

Frac

tion

Cumulative error distribution• Shows how errors are distributed• More detailed than just a mean or median error

50% of the time, the particle filter gives an error of 2 meters or less (median error)

95th percentile

95% of the time, the particle filter gives an error of 6 meters or less (95th percentile error)

5252

Presenting DiscretePerformance Results

Techniques like particle filter and HMM can classify sequential data into discrete classes

Sitti ng Standing Walking Up s ta i rs Down sta i rs

Elevator down

Elevator up

Brushing teeth

Sitti ng 75% 24% 1% 0% 0% 0% 0% 0%

Standing 29% 55% 6% 1% 0% 4% 3% 2%

Walking 4% 7% 79% 3% 4% 1% 1% 1%

Up sta i rs 0% 1% 4% 95% 0% 0% 1% 0%

Down sta i rs 0% 1% 7% 0% 89% 2% 0% 0%

Elevator down 0% 2% 1% 0% 8% 87% 1% 0%

Elevator up 0% 2% 2% 6% 0% 3% 87% 0%

Brushing teeth 2% 10% 3% 0% 0% 0% 0% 85%

Actu

al A

ctivi

ties

Inferred Activities

Confusion matrix

Pervasive 2006

5353

End

Mean Median Kalman (untuned)

Kalman (tuned)

Particle HMM01234567


Mean ErrorMedian Error

met

ers



• Presenting performance results 0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

Actual Path and Measured Locations

x (meters)

y (m

eter

s)

54

55

Ubiquitous Computing Fundamentals,CRC Press, © 2010

Documents

Processing Sequential Sensor Data John Krumm Microsoft Research Redmond, Washington USA [email protected] 1