Upload
kelton
View
23
Download
1
Embed Size (px)
DESCRIPTION
SEQUENTIAL STATE-SPACE FILTERS FOR SPEECH ENHANCEMENT. S.Patil, S. Srinivasan, S. Prasad , R. Irwin, G. Lazarou and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University - PowerPoint PPT Presentation
Citation preview
S.Patil, S. Srinivasan, S. Prasad, R. Irwin, G. Lazarou and J. Picone
Intelligent Electronic SystemsCenter for Advanced Vehicular Systems
Mississippi State University
URL: http://www.cavs.msstate.edu/hse/ies/publications/conferences/ieee_secon/2006/state_space_algorithms/
SEQUENTIAL STATE-SPACE FILTERS
FOR SPEECH ENHANCEMENT
Page 2 of 15Sequential State-Space Filters for Speech Enhancement
Recursive State-Space Filters
Models the state x(n) as the output of a linear system excited by white noise v1(n)
Relates the observable output of the system y(n) to the state x(n)
1( 1) ( 1, ) ( ) ( )n n n n n x F x v 2( ) ( ) ( ) ( )n n n n y H x v
Signal Model: Observation Model:
The filtering problem is posed as follows: Using the set of observed data , to find for each the MMSE of the state vector , and then to estimate the clean signal from the signal model.
1( 1) ( 1, ) ( ) ( )n n n n n x F x v
n( ) ( / )n ns Hx y2( ) ( ) ( )n n n y s v
ny1n x
s
Page 3 of 15Sequential State-Space Filters for Speech Enhancement
Auto-Regressive Model for Speech
• It is common to model speech data as being generated by an AR process:
• This framework can be used in a conventional state-space setup as follows:
p
ii nuinsnans
1
)()()()(
AR-Coefficients
AR order
Driving/Process noise
]1...,,0,0[
,
...,,,
0...,,1,0,00...,,0,1,0
,]ˆ,...ˆ,ˆ[
where
and
11
1
11
H
aaa
F
TsssX
VXHY
UXFX
pp
kpkpkk
kkk
kkk
Noisy observations
Filtered estimates
State-evolution function
Measurement function
Page 4 of 15Sequential State-Space Filters for Speech Enhancement
Kalman Filters
• Linear recursive filters where a Gaussian random (state) vector is propagated through a linear state space model.
• Objective: to find a ‘filtered’ estimate of the state vector, X , at time k represented as a linear combination of the measurements up to time k, such that the following quadratic cost function is minimized:
• P, Q and R represent the covariance matrices for the state-estimation error, process noise and measurement noise.
)]ˆ()ˆ[( kkkk XXMTXXE
a symmetric nonnegative definite weighting matrix
kkk
kkkkk
T
K
T
kk
kkkk
kkk
PHKIPXHYKXX
RHPHHPK
QFPFPXFX
T
)()ˆ(ˆˆ)(
ˆˆ
1
1
1Prediction
Estimation / Filtering
Page 5 of 15Sequential State-Space Filters for Speech Enhancement
Nonlinear Recursive State-Space Filters
Nonlinear State Estimation:
• Predicting state vector • Kalman gain• Predicting observation
• When propagating a GRV through a linear model (Kalman filtering), it is easy to compute the expectation, E[ ].
• If the same GRV passes through a nonlinear model, the resultant estimate may no longer be Gaussian, and calculation of E[ ] is no longer easy!
The calculation of E[ ] in the recursive filtering process requires an estimate the pdf of the propagating random vector
1( )
k k k x F x ,v
( )k k ky H x ,nF, H are (nonlinear) vector functions
1 1ˆ ˆ[ ( )]k k kE
x F x ,v
ˆ ˆ[ ( )]k k kE -y H x ,n
ˆ ˆ ˆ( )]k k k k kK x x y y
Prediction of Innovationˆ kx
a Posteriori estimate
1
k k k kx y y yk P P
K
Page 6 of 15Sequential State-Space Filters for Speech Enhancement
Unscented Kalman Filters
• Extended Kalman Filters (EKFs) avoid this by linearizing the state-space equations and reducing the estimate equations to point estimates:
• The covariance matrices (P) are computed by linearizing the dynamics
• The state distributions are approximated by a GRV and propagated analytically through a ‘first-order linearization’ of the nonlinear system
If the assumption of local linearity is violated, the resulting filter will be unstable
The linearization process itself involves computations of Jacobians and Hessians (which are nontrivial)
1ˆ ˆ( )k k
x F x ,v
1ˆ ˆk k k k
x y y yk P P
Kˆ ˆ( )k k k
-y H x ,n
1 ,k k kA Bv x x k k kC Dn y x
Page 7 of 15Sequential State-Space Filters for Speech Enhancement
The Unscented Transform
Problem statement: calculation of the statistics of a random variable which has undergone a nonlinear transformation
Approach: It is easier to approximate a distribution than it is to approximate an arbitrary nonlinear transformation.
Consider a nonlinearity:
• The Unscented transform chooses (deterministically) a set of ‘sigma’ points in the original (x) space which when mapped to the transformed (y) space will be capable of generating accurate sample means and covariance matrices in the transformed space
• An accurate estimation of these allows us to use the Kalman filter in the usual setup, with the expectation values being replaced by sample means and covariance matrices as appropriate
( )y g x
Page 8 of 15Sequential State-Space Filters for Speech Enhancement
Unscented Kalman Filters (UKFs)
• Instead of propagating a GRV through the “linearized” system, the UKF technique focuses on an efficient estimation of the statistics of the random vector being propagated through a nonlinear function
• To compute the statistics of , we ‘cleverly’ sample this function using ‘deterministically’ chosen 2L+1 points about the mean of x
( )y g x
0
( ( ) ) , 1, 2,....
( ( ) ) , , 1,....2
i x i
i x i L
x
x L P i L
x L P i L L L
Sigma Points
Page 9 of 15Sequential State-Space Filters for Speech Enhancement
Unscented Kalman Filters (UKFs)
It can be proved that with these as sample points, the following ‘weighted’ sample means and covariance matrices will be equal to the true values
• We use this set of equations to predict the states and observations in the nonlinear state-estimation model. The expected values E[ ] and the covariance matrices are replaced with these sample ‘weighted’ versions.
2
0
2
0
{ }{ }
, ( ), 0,1,2....2
Lmi i
i
Lc T
y i i ii
i i
y W y
P W y y y y
where y g i L
( )
( ) 2
( ) ( )
1
1, 1,2,...., 2
2( )
mo
co
c mi i
WL
WL
W W i LL
Page 10 of 15Sequential State-Space Filters for Speech Enhancement
Particle Filters
• Being increasingly employed for sequential state estimation from systems with a nonlinear state-space model
• These do not restrict the statistics of the propagating random vector to a Gaussian distribution
• These are based on sequential Monte Carlo techniques
Represent the pdfs
using particles
Update these particles with each
observation
Use these ‘learned’pdfs for
state estimation
Use the state estimates for
prediction / filtering / smoothing
Observed Signal
Page 11 of 15Sequential State-Space Filters for Speech Enhancement
Particle Filters
• Probability distributions are represented in terms of randomly picked samples (particles)
• These particles are then recursively updated using Bayesian estimation procedures
• Direct generation of particles for arbitrary distributions is difficult
• Particles are generated for another distribution – importance density function
Importance Density Function: chosen such that it facilitates easy drawing of samples
• Accuracy of density estimation using particles, and hence, of state estimation and filtering will improve with an increase in the number of particles used
• 100 particles were used for density estimation in this setup
Page 12 of 15Sequential State-Space Filters for Speech Enhancement
Experimental Setup
• A comparative analysis of the performance of three state-space sequential filtering algorithms is presented
• Three test signals were used:
• An Auto-Regressive process with known driving parameters
• A single tone sinusoid
• Sample data from an acoustic utterance
• An Auto-Regressive model (order=10) was used to represent these signals.
• Filtering was performed on a per-frame basis, to ensure that stationarity is not violated.
• The AR coefficients are learned using Yule-Walker equations.
• The filtering process in each frame is used by initializing the filter with the last 10 samples of the previously (filtered) frame.
Page 13 of 15Sequential State-Space Filters for Speech Enhancement
Experimental results
AR process Sine Speech
I/P SNR KF PF UKF KF PF UKF KF PF UKF
-10.0 0.0293 0.0182 0.0064 0.0514 0.5607 0.1369 0.0194 0.0185 0.0079
-5.0 0.0128 0.0012 0.0052 0.0150 0.5408 0.0832 0.0065 0.0060 0.0051
0.0 0.0039 0.0051 0.0038 0.0053 0.1911 0.0366 0.0022 0.0019 0.0023
5.0 0.0015 0.0020 0.0026 0.0017 0.0882 0.0244 0.0007 0.0006 0.0011
10.0 0.0006 0.0007 0.0020 0.0006 0.0363 0.0207 0.0002 0.0002 0.0008
15.0 0.0001 0.0003 0.0017 0.0002 0.0129 0.0195 0.0001 0.0001 0.0007
20.0 0.0006 0.0005 0.0016 0.0001 0.0050 0.0206 0.0001 0.0001 0.0007
KF performs better at high SNRs
UKF performs better at low SNRs
Page 14 of 15Sequential State-Space Filters for Speech Enhancement
Conclusions and Future Work
• A comparison of three sequential state space filtering algorithms
• Performance studied as a function of SNR:
At high SNRs, KF consistently outperforms UKF and PF
At low SNRs, UKF and PF perform better than KF
• This experimental design is ready for applications in speech enhancement and robust speech recognition.
• The code is publicly available and suitable for use in a large scale speech enhancement / recognition applications.
• This work can be extended to a nonlinear AR model using a perceptron to learn the nonlinear vector functions.
• This will allow modeling of nonlinear state-space functions and measurement functions.
• Another attractive model for speech recognition applications could be a linear AR model and a nonlinear observation model.
Page 15 of 15Sequential State-Space Filters for Speech Enhancement
Resources
• Pattern Recognition Applet: compare popular linear and nonlinear algorithms on standard or custom data sets
• Speech Recognition Toolkits: a state of the art ASR toolkit for testing the efficacy of these algorithms on recognition tasks
• Foundation Classes: generic C++ implementations of many popular statistical modeling approaches
Page 16 of 15Sequential State-Space Filters for Speech Enhancement
References
• L. Rabiner and B. Juang, Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ, 1993.
• M. Gabrea, “Robust Adaptive Kalman Filtering-Based Speech Enhancement Algorithm,” Proceedings of the International Conference on Acoustics, Speech and Signal Signal Processing (ICASSP) , vol. 1, pp‑I301-304, May 2004.
• E. Wan and R. Merwe, “The unscented Kalman Filter for Nonlinear Estimation,” Proceedings of the Adaptive Systems for Signal Processing, Communications and Control Symposium, pp. 153-158, November 2005.
• G. Welch and G. Bishop, “Introduction to the Kalman Filter,” Technical Report TR 95-041, Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, April 2004.Figure 3: Output MSE vs. input SNR plot for a signal generated from an AR model
• S. Haykin and E. Moulines, “From Kalman to Particle Filters,” Proceedings of the International Conference on Acoustics, Speech and Signal Signal Processing (ICASSP), Philadelphia, PA, USA, March 2005.
• R. Merve, N. de Freitas, A. Doucet, and E. Wan, “The Unscented Particle Filter,” Technical Report CUED/F-INFENG/TR 380, Cambridge University Engineering Department, Cambridge University, U.K, August 2000.
• P. Djuric, J. Kotecha, J. Zhang, Y. Huang, T. Ghirmai, M. Bugallo, and J. Miguez, “Particle Filtering,” IEEE Signal Processing Magazine, vol. 20, no. 5, pp. 19-38, September 2003.
• N. Arulampalam, S. Maskell, N. Gordan, and T. Clapp, "Tutorial On Particle Filters For Online Nonlinear/ Non-Gaussian Bayesian Tracking," IEEE Transactions on Signal Processing, vol. 50, no. 2, pp. 174-188, February 2002.