Text Independent Speaker Identification Using Gaussian Mixture Model

International Conference on Intelligent and Advanced Systems 2007

Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff.

Jain-De,Lee

INTRODUCTION

GMM SPEAKER IDENTIFICATION SYSTEM

EXPERIMENTAL EVALUATION

CONCLUSION

OUTLINE

Speaker recognition is generally divided into two tasks◦ Speaker Verification(SV)◦ Speaker Identification(SI)

Speaker model ◦ Text-dependent(TD)◦ Text-independent(TI)

INTRODUCTION

Many approaches have been proposed for TI speaker recognition◦VQ based method◦Hidden Markov Models◦Gaussian Mixture Model

VQ based method

INTRODUCTION

Hidden Markov Models◦ State Probability◦ Transition Probability

Classify acoustic events corresponding to HMM states to characterize each speaker in TI task

TI performance is unaffected by discarding transition probabilities in HMM models

INTRODUCTION

Gaussian Mixture Model

◦Corresponds to a single state continuous ergodic HMM◦Discarding the transition probabilities in the HMM models

The use of GMM for speaker identity modeling

◦ The Gaussian components represent some general speaker-dependent spectral shapes

◦ The capability of Gaussian mixture to model arbitrary densities

INTRODUCTION

The GMM speaker identification system consists of the following elements

◦ Speech processing

◦Gaussian mixture model

◦ Parameter estimation

◦ Identification

GMM SPEAKER IDENTIFICATION SYSTEM

The Mel-scale frequency cepstral coefficients (MFCC) extraction is used in front-end processing

Speech Processing

Input Speech Signal Pre-Emphasis Frame Hamming

Window

FFTTriangularband-pass

filterLogarithm DCT

Mel-sca1e cepstral feature analysis

The Gaussian model is a weighted linear combination of M uni-model Gaussian component densities

The mixture weight satisfy the constraint that

Gaussian mixture model

iii xbwxp

Where is a D-dimensional vectorxare the component densitiesMixbi ,...,1),(

wi , i=1,…,M are the mixture weights

Each component density is a D-variate Gaussian function of the form

The Gaussian mixture density model are denoted as

Gaussian mixture model

)}()(21exp{

||)2(1)( 1

2/12/ iiT

D xxxbi

Where is mean vectori

is covariance matrixi

Miw iii ,...,1),,,(

Conventional GMM training process

Parameter estimation

Input training vector

LBG algorithm

EM algorithm

Convergence EndY

LBG AlgorithmInput training

vector

Overall average

Clustering

Cluster’saverage

Calculate Distortion (D-D’)/D< δ

D’=D

m<M End

Speaker model training is to estimate the GMM parameters via maximum likelihood (ML) estimation

Expectation-maximization (EM) algorithm

EM Algorithm

ttxpXp

)|()|(

tti xip

),|(iT

This paper proposes an algorithm consists of two steps

Cluster the training vectors to the mixture component with the highest likelihood

Re-estimate parameters of each component

)(maxarg1

xbC iMi

number of vectors classified in cluster i / total number of training vectors

sample mean of vectors classified in cluster i.i

sample covariance matrix of vectors classified in cluster ii

The feature is classified to the speaker ,whose model likelihood is the highest

The above can be formulated in logarithmic term

IdentificationS

SkkXpS

1)|(maxargˆ

)|(logmaxargˆ

Database and Experiment Conditions◦ 7 male and 3 female◦ The same 40 sentences utterances with different text◦ The average sentences duration is approximately 3.5 s

Performance Comparison between EM and Highest Mixture Likelihood Clustering Training◦ The number of Gaussian components 16◦ 16 dimensional MFCCs◦ 20 utterances is used for training

EXPERIMENTAL EVALUATION

Convergence condition

EXPERIMENTAL EVALUATION03.0|)|()|(| )()1( kk XpXp

EXPERIMENTAL EVALUATION The comparison between EM and highest likelihood

clustering training on identification rate◦ 10 sentences were used for training◦ 25 sentences were used for testing◦ 4 Gaussian components◦ 8 iterations

EXPERIMENTAL EVALUATION Effect of Different Number of Gaussian Mixture

Components and Amount of Training Data◦MFCCs feature dimension is fixed to 12◦ 25 sentences is used for testing

EXPERIMENTAL EVALUATION Effect of Feature Set on Performance for Different

Number of Gaussian Mixture Components◦Combination with first and second order difference coefficients

was tested◦ 10 sentences is used for training◦ 30 sentences is used for testing

Comparably to conventional EM training but with less computational time

First order difference coefficients is sufficient to capture the transitional information with reasonable dimensional complexity

The 12 dimensional 16 order GMM and using 5 training sentences achieved 98.4% identification rate

CONCLUSION

Text Independent Speaker Identification Using Gaussian Mixture Model

Documents

Expectation Maximization and Gaussian Mixture Models

Chapter 6 Gaussian Mixture Models - MIT OpenCourseWare...Chapter 6. Gaussian Mixture Models. In this chapter we will study Gaussian mixture models and clustering. The basic problem

SPEAKER VERIFICATION USING NEURAL RESPONSES FROM THE …studentsrepo.um.edu.my/7784/1/KGL120004_NoorFadzilahRazali_V3.pdf · menggunakan ‘Gaussian Mixture Model’ (GMM). Ciri vector

A Gaussian Mixture Model Spectral Representation for ...mi.eng.cam.ac.uk/~mjfg/thesis_mns25.pdf4.1 Gaussian mixture model representations of the speech spectrum 45 4.1.1 Mixture models

The Blob Filter: Gaussian Mixture Nonlinear Filtering with ...gps.mae.cornell.edu/psiaki_gaussmixfilter_ieeeionplans2014.pdf · The "Blob" Filter: Gaussian Mixture Nonlinear Filtering

A Look Up Table-free Gaussian Mixture Model-based Speaker Classiﬁer · 2019-01-08 · A Look Up Table-free Gaussian Mixture Model-based Speaker Classiﬁer Relatori: Prof. Mariagrazia

Speaker Recognition using Gaussian Mixture Model

Sliced Wasserstein Distance for Learning Gaussian Mixture Modelsopenaccess.thecvf.com/content_cvpr_2018/CameraReady/3352.pdf · 2018-06-05 · 1. Introduction Finite Gaussian Mixture

Improved Text-Independent Speaker Recognition using ...robust/Thesis/muralib_ms.pdf · Improved Text-Independent Speaker Recognition using Gaussian Mixture Probabilities Balakrishnan

Multi-Dimensional Uniform Initialization Gaussian Mixture

A Speaker Recognition System Using Gaussian Mixture Model

Dimension-Decoupled Gaussian Mixture Model for Short Utterance Speaker Recognition Thilo Stadelmann, Bernd Freisleben, Ralph Ewerth University of Marburg,

BAYESIAN CLASSIFICATION USING GAUSSIAN MIXTURE … · BAYESIAN CLASSIFICATION USING GAUSSIAN MIXTURE MODEL AND ... Bayesian Classiﬁcation Using Gaussian Mixture Model and EM Estimation:

Gaussian Mixture Models Meet Econometric Models

Hidden Markov Models and Gaussian Mixture Models · Hidden Markov Models and Gaussian Mixture Models ... Hidden Markov Model ... ASR Lectures 4&5 Hidden Markov Models and Gaussian

Gaussian Mixture Models and Acoustic Modeling

Robust text-independent speaker identification using ...frank/csc401/readings/ReynoldsRose.pdf · Title: Robust text-independent speaker identification using Gaussian mixture sp eaker

Speaker Verification Using Adapted Gaussian Mixture Modelsspeech.csie.ntu.edu.tw/previous_version/Speaker Verification Using... · Mixture Models1 Douglas A. Reynolds, Thomas F. Quatieri,

Deep Clustering by Gaussian Mixture Variational ...openaccess.thecvf.com/content_ICCV_2019/papers/...Deep Clustering by Gaussian Mixture Variational Autoencoders with Graph Embedding

Speaker Verification Using Adapted Gaussian …turnbull/cs97/f09/paper/reynolds00.pdfSpeaker Verification Using Adapted Gaussian Mixture Models ...Published in: Digital Signal Processing