Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security

Supervisor: Dr. Eddie Jones

Electronic Engineering Department Final Year Project 2008/09

Development of a Speaker Recognition/Verification System

for Security Applications

ContentsWhat is a Speech Recognition/Verification

SystemHow Speech Recognition WorksHow Speech Verification WorksBasicsMilestonesTimelineQuestions

What is a Speech Recognition/Verification System

Speech recognition applications include voice dialing (e.g., "Call home”)

Speech Recognition converts spoken words to a digital signal, this signal is then converted to very short waveforms and then compares these to a database of known pronunciations, called Phonemes. This is then used to recognise the word that was spoken.

Speech verification applications try to verify correctness of pronunciation.

Speech verification doesn’t try to decode unknown speech, but instead, knowing what speech is to be said, it attempts to match the known sentences pitch, pronunciation etc. with that of a stored speakers pitch, pronunciation etc.

How Speech Recognition WorksSpeech at its basic level is broken down to

phonemes – A representation of the sound we make and put together to form sentences.

When the analogue speech is converted to a digital signal it is divided into small segments as short as a few 100ths of a second. These segments are then matched to know phonemes (~40 in the English language).

A program will then examine phonemes in the context of other phonemes around them.

How Speech Verification WorksThere are essentially 4 different types of speech

verification:1. Fixed Phrase Verification – One fixed phrase is stored

and when verifying, that phrase is used to identify the user.

2. Fixed Vocabulary Verification – Multiple phrases are stored and when verifying, one at random is used to identify the user.

3. Flexible Vocabulary Verification – During system training using phrases, a set of subwords are generated and can be used during verification.

4. Text Independent Verification – The system learns the users voice (pitch, dialect, pronunciation, tone etc.) and in verification the user is free to say anything he/she wishes and the system recognises their voice.

BasicsTraining – Recording multiple speakers saying

many varying sentences. This will then be analysed and stored for future recognition. Certain speakers will have a higher security level and this can be used in security applications.

Recognition – Setting the system to recognition mode and reading out the sentence provided. If a match is found, access to a secured location is granted, if not, the user must try again, or does not have access.

Basic Block Diagram of Training

Stored Voice 1

Note: There will be multiple stored voices for speech verification

Basic Block Diagram of Verification

Unknown voice

fails

Passes Security check

Milestones1. Research into Speaker Recognition/Verification and

simulate on Various FEP in Matlab.2. Simulation of classifier(s), and baseline performance

evaluation of system.3. Investigate speaker recognition/Verification over the

Internet. Simulation of channel errors and additive noise.

4. Investigation of real-time implementation, including selecting a suitable development platform. Translation to C with a view to real-time implementation and functional verification against the Matlab reference.

5. Development of a real-time version of the system.

TimelineMilestone 1: Mid-Late NovemberMilestone 2: Mid JanuaryMilestone 3: February (Pending research)Milestone 4: Early MarchMilestone 5: Late March

Questions?

Documents

Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security