Upload
alan-fleming
View
213
Download
0
Embed Size (px)
Citation preview
Teaching Tool For French Speech Pronunciation
Capstone Design Project2008
Joseph CiaburriAdvisor: Professor Catravas
Motivation
Use feedback that allows for self diagnosis
Make tool as simple as possible for student
Improve French pronunciation through the repetition of visual and aural aids
Tony Blair Congratulating Nicolas Sarkozy on Election Win
Interview with Domnique Villepin
USER
Window 1Native Speaker
Audio
Speech
Microphone Webcam
Video
Data Acquisition
Window 2Audio and Videoof User Speaking
Data
Dat
a
Data
Window 3Diagnostics
Video
Audio
Video
Video
Audio
Audio/Visual Databank
Proposed Learning System
Design Specifications
Goals
Read in audio and video at the same time
Play back audio and video at the same time within 1 second
Minimize system requirements
Implement diagnostics that are sensitive to pronunciation differences
Provide pronunciation feedback via bulls eye
Simplicity
Microphone and Webcam to Data Acquisition
USER
Webcam
VIDEO
Data Acquisition
Data
Auto light compensationAverage Frame Rate
15 frames per secondLarge file stored as a variableLength of video is short
~5 SecondsCamera lights up when recording
USER
Microphone
Data Acquisition
Speech
Data
Webcam
Microphone
Sampling rate used is adjustable up to 44.1khzSaved as a variableReads in simultaneously with videoMicrophone
Built into webcamAuto noise cancellationPower comes from computer
Able to crop to only speech
Repetition of User
USER
Window 2Video of User
Data Acquisition
Data
AUDIO
VIDEO
Play back from variable
Allows for a quicker load time
Less than 3 seconds to load video
Audio and Video do not play in sync
Play length ~ 5 Seconds
Keeps memory requirements low
DiagnosticsData
Acquisition
Data
Window 3Diagnostics
USER
Audio
Video
Can create and graph spectrogram dataAllows for determination of vowels
using the formants and consonants using the transitions of the formants
Can create and graph cepstrum dataInverse Fourier Transform of the log of the Fourier Transform
Can find fundamental frequency
Can find Zero Crossings Zero Crossings show silence versus speech
Bulls eye allows for two inputs, along the x and y, graphed as a percent distance from the center
ResultsTime Domain: Non Native Speaker Time Domain: Native Speaker
Spectrogram: Non Native Speaker Spectrogram: Native Speaker
Results Continued
0 0.5 1 1.5-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Time (s)
Am
plitu
de
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Time (s)
Am
plitu
de
Cepstrum: Non Native Speaker Cepstrum: Native Speaker
Zero : Non Native SpeakerZero Crossings: Native Speaker
Design Specifications
Goals
Read in audio and video at the same time
Play back audio and video at the same time within 1 second
Implement diagnostics that are sensitive to pronunciation differences
Provide pronunciation feedback via bulls eye
Simplicity
Minimize system requirements
Accomplished
Can read synchronized audio and video into MATLAB
Can play back audio or video separately, or unsynchronized audio and video in MATLAB
Can plot diagnostics and find fundamental frequency
Can plot on bulls eye
All in one webcam as well as keeping the whole program in MATLAB
In Progress
• Identifying specific components of speech that specific to French– Vowels– Consonants
• Quantifying these components and using them on the bulls eye
• Creating a GUI
• Gather more video samples
Future Research
• Integrating other languages
• Evaluation – Use of non-native speakers– Use of native speakers
• Testing in the use of facial communication in oral communication
• Basis for comparison of other audible signals
Acknowledgements
• Professor Rudko• Professor Hanson• Professor Streignitz• Professor Cotter• Professor Catravas• Professor Chilcoat• Professor Pickering• Professor Spallholz