Image Processing Algorithm for Speech Acoustics

Preview:

DESCRIPTION

Erin Plasse Advisors: Professor Hanson Professor Rudko. Image Processing Algorithm for Speech Acoustics. Introduction. Experiment done in 1960’s by Kenneth Stevens and Dr. Sven Öhman in Sweden Used a cineradiograph x-ray to take lateral images of the vocal tract - PowerPoint PPT Presentation

Citation preview

Erin PlasseAdvisors: Professor Hanson

Professor Rudko

Introduction Experiment done in 1960’s by Kenneth

Stevens and Dr. Sven Öhman in Sweden Used a cineradiograph x-ray to take lateral

images of the vocal tract 31 utterances and 2 sentences were made Analyzed how articulators displace over

time 45 frames/second

Movie Clip

Image Processing

Perkell (1969)- Used manual methods to make tracings of the images

Typical Tracing

Perkell (1969) used manual

Typical Analysis

Perkell (1969) used manual measures

Goals & Parameters Design an algorithm in MATLAB to automate the

tracings using edge detection methods Trace certain articulators, such as, lips, velum,

epiglottis, hyoid bone, etc. Results should be similar to the original tracings Only 13 utterances were analyzed Obtain tracings for the 20 utterances not

analyzed by Perkell (1969) Manual extraction is time consuming Smooth and continuous curves

Design Alternatives

Snakes: Active Contour Models Matlab script written by Eric Debreuve

Chan-Vese Region Based Segmentation Algorithm Matlab script written by Shawn Lankton

EdgeTrak System for Ultrasound images VIMS Lab, University of Delaware

Customize one of above to create own design for the data

Snakes: Active Contour ModelsMichael Kass

Snake: Energy minimizing spline guided by external forces

Image forces pull it toward lines and edges

MATLAB code written by Eric Debreuve Only worked with binary images

1

0

22))(()('')('

2

1dssxEsxsxE ext

Chan-Vese Algorithm Region based segmentation Use homogeneity of intensity in a

region as the constraint Only applicable to closed contours Uses an initial mask region MATLAB script written by Shawn

Lankton

Pharynx using Chan-Vese

EdgeTrak System Li, Kambhamettu, Stone Uses gradient image forces and intensity

information in local regions Energy definition for snakes: ETotal = α Eint + β Eext

Energy band gap External energy is redefined for EdgeTrak

as: E′ext(vi) = Eband(vi) •Eext(vi) Not effective for closed contours Good for tracking tongue in noisy images

with high-contrast unrelated edges

Energy Minimization Band

Main contribution of EdgeTrak method, finds the intensity of the regions.Energy band regions are found around each snake elementFind mean intensity difference between regionsFind new external energy using band energy Minimize total energy using dynamic programming

EdgeTrak Program

The Final Design

Used methods from both the EdgeTrak System, Chan-Vese, and snake methods.

Implemented using MATLAB Used only the image gradient to find

edges Tongue is the articulator that is

focused on

MATLAB Code

User picks 5 points 33 snake elements found using spline

interpolation Computes internal and external energy of

initial snake elements Computes internal and external energies

of points surrounding each initial point Finds the surrounding point with the

lowest energy, this becomes new point New contour is graphed

MATLAB Code Demo

%Edge_trak_demo

%Coded by Erin Plasse

Final Results

Results-o Energy of original snake = -96.9553o Energy of new snake = 1.2244o Percent change Snake energy = 101.2629

Alpha = .2, Beta = .8, Delta = 5

E_snake_orig = -21.8775

E_snake_new = -0.5480

Percent_change_Snake_energy = 97.495

Initial Points

Final Points

Application to other articulators

Cont.

Future Work Apply the contour model to a

sequence of consecutive frames Find more articulators Use the intensity method for external

energy as described in the Edge Trak program

References Perkell, Joseph S.. Physiology of Speech Production:

Results and Implications of a Quantitative Cineradiographic Study. Cambridge, MA: The MIT Press, 1969.

Stevens, Kenneth and Öhman, Dr. Sven. (1963). “Cineradiographic Studies of speech:procedures and objectives.” J. Acoust. Soc. Am., 35, 1889.

M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active Contour Models,” Int. J. Comput. Vis., vol. 1, pp. 321-331, 1988.

T.F. Chan, L.A. Vese. Active Contours Without Edges. IEEE Trans. On Img. Processing., vol. 10 , pp.266-277, 2001.

M. Li, C. Kambhametti, M. Stone. Automatic Contour Tracking in Ultrasound Images. 2004.

Questions?

Recommended