25
Erin Plasse Advisors: Professor Hanson Professor Rudko

Image Processing Algorithm for Speech Acoustics

  • Upload
    vian

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Erin Plasse Advisors: Professor Hanson Professor Rudko. Image Processing Algorithm for Speech Acoustics. Introduction. Experiment done in 1960’s by Kenneth Stevens and Dr. Sven Öhman in Sweden Used a cineradiograph x-ray to take lateral images of the vocal tract - PowerPoint PPT Presentation

Citation preview

Page 1: Image Processing Algorithm for Speech Acoustics

Erin PlasseAdvisors: Professor Hanson

Professor Rudko

Page 2: Image Processing Algorithm for Speech Acoustics

Introduction Experiment done in 1960’s by Kenneth

Stevens and Dr. Sven Öhman in Sweden Used a cineradiograph x-ray to take lateral

images of the vocal tract 31 utterances and 2 sentences were made Analyzed how articulators displace over

time 45 frames/second

Page 3: Image Processing Algorithm for Speech Acoustics

Movie Clip

Page 4: Image Processing Algorithm for Speech Acoustics

Image Processing

Perkell (1969)- Used manual methods to make tracings of the images

Page 5: Image Processing Algorithm for Speech Acoustics

Typical Tracing

Perkell (1969) used manual

Page 6: Image Processing Algorithm for Speech Acoustics

Typical Analysis

Perkell (1969) used manual measures

Page 7: Image Processing Algorithm for Speech Acoustics

Goals & Parameters Design an algorithm in MATLAB to automate the

tracings using edge detection methods Trace certain articulators, such as, lips, velum,

epiglottis, hyoid bone, etc. Results should be similar to the original tracings Only 13 utterances were analyzed Obtain tracings for the 20 utterances not

analyzed by Perkell (1969) Manual extraction is time consuming Smooth and continuous curves

Page 8: Image Processing Algorithm for Speech Acoustics

Design Alternatives

Snakes: Active Contour Models Matlab script written by Eric Debreuve

Chan-Vese Region Based Segmentation Algorithm Matlab script written by Shawn Lankton

EdgeTrak System for Ultrasound images VIMS Lab, University of Delaware

Customize one of above to create own design for the data

Page 9: Image Processing Algorithm for Speech Acoustics

Snakes: Active Contour ModelsMichael Kass

Snake: Energy minimizing spline guided by external forces

Image forces pull it toward lines and edges

MATLAB code written by Eric Debreuve Only worked with binary images

1

0

22))(()('')('

2

1dssxEsxsxE ext

Page 10: Image Processing Algorithm for Speech Acoustics

Chan-Vese Algorithm Region based segmentation Use homogeneity of intensity in a

region as the constraint Only applicable to closed contours Uses an initial mask region MATLAB script written by Shawn

Lankton

Page 11: Image Processing Algorithm for Speech Acoustics

Pharynx using Chan-Vese

Page 12: Image Processing Algorithm for Speech Acoustics

EdgeTrak System Li, Kambhamettu, Stone Uses gradient image forces and intensity

information in local regions Energy definition for snakes: ETotal = α Eint + β Eext

Energy band gap External energy is redefined for EdgeTrak

as: E′ext(vi) = Eband(vi) •Eext(vi) Not effective for closed contours Good for tracking tongue in noisy images

with high-contrast unrelated edges

Page 13: Image Processing Algorithm for Speech Acoustics

Energy Minimization Band

Main contribution of EdgeTrak method, finds the intensity of the regions.Energy band regions are found around each snake elementFind mean intensity difference between regionsFind new external energy using band energy Minimize total energy using dynamic programming

Page 14: Image Processing Algorithm for Speech Acoustics

EdgeTrak Program

Page 15: Image Processing Algorithm for Speech Acoustics

The Final Design

Used methods from both the EdgeTrak System, Chan-Vese, and snake methods.

Implemented using MATLAB Used only the image gradient to find

edges Tongue is the articulator that is

focused on

Page 16: Image Processing Algorithm for Speech Acoustics

MATLAB Code

User picks 5 points 33 snake elements found using spline

interpolation Computes internal and external energy of

initial snake elements Computes internal and external energies

of points surrounding each initial point Finds the surrounding point with the

lowest energy, this becomes new point New contour is graphed

Page 17: Image Processing Algorithm for Speech Acoustics

MATLAB Code Demo

%Edge_trak_demo

%Coded by Erin Plasse

Page 18: Image Processing Algorithm for Speech Acoustics

Final Results

Results-o Energy of original snake = -96.9553o Energy of new snake = 1.2244o Percent change Snake energy = 101.2629

Alpha = .2, Beta = .8, Delta = 5

Page 19: Image Processing Algorithm for Speech Acoustics

E_snake_orig = -21.8775

E_snake_new = -0.5480

Percent_change_Snake_energy = 97.495

Page 20: Image Processing Algorithm for Speech Acoustics

Initial Points

Final Points

Page 21: Image Processing Algorithm for Speech Acoustics

Application to other articulators

Page 22: Image Processing Algorithm for Speech Acoustics

Cont.

Page 23: Image Processing Algorithm for Speech Acoustics

Future Work Apply the contour model to a

sequence of consecutive frames Find more articulators Use the intensity method for external

energy as described in the Edge Trak program

Page 24: Image Processing Algorithm for Speech Acoustics

References Perkell, Joseph S.. Physiology of Speech Production:

Results and Implications of a Quantitative Cineradiographic Study. Cambridge, MA: The MIT Press, 1969.

Stevens, Kenneth and Öhman, Dr. Sven. (1963). “Cineradiographic Studies of speech:procedures and objectives.” J. Acoust. Soc. Am., 35, 1889.

M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active Contour Models,” Int. J. Comput. Vis., vol. 1, pp. 321-331, 1988.

T.F. Chan, L.A. Vese. Active Contours Without Edges. IEEE Trans. On Img. Processing., vol. 10 , pp.266-277, 2001.

M. Li, C. Kambhametti, M. Stone. Automatic Contour Tracking in Ultrasound Images. 2004.

Page 25: Image Processing Algorithm for Speech Acoustics

Questions?