Download pdf - 500 150 500 -40 0 EECS 352 - Machine Perception of Music and …chenzhe142.github.io/nu-eecs352/poster.pdf · 2015-03-22 · Zhe Chen, Chenxing Wu, Maxin Chen EECS 352 - Machine Perception

Zhe Chen, Chenxing Wu, Maxin Chen EECS 352 - Machine Perception of Music and Audio,

Northwestern University

Spectrogram

Frame number

Fre

quency (

Hz)

50 100 150 200 250 300 3500

100

200

300

400

500

600

700

800

900

1000

-120

-100

-80

-60

-40

-20

0

20

40

Spectrogram

Fre

qu

en

cy (H

z)

Frame number

(a) Spectrogram of speech

Spectrogram

Frame number

Fre

quency (

Hz)

50 100 150 200 250 300 3500

100

200

300

400

500

600

700

800

900

1000

-120

-100

-80

-60

-40

-20

0

20

40

(b) Spectrogram of melody

Spectrogram

Fre

qu

en

cy (H

z)

Frame number

Spectrogram

Frame number

Fre

quency (

Hz)

50 100 150 200 250 300 3500

100

200

300

400

500

600

700

800

900

1000

-160

-140

-120

-100

-80

-60

-40

-20

0

20

40

Spectrogram

Fre

qu

en

cy (H

z)

Frame number

(c) Spectrogram of singing-speech

Figure 2 Spectrogram comparison of audio files in speech-to-sing function

Figure 3 User Interface

Fre

qu

en

cy (H

z)

Frame number

0 100 200 300 400 500 600 700 800 9000

50

100

150

200

250

300

350

400

450

500

Figure 1 Pitches before and after auto-tuning

Before auto-tuning

After auto-tuning

Melody

import

Output with

3 function

choices

Recording

Input and

display

Audio input (Recording)

Speech modification

Onset detection

Extract syllables

Adjust each syllables to determined beats

Extract notes of the melody

Matching syllables and

notes, modifying pitch

Modified speech output

Pitch detection

Modifying to

nearest whole

tone

Tuned singing

output

Melody

input

Ch

oose

Au

to-t

un

ing

Speech-to-

rap

Speech-to-

sing

Choose Speech

modification