35
Timefrequency Masking EECS 352: Machine Percep=on of Music & Audio Zafar Rafii, Winter 2014 1

Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Time-­‐frequency  Masking  

EECS  352:  Machine  Percep=on  of  Music  &  Audio  

Zafar  Rafii,  Winter  2014   1  

Page 2: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Real spectrogram

time

frequ

ency

•  The  Short-­‐Time  Fourier  Transform  (STFT)  is  a  succession  of  local  Fourier  Transforms  (FT)  

STFT  

STFT  

Zafar  Rafii,  Winter  2014   2  

+  j*  Imaginary spectrogram

time

frequ

ency

frame  i   frame  i  

Time signal

timewindow  i  

Page 3: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Imaginary spectrum

frequency

Real spectrum

frequency

•  If  we  used  a  window  of  N  samples,  the  FT  has  N  values,  from  0  to  N-­‐1;  e.g.,  if  N  =  8…  

Time signal

timewindow  i  

FT  

STFT  

Zafar  Rafii,  Winter  2014   3  

+  j*  window  i   frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

N  frequency  values   N  frequency  values  

Page 4: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Imaginary spectrum

frequency

Real spectrum

frequency

•  Frequency  index  0  is  the  DC  component;  it  is  always  real  (it  is  the  sum  of  the  =me  values!)  

Time signal

timewindow  i  

FT  

STFT  

Zafar  Rafii,  Winter  2014   4  

+  j*  window  i   frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Page 5: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Imaginary spectrum

frequency

Real spectrum

frequency

•  Frequency  indices  from  1  to  floor(N/2)  are  the  “unique”  complex  values  (a  +  j*b)  

Time signal

timewindow  i  

FT  

STFT  

Zafar  Rafii,  Winter  2014   5  

+  j*  window  i   frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Page 6: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Imaginary spectrum

frequency

Real spectrum

frequency

•  Frequency  indices  from  floor(N/2)  to  N-­‐1  are  the  “mirrored”  complex  conjugates  (a  -­‐  j*b)  

Time signal

timewindow  i  

FT  

STFT  

Zafar  Rafii,  Winter  2014   6  

+  j*  window  i   frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Page 7: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Imaginary spectrum

frequency

Real spectrum

frequency

•  If  N  is  even,  there  is  a  pivot  component  at  frequency  index  N/2;  it  is  always  real!  

Time signal

timewindow  i  

FT  

STFT  

Zafar  Rafii,  Winter  2014   7  

+  j*  window  i   frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Page 8: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Real spectrogram

time

frequ

ency

•  Summary  of  the  frequency  indices  and  values  in  the  STFT  (in  colors!)  

STFT  

Zafar  Rafii,  Winter  2014   8  

Imaginary spectrogram

time

frequ

ency

Frequency  0  =    DC  component  (always  real)    

Frequency  1  to  floor(N/2)  =    “unique”  complex  values  

Frequency  N/2  =    “pivot”  component  (always  real)    

Frequency  floor(N/2)  to  N-­‐1  =    “mirrored”  complex  conjugates  

N  frequency  values  =    frequency  0  to  N-­‐1  

+  j*  

Page 9: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  The  (magnitude)  spectrogram  is  the  magnitude  (absolute  value)  of  the  STFT  

Magnitude spectrogram

time

frequ

ency

Real spectrogram

time

frequ

ency

Spectrogram  

Zafar  Rafii,  Winter  2014   9  

Imaginary spectrogram

time

frequ

ency

abs  

frame  i  frame  i   frame  i  

+  j*  

Page 10: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  For  a  complex  number  𝑎+𝑗∗𝑏,  the  absolute  value  is  |𝑎+𝑗∗𝑏|=√𝑎↑2 + 𝑏↑2    

abs  

Spectrogram  

Zafar  Rafii,  Winter  2014   10  

Imaginary spectrum

frequency

Real spectrum

frequency

+  j*  frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Magnitude spectrum

frequency

=  

frame  i  

1.4   1.4   1.4   1.4  3   0   2   0  

Page 11: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

1.4   1.4   1.4  

•  All  the  N  frequency  values  (frequency  indices  from  0  to  N-­‐1)  are  real  and  posiHve  (abs!)  

abs  

Spectrogram  

Zafar  Rafii,  Winter  2014   11  

Imaginary spectrum

frequency

Real spectrum

frequency

+  j*  frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Magnitude spectrum

frequency

0   1   2   3   4   5   6   7  =   1.4  3   0   2   0  

frame  i  

N  frequency  values  

Page 12: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

1.4   1.4   1.4  

•  Frequency  indices  from  0  to  floor(N/2)  are  the  unique  frequency  values  (with  DC  and  pivot)  

abs  

Spectrogram  

Zafar  Rafii,  Winter  2014   12  

Imaginary spectrum

frequency

Real spectrum

frequency

+  j*  frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Magnitude spectrum

frequency

0   1   2   3   4   5   6   7  =   1.4  3   0   2   0  

frame  i  

Page 13: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

1.4   1.4   1.4  

•  Frequency  indices  from  floor(N/2)+1  to  N-­‐1  are  the  mirrored  frequency  values  

abs  

Spectrogram  

Zafar  Rafii,  Winter  2014   13  

Imaginary spectrum

frequency

Real spectrum

frequency

+  j*  frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Magnitude spectrum

frequency

0   1   2   3   4   5   6   7  =   1.4  3   0   2   0  

frame  i  

Page 14: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

1.4   1.4   1.4  

•  Since  they  are  redundant,  we  can  discard  the  frequency  values  from  floor(N/2)+1  to  N-­‐1  

abs  

Spectrogram  

Zafar  Rafii,  Winter  2014   14  

Imaginary spectrum

frequency

Real spectrum

frequency

+  j*  frame  i   frame  i  

-­‐3   -­‐1   0   1   2   1   0   -­‐1   0   -­‐1   0   1   0   -­‐1   0   1  +  j*  0   1   2   3   4   5   6   7   0   1   2   3   4   5   6   7  

Magnitude spectrum

frequency

0   1   2   3   4   5   6   7  =   1.4  3   0   2   0  

frame  i  

floor(N/2)+1  unique    frequency  values  

Page 15: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  The  spectrogram  has  therefore  floor(N/2)+1  unique  frequency  values  (with  DC  and  pivot)  

Magnitude spectrogram

time

frequ

ency

Real spectrogram

time

frequ

ency

Spectrogram  

Zafar  Rafii,  Winter  2014   15  

Imaginary spectrogram

time

frequ

ency

frame  i  frame  i   frame  i  

abs  +  j*  

Page 16: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Spectrogram  

•  Why  the  magnitude  spectrogram?  – Easy  to  visualize  (compare  with  the  STFT)  – Magnitude  informa=on  more  important  – Human  ear  less  sensi=ve  to  phase  

Zafar  Rafii,  Winter  2014   16  

Time signal

time

Magnitude spectrogram

time

frequ

ency +  

 0  

Page 17: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Spectrogram  

•  When  you  display  a  spectrogram  in  Matlab…  –  imagesc:  data  is  scaled  to  use  the  full  colormap  – 10*log10(V):  magnitude  spectrogram  in  dB  – set(gca,’YDir’,’normal’):  y-­‐axis  from  boiom  to  top  

Zafar  Rafii,  Winter  2014   17  

Time signal

time

Magnitude spectrogram

time

frequ

ency +  

 0  

Page 18: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  The  signal  cannot  be  reconstructed  from  the  spectrogram  (phase  informa=on  is  missing!)  

Spectrogram  

Zafar  Rafii,  Winter  2014   18  

Time signal

time

Imaginary spectrogram

time

frequ

ency

Real spectrogram

time

frequ

ency

Magnitude spectrogram

time

frequ

ency iSTFT  ???  

???  

Page 19: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  Suppose  we  have  a  mixture  of  two  sources:        a  music  signal  and  a  voice  signal  

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   19  

Music signal

time

Mixture signal

time

Voice signal

time

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

+    0  

+    0  

+    0  

Page 20: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  We  assume  that  the  sources  are  sparse  =  most  of  the  =me-­‐frequency  bins  have  null  energy  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   20  

Music signal

time

Mixture signal

time

Voice signal

time

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Page 21: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  We  assume  that  the  sources  are  sparse  =  most  of  the  =me-­‐frequency  bins  have  null  energy  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   21  

Music signal

time

Mixture signal

time

Voice signal

time

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mostly  low    energy  bins  

Mostly  low    energy  bins   Mixture spectrogram

timefre

quen

cy

Page 22: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  We  assume  that  the  sources  are  disjoint  =  their  =me-­‐frequency  bins  do  not  overlap  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   22  

Music signal

time

Mixture signal

time

Voice signal

time

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Page 23: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  We  assume  that  the  sources  are  disjoint  =  their  =me-­‐frequency  bins  do  not  overlap  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   23  

Music signal

time

Mixture signal

time

Voice signal

time

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Not  a  lot  of    overlapping  

Page 24: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  Assuming  sparseness  and  disjointness,  we  can  discriminate  the  bins  between  mixed  sources  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   24  

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Music signal

time

Mixture signal

time

Voice signal

time

Page 25: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  Assuming  sparseness  and  disjointness,  we  can  discriminate  the  bins  between  mixed  sources  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   25  

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Music signal

time

Mixture signal

time

Voice signal

time

Source  1  =  bright  Source  2  =  dark  

Page 26: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  Bins  that  are  likely  to  belong  to  one  source  are  assigned  to  1,  the  rest  to  0  =  binary  masking!  

Binary mask

time

frequ

ency

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   26  

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Binary mask

time

frequ

ency 1  

 0  

Music signal

time

Voice signal

time

Source  of  interest   Interfering  source  

Page 27: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  By  mul=plying  the  binary  mask  to  the  mixture  spectrogram,  we  can  “preview”  the  es=mate  

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   27  

Binary mask

time

frequ

ency .x  

Mixture spectrogram

time

frequ

ency

Masked spectrogram

time

frequ

ency

+    0  

+    0  

1    0  

Page 28: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  However,  we  cannot  derive  the  es=mate  itself  because  we  cannot  invert  a  spectrogram!  

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   28  

Binary mask

time

frequ

ency .x  

Mixture spectrogram

time

frequ

ency

Masked spectrogram

time

frequ

ency

+    0  

+    0  

Music estimate

time

1    0  

Page 29: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  We  mirror  the  redundant  frequencies  from  the  unique  frequencies  (without  DC  and  pivot)    

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   29  

Binary mask

time

frequ

ency

Binary mask

time

frequ

ency

Binary mask

time

frequency

1    0  

1    0  

Page 30: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  We  then  apply  this  full  binary  mask  to  the  STFT  using  a  element-­‐wise  mul=plica=on  

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   30  

Binary mask

time

frequ

ency

Binary mask

time

frequ

ency

Binary mask

time

frequency

Imaginary spectrogram

time

frequ

ency

Real spectrogram

timefre

quen

cy

1    0  

1    0  

.x  

Page 31: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  The  es=mate  signal  can  now  be  reconstructed  via  inverse  STFT  

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   31  

Binary mask

time

frequ

ency

1    0  

Masked imaginary

time

frequ

ency

Masked real

time

frequ

ency

Music estimate

time

iSTFT  

Page 32: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  Sources  are  not  really  sparse  or  disjoint  in  =me-­‐frequency  in  the  mixture  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   32  

Music signal

time

Mixture signal

time

Voice signal

time

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Page 33: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

Soft mask

time

frequ

ency

•  Bins  that  are  likely  to  belong  to  one  source  are  close  to  1,  the  rest  close  to  0  =  soP  masking!  

Music spectrogram

time

frequ

ency

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   33  

Music spectrogram

time

frequ

ency +  

 0  

+    0  

+    0  

Voice spectrogram

time

frequ

ency

Mixture spectrogram

timefre

quen

cy

Music signal

time

Voice signal

time

Source  of  interest   Interfering  source  

1    0  

Soft mask

time

frequ

ency

Page 34: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  Let’s  listen  to  the  results!  

Time-­‐frequency  Masking  

Zafar  Rafii,  Winter  2014   34  

Music estimate

time

Music signal

time

demix  mix  

Mixture signal

time

Page 35: Time%frequency-Masking- - Interactive Audio Lab · Time%frequency-Masking-Zafar-Rafii,-Winter-2014- 33 Music spectrogram time frequency +-0 +-0 +-0 Voice spectrogram time frequency

•  How  can  we  efficiently  model  a  binary/som  =me-­‐frequency  mask  for  source  separa=on?...  

•  To  be  con=nued…  

Ques=on  

Zafar  Rafii,  Winter  2014   35  

Mixture spectrogram

time

frequ

ency

+    0  

Soft mask

time

frequ

ency

1    0  ???