Upload
vic
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Physically Informed Subtraction of a String’s Resonances from Monophonic, Discretely Attacked Tones : A Phase Vocoder Approach. Matthieu Hodgkinson Ph.D. Thesis defence April 2012 Dept. Of computer science NUI Maynooth. String Extraction. Input (Viola Pizzicato). 1. Amplitude. - PowerPoint PPT Presentation
Citation preview
MATTHIEU HODGKINSONPH.D. THESIS DEFENCE
APRIL 2012DEPT. OF COMPUTER SCIENCE
NUI MAYNOOTH
Physically Informed Subtraction of a String’s Resonances from Monophonic,
Discretely Attacked Tones : A Phase Vocoder Approach
Chair : Prof. Raymond O’Neill
External Examiner : Prof. Rudolf Rabenstein
Internal Examiner : Dr. Tomás WardSupervisor : Dr. Joseph Timoney
String ExtractionIdea : subtract string resonances from monophonic, plucked or hit string tones.
If input = string + excitation, then the remainder is input –string = excitation.
Then “string extraction” reduces to “excitation extraction”.
0 0.4958-1
0
1
Am
plitu
de
Input (Viola Pizzicato)
0 0.4904-1
0
1
Am
plitu
de
Excitation
0 0.4898-1
0
1
Am
plitu
de
String
If input = string + excitation + other,then input – string = excitation + other.
Now a few examples of other...
Environmental noise IElectric guitars : very short excitation (non-resonant body), electric buzz audible after string extraction.
0 2.7185-1
0
Time
Am
plitu
de
Stratocaster
Environmental noise IIAccidental background noises in recording room.
0 1.9927-1
0
1
Time
Am
plitu
de
Martin (custom recording)
Input Other stringsSometimes awkward to mute open strings. Non-muted strings respond to excitation and to vibrations of target string.
0 1.5147
0
1
Time
Am
plitu
de
Acoustic guitar (Open D)
Output
Granulation and Extraction
The string extraction uses a transparent Phase Vocoder scheme.The waveform is processed repeatedly over short time intervals.
Each short-time output is added to the long-term output.This process is known as overlap-add.It is illustrated in the next few slides.
Granulation and ExtractionWaveform is multiplied by window to make grain.
Sinusoidal components of string are subtracted.
Residual is added to output.
Process is repeated at regular time intervals, and residuals are added.
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Input
window grain processed grain
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Output
Time (s)
Granulation and Extraction
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Input
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Output
Time (s)
window grain processed grain
Waveform is multiplied by window to make grain.
Sinusoidal components of string are subtracted.
Residual is added to output.
Process is repeated at regular time intervals, and residuals are added.
Granulation and ExtractionWaveform is multiplied by window to make grain.
Sinusoidal components of string are subtracted.
Residual is added to output.
Process is repeated at regular time intervals, and residuals are added.
window grain processed grain
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Input
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Output
Time (s)
Granulation and ExtractionWaveform is multiplied by window to make grain.
Sinusoidal components of string are subtracted.
Residual is added to output.
Process is repeated at regular time intervals, and residuals are added.
window grain processed grain
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Input
0 0.005 0.01 0.015 0.02 0.025-1
-0.5
0
0.5
1Output
Time (s)
Short-Time String Cancelation
Within each grain, the sinusoidal components of the string are measured and subtracted.
Short-Time String Cancelation
0 500 1000 1500 2000 2500 3000 3500 4000-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (Hz)
Mag
nitu
de (d
B)
original spectrum
The subtraction takes place in the frequency domain.
The first harmonic is detected with a peak search.
The complex spectrum is used to measure the parameters of the sinusoid.
A spectrum of this sinusoid is synthesized and subtracted.
The process is repeated for each harmonic.
Short-Time String CancelationThe subtraction takes place in the frequency domain.
The first harmonic is detected with a peak search.
The complex spectrum is used to measure the parameters of the sinusoid.
A spectrum of this sinusoid is synthesized and subtracted.
The process is repeated for each harmonic.
0 500 1000 1500 2000 2500 3000 3500 4000-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (Hz)
Mag
nitu
de (d
B)
original spectrum synthesized harmonicprocessed spectrum
Short-Time String Cancelation
0 500 1000 1500 2000 2500 3000 3500 4000-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (Hz)
Mag
nitu
de (d
B)
original spectrum synthesized harmonicprocessed spectrum The subtraction takes
place in the frequency domain.
The first harmonic is detected with a peak search.
The complex spectrum is used to measure the parameters of the sinusoid.
A spectrum of this sinusoid is synthesized and subtracted.
The process is repeated for each harmonic.
Short-Time String Cancelation
0 500 1000 1500 2000 2500 3000 3500 4000-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (Hz)
Mag
nitu
de (d
B)
original spectrum synthesized harmonicprocessed spectrum The subtraction takes
place in the frequency domain.
The first harmonic is detected with a peak search.
The complex spectrum is used to measure the parameters of the sinusoid.
A spectrum of this sinusoid is synthesized and subtracted.
The process is repeated for each harmonic.
Short-Time String Cancelation
0 500 1000 1500 2000 2500 3000 3500 4000-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (Hz)
Mag
nitu
de (d
B)
original spectrum synthesized harmonicprocessed spectrum The subtraction takes
place in the frequency domain.
The first harmonic is detected with a peak search.
The complex spectrum is used to measure the parameters of the sinusoid.
A spectrum of this sinusoid is synthesized and subtracted.
The process is repeated for each harmonic.
Short-Time String Cancelation
0 500 1000 1500 2000 2500 3000 3500 4000-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (Hz)
Mag
nitu
de (d
B)
original spectrum synthesized harmonicprocessed spectrum The subtraction takes
place in the frequency domain.
The first harmonic is detected with a peak search.
The complex spectrum is used to measure the parameters of the sinusoid.
A spectrum of this sinusoid is synthesized and subtracted.
The process is repeated for each harmonic.
Modeling and measurement of harmonics
The harmonics are modeled as sinusoids of 1st-order phase and amplitude (i.e. constant frequency and exponential
amplitude)
njjnx exp][
The measurements of these partials is based on the Complex Spectral Phase-Magnitude Evolution (CSPME) method, generalisation of (Short and Garcia, 2006) to 1st-order
amplitude signals, introduced in the thesis.
Modeling and measurement of harmonics
njjnx exp][
jnxnxny exp][]1[][
xX DFT
jXY exp
2
*
logrealXYX
2
*
logimagXYX
Modeling and measurement of harmonicsThe standard magnitude spectrum along with the CSPME frequency ω and decay rate γ spectra are shown here for illustration.
ω and γ nevertheless need only be evaluated at the peak maxima.
Zero-padding is with this method superfluous (note the “angularity” of the spectra).
669.0684 1338.1367 2007.2051-60
-40
-20
0
deci
bels
Magnitude Spectrum
669.0684 1338.1367 2007.2051
656.8242
1319.2515
1997.689he
rtz
Frequency Spectrum
669.0684 1338.1367 2007.2051
-45.6197
-7.3079
12.962
Frequency (Hz)
hertz
Decay Rate Spectrum
Modeling and measurement of harmonics
After 1st-order phase and amplitude terms (i.e. frequency and decay rate) are evaluated, the 0th-order terms can be
evaluated in turn.
njns exp][
The spectrum of a synthetic signal s[n] is used in a process known as demodulation (Marchand, 1998, Zölzer, 2002, Short
and Garcia, 2006).
0
1
Modeling and measurement of harmonics
][exp
1exp][ nxj
njns
SXlogreal
SXlogimag
Modeling and measurement of harmonicsLikewise, the phase φ and amplitude λ are shown at all spectral indices for illustration.
Note that, due to the exponential decay of the partials, the partial amplitude measurements of the middle plot slightly differ from the magnitude maxima of the upper plot.
669.0684 1338.1367 2007.2051
-33.5506
-24.0077
-13.751
0
deci
bels
Magnitude Spectrum
Amplitude Spectrum
669.0684 1338.1367 2007.2051
-1.4805-0.9259
0.9367
Frequency (Hz)
radi
ans
669.0684 1338.1367 2007.2051
-33.0886
-21.3451
-14.3155
0
deci
bels
Phase Spectrum
Modeling and measurement of harmonics
Thereafter, the complete partial can be synthesized, Fourier-transformed, and subtracted.
Instead of taking the DFT of the synthetic partial, the spectrum can be directly synthesised with Fourier-series approximation
(derived in thesis).
The spectral values of the main lobe can thus be synthesised alone instead of an entire spectrum, allowing computational
savings.
N
bGVjbX 2exp][
The problem of Inharmonicity
The phenomenon of inharmonicity makes the problem of excitation/string extraction much more delicate, for the reasons
that we are going to see.
Strings with negligible stiffness exhibit linear frequency series.
20 1 kkk
0 kk
Very often, stiffness causes a nonlinear “stretch” of this series.
k is the harmonic number, ω0 is the fundamental frequency, and β, the inharmonicity coefficient.
Spectral StretchThe stretch caused by inharmonicity is negligible at low harmonic indices, but often substantial at high frequency indices.
A linear model cannot be assumed for the identification of the harmonics.
A simple, robust and accurate method was presented in (Hodgkinson et al., 2009).
0 6000
Acoustic guitar open A (FF = 110Hz, IC = 7*10-5)
1
2
3 45
2200 30000
20
21
22 23
24
2526M
agni
tude linearised spectrum
6000 73000
51
5253
5455
56 5758
59
Frequency (Hz)
Appearance of Phantom PartialsA longitudinal series of vibrations has identical fundamental frequency and 1/4 inharmonicity. (Bank and Sujbert, 2003)
When inharmonicity is 0, this series is merged with main series. Else, phantom partials can be salient, and must be subtracted as well.
In this spectrum, the transverse partials are numbered in black, and the phantom partials, in red.
2100 52000
19
2020
21
21
22
22
23
23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
3232
33
3334 35
36
37
38
39
39
40 40
4141
42
4243
43
44
44
Frequency (Hz)
Mag
nitu
de
found
Overlap of phantom partialsTransverse and phantom partials may overlap.
This compromises the accuracy of the partial measurements.
In situations of overlap, a partial may come out as a bulge instead of a peak.
Transverse partials are generally larger, but not always!
3521.6086 3642.4918 3764.0529 3886.31070
Frequency (Hz)
Mag
nitu
de
32
31
32
33
33
3434
predicted
A transverse partial (black) under a phantom
partial (red)!
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Algorithmic searchThis complicates greatly the search and cancelation of the partials.
An algorithmic integrated detection/cancelation process is proposed in the thesis.
First-come, first-serve does not always apply!
Linearity of our subtractive cancelation process is used to tackle situations of overlap.
3450 39500
Time-varying FF and IC
nen)(0
ne
n)(
00 n
n 0lim
Large vibrational amplitude of the string may cause a downward glide in Fundamental Frequency, and upward trend in Inharmonicity
Coefficient (Hodgkinson et al., 2010).
We propose simplified models, based on string length and tension derivations by (Legge and Fletcher, 1984) and (Bank, 2009).
Time-varying FF and ICThe measurements beside were taken from an acoustic guitar open E3, played fortissimo.
To fit the coefficients ωΔ, ω∞, γω, βΔ, β∞ and γβ, a fast and robust fitting method based on Fourier analysis (FEPCF) was developed (Hodgkinson, 2011)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 161.5
61.6
61.7
61.8
61.9
62Fundamental Frequency
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11.11
1.12
1.13
1.14
1.15
1.16x 10
-4 Inharmonicity Coefficient
Time (s)
measurementsFEPCF fit
Time-varying FF and ICThese coefficients found, and entire spectrum of partial tracks can be deployed with only 6 coefficients. (Hodgkinson et al., 2010)
Besides we show the Ovation tone’s spectrogram fitted with our model (solid lines).
To show the importance of the time-varying IC, we also show tracks with constant IC (dashed lines).
Time-varying FF and ICThese coefficients found, and entire spectrum of partial tracks can be deployed with only 6 coefficients. (Hodgkinson et al., 2010)
Besides we show the Ovation tone’s spectrogram fitted with our model (solid lines).
To show the importance of the time-varying IC, we also show tracks with constant IC (dashed lines).
Onset-overlapping grains
Another major difficulty is the cancelation of the partials for the grains that overlap with the onset of the tone.
njjnhnx exp][][
If the onset takes place at time t=ν, then the sinusoidal model in attack-overlapping grains must be formulated as
where h[n] is the unit-step function,
0,10,0
][nn
nh
Attack-overlapping grainsIn attack-overlapping grain, the signal can be seen as windowed by a “unit-stepped” window.
Unofortunately, the frequency-domain characteristics of such windows are far from optimal.
The lower plot confronts the spectrum of a 2nd-order continuous window and a unit-stepped window.
Time0
0
1Am
plitu
de
unit-step function unit-stepped window
0 1 2 3 4 5 6 7 8 9 10-100
-80
-60
-40
-20
0
Frequency (Hz)
Mag
nitu
de (d
B)
Attack-overlapping grainsFurther illustration of the situation with the spectra of an attack-overlapping grain (black) and a regular grain (white).
0
0
Time
Am
plitu
de
0 50 100 150 200 250 300 350 400 450 500 550-150
-100
-50
0
Frequency (Hz)
Mag
nitu
de (d
B)
Attack-overlapping grainsString extraction in onset-overlapping grains can nevertheless contribute to the aural quality of the excitation.
The sound example shows string extraction starting from first non-onset-overlapping grain only (white), and from the grain before (black, overlap factor 1/3).
0 0.05 0.1 0.15 0.2
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Time (s)
Am
plitu
de
no overlap 1/3 overlap
Attack-overlapping grains
A few remarks concerning string extraction in onset-overlapping grains.
The CSPME frequency and exponential decay estimates are sensitive to excessive leakage seen in onset-overlapping grains. A
standard quadratic-fit approach may be preferable there.
Because of the poor resolution of the partials, the search for phantom partials seems futile.
For the application of Commuted Waveguide Synthesis (Karjalainen et al., 1993, Smith, 1993), it is desirable to leave
some sinusoidal energy at the onset of the tone. Onset-overlapping frames can then be left unprocessed.
Recapitulation on contributions
Fundamental Frequency and Inharmonicity Coefficient estimation method (Hodgkinson et al., 2009).
FF and IC time-varying models (Hodgkinson et al., 2010).Exponential-Plus-Constant fitting method (Hodgkinson, 2011)Inharmonicity-related complications “revealed”.Bulge search replaces peak search.Frequency-domain subtractive approach to excitation/string
extraction.Generalisation of the CSPE to exponential-amplitude signals.Analytical formulation of exponential-amplitude-modulated
cosine-window spectra.Onset-overlapping frames approached.
Thesis Contents
Introduction
: Conceptual definition of String Extraction ; applicability ; applications.
Chapter 1 : Comprehensive string model :• Basis (Fletcher and Rossing, 1991, Raichel, 2000,
Steiglitz, 1996)• damping (Chaigne and Askenfelt, 1993, Trautman
and Rabenstein, 2003)• Inharmonicity (Fletcher et al., 1962)• Longitudinal vibrations (Morse and Ingard, 1986,
Giordano and Korty, 1996, Bank and Sujbert, 2003)• Tension modulation (Legge and Fletcher, 1984,
Bank, 2009, Hodgkinson et al., 2010).
Thesis contents
Chapter 2 : Frequency-domain component estimation and cancelation • windowing (Harris, 1978, Nuttall, 1981)• Fourier-series approximation of cosine window DFTs• estimation of partial frequencies with CSPE (Short and Garcia, 2006) and exponential-amplitude generalisation (CSPME).
Chapter 3 : Phase Vocoder (PV) approach to String Extraction• PV scheme (Portnoff, 1981)• unit-step modeling of attack• inharmonicity estimation (Hodgkinson et al., 2009)• phantom partials.
Chapter 4 : Tests and results• CSPME• Fourier series approximation• onset-overlapping frames• phantom partials.
Conclusion : Aims, organisation, contributions and future work.
References
(Legge and Fletcher, 1984) Nonlinear generation of missing modes on a vibrating string. Journal of the Acoustical Society of America, 76(1), 1984
(Balázs Bank, 2009) Energy-based synthesis of tension modulation in strings. In Proceedings of the 12th International Conference on Digital Audio Effects (DAFx-09), Como, Italy, 2009.
(Hodgkinson, 2011) Exponential-plus-constant fitting based on Fourier analysis. In JIM2011 – 17èmes Journées d’Informatique Musicale, Saint-Etienne, France, 2011.
(Fletcher and Rossing, 1991) Neville H. Fletcher and Thomas D. Rossing. The Physics of Musical Instruments. Springer-Verlag New York Inc., New-York, USA, 1991.
(Bank and Sujbert, 2003) Balázs Bank and Lásló Sujbert. Modeling the longitudinal vibration of piano strings. In Proceedings of the Stockholm Music Acoustics Conference, August 6-9, 2003 (SMAC 03), Stockholm, Sweden, 2003.
References
(Morse and Ingard, 1986) Philip M. Morse and K. Uno Ingard. Theoretical Acoustics. Princeton University Press,
(Marchand, 1998) Improving spectral analysis precision with an enhanced phase vocoder using signal derivatives. In Proc. DAFx98 Digital Audio Effects Workshop, pages 114-118. MIT Press, 1998.
(Karjalainen et al., 1993) Towards high-quality sound synthesis of the guitar and string instruments. In International Computer Music Conference, Tokyo, Japan, 1993.
(Smith, 1993) Efficient synthesis of stringed musical instruments. In International Computer Music Conference, Tokyo, Japan, 1993.
(Short and Garcia, 2006) Kevin M. Short and Ricardo A. Garcia. Signal analysis using the complex spectral phase evolution (CSPE) method. In AES 120th Convention, Paris, France, 2006.
References
(Hodgkinson et al., 2009) Matthieu Hodgkinson, Jian Wang, Joseph Timoney and Victor Lazzarini “Handling Inharmonic Series with Median-Adjustive Trajectories”, Proceedings of the 12th International Conference on Digital Audio Effects (DAFx-09), Como, Italy, 2009.
(Hodgkinson et al., 2010) Matthieu Hodgkinson, Joseph Timoney and Victor Lazzarini “A Model of Partial Tracks for Tension-Modulated, Steel-String Guitar Tones”, Proceedings of the 13th International Conference on Digital Audio Effects (DAFx-10), Graz, Austria, 2010.
(Raichel, 2000) Daniel R. Raichel. The Science and Applications of Acoustics. Springer-Verlag New York Inc., New-York, USA, 2000.
(Steiglitz, 1996) Ken Steiglitz. A Digital Signal Processing Primer. Addison-Wesley Publishing Company, Inc., Menlo Park, California, USA, 1996.
References
(Chaigne and Askenfelt, 1993) Antoine Chaigne and Anders Askenfelt. Numerical simulations of piano strings. I. A physical model for a struck string using finite difference methods. Journal of the Acoustical Society of America, 95(2), 1993.
(Trautman and Rabenstein, 2003)
Lutz Trautman and Rudolf Rabenstein. Digital Sound Synthesis by Physical Modeling Using the Functional Transformation Method. Kluwer Academic/Plenum Publishers, New York, 2003.
(Giordano and Corty, 1996) N. Giordano and A. J. Korty. Motion of a piano string: Longitudinal vibrations and the role of the bridge. Journal of the Acoustical Society of America, 100(6), 1996.
(Harris, 1978) Frederic J. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE 66(1), 1978.
(Nuttall, 1981) Albert H. Nuttall. Some windows with very good sidelobe behavior. IEEE Transactions of Acoustics, Speech, and Signal Processing, ASSP-29(1), 1981.
References
(Portnoff, 1981) Michael R. Portnoff. Time-scale modification of speech based on short-time Fourier analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing, 29(3), 1981.