17
Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live Audio Recordings for Blind Noise Reduction

Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Embed Size (px)

Citation preview

Page 1: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron BallewAleksandar KuzmanovicC. C. Lee

Northwestern University

Dept. of Electrical Engineering and Computer Science

July 7th 2011

Fusion of Live Audio Recordings for Blind Noise Reduction

Page 2: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Observation

You Attend a Concert● You’d like a recording of

the show● Live albums exist, but…● You want the show you

went to, back in San Jose CA on Feb 22nd 2010

Bootleggers

At the show, you remember cell phones and cameras in the air

Page 3: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Observation, cont’d

Seek it Out

You find some of those recordings uploaded

Not just one, but three, four, or five copies of your favorite songs

Varying quality

Online Database

Page 4: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Opportunity

Each song is an unknown source signal with receiver diversity

There must be a way to take advantage of the diversity in these recordings to generate a new recording whose quality is better than any of the originals

Page 5: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Opportunity, cont’d

All the recordings have something in common – a sameness from the music that was generatedThey have something uncommon too – a differentness from noisy applause, screaming fans, wind, etc.

SIGNAL +

NOISE

Multipath Echo

Applause, Screaming, etc.

Music Source

Page 6: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Complications

No reference (except in your mind) that defines which part is music rather than noise Studio recording won’t work in general

You don’t know the SNR of any signal

There’s no pilot signal to imply the channel

No opportunity to pre-code a digital waveform It’s an Analog source No M-ary QPSK, Matched-Filters

Uncountably many sources and relatively few recordings, not a good fit for ICA

Page 7: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Assumptions

Recordings are mono Stage speakers may be physically separated and multitrack Relative to venue’s scale and listener’s perspective the

multitracks arrive synchronized and recorded as mono by mic

Recordings are not synchronized to each other Different start/stop times and duration

Receivers are distributed arbitrarily among audienceNoise at one receiver is not the same noise at another Not necessarily true if two receivers are close to each other Not true out-of-context, such as a quiet auditorium

Sample vs. Sample

Noise vs. Noise

Page 8: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Strategy

We will never know the absolute SNR of any of the recordings

However, if we could be confident their signal powers were equal, then the differences in their total powers would be due to the noise Assumes the noise is (close to) uncorrelated Does not assume we know what the signal power actually is

If we could use the total power as a proxy for noise power (given bullet 2 above), we could: Rank recordings by SNR Apply a classic averaging technique to cancel noise Measure whether noise power went up or down compared to any original

recording

Page 9: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Strategy, cont’d

It would look like this:

Page 10: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

R

x 4

2

3

1 x

x

x

T

rij tij,

=

= { }{ }

rij

tij

. . .

. . .......

Step 1 – Internal Reference

Similarity & Synchronization

Cross-correlations show: Which sample is most

similar to all other samples The time-shift (lag)

between any sample pair

No external reference, so pick internal one from the sample set

Page 11: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Step 2 – Normalize

In Absence of SNR,

The effect of combining samples is unclear

Need a way to isolate

changes in signal or

noise power

It would be helpful if signal

powers were already equal Implies combining affects the

noise

𝜎𝑛2

==𝜎𝑠2

} }

Avg Avg

𝜎𝑠2

𝜎𝑛2

=

𝜎𝑥2

}

Avg

= ??

??

Page 12: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Step 2 – Normalize, cont’d

Use the Right Tool

Use covariance, not r, to normalize signal powers

You still don’t know the absolute signal powers

You only know that the differences are due to noise

Now, you can tell whether noise goes up or down after combining

Page 13: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Step 3 – Fusion

“Weighted” Average

Find the average of the first M ranked samples, such that total power is minimized

Why the first M? A sample’s noise power

may be so large it increases the composite’s noise

𝜎𝑥2 𝜎𝑠2

𝜎𝑛2

Avg

=}

𝜎𝑥2 𝜎𝑠2

𝜎𝑛2

Avg

=}

*not to scale

Page 14: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Benefits

Identify a “best” quality recording without having to manually listen to each

Generate a recording that exceeds the “best” in quality

Encourage user-generated (crowd-sourced) content sharing

Applicable to any context where the source signal is completely unknown

Page 15: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Ongoing and Future

Ongoing: Time-variability of noise Shows up as “low-frequency” noise that downselects against such a

recording We window in time (and frequency) to take advantage of the high-quality

parts of the recordings Stitching the windows back together post-fusion requires some attention

due to an audible discontinuity when adjacent windows generate a different composite

Future: Maximal Ratio Combining Well-known technique that requires channel knowledge Gives optimal weighting of samples for maximal fusion gain I believe we can adapt the inference technique to MRC, such that we get

the “maximal” SNR gain, though I may not know exactly what the gain is!

Page 16: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron Ballew Fusion of Live Audio Recordings for Blind Noise Reduction – Fusion 2011

Conclusion

Thank You! http://networks.cs.northwestern.edu/~aaron/fusion.html

Page 17: Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011 Fusion of Live

Aaron BallewAleksandar KuzmanovicC. C. Lee

Northwestern University

Dept. of Electrical Engineering and Computer Science

Fusion of Live Audio Recordings for Blind Noise Reduction