Audio and (Multimedia) Methods for Video Analysisfractor/fall2011/cs298-3.pdf · Audio and...

Audio and (Multimedia) Methods for Video Analysis

Dr. Gerald Friedland, fractor@icsi.berkeley.edu

CS-298 Seminar Fall 2011•

More on Sound

•Responses to Email Questions•Useful Filters•More Features•Typical Machine Learning on

Audio•More on Tools

Why didn’t you talk about Loudspeakers?

Loudspeakers vs Microphones

Source: http://www.mediacollege.com/audio/microphones/dynamic.html

Source: http://www.physics.org

Loudspeaker Characteristics

Frequency dependent! (As are the response patterns of Microphones!)

- Speakers don’t really help with video analysis...

- Exception: Multichannel!

How is the directionality graph for Microphones generated?

- According to IEC 60268-4

Does lossy compression influence audio analysis?

- Yes, but degree unknown- Common Sense: It’s better to train your models based on the same data compression algorithm.

Why is the CD sampling rate 44100Hz and not just 44000?

- Because it is dividable by both 25 video frames per second (PAL) and 30 fps (NTSC)

Useful Filters

•Low-,High-, Band-pass, Graphic Equalizer•Dynamic Range Compression•Filter by Example: Spectral Subtraction, Wiener Filter•Handling Multiple Channels

Remember: Fourier Transform

More Information: “Introduction to Multimedia Computing”, Draft Chapter 8

Remember: Fast Fourier Transform

// Input: A number N of samples X. N must be an integer power of two.// s the stride// Output: The Discrete Fourier Transform of X in the array Y.FFT(X, N, s)

IF (N==1) THENY[0]:=X{0] // Trivial size-1 case: Sample = DC coefficient

ELSEY[0,...,N/2]:=FFT(X,N/2,2*s) // DFT of (X[0],X[2s},X[4s},...)Y[N/2,...,N-1]:=FFT(X[s,...,N-1],N/2,2*s)// DFT of (X[s],X[s+2s],X[s+4s],...)

FOR k:=0 TO N/2 // Combine the two halves into onet:=Y[k]Y[k]:=t+exp(-2*pi*k/N)*Y[k=N/2]Y[k+N/2]:=t-exp(-2*pi*k/N)*Y[k=N/2]

FFT := Y

Remember: Convolution Theorem

Let f and g be two functions and f * g their convolution. Then:

Bandpass Filters

Dynamic Range Compression

Dark blue: Original, Light Blue: Dynamic Range Compressed

Filter by Example: Spectral Subtraction

Dark blue: Original with humming noise, Light blue: Corrected using spectral subtraction of humming.

Filter by Example: Wiener Filter

Assume:

where:•s(t) is original signal•n(t) is noise footprint•s(t) is estimated signal•g(t) is Wiener Filter

Define Error:

with being the delay. Thus:

(squared error)

Define Error:

with being the delay. Thus:

(squared error)

Results in:

where: •X(s(t)) is the power spectrum of the signal,•s(t) and N(s(t)) is the power spectrum of the noise, and •G(s(t)) is the frequency-space Wiener filter.

Problem: Do we know N(s(t))?

Multichannel Audio

•Microphone Array Recordings popular with Speech Community•Stereo popular since the 60s.•Most cameras have stereo microphones•Most DVDs are encoded in Dolby Digital 5.1 and in Dolby Surround on Stereo Track

Dolby Digital

Source: http://www.aspecttv.co.uk/surround-sound-setup/

Dolby Surround

How to Cope with Multiple Channels?

•Treat them individually (different channel=different content)•Combine them, ideally while reducing noise•Cross-Infer between channels•Any combination of the above

How to Cope with Multiple Channels?

•Treat them individually (different channel=different content)•Combine them, ideally while reducing noise•Cross-Infer between channels•Any combination of the above

Combining Channels

•Beamforming, two major techniques:- Delay-Sum- Frequency-Domain Beamformer

Delay-Sum Beamforming

More Information: http://www.xavieranguera.com/phdthesis/

Delay-Sum Beamforming

Output:- Sum of all channels (lower noise)- Delay between microphones (directional signal)

Audio and (Multimedia) Methods for Video Analysisfractor/fall2011/cs298-3.pdf · Audio and...

Documents

Multimedia im Netz Online Multimedia - Medieninformatik · Multimedia im Netz Online Multimedia ... Asynchronous Contact Form with PHP back ... Universität München Online Multimedia

Multimedia Communications Multimedia Technologies ...elsaddik/abedweb/teaching/elg5121/pdf/09_MM... · Multimedia Communications Multimedia Technologies & Applications ... Multimedia

Multimedia Systems Design. Contents Introduction Multimedia elements Multimedia applications Multimedia systems architecture

Mobile Multimedia and Multimedia Systems Perspectives Mobile Multimedia and Multimedia Systems Perspectives

Hands On: Multimedia Methods for Large Scale Video Analysis …fractor/fall2012/cs294-6-2012.pdf · 2012. 10. 5. · Hands On: Multimedia Methods for Large Scale Video Analysis (Project

CS298 Report-PDF - Department of Computer Science - San Jose

1 LECTURE 1 INTRODUCTION TO MULTIMEDIA. 2 Objective Definitions of multimedia What is multimedia? Multimedia Types Applications for Multimedia

CS298: HCI Design Clinics

Hands On: Multimedia Methods for Large Scale Video ...fractor/fall2012/cs294-4-2012.pdf · – Bid for unused AWS capacity – Prices controlled by AWS based on supply and demand

Narrative Theme Navigation for Sitcoms Supported by Fan ...fractor/papers/friedland_118.pdf · narrative-theme segmentation has proven indistinguishable from expert annotation (as

“Intelligent Multimedia” · “Intelligent Multimedia” ... Text Audio Image Video interaction Multimedia Eq 1 Coding System Applications Multimedia Eq 2. Intelligent Multimedia

Cs298 Report Dhillon Ravi

1 Multimedia Chapter 7 7.1 Introduction to multimedia 7.2 Multimedia files 7.3 Video compression 7.4 Multimedia process scheduling 7.5 Multimedia file

Towards Automatized Studioless Audio Recording: …fractor/papers/friedland_24.pdfTowards Automatized Studioless Audio Recording: A Smart Lecture Recorder Gerald Friedland, Kristian

CS298 Project Presentation By Priya Gangaraju · CS298 Project Presentation. By . Priya Gangaraju. Agenda Motivation and Project Description ... 0-1 vector is constructed for each

Audio and (Multimedia) Methods for Video Analysisfractor/fall2011/cs298-2.pdf · Audio and (Multimedia) Methods for Video Analysis Dr. Gerald Friedland, fractor@icsi.berkeley.edu

Mengoperasikan software multimedia - psbtik.smkn1cms.netpsbtik.smkn1cms.net/tik/multimedia/mengoperasikan_software... · PROGRAM KEAHLIAN MULTIMEDIA Mengoperasikan Software Multimedia

Hands On: Multimedia Methods for Large Scale Video ...fractor/fall2012/cs294-2-2012.pdfHands On: Multimedia Methods for Large Scale Video Analysis (Project Meeting) Dr. Gerald Friedland,

Multimedia Concept & Topicscis.k.hosei.ac.jp/~jianhua/course/mm/Lesson01.pdf · Multimedia Concept & Topics • Multimedia Concept • Multimedia Computing • Multimedia Classification

CS298: HCI Design Clinics Paper Prototyping 03/01/10 Berkel ey UNIVERSITY OF CALIFORNIA