Upload
william-barrett
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
1
CSP05-06 - Auditory input processing
Auditory input processing
Lecturer:Smilen Dimitrov
Cross-sensorial processing – MED7
2
CSP05-06 - Auditory input processing
Introduction
• The immobot base exercise• Work on the auditory input• Goal – sound source localization
in 3D• Setup:
– PC– Two microphones– Sound card
3
CSP05-06 - Auditory input processing
Setup – microphone problems
• We need to use two microphones to obtain a stereo signal• For regular PC microphones (like our Sandbergs):
– Take note they are electret!– They demand +5V from the PC in order to work– All PC mic inputs follow this standard:
although we have a tip-ring-sleeve jack connector, it is NOT a stereo jack.
• Thus a PC mic input will always show as mono (stereo button will be greyed out in Recording control of Windows mixer)
4
CSP05-06 - Auditory input processing
Setup – microphone problems
• We need to use two microphones to obtain a stereo signal• For regular PC microphones (like our Sandbergs):
• Hence the connection cable below will NOT work (as it assumes that the electret connector is a stereo one)
5
CSP05-06 - Auditory input processing
Setup – microphone problems
• Hence, we will have to use :– a dedicated audio card, – with two microphone inputs,
even if we want to use cheap electrets for stereo!
• One possible soundcard: M-Audio mobilePre USB
6
CSP05-06 - Auditory input processing
Setup – microphone problems
• Interfacing two electrets for stereo input:– would involve a schematic cable like below:
• (assuming we have a stereo plug mic input on the card)
7
CSP05-06 - Auditory input processing
Setup – microphone problems
• To avoid these problems with electrets, we are going to use capacitor microphones (Generis)
• Note that these microphones must be connected using an XLR cable (the M-Audio card has such mic inputs)
• Note that condenser/capacitor microphones demand a power supply – so called “phantom power” (the M-Audio card has such facility)
• Thus, we should make sure the sound card and the microphones are compatible.
8
CSP05-06 - Auditory input processing
Setup
• Setup for a PC:
(In addition to the microphones and the sound card): 1. M-Audio MobilePre USB drivers2. Max/MSP/Jitter
• Microphone parameters need not be specified in the algorithm discussed today.
9
CSP05-06 - Auditory input processing
Goal of the auditory processing algorithm
• Object detection: – the application needs to detect the presence of
a new object whenever it enters the monitored environment (say, a sound louder that threshold)
• Object recognition: – Once a new object is detected, it needs to be
classified to determine its type (e.g., a car versus a truck, a tiger versus a deer) (involves comparing sounds – spectrum signatures)
• Object tracking: – Assuming the new object is of interest to the
application, it can be tracked as it moves through the environment. Tracking involves computing current location of the object and its trajectory
Preprocess-audio
Estimation of 3D location
through ITD / cross-correlation
• Relation to the model we had for visual input processing – Not really applicable for the algorithm discussed, but could
be – here we will directly do tracking
10
CSP05-06 - Auditory input processing
Goal of the auditory processing algorithm
•
11
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Small comparison between stereo camera and microphones system
– Camera – 2D sensor (2D array of photocells)– Single camera can give a vector of direction to tracked object– Two cameras can give a point (intersection of direction vectors
– CPA)
– Microphone – 1D sensor (senses values at a single point – corresponds to a single photocell in camera)
– Single microphone cannot give any geometric information– Two microphones can only give azimuthal angle – which
corresponds to a vector of direction, confined to the “horizontal” plane
12
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Algorithm – computing the the time delay of arrival (TDOA) of the wave front at the two microphones– In biological terms this is the equivalent of the Interaural Time
Difference (ITD)
– We compute the lag of the wave at a specific point received at both microphones (the Interaural Phase Difference (IPD) )
– Must find the time difference between two identical points in the left and right sound signal – using cross-correlation
13
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Cross-correlation – two arrays, representing the left and right audio signal: g and h – their correlation is also an array
• The length of the cross-correlation array is
1
0
)(,N
kkkjj hgthgCorr
1))()(()( BlengthAlengthClength
14
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Cross-correlation – in essence, what we are doing is taking one array, and “sliding” it across the another, finding the sum of the products between respective elements.
15
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Cross-correlation – algorithm• First, find the time increment between sampling:
• Assume the sound can be analyzed through the diagram below:
• Sound arriving at left channel, will arrive at right channel after crossing distance b – we know the speed of sound, so we can also calculate time difference
s53
102676.2101.44
1
16
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Cross-correlation – algorithm• Assume the sound can be analyzed through the diagram below:
• Trigonometry: b
a
c
b
c
a tan,cos,sin
17
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Cross-correlation – algorithm• Assume the sound can be analyzed through the diagram below:
• The time difference:– Where Δ = time between sound sampling,, and σ = the
number of delay samples returned from the cross-correlation function.
t
18
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Cross-correlation – algorithm
• Calc length of line a
– Speed of sound v = 384m/s at room temperature
• Finally, calc the angle θ
– Where c is a known distance between the microphones
soundsound vvta
c
a
c
a 1sinsin c
vsound
1sin
19
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• When θ is finally computed, we obtain a direction vector, by rotating the unit vector in the horizontal plane (xz), around the vertical axis (y) for amount θ
• So, the vector DA with components (-sin θ, 0, cos θ) will represent the direction of detected audio source
cos
0
sin
1
0
0
cos0sin
010
sin0cos
z
y
x
DA
20
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Overview of the algorithm (architecture)
21
CSP05-06 - Auditory input processingSound-source localization using ITD and cross-correlation
• Problems with the approach
• We only retrieve a direction vector in a plane (azimuthal angle) – information about the “vertical” position of the sound source is lost
• 3D localization of audio as a 3D point is possible using two microphones, if some medium (that changes sound) is placed between the microphones (a “head”), and then a head-related transfer function is calculated.
22
CSP05-06 - Auditory input processing
Implementation in Max/MSP
• Will program own MSP object, to perform audio cross-correlation realtime – then proceed to vector calculation and display