22
www.tu-chemnitz.de Chemnitz 02 July 2019 Danny Kowerko 3D Indoor Audio Localization of Moving Objects Fakultät für Informatik Juniorprofessur Media Computing

3D Indoor Audio Localization of Moving Objects

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.deChemnitz ∙ 02 July 2019 ∙ Danny Kowerko

3D Indoor Audio Localization of Moving Objects

Fakultät für InformatikJuniorprofessur Media Computing

Page 2: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de2Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Determination of the Position of a Sound Source in 3D Space

So we are trying to answer the question: Where is Tux?

𝑷Tux = 𝒙, 𝒚, 𝒛 ?

Fakultät für InformatikJuniorprofessur Media Computing

What is 3D Acoustic Source Localization?

Sound Source

3D Space

Page 3: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de3Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

How is 3D Acoustic Source Localization typically approached?

1) Multiple microphones at known, fixed locations

2) Record sound emitted by sound source

3) Deduce sound source position

Loalization Method Categories

− Time Difference of Arrival (TDOA)− Beamforming− Equation System-based

Microphone Geometries

− Planar arrays

− Distributed topologies

}}

Page 4: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de4Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

Microphone Geometries – Planar Arrays vs. Distributed Topolgies

16 microphones planar array,4×4 rectangular configuration

8 microphones distributedtopology, mics in all corners of a room

Page 5: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de5Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

How does TDOA based Localization work with Planar Arrays?

x

ySound Source

Mic 1Mic 2

φd

φ.. Angle of Arrivald.. Distance b/w MicrophonesR.. Average Distance b/w Sound

Source & Microphonesv.. Speed of Soundλ.. Wave Lengthτ.. Measured Time Delay b/w

Microphone Signals

𝜑 = cos−1𝑣 ∙ 𝜏

𝑑, for 𝑅 > 𝜆

R

Far Field

x

ySound Source

Mic 3Mic 4

φ1

Mic 1Mic 2

φ2

yS*

xS*

φ1.. Angle of Arrival Mic Pair 1φ2.. Angle of Arrival Mic Pair 2xS

*, yS*.. Estimated Position of Sound

Source

Step 1 [1]

Calculate Angle ofArrivals (AOAs)

[2] Step 2Combine AOAs and

localize Sound SourceUpper Row of 4x4 Microphone Array

Page 6: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de6Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

Estimating 𝝉 – Time Delay Estimation

Selection of Time Delay Estimation Methods [3]

Peak Search

Cross Correlation Function2 Mic Recordings (1 Pair)

Cross Correlation (CC), Generalized Cross Correclation (GCC), GeneralizedCross Correclation Phase Transform (GCC-PHAT), Adaptive Least Mean SquareFilter (LMS), Maximum Likelihood Estimator (ML), Average Square DifferenceFunction (ASDF)

Page 7: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de7Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

ΘH ≈ 130°

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset I – Static Sources & Planar Arrays

MC Localization-Dataset IPlace Media Computing LabAuthor Dr. HusseinSound Sources 16 loudspeakers

coordinates knownSource Audio 10s newscastMicrophones 56 microphones in 3 arrays

coordinates knownMethodology 16 loudspeakers

consecutively played back the cource audio file. Onerecording per loudspeaker

ΘV ≈ 120°Real & Virtual Scheme of Dataset

Angle Conventions & 3D AOA Example

Page 8: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de8Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset I – Results & Visualization (TDOA Method)

Calculate 3D region (e.g. sphere) in which the AOA vectors are spatially particularly close

Ca

lcu

late

An

gle

of

Arr

iva

ls(A

OA

s)

Ste

p1

Step 2 Combine AOAs and localize Sound Source

Noise Female Speech Jingle

Page 9: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de9Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

• 60 different horizontal microphone pair combinations

• 16 sound sources

• microphone array 3 (24 microphones)

→ RMSE - Root Mean Squared Error to estimatedeviations from model

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset I – Location Determination Results of static sources

„real“ horizontal angle fhor / °

pre

dic

ted

horizon

tal angle

/ °

RMSE ~ 10°

Page 10: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de10Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset I – Location Determination Results of Static Sources

→Most accurate:Array 2:cardioid pattern

RM

SE

/ °

predicted orientations

Average over

16 static sources

array 1 array 1 array 2 array 2 array 3 array 3

fhor fvert fhor fvert fhor fvert

Page 11: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de11Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset II – Dynamic Sources

MC Localization-Dataset IIPlace Media Computing LabAuthor R. Erler & F. SchmidsbergerSound Source Moving vacuum cleaner

robot, coordinates knownSource Audio NoiseMicrophones 48 mics in 3 arrays, 8 mics

in corners, coordinatesknown

Methodology Record audio and topviewvideo synchronously whilerobot vaccums the floor

Example Video from Dataset

Page 12: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de12Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset II – How can we generate the Ground Truth?

Raw Image Rectified Image Perspective corr. Image

Camera Calibration Homography Matrix

Annotated Image

Label Robot Center in Image Coordinates

𝑥px, 𝑦px

Transform

𝑥px, 𝑦pxinto World Coordinates𝑥w, 𝑦w

x: 1m, y: 1.5m

Page 13: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de13Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset II – Ground Truth Visualization

GT Video RealworldGT Video 3D Model

Page 14: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de14Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset II – Location Determination Results

21 3

54 6

87 9

Localization:

• only direct neighbor → 9 × L-shape mic pairs

• only vertical angles considered 0° <= fvert <= 90°

4x4 Array

Page 15: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de15Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Localization:

• only direct neighbor→ 4 × X-shape mic pair

• all angles considered 0° <= fvert <= 90°

Fakultät für InformatikJuniorprofessur Media Computing

MC Localization Dataset II – Location Determination Results

4x4 Array

1 2

3 4

Page 16: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de16Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

Evaluation of Array 1 vs. Array 2 vs. Array 3

Vertical Angle Horizontal Angle

Page 17: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de17Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

→ best results for cardioidmicrophones

Fakultät für InformatikJuniorprofessur Media Computing

Evaluation of Array 1 vs. Array 2 vs. Array 3

Page 18: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de18Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

Fakultät für InformatikJuniorprofessur Media Computing

Implementation Details

Data Processing Task Essential (Matlab) functions* Matlab Toolboxes

Camera Calibration Camera Calibrator App Computer Vision

Rectification undistortImage Image Processing

Image Annotation (Ground Truth) imageLabeler (App) Computer Vision

Coordinate Transformation Cp2tform Image Processing

Time Delay of Arrival Xcorr Signal Processing

Direction of Arrival Cosine/Algebra

Evaluation/Regression Fit, (Goodnes of Fit) Curve Fitting

Visualisation scatter3 -

*Matlab 2018a

Page 19: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de19Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

• Sensor Fusion with (Deep Learning based) Localized Objects from Video Material

• Publication of results and source code

• Intregration into lecture/turorials (Matlab Live Editor)

• Optimization of localization• Beamforming

• Geometry

• Deep Learning

• Search (Industrial) Cooperations

Fakultät für InformatikJuniorprofessur Media Computing

Outlook

Page 20: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.deChemnitz ∙ 02 July 2019 ∙ Danny Kowerko

3D Indoor Audio Localization of Moving Objects

Fakultät für InformatikJuniorprofessur Media Computing

Page 21: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de21Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

[1] J. York, ‘Angle Of Arrival.’ [Online: last accessed 16.03.2019]. https://www.ese.wustl.edu/~nehorai/josh/students.cec.wustl.edu/_jly1/doa.html.

[2] J. York, ‘Cooridinate Derivation '(x,y)‘.’ [Online: last accessed 16.03.2019]. https://www.ese.wustl.edu/~nehorai/josh/students.cec.wustl.edu/_jly1/posderivation.html.

[3] Y. Zhang, Waleed, and H. Abdulla, ‘A Comparative Study of Time-Delay Estimation Techniques Using Microphone Arrays’, in The University of Auckland, School of Engineering Report, Auckland, New Zealand, 2005, vol. 619.

[4] H. Wang et al., ‘Acoustic sensor networks for woodpecker localization’, in Proc.SPIE, 2005, vol. 5910.[5] C.-E. Chen, A. M. Ali, and H. Wang, ‘Design and Testing of Robust Acoustic Arrays for Localization and Enhancement of Several Bird

Sources’, in Proceedings of the 5th International Conference on Information Processing in Sensor Networks, New York, NY, USA, 2006, pp. 268–275.

[6] J. Lanslots, F. Deblauwe, and K. Janssens, ‘Selecting sound source localization techniques for industrial applications’, Sound Vib., vol. 44, no. 6, p. 6, 2010.

[7] M. Basiri, F. Schill, P. U.Lima, and D. Floreano, ‘Localization of emergency acoustic sources by micro aerial vehicles’, J. Field Robot., vol. 35, no. 2, pp. 187–201, Mar. 2018.

[8] S. V. Sibanyoni, D. T. Ramotsoela, B. J. Silva, and G. P. Hancke, ‘A 2-D Acoustic Source Localization System for Drones in Search and Rescue Missions’, IEEE Sens. J., vol. 19, no. 1, pp. 332–341, Jan. 2019.

Fakultät für InformatikJuniorprofessur Media Computing

References

Page 22: 3D Indoor Audio Localization of Moving Objects

www.tu-chemnitz.de22Chemnitz ∙ 02 March 2019 ∙ Danny Kowerko

• Wang et al. 2005, ‘Acoustic sensor networks for woodpecker localization’ [4]

• Chen et al. 2006, ‘Design and Testing of Robust Acoustic Arrays for Localization and Enhancement of Several Bird Sources’ [5]

• Lanslots et al. 2010, ‘Selecting sound source localization techniques for industrial applications’ [6]

• Basiri et al. 2018, ‘Localization of emergency acoustic sources by micro aerial vehicles’ [7]

• Sibanyoni et al. 2019, ‘A 2-D Acoustic Source Localization System for Drones in Search and RescueMissions’ [8]

Fakultät für InformatikJuniorprofessur Media Computing

3D Acoustic Source Localization – What is it good for?

Micro Aerial Vehicle [7] Door slam [6]