14
Demos for QBSH J.-S. Roger Jang ( 張張張 ) [email protected] http://mirlab.org/jang CSIE Dept, National Taiwan University

Demos for QBSH

  • Upload
    africa

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Demos for QBSH. J.-S. Roger Jang ( 張智星 ) [email protected] http://mirlab.org/jang CSIE Dept, National Taiwan University. Intro. to QBSH. QBSH: Query by Singing/Humming Challenges Robust pitch tracking Key transposition Collection of song databases Efficient comparison - PowerPoint PPT Presentation

Citation preview

Page 1: Demos for QBSH

Demos for QBSH

J.-S. Roger Jang (張智星 )

[email protected]

http://mirlab.org/jang

CSIE Dept, National Taiwan University

Page 2: Demos for QBSH

Intro. to QBSH

QBSH: Query by Singing/HummingChallenges

Robust pitch tracking Key transposition Collection of song databases Efficient comparison

Karaoke box: ~10000 songsInternet: 500M songs, 12M albums (www.jogli.com)

Page 3: Demos for QBSH

Efficient Retrieval in QBSH

Methods for efficient retrieval Multi-stage progressive filtering Indexing for different comparison methods Music phrase identification Repeating pattern identification Distributed & parallel computing

Our focus Parallel computing via GPU

Page 4: Demos for QBSH

MIRACLE

MIRACLE Music Information

Retrieval Acoustically via Clustered and paralleL Engines

Database (~20K songs) MIDI files Solo vocals (<100) Melody extracted from

polyphonic music (<100)

Comparison methods Linear scaling Dynamic time warping

Top-10 Accuracy ~75%

Platform Single CPU+GPU

Page 5: Demos for QBSH

MIRACLE (II)

References (full list) J.-S. Roger Jang and Ming-Yang Gao, "A Query-by-Singing System based on

Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dec 2000.

Jyh-Shing Roger Jang, Jiang-Chun Chen, Ming-Yang Kao, "MIRACLE: A Music Information Retrieval System with Clustered Computing Engines", International Symposium on Music Information Retrieval (ISMIR) 2001

… Chung-Che Wang and Jyh-Shing Roger Jang, “Acceleration of Query by

Singing/Humming Systems on GPU: Compare from Anywhere”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012

Page 6: Demos for QBSH

MIRACLE Before Oct. 2011Client-server distributed computingCloud computing via clustered PCs

Master server

Clients Clustered servers

PC

PDA/Smartphone

Cellular

Slave

Slave

Slave

Master server

Slave servers

Request: pitch vector

Response: search result

Database size: ~12,000

Page 7: Demos for QBSH

Current MIRACLESingle server with GPU

NVIDIA 560 Ti, 384 cores (speedup factor = 10)

Master server

ClientsSingle server

PC

PDA/Smartphone

Cellular

Master serverRequest: pitch vector

Response: search result

Database size: ~13,000

Page 8: Demos for QBSH

MIRACLE in the FutureMulti-modal retrieval

Singing, humming, speech, audio, tapping…

Master server

Clients Clustered servers

PC

PDA/Smartphone

Cellular

Slave

Slave

Slave

Master server

Slave servers

Request: feature vector

Response: search result

Page 9: Demos for QBSH

QBSH for Various Platforms

PC Web version

Embedded systems Karaoke machines

Smartphones iPhone/Android

Toys 16-bit micro-

controller

Page 10: Demos for QBSH

QBSH Prototype in MATLAB

To create a QBSH prototype in MATLAB Get familiar with audio processing in MATLAB

See audio signal processing

Try the programming contests onPitch trackingQBSH

• Run exampleProgram/goDemo.m to test drive the QBSH prototype in MATLAB!

Page 11: Demos for QBSH

QBSH Demos

QBSH demos by our lab QBSH on the web: MIRACLE QBSH on toys

Existing commercial QBSH systems www.midomi.com www.soundhound.com

Page 12: Demos for QBSH

Returned Results

Typical results of MIRACLE

Page 13: Demos for QBSH

13

Online Karaoke

Synchronized lyrics

Calory consumption

Real-time score

Recording

Live broadcast

Real-time pitch display

Automatic key adjustment

Page 14: Demos for QBSH

Future Work

Multi-modal music retrieval Query by user’s inputs: Singing, humming, whistling,

speech, tapping, beatboxing Query by exact examples: Audio clips

Speedup schemes Repeating pattern id., DTW indexing

Database preparation Polyphonic audio music as database The ultimate

challenge!