36
1 Music Information Retrieval CTP 431 Music and Audio Computing Graduate School of Culture Technology (GSCT) Juhan Nam

Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

1

Music Information Retrieval

CTP 431 Music and Audio Computing

Graduate School of Culture Technology (GSCT)Juhan Nam

Page 2: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Introduction

2

ü  Instrument:ü  Composer:ü  Key:ü Melody

ü  Transcrip7on–Musicnota7onü  Genre:Classicalü Mood:Melancholy,Sad,…

-ELO“ADerall”-Radiohead“ExitMusic”

ChopinPiano

E-minor

Page 3: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Music Information Retrieval (MIR)

§  Information in Music–  Factual: track, artist, years–  Acoustic: loudness, pitch, timbre–  Symbolic: Instrument, melody, rhythm, chords, structure –  Semantic: genre, mood, user preference

§  Area of research that aims to infer various types of information from music data–  Make computer understand music as human does–  Provide intelligent solutions to enhance human musical activities

3

Page 4: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

MIR Tasks

§  Audio fingerprinting §  Cover song detection§  Music transcription: melody, notes, tempo, chords§  Segmentation, structure, alignment§  Similarity-based retrieval, playlists, recommendation§  Classification: genre, mood, tags, …§  Query by humming§  Source separation: vocal removal§  Symbolic MIR: score retrieval or harmony analysis§  Optical Music Recognition (OMR)

MIREX: http://www.music-ir.org/mirex/wiki/MIREX_HOME

4

Page 5: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

MIR Research Disciplines

§  Digital Signal Processing§  Acoustics§  Music theory§  Machine Learning§  Natural language processing / Computer vision§  Psychology§  Human-Computer Interaction

5

Page 6: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Music Search

§  Query by music–  Search a single unique song identified by

the query–  Audio fingerprint –  Applied to movies, TV and ads, too

§  Query by humming–  Sing with humming and find closest matches–  Melody match

6

Page 7: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Music Recommendation

§  Personalized Radio–  Generate Playlist –  Based on user data, similarity and context

7

iTunesRadio Pandora

Page 8: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Score Following

§  Listen to performance and track the notes–  Example: JKU, Tonara

8

Page 9: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Score Following

§  The Piano Music Companion (2013) –  Along with song identification

9

Page 10: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Automatic Accompaniment

§  Score following + Interactive Performance–  Examples: IRCAM’s Antefesco, Sonation’s Cadenza

10

Page 11: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Entertainment / Education

§  Focus on performance evaluation–  Learning musical instrument–  Examples: Ovelin’s Yousician, MakeMusic’s Smartmusic,

Ubisoft’s RockSmith, RockProdigy

11

Page 12: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Music Production

§  Sound Sample search–  Imagine Research’s MediaMind: search sound effect sample for

media production (e.g. film, drama)–  Izotope’s Breaktweaker: search similar timbre of drum sounds

12

Page 13: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Application: Music Composition

§  Automatic Song writing–  Automatic arrangement–  Example: MSR’s Songsmith

13

Page 14: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

CASE STUDY: Music Recommendation

14

Page 15: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Backgrounds

§  Music record market –  Offline à Online music services–  CD à MP3 à Streaming audio

§  Scale and diversity of music contents–  Commercial music tracks

•  Spotify: 30M+ songs (2015)•  Bugs music: 4.1M+ songs (2015)

–  User contents •  YouTube: 300h+ video uploaded per min (2015)•  SoundCloud: 12h+ audio uploaded per minute (2014)

–  TV, cables and online media•  Music program, concert, music videos, audition, …

15

Page 16: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Backgrounds

§  Connection with human data –  Number of users

•  Spotify: +24M active users (as of Jan, 2014)•  YouTube: +1B unique users’ visit each month (as of Dec, 2014)

–  Personal data•  Play history, rate, personal music library •  Profile: age, occupation, …

–  Social data•  The majority of online services can be logged in via SNS•  Friends, followers•  Daily posting, blog (reviews), comments

16

Page 17: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Challenges

§  There are too many choices of music contents

§  How can we find music more easily or in a human-friendly way?–  Searching music with various queries (e.g. text, humming, audio

tracks)–  Recommendation based on user data (e.g. play history, rating,

location)

§  We need to extract semantic or musical information from audio tracks, and match them to the query or user data

17

Genre,Mood,Instrument,

Songcharacteris7cs

Queryword,Playhistory,RateProfile,Loca7on

Discovery/Familiarity

UsersMusic

Page 18: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Current Approaches

§  Manual Curation§  Human Expert Analysis §  Collaborative Filtering§  Content-based Analysis (by computers)

18

Page 19: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Manual Curation

§  Playlist generation by music experts (or users)–  Traditional: AM/FM radio–  The majority of current music services are

based on this approach

§  Advantages–  Effective for usage-based music services

(workout, study, driving or prenatal education)

–  Good for music discovery–  Often with story-telling

§  Limitations–  No personalization–  Not scalable

19[www.soribada.com]

Page 20: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Human Expert Analysis

§  Pandora: music genome project (1999) –  Musicologists analyze a song for about 450 musical attributes in

various categories–  Big success as a music service

§  Advantages–  High-quality analysis–  Good for music discovery

§  Limitations–  Expensive: take 20-30 minutes for a song to be analyzed–  Not scalable : only for commercial tracks ?

20

Page 21: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Collaborative Filtering (CF)

§  Basic idea

§  Formation–  Matrix factorization (or matrix completion) problem

21

PersonA:IlikesongsA,B,CandD.PersonB:IlikesongsA,B,CandE.PersonA:Really?YoushouldcheckoutsongD.PersonB:Wow,youalsoshouldcheckoutsongE.

Juhan

GangnamStyleJuhan’slatentvector

GangnamStyle’slatentvectorxu

yspus = xu

T ys

SongPreference

qu1u2 = xu1T xu2

UserSimilarity

rs1s2 = ys1T ys2

SongSimilarity

Page 22: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Collaborative Filtering

§  Advantages–  Capture semantics of music in the aspect of human –  Enable personalized recommendation (by nature)

§  Limitations–  The cold start problem: what if a song was never played by

anyone?–  Popularity bias: likely to recommend (already) well-known songs

or songs from the same musician or album

22

Page 23: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Collaborative Filtering

§  Bad examples

23

Canyoufindsongssimilartothismusician?

ThesesongsarealreadywhatIknowwell!

[Oordet.al,2013]

Page 24: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Content-Based Analysis: Music Auto-tagging

§  Google has music service as part of Google play–  Their main features “Instant mix”, which automatically generates a

playlist based on user’s music collections or play history §  They do CF but also make use of audio content. How?

24

FastCompany,July,2013

Page 25: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Content-Based Analysis: Music Auto-tagging§  An intelligent approach that makes computers listen to

music and predict descriptive words (i.e. tags) from audio tracks–  Features: MFCC, Chroma,…–  Algorithms: GMM, SVM, Neural Networks –  Tags: genre, mood, instrument, voice quality, usage

§  Basic Framework

2525“Metal”

“Jazz”

“Classical”

AlgorithmsAudioFiles AudioFeatures

Page 26: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Example of Auto-tagging

26

Thisisa[]songthatis[],[]and[].Itfeatures[]and[]vocal.Itisasongwith[]and[]thatyoumightliketolistentowhile[].

Thisisa[verydanceable]songthatis[arousing/awakening],[exci5ng/thrilling]and[happy].Itfeatures[strong]and[fasttempo]vocal.Itisasongwith[highenergy]and[highbeat]thatyoumightliketolistentowhile[ataparty].

Thisisa[pop]songthatis[happy],[carefree/lighthearted]and[light/playful].Itfeatures[high-pitched]vocaland[alteredwitheffects]vocal.Itisasongwith[posi5vefeeling]thatyoumightliketolistentowhile[ataparty].

JamesBrown–Giveituporturnitaloose

Cardigans-Lovefool

Page 27: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Text-based Music Retrieval by Auto-tagging

§  Sort the probability of the query tag and choose top-N songs–  Like text-based Google search

§  We also can compute similarity between songs using the estimated tag probabilities–  E.g. cosine distance between two tag probability vectors–  Applicable to query by audio

27

Queryword:“FemaleLeadVocals”

Top5rankedsongs

NorahJones–Don’tknowwhy

Dido–Herewithme

SherylCrow–Ishallbelieve

Nodoubt–Simplekindoflike

Carpenters–RainydaysandMondays

Page 28: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Content-based Music Recommendation

§  Blending audio and user data–  Replace the text-based tags with the latent vector of a song

28

AudioTrackof“GangnamStyle”

Matrixfactoriza7onfromcollabora7vefiltering

[Oordet.al,2013]

“user”

“song”“GangnamStyle’slatentvector

Page 29: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Music Retrieval Results

29

Collabora7veFilteringonly Collabora7veFiltering+AudioContent

[Oordet.al,2013]

Page 30: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Content-Based Analysis: Music Auto-tagging

§  Advantages–  Free of cold-start and popularity bias–  Highly scalable: using high-performance computing–  Works for music in other media or user content as well–  Can be combined with other approaches

§  Limitations–  Some tags are unpredictable: indy, idol, …–  Hard to measure music quality (or level of performance),

especially for user contents

30

Page 31: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

CASE STUDY: Score Following

31

Page 32: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Music Score Following

§  Tracking played notes while listening to the music–  Temporally align different representations or renditions of music–  Audio to Audio, Audio to Score (or MIDI)

Page 33: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Music Score Following

§  Extracting Chroma Features–  Capture harmonic (or tonal) characteristics of music

33

CENS:NormalizedChromaFeatures(Muller,2005)

MIDI

Lisitsa

Page 34: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

34

Music Score Following

§  Computing (Dis)similarity Matrix

Page 35: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Music Score Following

35

LocalSimilarity AccumulatedSimilarity

§  Computing the Shortest Path using Dynamic Time Warping

Page 36: Music Information Retrieval - KAISTmac.kaist.ac.kr/~juhan/ctp431/2015/slides/10-MIR.pdf · 2018-09-14 · Challenges § There are too many choices of music contents § How can we

Score Following Demo