Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Cold-StartMusicRecommendationUsingMultimodalDeepArchitectures
SYSTEMATICAPPROACHESTODEEPLEARNINGMETHODSFORAUDIOESIWORKSHOPVIENNA,AUSTRIA
SEP15,2017
• Motivation:TheCold-StartProblem
• Background:CollaborativeFiltering
• Cold-StartMusicRecommendation:
• EstimateCollaborativeFactorsfromAudio
• TheMusicGenomeProject™
• MultimodalDeepArchitectures
Outline
• Motivation:TheCold-StartProblem
• Background:CollaborativeFiltering
• Cold-StartMusicRecommendation:
• EstimateCollaborativeFactorsfromAudio
• TheMusicGenomeProject™
• MultimodalDeepArchitectures
Outline
Cold-Start ProblemTHELONGTAIL
MostPopularTracks
0.01%
Cold-Start ProblemTHELONGTAIL
MostPopularTracks
1%
Cold-Start ProblemTHELONGTAIL
MostPopularTracks
100%
35%oftracks0spinslastweek
• Motivation:TheCold-StartProblem
• Background:CollaborativeFiltering
• Cold-StartMusicRecommendation:
• EstimateCollaborativeFactorsfromAudio
• TheMusicGenomeProject™
• MultimodalDeepArchitectures
Outline
PROBLEMOVERVIEWCollaborative Filtering
Users
? ?
? ? ?
?
? ?
? ?[Items (Tracks) [
Explicit
Thumbs(upanddown)
StationCreation
Implicit
TrackCompletion
TrackSkips
LATENTFACTORSCollaborative Filtering
AggressiveCalm
SimpleHarmony
ComplexHarmony
MATRIXFACTORIZATIONCollaborative Filtering
Koren,Y.,Bell,R.,&Volinsky,C.(2009).MatrixFactorizationTechniquesforRecommenderSystems.Computer,42(8),42–49.
? ?
? ? ?
?
? ?
? ?[ [
Users
Item
s (Tr
acks
)
⇡[ [
k
Users
Item
s (Tr
acks
)
k[ [
PROBLEMFORMULATIONCollaborative Filtering
GivenItemiandUseru: ? ?
? ? ?
?
? ?
? ?[ [
UsersIte
ms
(Trac
ks)
⇡[ [
k
Users
Item
s (Tr
acks
)
k
[ [ItemLatentFactor:qi 2 Rk
UserLatentFactor: pu 2 Rk
Rating: riuriu
RatingApproximation: r̂iu = qTi pu
qipu
argminq⇤,p⇤X
u,i2S(rui � qTi pu)
2 +�(||qi||2 + ||pu||2)Koren,Y.,Bell,R.,&Volinsky,C.(2009).MatrixFactorizationTechniquesforRecommenderSystems.Computer,42(8),42–49.
Collaborative Filtering
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked1 TheBeatles ADayInTheLife
Ranked2 TheBeatles ADayInTheLife(LoveVersion)
Ranked3 TheBeatles AcrossTheUniverse
EXAMPLE
Collaborative Filtering
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked35 GeorgeHarrisonWhileMyGuitarGentlyWeeps
(Live)
Ranked82 GeorgeHarrison MySweetLord(Live)
Ranked91PaulMcCartney&Eric
Clapton Something(Live)
EXAMPLE
Ranked158 LedZeppelin Tangerine
THEGOODANDTHEBADCollaborative Filtering
Richpreference-drivensimilarityspace Latentspaceisgenerallynotinterpretable
Powerfulatmatchingtherightsongwiththerightlistener
Canonlyrecommenditemsthathavealreadybeenrated
• Motivation:TheCold-StartProblem
• Background:CollaborativeFiltering
• Cold-StartMusicRecommendation:
• EstimateCollaborativeFactorsfromAudio
• TheMusicGenomeProject™
• MultimodalDeepArchitectures
Outline
Estimate Collaborative Factors
? ?
? ? ?
?
? ?
? ?[ [
Users
Item
s (Tr
acks
)
⇡[ [
k
Users
Item
s (Tr
acks
)
k[ [
http://blogs-images.forbes.com/kevinmurnane/files/2016/03/google-deepmind-artificial-intelligence-2-970x0-970x646.jpg
k
Approximate Item Factors using Audio
Oord,A.VanDen,Dieleman,S.,&Schrauwen,B.(2013).DeepContent-basedMusicRecommendation.AdvancesinNeuralInformationProcessingSystems,2643–2651.
WITHDEEPLEARNINGApproximating Factors using Audio
4096
4096
DenseLayers1024
1DConvolutionalLayers
2048 2048 4096 4096
108
N
4096x4
TimeGlobalPooling
mean
max
L2var
1DConvolutionalLayer
k
TRAININGDATAApproximating Item Factors using Audio
• (Small)Dataset:
• 83ktracks
• 3patchesof35secondspertrack(251kpatches=)
• (Patchesonlyfortraining!)
• Splits:
• Train:80%
• Validation:10%
• Test:10%
M
{X,Y}
TRAININGApproximating Item Factors
• Lossfunction:
• CosineDistance
• Optimization:
• Adam(defaultparams)
• 50%DropoutonDenseLayers
• EarlyStopping
• Mini-batchesof64examples
L(✓) = 1� 1
M
X
X2X,y2Y
f(X; ✓)Ty
||f(X; ✓)||2||y||2
Cosin
eDista
nce
MiniBatches
RESULTSApproximating Item Factors using Audio
Input CosDistance #Epochs Time/Epoch
Audio(35sPatches) 0.25 22 ~2h
RESULTSApproximating Item Factors using Audio
Input CosDistance #Epochs Time/Epoch
Audio(35sPatches) 0.25 22 ~2h
Audio(FullTracks) 0.21 - -
• Motivation:TheCold-StartProblem
• Background:CollaborativeFiltering
• Cold-StartMusicRecommendation:
• EstimateCollaborativeFactorsfromAudio
• TheMusicGenomeProject™
• MultimodalDeepArchitectures
Outline
The Music Genome Project™
>1.5Milliontracksmanuallyanalyzed
~400attributespertrack
AttributeExamples
BreathyVoiceNasalVoiceOddMeterHasBanjoJoyfulLyrics
…
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked1 IVThieves TheSoundAndTheFury
Ranked2 Journey TooLate
Ranked3 AlbertLee LookOutCleveland
Recommending Music using the MGP™EXAMPLE
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked1 IVThieves TheSoundAndTheFury
Ranked2 Journey TooLate
Ranked3 AlbertLee LookOutCleveland
Recommending Music using the MGP™EXAMPLE
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked1 IVThieves TheSoundAndTheFury
Ranked2 Journey TooLate
Ranked3 AlbertLee LookOutCleveland
Recommending Music using the MGP™EXAMPLE
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked1 IVThieves TheSoundAndTheFury
Ranked2 Journey TooLate
Ranked3 AlbertLee LookOutCleveland
Recommending Music using the MGP™EXAMPLE
http://blogs-images.forbes.com/kevinmurnane/files/2016/03/google-deepmind-artificial-intelligence-2-970x0-970x646.jpg
k
Approximate Factors using the MGP™
DEEPARCHITECTUREApproximate Factors using the MGP
N
4096
4096
DenseLayers
k
TRAININGDATAApproximating Item Factors using the MGP™
• (Small)Dataset:
• 83ktracks()
• Splits:
• Train:80%
• Validation:10%
• Test:10%
M
{X,Y}
TRAININGApproximating Item Factors using the MGP™
• Lossfunction:
• CosineDistance
• Optimization:
• Adam(defaultparams)
• 50%DropoutonDenseLayers
• EarlyStopping
• Mini-batchesof256examplesCosin
eDista
nce
MiniBatches
L(✓) = 1� 1
M
X
x2X,y2Y
f(x; ✓)Ty
||f(x; ✓)||2||y||2
RESULTSApproximating Item Factors
Input CosDistance #Epochs Time/Epoch
Audio(35sPatches) 0.25 22 ~2h
Audio(FullTracks) 0.21 - -
MGP 0.15 37 7s
Beyond the MGP™
APPROXIMATEMGPWITHMACHINELISTENING
MACHINELISTENINGGENES
(Coming soon: MGP™ Estimation with Waveforms!)
APPROXIMATEMGPWITHMACHINELISTENING
DEEPARCHITECTUREApproximate Factors using MLG
N
4096
4096
DenseLayers
k
RESULTSApproximating Item Factors
Input CosDistance #Epochs Time/Epoch
Audio(35sPatches) 0.25 22 ~2h
Audio(FullTracks)
0.21 - -
MGP 0.15 37 7s
MLG 0.22 37 7s
• Motivation:TheCold-StartProblem
• Background:CollaborativeFiltering
• Cold-StartMusicRecommendation:
• EstimateCollaborativeFactorsfromAudio
• TheMusicGenomeProject™
• MultimodalDeepArchitectures
Outline
http://blogs-images.forbes.com/kevinmurnane/files/2016/03/google-deepmind-artificial-intelligence-2-970x0-970x646.jpg
k
Combine Methods to Approximate Factors
k
Combine Methods to Approximate Factors
N 4 4 k4 4
Dens(vanden
11D
2 2 4 41
N 40m
mL
v
1D
k
http://blogs-images.forbes.com/kevinmurnane/files/2016/03/google-deepmind-
LATE-FUSIONDEEPARCHITECTURECombine Methods to Approximate Factors
4096x2
4096
4096
DenseLayers
k
RESULTSApproximating Item Factors
Input CosDistance #Epochs Time/Epoch
Audio(35sPatches)
0.25 22 ~2h
Audio(FullTracks)
0.21 - -
MGP 0.15 37 7s
MLG 0.22 37 7s
Audio+MLG 0.19 37 7s
http://blogs-images.forbes.com/kevinmurnane/files/2016/03/google-deepmind-artificial-intelligence-2-970x0-970x646.jpg
k
Further Multimodality to Approximate Factors
LATE-FUSIONDEEPARCHITECTUREFurther Multimodality to Approximate Factors
4096x2+31
4096
4096
DenseLayers
k
one-hotvectorencoding
RESULTSApproximating Item Factors
Input CosDistance #Epochs Time/Epoch
Audio(35sPatches)
0.25 22 ~2h
Audio(FullTracks)
0.21 - -
MGP 0.15 37 7s
MLG 0.22 37 7s
Audio+MLG 0.19 37 7s
Audio+MLG+genres 0.16 37 7s
ISALRIGHTMore data
• LARGEDataset:
• ~900kmostpopulartracks
• 3patchesof35secondspertrack(~2.7Mpatches=)M
{X,Y}
Input Trainedon TestSet CosDistance
Audio SMALL SMALL 0.21
ISALRIGHTMore data
Input Trainedon TestSet CosDistance
Audio SMALL SMALL 0.21
Audio LARGE LARGE 0.37
• LARGEDataset:
• ~900kmostpopulartracks
• 3patchesof35secondspertrack(~2.7Mpatches=)M
{X,Y}
ISALRIGHTMore data
Input Trainedon TestSet CosDistance
Audio SMALL SMALL 0.21
Audio LARGE LARGE 0.37
Audio SMALL LARGE 0.64
• LARGEDataset:
• ~900kmostpopulartracks
• 3patchesof35secondspertrack(~2.7Mpatches=)M
{X,Y}
ISALRIGHTMore data
Input Trainedon TestSet CosDistance
Audio SMALL SMALL 0.21
Audio LARGE LARGE 0.37
Audio SMALL LARGE 0.64
Audio LARGE SMALL 0.21
• LARGEDataset:
• ~900kmostpopulartracks
• 3patchesof35secondspertrack(~2.7Mpatches=)M
{X,Y}
Recommendation Examples
Ar[st Title
QueryTrack TheBeatles WhileMyGuitarGentlyWeeps
Ranked1 BobDylan Knockin’OnHeavensDoor
Ranked2 NeilYoung HeartOfGold
Ranked3 TheRollingStones Angie
Recommendation Examples
Ar[st Title
QueryTrack Sargon Con[nuarà
Ranked1 Mudvayne Happy?
Ranked2 Mudvayne ForgetToRemember
Ranked3 StoneSour Hell&Consequences
Long Tail Context
MostPopularTracks
100%
35%oftracks0spinslastweek
20spinslastweek
Recommendation Examples
Ar[st Title
QueryTrack Sargon Con[nuarà
Ranked1 Mudvayne Happy?
Ranked2 Mudvayne ForgetToRemember
Ranked3 StoneSour Hell&Consequences
Recommendation Examples
Ar[st Title
QueryTrack Sargon Conqnuarà
Ranked1 Mudvayne Happy?
Ranked2 Mudvayne ForgetToRemember
Ranked3 StoneSour Hell&Consequences
Recommendation Examples
Ar[st Title
QueryTrack Sargon Conqnuarà
Ranked1 Mudvayne Happy?
Ranked2 Mudvayne ForgetToRemember
Ranked3 StoneSour Hell&Consequences
Recommendation Examples
Ar[st Title
QueryTrack Sargon Conqnuarà
Ranked1 Mudvayne Happy?
Ranked2 Mudvayne ForgetToRemember
Ranked3 StoneSour Hell&Consequences
Recommendation Examples
Ar[st Title
QueryTrack LaBossad’Urina ElTiempo
Ranked1 IlDivo Hallelujah
Ranked2SarahBrightman&TheLondonSymphony
OrchestraTimeToSayGoodbye
Ranked3 AndreaBocelli Amapola
Long Tail Context
MostPopularTracks
100%
35%oftracks0spinslastweek
0spinslastweek
Recommendation Examples
Ar[st Title
QueryTrack LaBossad’Urina ElTiempo
Ranked1 IlDivo Hallelujah
Ranked2SarahBrightman&TheLondonSymphony
OrchestraTimeToSayGoodbye
Ranked3 AndreaBocelli Amapola
Recommendation Examples
Ar[st Title
QueryTrack LaBossad’Urina ElTiempo
Ranked1 IlDivo Hallelujah
Ranked2SarahBrightman&TheLondonSymphony
OrchestraTimeToSayGoodbye
Ranked3 AndreaBocelli Amapola
Recommendation Examples
Ar[st Title
QueryTrack LaBossad’Urina ElTiempo
Ranked1 IlDivo Hallelujah
Ranked2SarahBrightman&TheLondonSymphony
OrchestraTimeToSayGoodbye
Ranked3 AndreaBocelli Amapola
Recommendation Examples
Ar[st Title
QueryTrack LaBossad’Urina ElTiempo
Ranked1 IlDivo Hallelujah
Ranked2SarahBrightman&TheLondonSymphony
OrchestraTimeToSayGoodbye
Ranked3 AndreaBocelli Amapola
ENSEMBLE OF RECOMMENDERS MAY PRODUCE OPTIMAL RECOMMENDATIONS
MAN vs MACHINE?
MAN + MACHINE
MAN + MACHINE“Mix,of,Art,and,Science”
MAN + MACHINE“Mix,of,Art,and,Science”
Oramas, S., Nieto, O., Sordo, M., Serra, X., A Deep Multimodal Approach for Cold-start Music Recommendation. Deep Learning for Recommender Systems Workshop, RecSys, Como, Italy 2017
Oramas, S., Nieto, O., Barbieri, F., Serra, X., Multi-label Music Genre Classification From Audio, Text, and Images Using Deep Features. Proc. of the 18th International Society for Music Information Retrieval Conference (ISMIR). Suzhou, China, 2017
MAN + MACHINE“Mix,of,Art,and,Science”
THANKS! [email protected]
Oramas, S., Nieto, O., Sordo, M., Serra, X., A Deep Multimodal Approach for Cold-start Music Recommendation. Deep Learning for Recommender Systems Workshop, RecSys, Como, Italy 2017
Oramas, S., Nieto, O., Barbieri, F., Serra, X., Multi-label Music Genre Classification From Audio, Text, and Images Using Deep Features. Proc. of the 18th International Society for Music Information Retrieval Conference (ISMIR). Suzhou, China, 2017