Upload
ying-shu-kuo
View
20
Download
0
Embed Size (px)
Metric Learning for Music Discovery with Source and Target Playlists
Ying-Shu KuoAugust 12 2015
Proposed Idea
No Name Artist1 Song_A Artist_A2 Song_B Artist_B3 Song_C Artist_A4 Song_D Artist_C5 Song_E Artist_B6 Song_F Artist_D7 Song_G Artist_E8 Song_H Artist_E9 Song_I Artist_F
Playlist Your Set Target Set
Search
Song_AArtist_A
Parameter = Song= Your Set= Target Set
= Others
= Chosen Playlist
= Similarity
※ x-y axis has no meaning
Use Case
Parameter • Explore unknown music genre (e.g. from Jazz to Metal)
• Get to know your friend’s jam(e.g. from your favs to her favs)
No Name Artist1 Song_A Artist_A2 Song_B Artist_B3 Song_C Artist_A4 Song_D Artist_C5 Song_E Artist_B6 Song_F Artist_D7 Song_G Artist_E8 Song_H Artist_E9 Song_I Artist_F
Playlist Your Set Target Set
Search
Song_AArtist_A
Parameter
What I need for this
1. Song to play with => Million Song Dataset / Spotify API
2. Music similarity => EchoNest Audio Features
3. Cluster song sets => Metric Learning
4. 2-D Visualization => Dimension Reduction
5. Playlist Generation
Million Song Dataset
• Criteria for a good dataset
• Why use MSD?
Bertin-Mahieux, Thierry, et al. "The million song dataset." ISMIR 2011: Proceedings of the 12th International Society for Music Information Retrieval Conference, October 24-28, 2011, Miami, Florida. University of Miami, 2011. http://audiocontentanalysis.org/data-sets
Dataset RWC CAL500 GTZAN MusiCLEF MSD
size 465 502 1,000 200,000 1,000,000
has audio Y Y Y Y N*
has metadata Y Y Y (update) ? Y
* A partial of it has 7digital audio preview. All of the songs have content-based features.
EchoNest Feature
• Metadata: artist name / song title / album name / year / duration
• Low-level: segment time / loudness / pitch / timbre
• Time: tempo / time signature / section time / bar time …
http://developer.echonest.com/docs/v4/_static/AnalyzeDocumentation.pdf
EchoNest Feature
•
•
• Codebook-based
mean
meanstddev
meanstddev
Metric Learning
• Metric: define the way you measure the distance between data
http://en.wikipedia.org/wiki/File:Manhattan_distance.svg Bellet, Aurélien, Amaury Habrard, and Marc Sebban. "A survey on metric learning for feature vectors and structured data." arXiv preprint arXiv:1306.6709 (2013).
Metric Learning
• Mahanalobis Distance
http://stats.stackexchange.com/questions/62092/bottom-to-top-explanation-of-the-mahalanobis-distance
Metric Learning
• Metric Learning: learning distance function
Bellet, Aurélien, Amaury Habrard, and Marc Sebban. "A survey on metric learning for feature vectors and structured data." arXiv preprint arXiv:1306.6709 (2013).
Metric Learning
• Why I need to reshape the feature space?
original metric learned
Metric Learning – LMNNLarge Margin Nearest Neighbor
Weinberger, Kilian Q., John Blitzer, and Lawrence K. Saul. "Distance metric learning for large margin nearest neighbor classification." Advances in neural information processing systems. 2005.
NOT the unlabeled one!!!
Metric Learning – GB-LMNNGradient-Boosted Large Margin Nearest Neighbor
Kedem, Dor, Zhixiang Eddie Xu, and Kilian Q. Weinberger. "Gradient Boosted Large Margin Nearest Neighbors."
• Kernel trick, non-linear
• Gradient Boosted Regression Tree
Metric Learning – Evaluation
• Does starting / ending songs cluster?
• Davies–Bouldin Index
Metric Learning – Evaluation
10 vs 10 ø LMNN GB-LMNN OASIS
average 9.46 10.85 5.62 12.49
max – 16.43 15.66 13.25
min – 8.89 0.61 11.99
Dimension Reduction
• High dimension to low dimension based on constraints
• Keep the distance between data the same
• 2-D visualization
Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.2579-2605 (2008): 85.
Dimension Reduction – t-SNE
http://commons.wikimedia.org/wiki/File:T_distribution_1df_enhanced.svg Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.2579-2605 (2008): 85.
• Pairwise distance
• Effective neighbors = local
• Gaussian vs t-distribution
Playlist Generation
• Trying to create a list of music based on some assumptions/rules/constraints.
Playlist Generation – Related WorkZheleva et al.
[1]McFee et al.
[2]Chen et al.
[3] mine
assumption / constraint
matching user taste and song
taste
natural language
natural language
2 clusters,smooth
input (dataset)
triplet(user, song, t)
tag 0/1; content-based
playlists content-based
approach topic model(LDA)
Markov chainensemble Markov chain nearest
neighbors
evaluation entropy-based log likelihood log likelihood ?
[1] Zheleva, Elena, et al. "Statistical models of music-listening sessions in social media." Proceedings of the 19th international conference on World wide web. ACM, 2010. [2] McFee, Brian, and Gert RG Lanckriet. "The Natural Language of Playlists." ISMIR. 2011. [3] Chen, Shuo, et al. "Playlist prediction via metric embedding." Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012.
Playlist Generation – Related Work
Flexer [4] Van Gulik [5] Lamere [6] mine
assumption / constraint
specifying start and end
high-level control of
playlistboil the frog 2 clusters,
smooth
input (dataset)
content-based
songs with metadata
songs with artist info
content-based
approach divergence ratio
visualizationpath drawing
artist similarity
nearestneighbors
evaluation same genre – – ?
[4] Flexer, Arthur, et al. "Playlist Generation using Start and End Songs." ISMIR. 2008. [5] Van Gulik, Rob, and Fabio Vignoli. "Visual Playlist Generation on the Artist Map." ISMIR. Vol. 5. 2005. [6] http://static.echonest.com/frog/
Playlist Generation – Method
• number of songs
• threshold
http://www.pstcc.edu/departments/natural_behavioral_sciences/Web%20Physics/E2020D0103.gif
Playlist Generation – Result
• demo
Future Work and Discussion
• Discussion
• feature representation
• path finding
• Future Work
• Implementation on Spotify API
• User Study
Thank you!
Questions / Comments?