A scalable collaborative filtering framework based on co-clustering

A Scalable Collaborative Filtering Framework based on Co-clustering

Author: Thomas George, Srujana Merugu in ICDM’05.

Presenter: Rei-Zhe Liu. Date: 2010/10/26.

Outline Introduction System architecture Experiments and result Conclusion

Introduction We propose a dynamic collaborative filtering

approach that can support the entry of new users, items and ratings using a hybrid of incremental and batch versions of the co-clustering algorithm.

Empirical comparison of our approach with SVD, NNMF and correlation-based collaborative filtering techniques indicates comparable accuracy at a much lower computational effort.

System architecture

Problem definition(1/2) The approximate matrix for prediction is given

Problem definition(2/2) We can now pose the prediction of unknown

ratings as a co-clustering problem where we seek to find the optimal user and item clustering such that the approximation error with respect to the known ratings of A is minimized,

where ensures that only the known ratings contribute to the loss function.

Algorithm(1/3)

Algorithm(2/3)

Algorithm(3/3)

System description P1 handles the prediction and

incremental training. P2 is responsible for the static

training. During incremental training P1,

also updates the raw ratings. P2 performs co-clustering

repeatedly by reading A(the current ratings matrix) and updating S(summary statistics) when done.

Data Objects A and S are stored at 2 parts: (a)stable part (b)increment part.

At the end of each co-clustering run, the two parts are merged to obtain a new set of stable values.

Experiments and results

Data sets and Exp. Settings(1/2) Data set

MovieLens: 943-1882 user-by-movie matrix. Totally 100,000 ratings. Rated from 1 to 5.

Evaluation methodology The prediction accuracy was measured using the

mean absolute error (MAE), which is the average of the absolute values of the errors over all the predictions.

The static training time was estimated in terms of the CPU time taken for the core training routines (viz. co-clustering and SVD).

The prediction time was estimated by averaging over the response time taken for all the predictions.

Data sets and Exp. Settings(2/2) For evaluating the prediction accuracy, we created ten

80-20% random train-test splits of the datasets and averaged the results over the various splits.

We considered two scenarios, —(i) static testing, where the known ratings do not change, and (ii) dynamic testing, where the ratings are updated incrementally.

Algorithms We compared the performance of our co-clustering

based approach with SVD [13], NNMF [10] and classic correlation-based collaborative filtering [12].

An incremental SVD-based approach [14] using a folding in technique was also implemented in order to evaluate the prediction accuracy in dynamic scenarios with changing ratings.

Evaluation(1/3)

k = l = SVD rank =NNMF rank=3

k = l = SVD rank=3

Evaluation(2/3)

Dataset: Mov1

Dataset: MovieLens

CoC: (m+n+kl-k-l)NNMF, SVD: (m+n)(k+l)

Evaluation(3/3)

Dataset: MovieLens

Conclusion

In this paper, we presented a new dynamic collaborative filtering approach based on simultaneous clustering of users and items.

Empirical results indicate that our approach can provide high quality predictions at a much lower computational cost compared to traditional correlation and SVD-based approaches.

A scalable collaborative filtering framework based on co-clustering

Technology

Scalable, Behavior-Based Malware Clustering

Scalable framework for heterogeneous clustering of

A Scalable Framework for Clustering Vehicle Trajectories

An efficient and scalable algorithm for clustering …gkmc.utah.edu/7910F/papers/IEEE TKDE clustering xml...An Efficient and Scalable Algorithm for Clustering XML Documents by Structure

Scalable Collaborative Filtering Recommendation Algorithms on Apache Spark

XML Filtering Using Dynamic Hierarchical Clustering of

Google News Personalization Scalable Online Collaborative Filtering

A clustering approach for topic filtering within

Hierarchical clustering: A structure for scalable ...brecht/courses/702/Possible-Readings/oses/... · Hierarchical Clustering: A Structure for Scalable ... design and efficient building

Scalable Fair clustering - icml.cc · Scalable Fair Clustering Arturs Backurs Piotr Indyk Krzysztof Onak Baruch Schieber Ali Vakilian Tal Wagner. Fair Clustering • Algorithmic Fairness

Google News Personalization: Scalable Online Collaborative Filtering

Scalable Web Server Clustering Technologies J. Wei

Scalable Clustering Algorithms with Balancing Constraintsideal.ece.utexas.edu/papers/sc.pdf · Clustering algorithms for data-mining problems must be extremely scalable. In addition,

Local Graph Sparsiﬁcation for Scalable Clustering

Scalable Graph Clustering using Stochastic Flows · Venu Satuluri and Srinivasan Parthasarathy Scalable Graph Clustering using Stochastic Flows Problem Statement Graph Clustering:

Localization and Clustering in Scalable WSN

A scalable clustering-based task scheduler for homogeneous ...people.bordeaux.inria.fr/Julien.herrmann/Pub/Conf/Ipdps2019.pdfA scalable clustering-based task scheduler for homogeneous

FiST: Scalable XML Document Filtering by Sequencing Twig Patterns

Scalable Clustering on the Data Grid

Adaptive Collaborative Filtering Based on Scalable ... · PDF fileThis paper proposes Adaptive Collaborative Filtering Based on Scalable ... O-J. Lee et al. Adaptive Collaborative