1
CONCLUSION & FUTURE WORK Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared in the semantic space with the inferred task clusters of the community. In order to find the task clusters closely associated with the new task, the task similarity objective function is used with calculating the average task similarity per each cluster and finding the Top N clusters with the highest average task similarity. Acknowledgements : This research is supported by NSF grant 0938074 CNS: A Task-based Hybrid Collaborative Filtering Recommender Service Sampath Jayarathna and Frank Shipman Computer Science & Engineering, Texas A&M University – College Station ABSTRACT we propose a task-based hybrid collaborative filtering (CF) recommender service to support user’s information gathering tasks. Even with the best Web Search Engines (WSEs), and the most effective query formulations, information gathering tasks require people to work through long list of documents to determine potentially relevant documents. We propose and implement SVD based LSI to find similarity in information gathering tasks to find task clusters and use these clusters for recommendations. Our method alleviates major challenges in a traditional CF such as data scalability, synonymy and privacy protection (using interest profiles via web service). The use of hybrid CF technique in CNS overcomes CF problems such as sparsity and gray sheep. Figure 1. Community Navigation Service Overview People in their everyday life rely on recommendations from others in the community for things like reference letters, news, movies, songs, and travel so forth. Collaborative Filtering uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for the other users. The aim of this proposed work is to investigate this similar behavior in performing everyday information gathering tasks by taking into consideration similar tasks of community of Interest Profile Manager Recommender Application Organizing Application Interest Profile Reading Application CNS Service CNS Client COMMUNITY NAVIGATION During information task, useful documents may be long, and cover multiple subtopics; users may read some segments and ignore others. In order to record which portion(s) of the document pique the user’s interests, an explicit interest expressions (e.g. annotations using WebAnnotate) capturing tool is used . The derived User Interest Profile consists of a unique task ID, documents IDs (URLs) and the respective term vectors (user annotations).The IPM acts as a local interest profile server where it collects information from Reading Application (annotation enabled browser) and this information is aggregated and saved in the user’s interest profile and broadcasts it to the participating applications. Figure 2. CNS Web Service Architecture Each atomic task in the interest profile consists of set of documents related to the information gathering task and their respective term vectors. Our goal is to create a N*N task similarity matrix given a set of N tasks to be clustered. The SVD based Latent Semantic Indexing (LSI) can take a large matrix of term document association data and construct a semantic space where terms and documents that are closely associated can be detected with Cosine Similarity. where, i=1,2…..m, j=1,2…..n Client-side Axis2 support Client class User Interest Profile Top N Recommendations CNS Client (Recommender Application) Service Class Input Interest Profile Output Recommendations CNS Service Axis2 Engine (via Tomcat) SOAP Response via HTTP SOAP Request via HTTP REFERENCES TASK-BASED RECOMMENDATIONS Our major contributions in this work can be summarized as: (1)A novel task-based recommender service based on traditional hybrid CF techniques; (2)We adapted the LSI in task similarity context which is the recommendations based on user’s behavior in doing a particular information gathering task; (3)We proposed an innovative task clusters in which we represented the task data in a k- dimensional semantic space in order to make recommendations based on task similarity objective function. In our future works, we are planning to incorporate CNS with the IPM’s functionality necessary to characterize user interests, including the ability to collect, parse, and determine similarity among common forms of Web documents. 1. Tolomei, G., Orlando, S. and Silvestri, F., Towards a task-based search and recommender systems. in In Proceedings of ICDE Workshops, (2010), 333-336. 2. Su, X. and Khoshgoftaar, T.M. A survey of collaborative filtering techniques. Adv. in Artif. Intell., 2009. 2-2. 3. Bae, S., Hsieh, H., Kim, D., Marshall, C.C., Meintanis, K., Moore, J.M., Zacchi, A. and Shipman, F.M. Supporting document triage via annotation- based visualizations. American Society for Information Science and Technology, 45 (1). 1-16. 4. Landauer, T., Foltz, P.W. and Laham, D. An Introduction to Latent Semantic Analysis. Discourse Processes 25. 259-284. 5. Jurgens, D. and Stevens, K. The S-Space package: an open source package for word space models Proceedings of the ACL 2010 System Demonstrations, Association for Computational Linguistics, Uppsala, Sweden, 2010, 30-35. n m j i j i ) , dSim( = ) , tSim(

CONCLUSION & FUTURE WORK Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared

Embed Size (px)

Citation preview

Page 1: CONCLUSION & FUTURE WORK Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared

CONCLUSION & FUTURE WORK

Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared in the semantic space with the inferred task clusters of the community.

In order to find the task clusters closely associated with the new task, the task similarity objective function is used with calculating the average task similarity per each cluster and finding the Top N clusters with the highest average task similarity.

Acknowledgements : This research is supported by NSF grant 0938074Acknowledgements : This research is supported by NSF grant 0938074

CNS: A Task-based Hybrid Collaborative Filtering Recommender Service

Sampath Jayarathna and Frank ShipmanComputer Science & Engineering, Texas A&M University – College Station

CNS: A Task-based Hybrid Collaborative Filtering Recommender Service

Sampath Jayarathna and Frank ShipmanComputer Science & Engineering, Texas A&M University – College Station

ABSTRACT

we propose a task-based hybrid collaborative filtering (CF) recommender service to support user’s information gathering tasks. Even with the best Web Search Engines (WSEs), and the most effective query formulations, information gathering tasks require people to work through long list of documents to determine potentially relevant documents.

We propose and implement SVD based LSI to find similarity in information gathering tasks to find task clusters and use these clusters for recommendations.

Our method alleviates major challenges in a traditional CF such as data scalability, synonymy and privacy protection (using interest profiles via web service). The use of hybrid CF technique in CNS overcomes CF problems such as sparsity and gray sheep.

Figure 1. Community Navigation Service Overview

People in their everyday life rely on recommendations from others in the community for things like reference letters, news, movies, songs, and travel so forth.

Collaborative Filtering uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for the other users.

The aim of this proposed work is to investigate this similar behavior in performing everyday information gathering tasks by taking into consideration similar tasks of community of recommenders.

InterestProfile

Manager

RecommenderApplication

OrganizingApplication

Interest Profile

ReadingApplication

CNSService

CNS Client

COMMUNITY NAVIGATION

During information task, useful documents may be long, and cover multiple subtopics; users may read some segments and ignore others. In order to record which portion(s) of the document pique the user’s interests, an explicit interest expressions (e.g. annotations using WebAnnotate) capturing tool is used .

The derived User Interest Profile consists of a unique task ID, documents IDs (URLs) and the respective term vectors (user annotations).The IPM acts as a local interest profile server where it collects information from Reading Application (annotation enabled browser) and this information is aggregated and saved in the user’s interest profile and broadcasts it to the participating applications.

Figure 2. CNS Web Service Architecture

Each atomic task in the interest profile consists of set of documents related to the information gathering task and their respective term vectors. Our goal is to create a N*N task similarity matrix given a set of N tasks to be clustered.

The SVD based Latent Semantic Indexing (LSI) can take a large matrix of term document association data and construct a semantic space where terms and documents that are closely associated can be detected with Cosine Similarity.

where, i=1,2…..m, j=1,2…..n

Client-side Axis2support

Client class

User Interest Profile

Top NRecommendations

CNS Client (Recommender Application)

Service Class

Input Interest Profile

OutputRecommendations

CNS Service

Axis2 Engine(via Tomcat)

SOA

P R

espo

nse

via

HT

TP

SOA

P R

eque

st v

ia H

TT

P

REFERENCES

TASK-BASED RECOMMENDATIONS

Our major contributions in this work can be summarized as:

(1)A novel task-based recommender service based on traditional hybrid CF techniques;

(2)We adapted the LSI in task similarity context which is the recommendations based on user’s behavior in doing a particular information gathering task;

(3)We proposed an innovative task clusters in which we represented the task data in a k-dimensional semantic space in order to make recommendations based on task similarity objective function.

In our future works, we are planning to incorporate CNS with the IPM’s functionality necessary to characterize user interests, including the ability to collect, parse, and determine similarity among common forms of Web documents.

1. Tolomei, G., Orlando, S. and Silvestri, F., Towards a task-based search and recommender systems. in In Proceedings of ICDE Workshops, (2010), 333-336.

2. Su, X. and Khoshgoftaar, T.M. A survey of collaborative filtering techniques. Adv. in Artif. Intell., 2009. 2-2.

3. Bae, S., Hsieh, H., Kim, D., Marshall, C.C., Meintanis, K., Moore, J.M., Zacchi, A. and Shipman, F.M. Supporting document triage via annotation-based visualizations. American Society for Information Science and Technology, 45 (1). 1-16.

4. Landauer, T., Foltz, P.W. and Laham, D. An Introduction to Latent Semantic Analysis. Discourse Processes 25. 259-284.

5. Jurgens, D. and Stevens, K. The S-Space package: an open source package for word space models Proceedings of the ACL 2010 System Demonstrations, Association for Computational Linguistics, Uppsala, Sweden, 2010, 30-35.

nm

jij

i

),dSim(

=),tSim(