18
Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California [email protected] A Twitter Recommend System based on Topic Modeling

Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California [email protected] A Twitter Recommend System

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Tweetool (0. 1 100 version)Final Report

Yilei Qian

Computer Science

University of Southern

California

[email protected]

A Twitter Recommend System based on Topic Modeling

Page 2: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Ideas

• Following too many points on Twitter

• Too many news every day

• Cannot find the interested and valued news

• Don’t know the name which user want to follow

• Need someone to recommend who to follow

• Need someone to recommend the hottest news

• Use topic modeling to re-rank all the user

Page 3: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Traditional Method

Page 4: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Traditional Method

Page 5: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Traditional Method

Page 6: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Topic Modeling

Page 7: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Topic Modeling

Page 8: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Topic Modeling

• a topic model is a type of statistical model for

discovering the abstract "topics" that occur in a

collection of documents.  

• Always used in natural language processing.

Reference Papers:

Steyvers,m. and Griffiths, T., “Probabilistic topic

models,” Hand book of latent semantic analysis

Blei, D.M and Ng, A.Y and Jordan, M.I, “Latent

Dirichlet Allocation”, The Journal of Machine Learning

Research 2003

Page 9: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Label based LDA

Step:

1. Build the LDA Model

2. Train the model instance by train document

3. Run the LDA for all the data based on trained model

instance

Problem:

4. Punctuation marks. E.g. “”,.={}() …

5. Frequent words. E.g I , you….

6. Other Noise

Page 10: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Result Generate

1. By Angle

Value = 2. By Distance

Value =

Page 11: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

13-Dimension Topics

1. Art & Design2. Book3. Business4. Charity5. Entertainment6. Family7. Fashion8. Food & Drink9. Health10. Music11. News12. Science & Technology13. Sports

Page 12: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Languages & Tools

• Web UI: HTML + AJAX(Unfinished) +CSS(unfinished)+Twitter

REST API

• Android UI: Java, Android 2.1(unfinished)

• Server Side: Java 1.6, Servlet 2.0, Spring 3.0, Hibernate 3.3

• Twitter API: Twitter4j 2.2.1 (300 request per hour)

• Server: Tomcat 7.08

• Database: MySQL 5.5

• Data Package: JSON

• Develop Platform: Eclipse 3.4

• Total code lines: 2000(+) + 2421 + 462 = 5000(+)

• Subversion:

• http://tweetool-yilei.googlecode.com/svn/trunk/tweetool-yilei-read-

only

Page 13: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Architecture

DB

Twitterfetch

LLDATweetool

Hibernate DAO

Work Flow

Servlets

Work Flow

Work Flow

Mobile DeviceHTML

APPLICATIONCONTEXT

Page 14: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Distributed Crawler & Computing

Page 15: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Problems(endless T_T)

1. High noise in topic model

• Few words, Odd marks, Abbreviation

2. Unfamiliar with Twitter API, A lot of bugs

3. Transaction Problems

4. The Ugly UI

5. Poor performance

6. Don’t have enough time. Many functions are

unfinished

7. Tweetool system should be reconstructed !!!

Environment: 7000+Users 22,0000+Tweets

Page 16: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Future Work

1. Try to finish it

2. Debug

3. Build a better train file

4. Add feedback function

5. Better topics classification

Page 17: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Web UI (Design Version)

Page 18: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System

Android UI

FunctionButton

FunctionButton

FunctionButton

FunctionButton

Titile

Main Menu News Menu

Title

News

News

News