16
Analysis of Hidden Markov Model Method Implementation in Documents Topic Sentence Extraction for Information Retrieval Alfian Akbar Gozali 113060074

Presentasi MoodUs for Nokia Lumia Apps Olympiad

Tags:

Embed Size (px)

DESCRIPTION

MoodUs - WP7 Apps - Chatterbox+Twitter-API Apps for Windows Phone 7 combining your twitter data and sentiment analysis for interesting mood forecast and statistics.

Citation preview

Page 1: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Analysis of Hidden Markov ModelMethod Implementation

in Documents Topic Sentence Extractionfor Information Retrieval

Alfian Akbar Gozali113060074

Page 2: Presentasi MoodUs for Nokia Lumia Apps Olympiad

BackgroundGrowth of the Internet Users

Growth of the internet WebPages

Search engine development

Enormous number of indexing terms

Need more than just

an ordinary ‘trash’

thrower!

Page 3: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Solution: Document Extraction!!

Hidden Markov Model (HMM)Documents Extraction

Page 4: Presentasi MoodUs for Nokia Lumia Apps Olympiad

The Goals are Analyze…The HMM works

Effects of the parameter in HMM

Differences between ordinary indexing and compression indexing with HMM

Effects of document variation to this system

Page 5: Presentasi MoodUs for Nokia Lumia Apps Olympiad

What is HMM?

One of the Markov Chain enhancement

Predict the sequence of pattern that can’t be observed directly

Consist of two state trails: observed and hidden

Page 6: Presentasi MoodUs for Nokia Lumia Apps Olympiad

HMM Elements

Page 7: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Topic Sentence ExtractionDepends on particular language

Doesn’t depend on particular language

• Statistical approach• HMM Hedge

Page 8: Presentasi MoodUs for Nokia Lumia Apps Olympiad

ROUGE - 2Measure accuracy between human extraction

and system extraction

Page 9: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Overall Design

Page 10: Presentasi MoodUs for Nokia Lumia Apps Olympiad

System Testing{NAME} and {NUMERIC} tag

α parameter in decoding

effect of extraction

corpus kinds

Page 11: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Result – Scenario 1 (tagging)

0 40 80 120

160

200

0

10000

20000

pTrans

TaggingUntagging

Number of docs

terms

0 50 100 150 2000

1000200030004000

pEmiss

TaggingUntagging

Number of docs

terms

0 40 80 120

160

200

0

1000

2000

3000

4000

Extracting

TaggingUntagging

number of documents

time

0 40 80 120

160

200

050

100150200250

Evaluation

TaggingUntagging

Number of documents

time

Page 12: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Result – Scenario 2 (alpha)

0.00

10.

10.

30.

5

0.70

0000

0000

0000

10.

934.00%

37.00%

40.00%40.34%

Average Accuracy

average

alpha

accuracy

Page 13: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Result – Scenario 3 (extraction)Execution Time

dengan kompresitanpa kompresi

Number of Terms

dengan kompresitanpa kompresi

71.95%

56.98%

Page 14: Presentasi MoodUs for Nokia Lumia Apps Olympiad

Result – Scenario 4 (corpus)

www.fo

otba

lltrib

al.co

m

www.fi

fa.co

m

www.n

ytim

es.com

0%

20%

40%

60%

80%

rerata

maks akurasi

min akurasi

Extraction Accuracy

rerata

maks akurasi

min akurasi

Page 15: Presentasi MoodUs for Nokia Lumia Apps Olympiad

ConclusionTagging can reduce extracting time and

number of indexed terms

Optimum alpha parameter is 0,2 and 0,3

Compression can reduce indexing time and number of indexed terms

Variation of the corpuses can influence system accuracy

Page 16: Presentasi MoodUs for Nokia Lumia Apps Olympiad

That’s all…

Thank You…