10
1 Discussion Class 2 A Vector Space Model for Automated Indexing

Discussion Class 2

Embed Size (px)

DESCRIPTION

Discussion Class 2. A Vector Space Model for Automated Indexing. Discussion Classes. Format: Questions. Ask a member of the class to answer. Provide opportunity for others to comment. When answering: Stand up. Give your name. Make sure that the TA hears it. - PowerPoint PPT Presentation

Citation preview

Page 1: Discussion Class 2

1

Discussion Class 2

A Vector Space Model

for Automated Indexing

Page 2: Discussion Class 2

2

Discussion Classes

Format:

Questions.

Ask a member of the class to answer.

Provide opportunity for others to comment.

When answering:

Stand up.

Give your name. Make sure that the TA hears it.

Speak clearly so that all the class can hear.

Suggestions:

Do not be shy at presenting partial answers.

Differing viewpoints are welcome.

Page 3: Discussion Class 2

3

Question 1: Reading a Research Paper

(a) Who are the authors of this paper? What is their background? Why did they write this paper?

(b) When was the paper written? Since then, what has changed about computing?

(c) Since the paper was published was has changed about information retrieval?

Page 4: Discussion Class 2

4

Question 2. Reading a Research Paper

Page 5: Discussion Class 2

5

Question 3: Research Methodology

Define precision and recall.

Page 6: Discussion Class 2

6

Question 4. Summary of the paper

(a) What is the overall hypothesis that is examined in this paper?

(b) How does Section 2, Correlation between Indexing Performance and Space Density, relate to the hypothesis?

(c) How does Section 3, Correlation between Space Density and Indexing Performance, relate to the hypothesis?

(d) How does Section 4, The Discrimination Value Model, relate to the hypothesis?

Page 7: Discussion Class 2

7

Question 5: Document Space

Explain this diagram

Page 8: Discussion Class 2

8

Question 6: Weighting -- Term Frequency

The paper examines the effect of term weighting on the space density of index terms.

(a) Why is this of interest in information retrieval?

(b) What form of term frequency (tf) is used in this paper?

(c) How does this form of term frequency differ from the standard form discussed in class? Under what circumstances is this difference significant?

Page 9: Discussion Class 2

9

Question 7: Discrimination Value Model

Explain the following expression, which the authors use to measure the contribution of term k to the space density.

DVk = Qk - Q

What does this tell about the discriminant value of term k?

Page 10: Discussion Class 2

10

Question 7:

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.Question 8: Discuss this graph