Upload
mae-moody
View
213
Download
0
Embed Size (px)
Citation preview
CSE 450 – Web Mining SeminarProfessor Brian D. Davison
Fall 2005
A Presentation on
When Experts Agree: Using Non-Affiliated Experts to Rank Popular Topics
K. Bharat & G. A. MihailaWWW10 Conference, May 2001, Hong Kong
byOsama Ahmed Khan
10/06/2005
ProblemProblem
Query on Popular Topic Content Analysis
SolutionSolution
Most Authoritative Pages
Technical Terms
Expert Recommendation Non-affiliation
Hilltop AlgorithmHilltop Algorithm
1. Expert Lookup Detecting Host Affiliation Expert Selection Expert Indexing
2. Target Ranking Computing Expert Score Computing Target Score
Detecting Host AffiliationDetecting Host Affiliation
Conditions Same first 3 octets of IP
127.0.0.1 127.0.0.15
Same rightmost non-generic token of hostname
www.ibm.com www.ibm.co.mx
Union-Find Algorithm
Expert SelectionExpert Selection
Retrieve all webpages with:
Out-degree > Threshold (k)
(e.g. k = 5)
Expert will have:
URLs pointing to k distinct non-affiliated hosts
Expert IndexingExpert Indexing
Inverted Index Mapping Keywords to Experts Key Phrases Match Positions
Computing Expert ScoreComputing Expert Score
Condition Atleast 1 URL with all query keywords
Expert Score: (S0, S1, S2)
Si = SUM{key phrases p with k-i query terms} * LevelScore(p) * FullnessFactor(p,q)
Expert_Score = 232 * S0 + 216 * S1 + S2
Computing Target ScoreComputing Target Score
Condition Atleast 2 non-affiliated experts
Target Score:
Edge_Score(E,T) = Expert_Score(E) *
SUM{query keywords w} * occ(k,T)
Target_Score = Sum{Edge_Score(E,T)}
EvaluationEvaluation1. Locating Specific Popular Targets
Evaluation Evaluation (Contd.)
2. Gathering Relevant Pages
ConclusionConclusion
Characteristics Popular Queries Expert Subset
Hilltop vs. PageRank Topic Distillation
Thank YouThank You