ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing
Techniques for Large-Scale Entity Linking by Gianluca Demartini,
Djellel Eddine Difallah, and Philippe Cudr-Mauroux eXascale Infolab
U. of FribourgSwitzerland {firstname.lastname}@unifr.ch
Pick-A-Crowd: Tell Me What You Like, and Ill Tell You What to Do by
Djellel Eddine Difallah, Gianluca Demartini, and Philippe
Cudr-Mauroux eXascale Infolab U. of FribourgSwitzerland Presented
by: Muhammad Nuruddin, student ID: 2961230, email address:
[email protected], Internet Technologies and Information
Systems(ITIS), M.Sc. 4 th Semester Leibniz Universitt Hannover
Course details: Advanced Methods of Information Retrieval By: Dr.
Elena Demidova leibniz universitt hannover Presentation on the
papers 1
Slide 2
Entity Linking 2 Entity linking Algorithm ( Probabilistic
Reasoning based ) Entity Linking: A suggested way to automate the
construction of a semantic web
Slide 3
Example: Wikipedia provide annotated pages Military Germany
Pacific Ocean Historical Incidence France
Slide 4
Crowdsourcing Obtaining services, ideas, or content by asking
contributions from a large group of people, and especially from an
online community. Example: - Wikipedia = Wiki + encyclopedia =
quick + encyclopedia - IMDB movie top chart. - AMT (
AmazonMechanicalturk ) 4
Slide 5
Paper1: ZenCrowd: Leveraging Probabilistic Reasoning and
Crowdsourcing Techniques for Large-Scale Entity Linking 5 Entity
linking Algorithm ( Probabilistic Reasoning based ) Crowdsourcing
Improvement between 4% and 35%
Slide 6
Current techniques of Entity Linking Entity Linking is known to
be extremely challenging, since parsing and disambiguating natural
language text is still extremely difficult for machines. The
current matching techniques: Algorithmic Matching: Mostly based on
probabilistic reasoning (e.g. TF-IDF based). Not fully reliable as
human manual matching. Manual Matching: Fully reliable. Costly and
time consuming. e.g. New York Times (NYT) employs a whole team
whose sole responsibility is to manually create links from news
articles to NYT identifiers. This paper represents a step towards
bridging the gap between those two classes. 6
Slide 7
7 System Architecture The results of algorithmic matching are
stored in a probabilistic network: Decision Engine decides: 1.If
results has very high probability value, it is directly linked to
the entity. 2.If results have very low confidence value, it is
discarded and ignored. 3.Promising but uncertain valued entities
are passed to Micro-Task Manager to crowdsource the problem and
make a decision.
Slide 8
8 System Architecture After getting vote from Crowdsourcing
platform, all information gathered both from the algorithmic
matchers and the crowd are fed into a scalable probabilistic store,
and used by their decision engine to process all entities
accordingly. Lets have a look on decision engines mechanism to take
a decision.
Slide 9
Example scenario 9 CountryJordan River Berkeley professor
CountryJordan Entities River After the UNC workshop, Jordan gave a
tutorial on nonparametric Bayesian methods. Worker W 1 Worker W 2
l1l1 l2l2 l3l3 HTML page doc. 1 C 11 Reliability factor Pw 1 ()
Good, or Bad Reliability factor Pw 1 () Good, or Bad C 12 C 13 C 21
C 22 C 23 pl j Probability of l j computed from algorithmic
matches. pl 1 pl 2 pl 3 LOD cloud
Slide 10
Decision Engine uses Factor-Graph Factor-Graph can deal with a
complicated global problem by viewing it as a factorization of
several local functions. 10 l 1,l 2,l 3 3 candidate entities for a
linking. pl j Probability of l j computed from algorithmic matches.
W 1,W 2 two workers employed to check these l 1,l 2,l 3 relevancy.
Pw 1, pw 2 worker w 1 and w 2 s reliability factor. Lf i () linking
factor, connects l i to related clicks (e.g. C 11 ) and workers (
e.g W 1 ). Sa 1-2 () entities has SameAs link in LOD.
Slide 11
11 Equations used for Linking factor calculation in
Factor-Graph
Slide 12
Reaching a Decision We will find a posterior probability for
all the links running probabilistic inference in the network. Links
with posterior probability > 0.5 are considered to be correct.
12
Slide 13
Updating Priors As much as entity linkings come to a decision,
workers working profiles get updated. From the result, workers
accuracy of work can be calculated. 13 Reliability factor of W 2
Reliability factor of W 1
Slide 14
EXPERIMENTS Experimental setup Collection consists of 25
English news articles News from CNN.com, NYTimes.com,
washingtonpost.com, timesofindia.indiatimes.com, and swissinfo.com
489 entities extracted using stanford parser. Crowdsourcing was
performed using Amazon Mturk 80 distince workers Precision, Recall
and accuracy was measured. 14
Slide 15
Comparison of three matching techniques 15
Slide 16
Observations A Hybrid model ( Based on both automated and
manual human experts)for entity linking. 4% to 35% improvement than
manually optimized agreement voting approach. Average 14%
improvement over best automated system. In both cases, the
improvement is statistically signicant (t-test p < 0.05) Manual
work makes the total annotation work significantly slow. So there
are some questions about time quality tradeoff. They classified
workers into {Good, Bad} manually and calculated workers
reliability P(w), but did not mention any relation between these
two factors. 16
Slide 17
End of presentation of first paper 17
Slide 18
Paper 2: Pick-A-Crowd: Tell Me What You Like, and Ill Tell You
What to Do This paper is about a different Crowdsourcing approach
based on push methodology. This new push methodology yields better
results (29% more efficient) than usual pull strategies. 18 Any
worker can pull any task Figure: Traditional Crowdsourcing pull
strategy
Slide 19
Example of traditional approach: So whats wrong with this? Does
not care about workers field of expertise. Not all workers are a
good fit for all tasks. tasks requiring background knowledge is
important. I had no idea what to answer to most questions... was a
comment of a worker from AMT (Amazon Mechanical Turk). Any worker
can pull any task [1] [1]
https://requester.mturk.com/images/graphic_process.png?1403199990
19
Slide 20
So how they are going to improve it? Ranks/orders the workers
according to the type of the work and skill of the workers and
pushes the work to the most suitable workers. At first they
constructs user models for each workers in the crowd in order to
assign HITs ( Human Intelligence Tasks) to the most suitable
available worker. User model/user profile is built based on his
social network usage, his fields of interest. 20
Slide 21
So how this system ranks/orders the workers? Recommender system
Assigning HITs to workers is similar to the task performed by
recommender systems. The recommender systems matches HITs (Human
Intelligence tasks) to human workers (i.e. users) profiles that
describe worker interests and skills. Then the system generates a
ranking of candidate workers who can do the work better. 21
Slide 22
System Overview 22
Slide 23
Workflow of the system Calculate work difficulty: - Every work
is different from other works. HIT Difficulty Assessor takes each
HIT and determines a complexity score for it. Assess Worker skill:
- System create workers profile considering his liked pages and
previously work experiences. Calculate Reward for the work: - As
every work is different and every workers ability differs from work
to work, Rewards for different works and workers are different.
System calculates rewards considering these factors. Assign works
to top-k suitable candidates: - Recommender system finds k top most
suitable candidates and assign (pushes ) the work only these n
workers. 23
Slide 24
Calculate work difficulty 3 different possible algorithms
1.Text Compare: Compare the textual description of the task with
the skill description of each worker and assess the difficulty.
2.LOD(Linked Open Data) entity based: Each Facebook page liked by
the workers can be linked to its respective LOD entities. Then the
set of entities related to HITs and the set of entities
representing the interests of the crowd can be directly compared.
The task is classified as difficult when the entities involved in
the task heavily differ from the entities liked by the crowd.
3.Machine Learning based: A classifier trained by means of
previously completed tasks, their description and their result
accuracy. The description of a new task is given as a test vector
to the classifier. 24
Slide 25
4 possible way of Reward Estimation Input: A monetary budget B,
HIT h i. 1.Rewarding the same amount of money for each task of the
same type. 2.Taking into account the difficulty d() of the HIT h.
3.Computing a reward based on both the specific HIT as well as the
worker skill who performs it. 4.Game theoretic based approaches to
compute the optimal reward for paid crowdsourcing incentives in the
presence of workers who collude in order to game the system.
25
Slide 26
Worker Profile Selector This module uses the similarity measure
that used for matching workers to tasks. The entities included in
the workers profiles can be considered. The Facebook categories of
their liked pages also plays significant role. A generic Similarity
measurement equation is A = set of candidate answers for task h i
sim() = similarity between the worker profile and the task
description. 3 Assignment models for hit ( Human Intelligence task)
assignemnt. 26
Slide 27
HIT ASSIGNMENT MODELS Category-based Assignment Model Tasks are
assigned according to Facebook pages or page categories. (e.g.
Entertainment -> Movie ) Requestor mentions the category of the
task. Expert Profiling Assignment Model Scoring function is based
on a voting model. Voting model is based on no. of pages related to
the tasks and no. of pages user liked and how many are common.
Semantic-Based Assignment Model Answers and liked pages are linked
to entities and Underlying graph structure is used to measure the
distance ( similarity). Example SPARQL 27
Slide 28
Example: Expert Finding Voting Model Figure: An example of the
Expert Finding Voting Model. The final ranking identifies worker A
as the top worker as he likes the most pages related to the query
28
Slide 29
29 Any worker can pull any task Hit Assigner assigns tasks to
suitable workers Summary of the system
Slide 30
Experimental Evaluation Experimental Setting: 170 workers
Overall, more than 12K distinct liked Facebook pages workers have
been recruited via Amazon Mturk. Task categories: actors, soccer
players, anime characters, movie actors, movie scenes, music bands,
and questions related to cricket 50 images/category. Precision,
Recall over majority votes obtained over 3 or 5 workers. 30
Slide 31
Figure: Crowd performance on the cricket task. Square points
indicate the 5 workers selected by their proposed system. The best
worker performing at 0.9 Precision and 0.9 Recall. 31
Slide 32
Figure: OpenTurk worker Accuracy vs no. of relevant Pages a
worker likes. Observations: More relevant pages in the worker
profile (e.g., >30), more accuracy. 32
Slide 33
Table: Average Accuracy for different HIT assignment models
assigning each HIT to 3 and 5 workers. AMT Amazon Mechanical Turk.
Cagegory-based Category of liked page and category of task based
comparison. En. type 3/5 - Entity graph Entity type in the DBPedia
knowledge base assigning each HIT to 3 and 5 workers. Voting Model
t i Voting model based on page text relevant to the task. Voting
Model A i Voting model based on all possible answer based
similarity. 1-step Considers directly related entities within one
step in the graph. Result is based on 320 questions. Voting Model t
i achieves 29% relative improvement over the best accuracy obtained
by the AMT model 33
Slide 34
Observations May lead to longer task completion times Real time
annotation is not possible. But obtaining high-quality answers is
more important rather than getting real-time data in most of the
cases. 34