44
Aggregating Multiple Dimensions for Computing Document Relevance Mauro Dragoni Fondazione Bruno Kessler (FBK), Shape and Evolving Living Knowledge Unit (SHELL) 2 nd KEYSTONE Summer School Santiago de Compostela, July 21 st 2016 1

Aggregating Multiple Dimensions for Computing Document Relevance

Embed Size (px)

Citation preview

Page 1: Aggregating Multiple Dimensions for Computing Document Relevance

1

Aggregating Multiple Dimensions for Computing Document Relevance

Mauro DragoniFondazione Bruno Kessler (FBK), Shape and Evolving Living

Knowledge Unit (SHELL)

2nd KEYSTONE Summer SchoolSantiago de Compostela, July 21st 2016

Page 2: Aggregating Multiple Dimensions for Computing Document Relevance

2

How will we spend time today? Our Goal:

to understand how documents can be evaluated by adopting a multi-criteria framework

Presentation of the theoretical framework

Case Study 1

Representing documents through different layers

Case Study 2

Combining user profiles, queries, and document content for computing relevance

Case Study 3

Merge and explode Case Study 1 and Case Study 2…

Page 3: Aggregating Multiple Dimensions for Computing Document Relevance

3

Why is this topic interesting? Indexing documents and querying repositories is not only a

matter of weighting terms

At the end of this lesson you should be able to: consider a document from different perspectives understand why YOU can be part of the document score know how to treat different type of information content

What might I expect from you? To see a paper on this topic published in the near future… To get new ideas, proposed by you…

Page 4: Aggregating Multiple Dimensions for Computing Document Relevance

4

Some Background The main idea behind this topic is “multi-criteria decision

making”

What does it mean? Suppose to have an entity E and a set C of n criteria We need to evaluate, for each criterion Ci how much E satisfies Ci

We have to aggregate all satisfaction degrees for evaluating E

Some suggested papers Ronald R. Yager. Modeling prioritized multicriteria decision making. IEEE Trans. Systems,

Man, and Cybernetics, Part B 34(6): 2396-2404 (2004) Ronald R. Yager. Prioritized aggregation operators. Int. J. Approx. Reasoning 48(1): 263-

274 (2008) Célia da Costa Pereira, Mauro Dragoni, Gabriella Pasi. Multidimensional relevance:

Prioritized aggregation in a personalized Information Retrieval setting. Inf. Process. Manage. 48(2): 340-357 (2012)

Francesco Corcoglioniti, Mauro Dragoni, Marco Rospocher, Alessio Palmero Aprosio. Knowledge Extraction for Information Retrieval. ESWC 2016: 317-333

Page 5: Aggregating Multiple Dimensions for Computing Document Relevance

5

Further Readings Fuzzy Logic

Zadeh book and papers

Knowledge Extraction Semantic Web (ISWC conference series, KBS and JWS journals, …) Knowledge Management (KR, IJCAI, AAAI, …) Natural Language Processing (ACL, COLING, …)

User Modeling and Interaction UMAP proceedings HCI papers

Page 6: Aggregating Multiple Dimensions for Computing Document Relevance

6

Introductory Example John is looking for a bicycle for his little son John takes care of two criteria: “safety” and “inexpensiveness” John considers “safety” > “inexpensiveness”

We may two scenarios:1. John is not able to find a “safe” bicycle that is also “cheap”.2. John has a low budget. Thus, he has to find a trade-off between the

two criteria.

EC1 C2

Page 7: Aggregating Multiple Dimensions for Computing Document Relevance

7

Problem Representation Components

the set C of the n considered criteria: C = {C1, …, Cn};

the collection D of entities (documents in the specific case of IR);

an aggregation function F computing the score F(C1(d),…, Cn(d)) of each document d contained in D;

a priority model P defined by… someone (user, system maintainer, etc.);

a weighting schema W.

Page 8: Aggregating Multiple Dimensions for Computing Document Relevance

8

Weighting Schema – Expert-based choice Weights are arbitrarily chosen by an expert.

No rules for computing them.

For example: C λ1 = 0.7 C2 λ2 = 0.5 C3 λ3 = 0.6 C4 λ4 = 0.3

You need to justify the values you choose.

Page 9: Aggregating Multiple Dimensions for Computing Document Relevance

9

Weighting Schema – Priority-based choice Weights are computed “automatically” based on the priority

between criteria.

For each document d, the weight of the most important criterion C1 is set to 1.0 by definition.

The weights of the other criteria are computed as follows:

Page 10: Aggregating Multiple Dimensions for Computing Document Relevance

10

Weight Schema - Considerations A weighting schema can be decided a-priori but…

We can learn a new weighting schema: from learn-to-rank dataset, or from the IR system usage.

The choice of the weighting schema, obviously, affects the effectiveness of your information retrieval system.

Where can we apply such weighting schema?

Page 11: Aggregating Multiple Dimensions for Computing Document Relevance

11

Three (not exhaustive) Operators As you can imagine… there are different ways for combining

weights and criteria

Operator 1: “Scoring” weighted criteria scores are summed

Operator 2: “Min” or “And” among weighted criteria scores, minimum score is selected

Operator 3: “Max” or “Or” among weighted criteria scores, maximum score is selected

Page 12: Aggregating Multiple Dimensions for Computing Document Relevance

12

The “Scoring” Operator The overall document score is computed by summing the

weighted scores computed for all criteria.

The score computed on the most important criterion leads the overall document score.

Less important criteria help in refining the overall document score.

Page 13: Aggregating Multiple Dimensions for Computing Document Relevance

13

The “And” (or “Min”) Operator The document score is strongly dependent on the degree of

satisfaction of the least satisfied criterion

Very restrictive operator

Suggestion: consider criteria that are really relevant for a user!!!

Page 14: Aggregating Multiple Dimensions for Computing Document Relevance

14

The “Or” (or “Max”) Operator Dangerous operator!

Recommendation: criteria with a satisfaction degree of zero do not have to be considered.

It is useful only when priority between criteria is not used. Weighting schema is manually defined Weight of less important criteria are not based on the value of the

most important ones.

Page 15: Aggregating Multiple Dimensions for Computing Document Relevance

15

Operators’ Properties

Boundary Conditions

Continuity

Monotonicity (just for Scoring)

Absorbing Element (“0”, for Scoring and Min operators)

Page 16: Aggregating Multiple Dimensions for Computing Document Relevance

16

The Operators in Action Assume to have a document D composed as follows:

Title

Abstract

IntroductionContent

Title C1

Abstract C2

Introduction C3

Content C4

Page 17: Aggregating Multiple Dimensions for Computing Document Relevance

17

The Operators in Action Suppose to perform a query as follows:

Q = {qt1, qt2, qt3} Assume that, for each document field, you have a normalized

similarity values: sim(Q, DTitle) = 0.5 sim(Q, DAbstract) = 0.4 sim(Q, DIntroduction) = 0.2 sim(Q, DContent) = 0.7

As you can imagine, by using different priorities and different aggregations, the document score will be different.

Page 18: Aggregating Multiple Dimensions for Computing Document Relevance

18

The Operators in Action

Criteria score: C1 = 0.5; C2 = 0.8; C3 = 0.2; C4 = 0.7

Priority schemas:P1: C1 > C2 > C3 > C4

P2: C1 > C2 > C4 > C3

Weights:

for P1: for P2:w1: 1.0 w1: 1.0w2: 1.0 * 0.5 = 0.5 w2: 1.0 * 0.5 = 0.5w3: 0.5 * 0.8 = 0.4 w3: 0.5 * 0.8 = 0.4w4: 0.4 * 0.2 = 0.08 w4: 0.4 * 0.7 =

0.28

Page 19: Aggregating Multiple Dimensions for Computing Document Relevance

19

The Operators in Action Document score

“Scoring” operator: • DP1 = (0.5 * 1.0) + (0.8 * 0.5) + (0.2 * 0.4) + (0.7 * 0.08) = 1.036• DP2 = (0.5 * 1.0) + (0.8 * 0.5) + (0.7 * 0.4) + (0.2 * 0.28) = 1.236

“And” operator:• DP1 = min(0.5^1.0, 0.8^0.5, 0.2^0.4, 0.7^0.08) = min(0.5, 0.89, 0.53, 0.97)

= 0.5• DP2 = min(0.5^1.0, 0.8^0.5, 0.7^0.4, 0.2^0.28) = min(0.5, 0.89, 0.87, 0.64)

= 0.5

“Or” operator:• DP1 = max(0.5^1.0, 0.8^0.5, 0.2^0.4, 0.7^0.08) = max(0.5, 0.89, 0.53,

0.97) = 0.97• DP2 = max(0.5^1.0, 0.8^0.5, 0.7^0.4, 0.2^0.28) = max(0.5, 0.89, 0.87,

0.64) = 0.89

Page 20: Aggregating Multiple Dimensions for Computing Document Relevance

20

Any question so far?

Timeout…

Page 21: Aggregating Multiple Dimensions for Computing Document Relevance

21

Case Study 1 – The Scenario Keyword search over a multi-layer representation of documents Documents and queries structure:

Textual layer: natural language text Metadata layers:

• Entity Linking• Predicates• Roles/Types• Timing Information

Problems: How to compute the score for each layer? How to aggregate such scores? How to weight each layer?

Page 22: Aggregating Multiple Dimensions for Computing Document Relevance

22

Case Study 1 – The Scenario Natural language content is enriched with four

metadata/semantic layers URI Layer: links with entities detected into the text and mapped to

DBpedia entities TYPE Layer: conceptual classification of the named entities detected

into the text and mapped with both DBpedia and Yago knowledge bases

TIME Layer: metadata related to the temporal mentions find into the text by using a temporal expression recognizer (ex. “the eighteenth century”, “2015-18-12”, etc.)

FRAME Layer: output of the application of semantic role labeling techniques. Generally, this output includes predicates and their arguments describing a specific role in the context of the predicate.Example: “He has been influenced by Carl Gauss” [framebase:Subjective_influence; dbpedia:Carl_Friedrich_Gauss]

Page 23: Aggregating Multiple Dimensions for Computing Document Relevance

23

Case Study 1 – Example Text: “astronomers influenced by Gauss”

Layers URI Layer: “dbpedia:Carl_Friedrich_Gauss” TYPE Layer: “yago:GermanMathematicians”,

“yago:NumberTheorists”, “yago:FellowsOfTheRoyalSociety”

TIME Layer: “day:1777-04-30”, “day:1855-02-23”, “century:1700” FRAME Layer: “Subjective_influence.v_Carl_Friedrich_Gauss”

Annotations provided by PIKES (https://pikes.fbk.eu)

Page 24: Aggregating Multiple Dimensions for Computing Document Relevance

24

Case Study 1 - Evaluation

331 documents, 35 queries Jörg Waitelonis, Claudia Exeler, Harald Sack. Enabled Generalized Vector Space

Model to Improve Document Retrieval. NLP-DBPEDIA@ISWC 2015: 33-44

Multi-value relevance (1=irrelevant, 5=relevant)

Diverse queries: from keyword-base search to queries requiring semantic capabilities

Page 25: Aggregating Multiple Dimensions for Computing Document Relevance

25

Case Study 1 - Evaluation 2 baselines:

Google custom search API Textual layer only (~Lucene)

Measures: Prec1,5,10, MAP, MAP10, NDCG, NDCG10

Same weights for textual and semantic layers: TEXTUAL (50%) URI (12,5%), TYPE (12,5%), FRAME (12,5%), TIME (12,5%)

Page 26: Aggregating Multiple Dimensions for Computing Document Relevance

26

Case Study 1 - Evaluation

Approach/

SystemPrec1 Prec5 Prec10 NDCG

NDCG1

0 MAP MAP10

Google 0.543 0.411 0.343 0.434 0.405 0.255 0.219

Textual 0.943 0.669 0.453 0.832 0.782 0.733 0.681

KE4IR 0.971 0.680 0.474 0.854 0.806 0.758 0.713

KE4IR vs.

Textual 3.03% 1.71% 4.55% 2.64% 2.99% 3.50% 4.74%

Page 27: Aggregating Multiple Dimensions for Computing Document Relevance

27

Case Study 1 - Evaluation

Layers (TEXTUAL+) Prec1 Prec5 Prec10 NDCG NDCG10 MAP MAP10 URI,TYPE,FRAME,TIME 0.971 0.680 0.474 0.854 0.806 0.758 0.713URI,TYPE,FRAME 0.971 0.680 0.474 0.853 0.804 0.757 0.712URI,TYPE,TIME 0.971 0.680 0.474 0.851 0.802 0.757 0.712URI,TYPE 0.971 0.680 0.474 0.849 0.801 0.755 0.710URI,FRAME,TIME 0.971 0.674 0.465 0.844 0.796 0.750 0.702URI,FRAME 0.971 0.674 0.465 0.842 0.795 0.749 0.702URI,TIME 0.971 0.674 0.465 0.840 0.791 0.747 0.700TYPE,FRAME,TIME 0.943 0.674 0.471 0.848 0.799 0.745 0.700TYPE,TIME 0.943 0.674 0.471 0.843 0.794 0.743 0.697TYPE,FRAME 0.943 0.674 0.468 0.847 0.797 0.743 0.695FRAME,TIME 0.943 0.674 0.462 0.842 0.793 0.741 0.693

Page 28: Aggregating Multiple Dimensions for Computing Document Relevance

28

Case Study 1 - Evaluation

Page 29: Aggregating Multiple Dimensions for Computing Document Relevance

29

Case Study 1 – What We Learnt How the effectiveness of a system can be affected if we change

weights.

In this specific case, the use of an expert-based weighting schema helps you in balancing the importance of the semantic information…

… however, we are using learning to rank for identifying potential priorities between used layers.

Further lessons more related to the use of semantic layers.

Future work: to apply the approach to larger collections.

Page 30: Aggregating Multiple Dimensions for Computing Document Relevance

30

Any question onCase Study 1?

Timeout…

Page 31: Aggregating Multiple Dimensions for Computing Document Relevance

31

Case Study 2 – The Scenario Combine document information with user profiles.

Assumption: you already have computed user profiles.

Which information can you use? RELIABILITY: How much a user trusts the document source. COVERAGE: How strongly a user profiles is represented in a document

(inclusion of a user profiles into a document). APPROPRIATENESS: How much a document satisfies a user profile

(similarity between user profile and document). ABOUTNESS: Trivial criterion, how much a document matches the

performed query

Page 32: Aggregating Multiple Dimensions for Computing Document Relevance

32

Case Study 2 – Reliability Why do I trust information source differently?

How much do you trust an information source? you might fix such values; you might infer them.

Page 33: Aggregating Multiple Dimensions for Computing Document Relevance

33

Case Study 2 – Coverage The “coverage” criterion allows to compute how strongly a user

profile is contained in the document

Suppose to have a profile of a user interested in the following topics:

c = {sports, economics} Suppose to have a document talking about the following topics:

d= {violence, politics, economics, sports}

c = {0, 0, 1, 1} d = {1, 1, 1, 1} Coverage(c,d) = 1.0

Page 34: Aggregating Multiple Dimensions for Computing Document Relevance

34

Case Study 2 – Appropriateness The “appropriateness” criterion allows to compute how much a

document satisfies a user profile

Suppose to have a profile of a user interested in the following topics:

c = {sports, economics} Suppose to have a document talking about the following topics:

d= {violence, politics, economics, sports}

c = {0, 0, 1, 1} d = {1, 1, 1, 1} Appropriateness(c,d) = 0.5

Page 35: Aggregating Multiple Dimensions for Computing Document Relevance

35

Case Study 2 – Aboutness The “classic” similarity between a query and documents

contained in a repository.

Many model available… and various adaptations based on the context.

Page 36: Aggregating Multiple Dimensions for Computing Document Relevance

36

Case Study 2 – Validation The Reuters RCV1 Collection has been used for creating user

profiles and for generating user queries. 20 users have been involved in the evaluation campaign. Different aggregation schemas have been tested.

Page 37: Aggregating Multiple Dimensions for Computing Document Relevance

37

Case Study 2 – Validation (Ab > Ap > C > R)

Page 38: Aggregating Multiple Dimensions for Computing Document Relevance

38

Case Study 2 – What We Learnt When users are involved, it is very difficult to define an

aggregation schema.

The same occurs for the priority between criteria.

Creating (or learning) a user profiles is already a big problem itself.

The quality of user profiles significantly affects the effectiveness of the retrieval algorithm.

If you start playing with criteria and weight schemas, you will never end!!!

Page 39: Aggregating Multiple Dimensions for Computing Document Relevance

39

Any question onCase Study 2?

Timeout…

Page 40: Aggregating Multiple Dimensions for Computing Document Relevance

40

Case Study 3 Let’s get back to the first simple example…

Title

Abstract

IntroductionContent

Title C1

Abstract C2

Introduction C3

Content C4

Page 41: Aggregating Multiple Dimensions for Computing Document Relevance

41

Case Study 3 – Suppose that… Each field has been annotated with different ontologies, but

belonging to the same domain this means that you have, for the same field, many layers with

different annotations… one for each used ontology Your repository contains documents coming from different

sources is the reliability of each repository the same?

Your users have a history Users profiles need to be updated

this aspect is out of the scope of this talk… but you should be aware of it…

Any other idea?

Page 42: Aggregating Multiple Dimensions for Computing Document Relevance

42

Exploding Fields

You have something to think about… Good luck!!!

Page 43: Aggregating Multiple Dimensions for Computing Document Relevance

43

So… for concluding Considering retrieval as a multi-criteria decision making

problem is interesting to explore.

There is room for investigating a lot of stuff.

Do not be scary on using user profiles. I invite you to consider recent works on simulating user interactions

with IR systems• David Maxwell, Leif Azzopardi. Simulating Interactive Information Retrieval:

SimIIR: A Framework for the Simulation of Interaction. SIGIR 2016: 1141-1144 (+ the tutorial he gave)

My suggestion: try to combine content semantic metadata users history

Page 44: Aggregating Multiple Dimensions for Computing Document Relevance

44

It’s time for questions…

Mauro DragoniFondazione Bruno Kessler

https://shell.fbk.eu/index.php/Mauro_Dragoni [email protected]