Automatic detection of navigational queries according to Behavioural Characteristics

Preview:

DESCRIPTION

Sildes used in the paper presentation in the LWA Congress.

Citation preview

Automatic detection of navigational queries according to Behavioural Characteristics

Automatic detection of navigational queriesaccording to Behavioural Characteristics

David J. Brenes MartínezDaniel Gayo-Avello

Lernen-Wissen-Adaptivitat (Learning-Knowledge-Adaptability) 08Workshop on Information Retrieval

Würzburg

2008/10/07

Automatic detection of navigational queries according to Behavioural Characteristics

Structure of the presentation

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Context of this research

Context of this research

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Context of this research

Master on Web Engineering

Research specialityMaster Thesis

Automatic detection of navigational queries according to Behavioural Characteristics

Context of this research

Research on Web Information Retrieval

How do the users search?Why do they search that way?How can we help them in their search tasks?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

Previous Research

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

Why are we researching this?

Previous Research

2 Previous ResearchWhy are we researching this?

What are they searching?What tasks do they perform?With what purpose?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

Why are we researching this?

Anomalous characteristics of Web users

Short queriesShort search sessionsFew examined results

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

Why are we researching this?

So...

Why do they not behave as expected?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

What are they searching?

Previous Research

2 Previous ResearchWhy are we researching this?What are they searching?

What tasks do they perform?With what purpose?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

What are they searching?

Simple approach

Simple approachWhat subjects do the users search?

Popular terms and queriesSubject taxonomy

Very useful in Advertising Research

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

What tasks do they perform?

Previous Research

2 Previous ResearchWhy are we researching this?What are they searching?What tasks do they perform?

With what purpose?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

What tasks do they perform?

Tasks on a query

Different tasksAdding termsModifying termsDeleting terms...

Adaptation of IRSHelps for adding terms or modifying queriesDeleting terms?

Abstract tasks

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

With what purpose?

Previous Research

2 Previous ResearchWhy are we researching this?What are they searching?What tasks do they perform?With what purpose?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

With what purpose?

Motivation underlying the query

Accesing some authority in a subjectUse of Web toolsPerforming some actions (shopping, downloading...)Problem resolutionSpare time

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

With what purpose?

IRS scenaries of use

Specific helpsDifferent needsDifferent characteristics?

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

With what purpose?

Intention Classification

TaxonomiesCommon in previous IR ResearchBroder

Navigational:google, cnn, apple store

Informational:Large hadron collider, Trains to Würzburg

Transactional:buy concert tickets, download torrents movies

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

With what purpose?

Classification Attempts

Manual classificationSimplerLess dataLexical and semantic characteristics

Length of the queryMeaning of the termsBiases

Performed by human expertsResults infuencied by the expertContradictions

Automated classificationMore dataMore complexLexical and semantic characteristicsBehaviour of the user

Automatic detection of navigational queries according to Behavioural Characteristics

Previous Research

With what purpose?

Summary

Search characteristics points out differences for Web usersSubjects don’t justify these differenciesTasks performed on queries don’t clarify the reasonClassification based on lexic and semantics introducesome biases

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Methodology

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Data and tools

Methodology

3 MethodologyData and tools

Navigational CoefficientsPublishing

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Data and tools

AOL Query Log

AOL Query LogAnalysis performed over the whole query log

PythonPostgreSQL

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Navigational Coefficients

Methodology

3 MethodologyData and toolsNavigational Coefficients

Publishing

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Navigational Coefficients

Definition

‘Navigational’ criteriaBased on user’s behaviourTranslated to some statistical characteristic

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Navigational Coefficients

Presentation

Expected behaviourFormulaPros and consTOP 10 queries

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Publishing

Methodology

3 MethodologyData and toolsNavigational CoefficientsPublishing

Automatic detection of navigational queries according to Behavioural Characteristics

Methodology

Publishing

Publishing data and results

Science repeatabilityWebsite of the author

http://www.davidjbrenes.infoDelayed publishing due to technical errorsNow it’s just my spanish blog :)

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Coefficients

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Most significant result

Coefficients

4 CoefficientsMost significant result

Number of resultsPercentage of navigational sessions

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Most significant result

Expected behaviour

Traffic concentrated on one resultStrong relationship between query and resultQuery as ‘name’ of the resultProposed by Lee et al (2005)

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Most significant result

Proposed formula

NC =Number_of_visits_most_popular_result

Number_of_visits_to_results

Rate of visits to the most visited result.

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Most significant result

Pros and Cons

SimpleIntuitiveBad behaviour in the long tail (high values)Ignore distribution of results

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Most significant result

Results

drudge retortsoulfuldetroitcosmology bookttologin.comjjj’s thumbnailgallery postbeteagleyscufrumsupportcricketnext.commsitf

Few performed queriesNon typical navigationalqueriesInfluence of individualusers

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Number of results

Coefficients

4 CoefficientsMost significant resultNumber of results

Percentage of navigational sessions

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Number of results

Expected behaviour

PolysemySame query means differents web sites for different usersVersions of a websiteNavigational behaviour for each result

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Number of results

Proposed formula

NC = 1 − Number_of_distinct_resultsNumber_of_visits_to_results

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Number of results

Pros and Cons

Focused on distribution of resultsFavor the most popular queriesBad behaviour on long tail (low values)

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Number of results

Results

googleyahoo.commapquestyahooebay

google.combank of americawww.google.comwww.yahoo.comyahoo mail

Popular queriesTypical navigationalqueriesLexical and semanticcharacteristics similarto other researches

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Percentage of navigational sessions

Coefficients

4 CoefficientsMost significant resultNumber of resultsPercentage of navigational sessions

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Percentage of navigational sessions

Expected behaviour

A result satisfy the informational needNavigational queries isolated in a session

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Percentage of navigational sessions

Proposed Formula

NC =Number_of_navigational_sessions

Number_of_sessions_of_query

Navigational SessionOne queryOne result

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Percentage of navigational sessions

Pros and cons

Don’t favor the most popular queriesStrong dependency on the detection sessions algorithmMultitask influenceBad behaviour on long tail

Not common for these queries having a navigationalsessionLess important impact

Automatic detection of navigational queries according to Behavioural Characteristics

Coefficients

Percentage of navigational sessions

Results

natural gas futurescashbreak.comallstar puzzlestimes enterpriseinstapunditclarksville leafchroniclefirst charter onlinemission viejo nadadorescounty of san joaquinbooking logthomas myspace editorbeta

Few performed queriesNon typical navigationalqueriesLexical and semanticcharacteristics similarto other researches

Possibly ignored byhuman experts

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Conclussions

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Comparison

Conclussions

5 ConclussionsComparison

Lexical and semantic characteristicsRelationship between statistics and behaviour

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Comparison

Comparison

A relevant result usually corresponds with a little set ofvisited resultsThe inverse relationship is not true

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Comparison

Comparison

A relevant result or a small set of results doesn’t assurenavigational sessions

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Comparison

Comparison

A high rate of navigational sessions usually impliesrelevant results and a small set of results

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Comparison

NC Combination

NC = NC_3∗NC_1 + NC_22

High weight forcoefficient dependingon navigationalsessionsAveraging of firstcoefficients

soulfuldetroitaol people magazine

cashbreak.comallstar puzzlesfirst charter

onlinemission viejo

nadadoresinstapundit

times enterpriseclarksville leaf

chronicleel canario

by the lagoon

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Comparison

Lack of navigational sessions

Lowest values for NCNavigational queries wrongly grouped with another queriesMultitask influenceUse of NC to detect navigational sessions

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Lexical and semantic characteristics

Conclussions

5 ConclussionsComparisonLexical and semantic characteristics

Relationship between statistics and behaviour

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Lexical and semantic characteristics

Lexical and semantic characteristics

Some queries present themResearch is not based on those characteristicsResults are not biased

Some queries could have not been detectedDatabase of relevant termsURL queriesLong queries

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Relationship between statistics and behaviour

Conclussions

5 ConclussionsComparisonLexical and semantic characteristicsRelationship between statistics and behaviour

Automatic detection of navigational queries according to Behavioural Characteristics

Conclussions

Relationship between statistics and behaviour

Relationship between statistics and behaviour

Intention inferred from statistical characteristicsAutomatic evaluationTheorically extensible

Automatic detection of navigational queries according to Behavioural Characteristics

Future Research

Future Research

1 Context of this research

2 Previous Research

3 Methodology

4 Coefficients

5 Conclussions

6 Future Research

Automatic detection of navigational queries according to Behavioural Characteristics

Future Research

Combination of NC

Analysis of obtained resultsResearch on statistics metodologies for combinating them

Automatic detection of navigational queries according to Behavioural Characteristics

Future Research

Another user’s intentions

Analysis of expected behaviourExperiments for each behaviour

Automatic detection of navigational queries according to Behavioural Characteristics

Future Research

Detection of navigational sessions

Feedback for detection sessions algorithmCaution with biases (sessions artifitially created)Combination with others detection algorithms

Automatic detection of navigational queries according to Behavioural Characteristics

And now...

Any question, please?

Recommended