Upload
eric-robinson
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Presented by
Edgar Cornejo03.03.14
LAMISpring 2014
Search Engine and Services
Outline
Mobile information search for location-based
information
Web-a-Where: Geotagging Web Content
The design and implementation of SPIRIT:
a spatially-aware search engine for
information retrieval on the Internet
Mobile information search for location-based information
Department of Industrial Engineering Tsinghua University
Beijing, ChinaApril 2010
Chengyi Liu · Pei-Luen Patrick Rau · Fei Gao
Mobile search for location-based information
Mobile information search for location-based information
The study investigated the
effects of location and
information type in mobile
searching for location-based
information by carrying out
two experiments in an airport
Mobile search scenario
High time
pressure
Many environment
al disturbance
s
Device limitation
s (screen size, input
method)
Restricted users’
operations
Mobile information search for location-based information
Mobile searching context
Information queries
+ location
More suitable results
Mobile information search for location-based information
Since most of the information is location-based [1,2],
the results can be improved by analyzing information
queries and location
Search Engine
Features of mobile interaction [3]
Mobile information search for location-based information
User's hands are often
used to manipulate
physical objects
Users may be involved in
tasks that demand a high
level of visual attention
Features of mobile interaction [3]
Mobile information search for location-based information
Users may be highly
mobile during the task
and have high-speed
interaction
Search queries
Mobile information search for location-based information
Query Type Purpose Share*
Navigational query
to reach a particular site 29.4%
Informational query
to find information 10.2%
Transactional query
to visit a site and perform some web-mediated activity
60.4%
*According to a large scale study of European mobile search
behavior developed in 2008 [4]
Factors proposed that may influence the mobile information
search
Experiment 1 - Hypotheses
Mobile information search for location-based information
Hypothesis 1
For information searches in mobile versus non-mobile:
The average of clicks in mobile is less
The first search is more important
Free recall is worse
Experiment 1 - Hypotheses
Mobile information search for location-based information
Hypothesis 2
For information searching about location-based with respect to non-location-based information
The number of clicks is less
The first search result is more important
Free recall is better
Experiment 1 - Tasks
Experiment 1 - Results
Mobile information search for location-based information
Hypothesis 1
The intention was to find how the user’s context (mobile vs. non-mobile) might affect the user’s information searching performance
The average of clicks in mobile are less
False
The first search is more important False
Free recall is worse False
Experiment 1 - Results
Mobile information search for location-based information
Hypothesis 2
The intention was to examine how the information type (location-based vs. non-location-based) might affect the user’s information searching performance The average of clicks in mobile are
lessTrue
The first search is more important True
Free recall is better True
Experiment 2 - Hypotheses
Mobile information search for location-based information
Hypothesis 3
For mobile information searching under high pressure with respect to low pressure info requirement:
Average number of clicks are less
The first search result is more important
Free recall is worse
Experiment 2 - Hypotheses
Mobile information search for location-based information
Hypothesis 4
For mobile information searching of informational or navigational with respect to transactional queries
Number of clicks is greater
The first search result is less important
Free recall is worse
Experiment 2 - Tasks
Experiment 2 - Result
Mobile information search for location-based information
Hypothesis 3
The intention was to examine how the information pressure (high vs. low) requirement might affect a user’s mobile search performance
The average of clicks is less True
The first search is more important False
Free recall is worse False
Experiment 2 - Results
Mobile information search for location-based information
Hypothesis 4
The intention was to examine how the how the location-based information type (informational, navigational vs. transactional) might affect a user’s mobile search performance.
The average of clicks is greater True
The first search results are less important
True
Free recall is worse True
Summary
Mobile information search for location-based information
Information type (location-based vs. non-location-based) was found to be effective in user performance during the information search process
Information requirement pressure and location-based information type (navigational, informational and transactional) affect the mobile search process
The first two search results were found to be very important to good search efficiency and good user satisfaction
Web-a-Where: Geotagging Web Content
Einat Amitay · Nadav Har’El Ron · Sivan Aya Soffer
IBM Haifa Research LabHaifa 31905, Israel
July 2004
Web-a-Where: Geotagging Web Content
Web-a-Where: Geotagging Web Content
Is a system for associating geography with Web pages
Locates mentions of places and determines the place each name refers to
Assigns to each page a geographic focus a locality that the page discusses as a whole
Implemented within the framework of the IBM WebFountain data mining system
Web-a-Where: Geotagging Web Content
Web-a-Where: Geotagging Web Content
Pages may have two types of geography associated
with it: a source and a target.
Source geography has to do with the origin of the
page, the physical location, address of its author,
etc.
Target geography is determined by the contents
of the page and relates to the topic the page is
discussing.
Ambiguities
Web-a-Where: Geotagging Web Content
Geo/non-geo ambiguity is the case of a place
name having another, non geographic meaning
e.g. Mobile (Alabama) or Reading (England)
Geo/geo ambiguity arises when two or more
distinct places have the same name
System Components
Web-a-Where: Geotagging Web Content
Geotagger (Main component)
Finds and disambiguates geographic names
Assigns a taxonomy node to each phrase in the
text to refer to a place e.g., Paris/France/Europe
The gazetteer
Database that keeps the list of geographic names,
their canonical taxonomies and other information
Tagging individual place names
Web-a-Where: Geotagging Web Content
The processing of a page is done in three
phases:
Spotting DisambiguationFocus
determination
1. Spotting place name candidates
Web-a-Where: Geotagging Web Content
Finding all the possible geographic names in each
page
Short abbreviations are not spotted e.g. IN (for
Indiana) or AT ( for Austria) but used to help
disambiguate other spots e.g. Gary, IN
2. Disambiguating spots (Algorithm)
Web-a-Where: Geotagging Web Content
The geotagger assigns a unique meaning to spots
that can be uniquely qualified. Confidence 95%
Combinations that are not unique are left
unassigned
In a page with multiple spots with the same name
where only one is qualified, this value is assigned
to the others. Confidence 80%
Disambiguation contexts are also used to
unassigned spots with confidence less than 70%
2. Disambiguating spot (Data sources)
Web-a-Where: Geotagging Web Content
The Geographic Names Information System
(GNIS) for U.S. locations
world-gazetteer.com for non-U.S. locations
United Nations Statistic Division (UNSD) for
countries and continents
ISO 3166-1 for country and other abbreviations
3. Focus determination
Web-a-Where: Geotagging Web Content
The basic idea is that if several cities from the
same region are mentioned, probably this region
is the focus
Sometimes cannot be said that a page has only
one focus
The confidence score should be taken into
account when finding the focus, giving higher
weight to information coming from locations with
higher confidence
Example
Web-a-Where: Geotagging Web Content
A certain page contained four mentions of Orlando/Florida (assigned confidence 0.5), three Texas (0.75), eight Fort Worth/Texas (0.75), three Dallas/Texas (0.75), one Garland/Texas (0.75), and one Iraq (0.5)
A human was asked to judge what is the geographical focus of this page and responded with “It’s about Texas and perhaps also Orlando”
Indeed, that page comes from the “Orlando Weekly” site, in a forum titled “Just a look at The Texas Local Music Scene...”
Evaluating geotagging precision
Web-a-Where: Geotagging Web Content
CollectionNumber of
pages Accuracy
Arbitrary collection 200 81,7%
.GOV collection 200 73,3%
Open Directory Project (ODP)
200 63,1%
Geotags assigned automatically versus defined manually
Evaluating focus
Web-a-Where: Geotagging Web Content
92% Correct up to country level
8% Incorrect country
38% Precise match
30% Correct state
or city
24% Correct country
4%Correct
continent
4%Continent
wrong
Comparison of Web-a-Where-determined focus to human-determined one (ODP) for ~1 million pages
Summary
Web-a-Where: Geotagging Web Content
The system is able to correctly tag individual
name place occurrences 80% of the time and
define correct focus of a page 92% of the time
Accuracy can be further improved
The main source of errors is geo/non-geo
ambiguity
The design and implementation of SPIRIT
Ross Purves, Paul Clough, Christopher Jones, Avi Arampatzis, Benedicte Bucheri, David Finch, Gaihua Fu, Hideo Joho, Awase Hhirni Syed, Subodh Vaid and
Bisheng Yang
Department of Geography, University of Zurich, Switzerland
Department of Information Studies, University of Sheffield, UK
School of Computer Science, Cardiff University, UK
Institute of Information and Computing Sciences, Utrecht University, Netherlands
Laboratoire COGIT - Institut Geographique National, France
August 2007
The design and implementation of SPIRIT
The design and implementation of SPIRIT
This paper describes the design and implementation
of a complete solution to geographic information
retrieval
Requirements
The design and implementation of SPIRIT
Exhaustive retrieval of relevant documents in a
specified area
Place names should be automatically identified,
and interactively disambiguated
Ability to query for geographical areas whose
boundaries are imprecise
Requirements
The design and implementation of SPIRIT
Spatial concepts relating different geographic
entities should be represented (outside, in)
It should be possible for users to specify the area
of interest on a map
Ability to view query results on a map linked to
relevant web documents
Document ranking should combine both spatial
and thematic aspects of document relevance
Architecture Overview
The design and implementation of SPIRIT
User interface Broker
Relevance ranking
IndexesTextualSpatial
Web data collection
documents
Search Engine
Geographical
ontology
Metadata Doc-to-
footprint mapping
Query disambiguationQuery expansion
Rank results
Search request
Geo-coding
Access indexes
Spatial index
Textual index
Geo-parsing
Run-time
Pre-processing
Functionality of the components
The design and implementation of SPIRIT
Pre-processing the document collection
Assigning spatial footprints to web documents:
Identify geographical references
(geoparsing)
Assign them to spatial
coordinates (geocoding)
Spatial footprint
Functionality of the components
The design and implementation of SPIRIT
Building document indexes
Grid-based spatial indexing
For each cell of the grid, a list of
document ID’s was constructed, using
the document footprints which resulted
from the geo-tagging process
Functionality of the components
The design and implementation of SPIRIT
Retrieving the results: “T” (Text) Scheme
Simplest approach
Retrieve all the documents that match the
concept terms of the query and then filter to
return only those which intersect the
geographical scope of the place in the query
(footprint)
Functionality of the components
The design and implementation of SPIRIT
Retrieving the results: “ST” (Space-Text)
Scheme
More integrated approach
Regarded as a space-primary method
At search time the cells that intersect the query
footprint are determined and then only the
corresponding text indexes are searched
Functionality of the components
The design and implementation of SPIRIT
Retrieving the results: “TS” (Text-Space)
Scheme
Better query response time
Regarded as a text-primary method
At search time, for each term, the associated
documents are grouped according to the spatial
index which they relate to
Query interfaces
The design and implementation of SPIRIT
Results display
The design and implementation of SPIRIT
Evaluation
The design and implementation of SPIRIT
Performance analysis
A relevant document to the query had to be both
thematically and spatially relevant.
In this sense, the key result of the work is that
spatially aware search outperformed text-only
search.
Evaluation
The design and implementation of SPIRIT
Usability analysis
Strongly disagree
Disagree Neutral Agree Strongly agree
0
5
10
15
20
25
30
It was easy to get started with the system and make my query
No, not at all A little Yes, very much0
5
10
15
20
25
30
It was easy to find the locations of doc-uments listed to the right of the map on
the map
Conclusions
The design and implementation of SPIRIT
The paper describes a unified approach, as well
as the architecture, for introducing spatial-
awareness into search-engine technology
A prototype system demonstrated the
effectiveness of the strategy
Personal Conclusions
The design and implementation of SPIRIT
The first study that can lead to changes in search
engines and devices to improve the mobile
experience
The web-a-where system provides good insight
for further location search improving though is
not very precise
SPIRIT is a complete new paradigm in space
aware searching but the interaction methods can
be improved
Thank you
References
General References
[1] M. Sanderson, J. Kohler, Analyzing geographic queries,
in: Proceedings of the SIGIR 2004 Workshop on Geographic
Information Retrieval, Sheffield, UK, 2004.
[2] S. Asadi, Searching the World Wide Web for local
services and facilities: a review on the patterns of location-
based queries, in: WAIM’05, Hong Zhou, China, 2005.
[3] S. Kristoffersen, F. Ljungberg, ‘‘Making Place’’ to make
IT work: empirical explorations of HCI for mobile CSCW, in:
Paper Presented at the International ACM SIGGROUP
Conference on Supporting Group Work, 1999.
References
General References
[4] K. Church, B. Smyth, K. Bradley, P. Cotter, A large scale
study of European mobile search behavior, in: Proceedings
of MobileHCI’08, 2008, pp. 13–22.
[5] M.A. Neerincx, J.W. Streefkerk, Interacting in desktop
and mobile context: emotion, trust and task performance,
in: Paper Presented at the Proceedings of the First European
Symposium on Ambient Intelligence (EUSAI), Eindhoven,
The Netherlands, 2003.