24
Enabling Semantic Search in Geospatial Metadata Catalogue to Support Polar Sciences Wenwen Li 1 and Vidit Bhatia 2 1 GeoDa Center for Geospatial Analysis and Computation School of Geographical Sciences and Urban Planning 2 Department of Computer Science Ira. Fulton School of Engineering Arizona State University 10-29-2013 1

Enabling Semantic Search in Geospatial Metadata Catalogue ... · Enabling Semantic Search in Geospatial Metadata Catalogue to Support Polar Sciences Wenwen Li1 2and Vidit Bhatia 1GeoDa

  • Upload
    others

  • View
    19

  • Download
    0

Embed Size (px)

Citation preview

Enabling Semantic Search in Geospatial Metadata Catalogue to Support Polar Sciences

Wenwen Li1 and Vidit Bhatia2 1GeoDa Center for Geospatial Analysis and Computation

School of Geographical Sciences and Urban Planning 2Department of Computer Science Ira. Fulton School of Engineering

Arizona State University 10-29-2013

1

Outline

Motivation

Intelligent geospatial data discovery

A knowledge mining approach

Integrated solution for semantic-enabled metadata catalogue

Experimental results

Demo

Conclusion and discussion

2

Towards a ‘CyberPolar’ Community

Studies about polar regions has attracted great attention:

Increasing interest in mining and natural resource exploration

Both poles are sensitive to human activities and global, environmental, and climate changes

Polar regions are key drivers of the Earth climate

“President’s National Strategy for the Arctic Region”

Released in May 2013 by the white house

Overarching stewardship objectives:

“Make decisions using the best available information”

3

Towards a ‘CyberPolar’ Community

Earlier efforts:

Virtual Arctic Spatial Data Infrastructure (Li et al. 2011)

4

Existing work

5

Existing work

6

Problem Statement

Current full-text search cannot address fuzzy search need

“Frozen ground” vs. “Permafrost”.

Gap between the best available data and the most suitable research applications.

A smart mechanism for data discovery is needed to bridge the gap.

7

Intelligent Geospatial Data Discovery

Ontology-based approach

Formalize the definition of domain knowledge

Classes, individuals, properties, relationships

Current Work:

LEAD (Droegemeler et al. 2005)

VSTO (Fox et al. 2008)

SWEET based semantic search (Li et al. 2011, Li et al. 2012a)

8

Limitation

Hard to model spatial relationship (Shi, 2011)

Equal, within, touch, disjoint, intersect..

Hard to build a consensus domain ontology (ontology mapping)

Limited spatial reasoning capability

Similarity reasoning

Rodriguez and Egenhofer, 2004; Janowicz et al. 2008; Li et al. 2012a 9

Ontology Full Name Creator

SWEET Semantic Web for Earth and Environmental Terminology

NASA JPL

CUAHSI Consortium of Universities for the Advancement of Hydrologic Science

CUAHSI

MMI Marine Metadata Interoperability NSF

INSPIRE Infrastructure for Spatial Information in Europe European Commission

GEMET General Multilingual Environmental Thesaurus EEA, ETC/CDS

A Knowledge Mining Approach

Methodology:

LSATTR (Li et al. 2012)

Latent Semantic Analysis and Two Tier Ranking

Identify latent semantic association among terminologies through singular value decomposition and lower rank estimation

Proposed a new similarity measure (first tier):

Second tier: rank by number of exact match

10

LSA , SVD and Explaination

Singular Valued Decomposition

Amn = UmmSmnVTnn

Reduced Singular valued Decomposition

LSA

11

An Integrated Solution

Combine LSATTR and Geonetwork to support semantic enabled metadata catalogue

Why Geonetwork?

12

Service-Oriented Extension of Metadata Catalogue

13

14

Business logic of semantic search flow

Creation of Semantic Association Matrix

15

Count Matrix

Stemming

TF*IDF Matrix

Semantic Matrix

Demo

16

Performance Evaluation

Factors:

Precision

Recall

Experiments:

How does the degree of rank reduction impact the search performance?

If the proposed algorithm introduce enhancement to the search capability?

17

Queries used

offshore hydrocarbon

offshore hydrocarbon mining

Marine biology data

Frozen ground Canada

Sediment data Yukon

Canada air pilot

Earthquake magnitude data Arctic

Sea level increase climate

21

Comparison of Recall

22

Comparison of Precision

23

Summary

A data mining approach to improve geospatial data discovery

LSA outperforms full-text search

An alternative approach to establish domain ontology to complement the top-down approach

24

Future Work

Investigate the impact of stemming algorithm to the search effectiveness

Further improve the performance of semantic search algorithm to handle big spatial data

25

References

W. Li, Automated Data Discovery, Reasoning and Ranking in Support of Building an Intelligent Geographic Search Engine, Ph.D. Dissertation. George Mason University, August 2010.

W. Li, C. Yang, D. Nebert, R. Raskin and H. Wu, 2011. Semantic-based service chaining for building a virtual Arctic spatial data infrastructure. Computers & Geosciences. 37(11), 1752-1762.

W. Li, R. Raskin and M.F. Goodchild, 2012a. Semantic similarity measurement based on knowledge mining: an artificial neural network approach. International Journal of Geographic Information Science, 26(8), 1415-1435.

W. Li, M.F. Goodchild and R. Raskin, 2012b. Towards geospatial semantic search: exploiting latent semantic relations in geospatial data. International Journal of Digital Earth, DOI:10.1080/17538947.2012.674561.

26

Acknowledgement

National Science Foundation Polar Cyberinfrastructure Program

Dr. Nancy Wiegand for inviting us to share our research outcome

27