Thesaurus based Enterprise Search

Preview:

DESCRIPTION

 

Citation preview

Thesaurus based Enterprise Search

Two Show Cases

Andreas Blumauer

Graz, September 2011

Agenda

• Semantic search scenarios

• The role of thesauri in semantic search

• PoolParty Semantic Search

– Live Demo – http://bit.ly/semantic_search

• Show Cases & Demos

2

3

Semantic search scenarios

Semantic search has many faces

Situations in which semantic search can help

4

I can´t remember how

to spell the search term

I can´t remember

exactly what I was looking for

I want to gain background

knowledge to a certain

document

I want to know more about

this entity in a certain context.

I want to see facts from

different sources describing this

entity.

I want to search in different

languages simultaneously

I want the software to understand

what I mean by „Jaguar“

Knowledge worker´s questions

5

Has anybody solved the problem xy?

Who can I ask about xy? What are the others

currently working on?

What is state of the art in xy?

Four demands for a smarter search

1. Find information faster Provide search assistants

2. Reveal hidden information Enrich the search index with background knowledge

3. Find more specific informationQuery the semantic web

4. Find linked informationIntegrate data sources

6

Find information faster – Auto-Complete

7

I can´t remember how

to spell the search term

To provide powerful auto-complete also for enterprise searchscenarios you need to establish an enterprise vocabulary.

Find information faster – Status quo

8

hydropower plantsSearc

h

I can´t remember

exactly what I was looking for

Small hydro Search

Find information faster with related search terms

9

hydropower plantsSearc

h

http://www.reegle.info/clean-energy-search

Reveal hidden information – Status quo

10

SNCRSearc

h

SNCR OR „Selective non-

Search

I forgot some of the names for the entity I´m looking for

Reveal hidden information with query expansion

11

SNCRSearc

hOR "selective non catalytic reduction"

SNCR

selective non catalytic reduction

alternative Label

preferred Label

Multi-lingual search based on a thesaurus

12

clean energy Searc

hOR energía limpia

clean energy

energía limpiapreferredLabel @es

preferred Label @en

I want to search in different

languages simultaneously

Reveal hidden information and relations

13

Find documentsor images relatedto any other text.

http://poolparty.punkt.at/demozone

I want to gain background

knowledge to a certain

document

Find more specific information – Status quo

14

Goldman SachsSearc

h

3 different contexts for„Goldman Sachs“:• Bond issuer• Analyst• Stock

I want to know more about

this entity in a certain context.

Find more specific information with faceted search

15

facets supportstructured queries

facets helpto drill down search results,adapt dynamically

Zero-result querieswon´t happen anymore

Complex queries with faceted search over linked data

16

„Show me all airlines whose parent company is Lufthansa“

http://dbpedia.neofonie.de/

My Energy-Dossier about

Find linked information – Status quo

17

I want to see facts from

different sources describing this

entity.

The user has to put together manually energy-relatedinformation about a country.

360O views: Find linked information

18

Energy-relatedinformation about countriesare „mashed“ automaticallyby using „linked data“

http://www.reegle.info/countries

Add personal context to the search

19

I want the software to understand

what I mean by „Jaguar“

JaguarSearc

h

20

The role of thesauri in semantic search

How vertical search can benefit from knowledge models

The role of thesauri in semantic search

21

The role of thesauri in semantic search (contd.)

22

Thesaurus as the central pointto control:

• labels & query expansion• facets• refine search mechanisms• metadata integration

Data integration and schema mapping based on thesauri

23

<person> Thomas Miller</person>

Source 1

<employee> Tom Miller</employee>

Source 2

Usage of linked data for semantic search

1. Align thesaurus concepts with DBpedia resources

– disambiguation!

– performance!

2. Enrich concept with category information

– schema.org / DBpedia ontology

– YAGO/Umbel

3. Use category information for concepts

1. to categorize document (usage of transitivities)

2. to provide search facets

24

25

PoolParty Semantic Search (PPS)

Make semantic search come true!

PoolParty System Architecture

26

Search Services

Search Application

Collector<xml>

Semantic Indexer

Document Index

Cartridge

Indexing and Mapping with PoolParty

• Metadata Standards

– Rich metadata in a standardized, extensible format (SKOS / RDF)

– Document metadata is mapped to concepts in the thesaurus

• Cost efficient metadata management

– Thesaurus is managed with PoolParty´s easy-to-use Thesaurus Manager

– One central metadata repository

• Improved end-user experience

– Semantic information improves search experience 27

PoolParty Search API & Standard GUI

28

• Available web services:• Search Service• Suggest Service• Similarity Service

• Supported formats:• JSON• XML• RSS

http://bit.ly/semantic_search

PoolParty Semantic Search Demo – Result

29

http://bit.ly/semantic_search

select properfacets

store querieswith search basket

facets supportstructured queries

find similar documents forrelevant results

specify your querywith categorisedauto-complete

30

Show cases & Demos

Thesaurus based search on the web & intranet

Show Case No. 1: Semantic Search based on reegle thesaurus

3131

Search Services

Search Application

CollectorSemantic Indexer

Document Index

Cartridge

Thesaurus

Projects DB

Web catalogueof actors

Actors DB

Data integration based on Reegle thesaurus

32

<sector> Hydro Power small scale</sector>

Actors DB

<category> Micro Hydro</category>

Web catalogue

Show case No. 2 - www.reegle.info

33

Show Case No. 3: Very large financial institute

3434

Search Services

Search Application

CollectorSemantic Indexer

Document Index

Cartridge

VLFIThesaurus

DMS 1

DMS 2

Contact

Andreas BlumauerManaging Partner, CEOa.blumauer@semantic-web.at

Alexander KreiserSystem Architecta.kreiser@semantic-web.at

35

Semantic Web Company GmbH

Mariahilfer Straße 70A—1070 Wien / Austria

+43-1-4021235

http://www.semantic-web.at/

http://www.poolparty.biz/

http://bit.ly/semantic_searchhttp://lod2.eu/

http://twitter.com/semwebcompany http://linkd.in/oFFnO4