35
Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies, University at Buffalo

Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

Embed Size (px)

Citation preview

Page 1: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

Semantic Mappingthrough a concept hub

Dagobert Soergel

College of Information Studies, University of Maryland

Department of Library and Information Studies, University at Buffalo

Page 2: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

2

Hub

Water transport

Inland water transport

Ocean transport

Traffic station Water transport⊓

Traffic station Inland water tr.⊓

Traffic station Ocean transport⊓

Dewey

387 Water, air, space transportation

386 Inland waterway & ferry transportation

387.5 Ocean transportation

386.8 Inland waterway tr. > Ports

387.1 Ports

LCSH

Shipping

Inland water transport

Merchant marine

Harbors

GermanHafen

Mapping through a Hub

Page 3: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

3

Outline

• Objective: Interoperability Plus

• KOS concept hub

• Method: Knowledge-based, computer-assisted of canonical representations of concepts

• Resulting knowledge base and applications

Page 4: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

4

Objective

Improve semantic-based search of digital content across multiple collections in multiple languages.

• Interoperability between any two participating KOS(Knowledge Organization Systems)

• Support for search, esp. facet-based search • for any collection indexed by a participating KOS• for free-text search

• Assistance in cataloging (metadata creation) by catalogers or users (social tagging)

• Long-range goal: Web service where a KOS can be uploaded and mappings to specified target KOS are returned

Page 5: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

5

KOS Concept Hub

• The backbone of the proposed system is a faceted core classification of atomic concepts together with a set of relationships

• Interoperability is achieved by expressing concepts from all participating KOS as a canonical representation:description logic formula using atomic concepts and relationships

• Mapping from KOS to KOS is achieved by reasoning over these canonical representations

Page 6: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

6

Hub

Water transport

Inland water transport

Ocean transport

Traffic station Water transport⊓

Traffic station Inland water tr.⊓

Traffic station Ocean transport⊓

Dewey

387 Water, air, space transportation

386 Inland waterway & ferry transportation

387.5 Ocean transportation

386.8 Inland waterway tr. > Ports

387.1 Ports

LCSH

Shipping

Inland water transport

Merchant marine

Harbors

GermanHafen

Mapping through a Hub

Page 7: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

7

Method: How to get DL formulasKey: Efficient creation of canonical representations (DL formulas)

• Apply existing knowledge:Large knowledge base ▬► less effort for processing a new KOS

• Use knowledge of KOS structure for hierarchical inheritance

• Use linguistic analysis of terms and captions

• Eliminate redundant atomic concepts

• Check or produce mapping results from assignment of concepts to the same records

• Get human editors’ input and verification where needed through a user-friendly interface

• KOS “owners” may verify and edit data pertaining to their KOS

Page 8: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

8

Knowledge base

Requires an ever larger classification and lexical knowledge base containing many kinds of data:

1. A faceted classification of atomic conceptsSeeded from sources with well-developed facets such as the AOD Thesaurus, the Harvard Business Thesaurus, the Art and Architecture Thesaurus, various ontologies

2. Linguistic knowledge bases such as Wordnet and mono-,bi-, and multi-langual dictionaries and thesauri

3. Many KOS (Knowledge Organization Systems), such as LCC, DDC, DMOZ directory, LCSH, Gene Ontology, Schlagwortnormdatei

4. These will over time be fused into one large multilingual knowledge base with many terminological and translation relationships and relationships linking terms to concepts, with an increasing number of concepts semantically represented by a DL formula.

Page 9: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

9

Examples of derivingDL formulas

Page 10: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

10

L00 Transportation and trafficL10 Traffic system components

L13 Traffic facilities

L15 Traffic stations

L17 Vehicles

L30 Modes of transportation

L33 Air transport

L37 Water transport

P00 Buildings, constructionP23 Buildings

P27 Architecture

P43 Construction

R00 EngineeringR30 Acoustics

R37 Soundproofing

T70 Military vs. civilian T73 Military

T77 Civilian

Underlying faceted classification

Page 11: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

11

HE Transportation

HE550-560 Ports, harbors, docks, wharves, etc.

L00 Transportation and traffic T77 Civilian⊓

Inherited: L00 Transportation and traffic T77 Civilian⊓

Added by editor:L15 Traffic stations L37 Water transport⊓

Resolved to:L15 Traffic stations L37 Water transport ⊓ ⊓T77 Civilian

Method: Assigning atomic concepts 1

Page 12: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

12

NA6300-6307 Airport buildings From database already established:

Airport = L15 Traffic stations L33 Air transport ⊓

Buildings = P23 Buildings

Added by editor T77 Civilian

Resolved to

L15 Traffic stations L33 Air transport ⊓ ⊓

P23 Buildings T77 Civilian⊓

Method: Assigning atomic concepts 2

Page 13: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

13

TL681.S6 Airplanes. Soundproofing From database already established:

Airplane = L17 Vehicles L33 Air transport ⊓

Soundproofing = R37 Soundproofing

Added by editor: Nothing

Resolved to

L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing

Method: Assigning atomic concepts 3

Page 14: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

14

Aeroplanes-Soundproofing From database already established:

Aeroplanes = Airplane [Spelling variant]

ThereforeTerm is recognized as same asAirplanes. Soundproofing

Resolved to

L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing

Method: Assigning atomic concepts 4

Page 15: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

15

Any class formed by geographical subdivision

Such as

NA6300-6307 Airport buildings

NA6305.E3 Egypt

Recognized using a dictionary of geographical names

Inherits from subject class above it; simply add the country

L15 Traffic stations L33 Air transport ⊓ P23 Buildings T77 Civilian ⊓ ⊓ ⊓

Egypt

No editor checking needed

Method: Assigning atomic concepts 5

Page 16: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

16

Examples from the resulting knowledge base

Page 17: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

17

HE550-560 Ports, harbors, docks, wharves, etc.

NA2800 Architectural acoustics

NA6300-6307 Airport buildings

NA6330 Dock buildings, ferry houses, etc.

TC350-374 Harbor works

TH1725 Soundproof construction

TL681.S6 Airplanes. Soundproofing

TL725-726 Airways (Routes). Airports and landing fields. Aerodromes

VA67-79 Naval ports, bases, reservations, docks

VM367.S6 Submarines. Soundproofing

= L15 Traffic stations L37 Water transport ⊓ T77 Civilian⊓

= P27 Architecture R30 Acoustics⊓

= L15 Traffic stations L33 Air transport ⊓ ⊓P23 Buildings T77 Civilian⊓

= L15 Traffic stations L37 Water transport ⊓ P23 Buildings T77 Civilian⊓ ⊓

= L15 Traffic stations L37 Water transport ⊓ R00 Engineering T77 Civilian⊓ ⊓

= P23 Buildings P43 Construction ⊓ ⊓R37 Soundproofing

= L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing

= L13 Traffic facilities L33 Air transport ⊓ ⊓Technical aspects

= L15 Traffic stations L37 Water transport ⊓ T73 Military⊓

= L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing T73 Military ⊓ ⊓Underwater

Page 18: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

18

Aeroplanes-Soundproofing

Airports-Buildings

Buildings-Soundproofing

Ships-Soundproofing

= L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing

= P23 Buildings L15 Traffic stations ⊓ ⊓L33 Air transport

= P23 Buildings P43 Construction ⊓ ⊓R37 Soundproofing

= L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing

LC subject headings with combinations of atomic concepts

Page 19: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

19

Hub

L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing

L17 Vehicles ⊓ L37 Water transport ⊓ R37 Soundproofing

L17 Vehicles ⊓ L37 Water transport ⊓ R37 Soundproofing T73 ⊓Military⊓ Underwater

LCC

TL681.S6 Airplanes. Soundproofing

VM367.S6 Submarines. Soundproofing

LCSH

Aeroplanes-Soundproofing

Ships-Soundproofing

Mapping through a Hub

Page 20: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

20

Hub

Canonical form of query

(DL formula)

User query

Free text

Combination of elemental concepts through facets (guided query formulation)

Controlled term(s) from a KOS, possibly found through browsing a KOS

Final query

(Enriched) free text query

Query in terms of a KOS

Mapping user queries

Page 21: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

21

TL681.S6 Airplanes. Soundproofing

VM367.S6 Submarines. Soundproofing

Aeroplanes-Soundproofing

Ships-Soundproofing

[L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing]

[L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing Military]⊓

 [L17 Vehicles L33 Air transport ⊓ ⊓R37 Soundproofing]

[L17 Vehicles L37 Water transport ⊓ ⊓R37 Soundproofing]

Query:L17 Vehicles AND R37

Soundproofing

Page 22: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

22

Examples from NALT and LCSH

• NALT National Agricultural Library Thesaurus

• LCSH Library of Congress Subject Headings

Page 23: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

23

Air pollution laws

LCSH term

Air – Pollution – Laws and regulations

[isa] Legal rule [appliedTo] {[isa] Condition [isConditionOf] Air [causedBy] Pollutant [property] Undesirable}

NALT terms

Air pollution

[isa] Condition [isConditionOf] Air [causedBy] Pollutant [property] Undesirable

Laws and regulations

[isa] Legal rule

Mapping LCSH ▬► NALT

Air – Pollution – Laws and regulations ▬► Air pollution AND

Laws and regulations

Interpretation for indexing and searching in both directions

Page 24: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

24

Soil moisture vs. Soil water

LCSH term

Soil moisture

[isa] Water [containedIn] Soil

NALT term

Soil water

[isa] Water [containedIn] Soil

Mapping LCSH ▬► NALT

Soil moisture ▬► Soil water

Page 25: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

25

Greenhouse gardening

LCSH term

Greenhouse gardening

[isa] Gardening [inEnvironment] Greenhouse [inEnvironment] Home

NALT terms

Home gardening

[isa] Gardening [inEnvironment] Home

Greenhouse

[isa] Greenhouse

Mapping LCSH ▬► NALT

Greenhouse gardening ▬► Home gardening AND

Greenhouse

Page 26: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

26

Salad greens

LCSH term

Salad greens

[isa] Green leafy vegetable [usedFor] Salad

NALT term

Green leafy vegetables

[isa] Green leafy vegetable

Mapping LCSH ▬► NALT

Salad greens ▬► BT Green leafy vegetables

Page 27: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

27

Emerging diseases

LCSH term

Emerging infectious diseases

[isa] Disease [hasProperty] Infectious [hasProperty] Emerging

NALT term

Emerging diseases

[isa] Disease [hasProperty] Infectious ??? [hasProperty] Emerging

Mapping LCSH ▬► NALT

Emerging infectious diseases ▬► Emerging diseases

Emerging infectious diseases ▬► BT Emerging diseases

Page 28: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

28

Distributed implementation

• A KOS on the Web could assign DL formulas to its concepts − let's call this a semantically enhanced KOS or SEKOS

• Could use any of a number of faceted core classifications or even several (using a unique URI for each elemental concept)

• Core classifications could be mapped to each other

• It is now a simple matter to map from any SEKOS to any other (somewhat dependent on the core classifications used)

Page 29: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

29

Take-home message

Semantics gives powerful systems

Semantik schafft maechtige Systeme

Page 30: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

30

L

• C

Page 31: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

31

This project will achieve the followingInteroperability between any two participating Knowledge Organization Systems (KOS) (to the extent the two schemes allow)Facet-based search

for any collection indexed by a participating KOSfor free-text search

Assistance in cataloging (metadata creation) by catalogers or users (social tagging)Long-range goal: Web service where a KOS can be uploaded and mappings to specified target KOS are returned

MeansCreate a comprehensive knowledge base relating many classification schemes and subject heading lists used in libraries and in other contexts (LCC, DDC, DMOZ directory, LCSH, European schemes).Use combinations of atomic concepts taken from a well-structured underlying faceted classification to represent the meaning of classes and subject headings.

Page 32: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

32

Page 33: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

33

Hub

Water transport

Inland water transport

Ocean transport

Traffic station Water transport⊓

Traffic station Inland water tr.⊓

Traffic station Ocean transport⊓

Dewey

387 Water, air, space transportation

386 Inland waterway & ferry transportation

387.5 Ocean transportation

386.8 Inland waterway tr. > Ports

387.1 Ports

LCSH

Shipping

Inland water transport

Merchant marine

Harbors

GermanHafen

Mapping through a Hub

Page 34: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

34

HubLCC

LCSH

Mapping through a Hub

Page 35: Semantic Mapping through a concept hub Dagobert Soergel College of Information Studies, University of Maryland Department of Library and Information Studies,

Koeln 20090706• Themen• Role indicators for building themes• arrangement of themes for exploration under user control• carry-over from citation order• Practical problem of connection to the participating systems – should use IDs for

combinations in Hub. Make sure that hub stays consistent with participating systems.

35