View
1.960
Download
0
Category
Tags:
Preview:
DESCRIPTION
The Web evolves into a Web of Data. In parallel Intranets of large companies will evolve into Data Intranets based on the Linked Data principles. Linked Data has the potential to complement the SOA paradigm with a light-weight, adaptive data integration approach.
Citation preview
Linked Data for Enterprise Information Integration
Sören Auer
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
The Web evolves into a Web of Data
2
Linked Open Data
FacebookOpen Graph
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
The Evolution of the Web
3
Web 1.0 - Hypertext Static Web pages Hyperlinks Link directories
Web 2.0 – Social Apps Social Web Crowd-sourcing Mashups
Web 3.0 – Linked Data REST APIs, RDF,
JSON-LD Vocabularies Rich-snippets,
Semantic Search
1990 2000 2010
Intranet 1.0 - Hypertext Static Intranet pages Keyword search Hyperlinks
Intranet 2.0 –Social Enterprise Apps Salesforce Crowd-sourcing Mashups
Intranet 3.0 –Enterprise Data Intranet URI Scheme Enterprise taxonomies /
knowledge bases RDB2RDF Mapping
1995 2005 2015
& Enterprise Intranets
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Linked Data Principles
1. Use URIs to identify the “things” in your data
2. Use http:// URIs so people (and machines) can look them up on the web
3. When a URI is looked up, return a description ofthe thing (in RDF format)
4. Include links to related things
http://www.w3.org/DesignIssues/LinkedData.html
4
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Linked Enterprise Data Principles
1. Evolve existing existing taxonomies into enterprise knowledge bases/hubs
2. Establish a enterprise wide URI scheme
3. Equip existing information systems in your intranet with Linked Data interfaces
4. Establish links between related information
5
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Linked Enterprise Data Advantages
• Light-weight linked data integration complements more complex SOA architectures
• Unified data (access) model simplifies data integration
• Increase standardization while preserving diversity
• Facilitate information flows along supply and value creation chains
Dramatically reduce data integration costs, increase enterprise flexibility
6
Creating Knowledge out of Interlinked Data
Inter-linking/ Fusing
Classifi-cation/ Enrichment
Quality Analysis
Evolution / Repair
Search/ Browsing/
Exploration
Extraction
Storage/ Querying
Manual revision/ authoring
Linked DataLifecycle
Creating Knowledge out of Interlinked Data
Extraction
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
From unstructured sources
• NLP, text mining, annotation
From semi-structured sources
• DBpedia, LinkedGeoData, DataCube
From structured sources
• RDB2RDF
Extraction
Creating Knowledge out of Interlinked Data
Many different approaches: D2R, Virtuoso RDF Views,
Triplify,
No agreement on a formal
semantics of RDF2RDF
mapping
• LOD readiness,
SPARQL-SQL translation
W3C RDB2RDF WG
Extraction Relational Data
Tool Triplify Sparqlify D2RQ Virtuoso RDF Views
TechnologyScripting
languages (PHP)
Java JavaWhole
middleware solution
SPARQL endpoint - X X X
Mapping language SQL
SPARQL CONSTRUCT Views + SQL
RDF based RDF based
Mapping generation Manual Semi-
automaticSemi-
automatic Manual
ScalabilityMedium-
high(but no
SPARQL)Very high Medium High
Malhotra, Auer, Erling, Hausenblas: W3C RDB2RDF Incubator Group Report. W3C RDB2RDF Incubator Group, 2009.
Creating Knowledge out of Interlinked Data
• Rationale: Exploit existing
formalisms (SQL, SPARQL
Construct) as much as possible
• flexible & versatile mapping
language
• translating one SPARQL query into
exactly one efficiently executable
SQL query
• Solid theoretical formalization
based on SPARQL-relational algebra
transformations
• Extremely scalable through
elaborated view candidate selection
mechanism
• Used to publish 20B triples for
LinkedGeoData
Sparqlify
Stadler, Unbehauen, Auer, Lehmann: Sparqlify – Very Large Scale Linked Data Publication from Relational Databases. Submitted to VLDB-Journal.
SPARQLConstruct
SQLView
Bridge
Creating Knowledge out of Interlinked Data
Storage and Querying
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
Authoring
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
1. Semantic (Text)
Wikis
• Authoring of
semantically
annotated texts
2. Semantic Data
Wikis
• Direct authoring of
structured information
(i.e. RDF, RDF-
Schema, OWL)
Two Kinds of Semantic Wikis
Creating Knowledge out of Interlinked Data
The situation at Daimler (€97.76 billion revenue, 250.000
employees):
• 3.000 heterogeneous IT systems
• Different units (car, bus, truck etc.) with very different
views
• No common language
• Inability to identify crucial entities (parts, locations
etc.) enterprise wide
There is no (can not be a) single Enterprise Information Model
A distributed, iterative, bottom-up integration approach
such as Linked Data might be able to help (pay-as-you-go).
Can Linked Data help to solve the EII problem in a fortune-500 company?
Creating Knowledge out of Interlinked Data
16
Search before
Creating Knowledge out of Interlinked Data
OntoWiki with loaded car model data
Creating Knowledge out of Interlinked Data
OntoWiki with loaded car model data
Creating Knowledge out of Interlinked Data
Management of Enterprise Taxonomies with OntoWikiBased on the W3C SKOS standard
Corporate Language Management at Daimler: 500k concepts in 20 languages
Creating Knowledge out of Interlinked Data
Search afterShowing recommondations from the knowledge base integrating car model data and enterprise taxonomy
Creating Knowledge out of Interlinked Data
You can search for „Kombi“ (station wagon) and find T-Models (Daimler term for station waggon)
From
Intr
anet
to E
nter
pris
e D
ata
Web
aro
und
a kn
owle
dge
hub
Auer, Frischmuth, Klímek, Unbehauen, Holzweißig, Marquardt: Linked Data in Enterprise Information Integration Submitted to Semantic Web Journal 2012.
Creating Knowledge out of Interlinked Data
© CC-BY-NC-ND by ~Dezz~ (residae on flickr)
Linking
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked DataIn an uncontrolled
environment as the Data
Web, there will be a
proliferation of equivalent
or similar entity identifiers
Manual Link discovery:• Sindice integration into UIs
• Semantic Pingback
Semi-automatic:• SILK
• LIMES
Automatic/ Supervised:
• Raven [1]
Linking Entities on the Data Web
[1] Ngonga, Lehmann, Auer, Höffner: RAVEN -- Active Learning of Link Specifications, OM@ISWC, 2011.
Creating Knowledge out of Interlinked Data
Enrichment
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
Linked Data is mainly instance data!!!
ORE (Ontology Repair and Enrichment) tool allows to improve an
OWL ontology by fixing inconsistencies & making suggestions for
adding further axioms.• Ontology Debugging: OWL reasoning to detect inconsistencies and
satisfiable classes + detect the most likely sources for the problems.
user can create a repair plan, while maintaining full control.
• Ontology Enrichment: uses the DL-Learner framework to suggest
definitions & super classes for existing classes in the KB. works if
instance data is available for harmonising schema and data.
http://aksw.org/Projects/ORE
Enrichment & Repair
Lehmann, Auer, Tramp: Class Expression Learning for Ontology Engineering. Journal of Web Semantics (JWS), 2011.
Creating Knowledge out of Interlinked Data
Analysis
Quality
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
Quality on the Data Web is varying a lot• Hand crafted or expensively curated knowledge base
(e.g. DBLP, UMLS) vs. extracted from text or Web 2.0
sources (DBpedia)
Research Challenge• Establish measures for assessing the authority,
provenance, reliability of Data Web resources
Opportunity for EII: Employ crowd-sourced
knowledge from the Data Web in the Enterprise
Linked Data Quality Analysis
FP7-IP DIACHRON Managing the Evolution and Preservation of the Data WebStarted April 2013
Creating Knowledge out of Interlinked Data
Evolution
© CC-BY-SA by alasis on flickr)
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
Exploration
Inter-linking
Enrichment
Quality Analysis
Evolution Repair
Explora-tion
Extrac-tion
Store Query
Authoring
Creating Knowledge out of Interlinked Data
An ecosystem of LOD visualizations
LOD
Exp
lora
tion
Wid
gets
Spatial faceted-browsing
Faceted-browsing
Statisticalvisualization
Entity-/faceted-Based browsing
Domain specificvisualizations … …
LOD
Dat
aset
sCh
oreo
grap
hyla
yer
• Dataset analysis (size, vocabularies, property histograms etc.)• Selection of suitable visualization widgets
Brunetti, Auer, García: The Linked Data Visualization Model. To appear in IJSWIS, 2012.
Creating Knowledge out of Interlinked Data
LOD Life-(Washing-)cycle supported by Debian based LOD2 Stack
http://stack.lod2.eu
Creating Knowledge out of Interlinked Data
Linked Enterprise Intra Data Webs fill the gap between Intra-/Extranets and EIS/ERP
Unstructured InformationManagement
Structured InformationManagement
Support the long tail of enterprise information domains
• Human-resources• Requirements engineering• Supply-chains
Creating Knowledge out of Interlinked Data
• Linked Data is a promising technology for closing the
gap between SOA and unstructured information
management
• wealth of knowledge available as LOD can be
leveraged as background knowledge for Enterprise
applications
• The application of Linked Data in the enterprise is still
largely unexplored (opportunity)
• Linked Data will make Enterprise Information
Integration more flexible, iterative, cost effective
Take home messages
Auer, Frischmuth, Klímek, Tramp, Unbehauen, Holzweißig, Marquardt: Linked Data in Enterprise Information Integration Submitted to Semantic Web Journal.
Creating Knowledge out of Interlinked Data
Thanks for your attention!
Sören Auerhttp://www.informatik.uni-leipzig.de/~auer | http://aksw.org |
http://lod2.orgauer@cs.uni-bonn.de
Recommended