Upload
mike-bergman
View
8.389
Download
0
Embed Size (px)
DESCRIPTION
Mike Bergman presents an overview geared to laypersons for why semantic technologies make the best choice for knowledge applications
Citation preview
Michael K. Bergman
July 2012
The Rationale for
Semantic Technologies
2
Outline
§ Nature of the World
§ Knowledge Representation, Not Transactions
§ The New Open World Paradigm
§ Integrating All Forms of Information
§ Connections Create Graphs
§ Network Analysis is the New Algebra
§ Information and Interaction is Distributed
§ The Web is the Perfect Medium
§ Leveraging – Not Replacing – Existing IT Assets
§ Democratizing the Knowledge Function
§ Seven Pillars of the Semantic Enterprise
§ Summary of Semantic Technology Benefits
3
Some Caveats
Semantic technologies are NOT: Cloud computing Big data Necessarily open data “One ring to rule them all” A replacement for current IT systems
These ideas are mostly orthogonal to semantics
4
Nature of the World
Messy
Complicated
Interconnected
Changing
Interdependent
Uncertain
Diverse
5
Nature of Knowledge
Knowledge is never complete
Knowledge is found in structured, semi-structured and unstructured forms
Knowledge can be found anywhere
Knowledge structure evolves with the incorporation of more information
Knowledge is contextual
Knowledge should be coherent
Knowledge is about its users defining its structure and use
Knowledge ≡ Nature of the World
6
Knowledge Representation, Not Transactions
KR functions: Search Business intelligence Competitive intelligence Planning Data federation Data warehousing Knowledge management Enterprise information integration Master data management
Traditional IT has been transaction-oriented e.g., “Seats on a plane”
7
Current Approaches Have Failed
Relational databases: Structured data only Inflexible, fragile Constant re-architecture
Business intelligence: Slow, inflexible Structured data only IT-constrained, not user-driven
Extract, Transfer, Load (ETL): Structured data only Inflexible, fragile
High $$$, incomplete, not adaptable
8
A 30-yr Quest to Integrate Content
Content and data federation has been insolvable for 30 years since IT systems first adopted:
Structured + semi-structured + unstructured content Data “silos” and unconnected systems Incompatible protocols and hardware 85% of content not in databases Semantic heterogeneities No universal data model
9
The New Open World Paradigm
Opposite logic of closed-world transactions
The open world assumption (OWA) means: Lack of a given assertion does not imply whether it is true or
false: it simply is not known A lack of knowledge does not imply falsity Everything is permitted until it is prohibited Schema can be incremental without re-architecting prior
schema (“extensible”) Information at various levels of incompleteness can be
combined
The right logic for KR problems
10
Integrating All Forms of Information
Uses a “canonical” data model (RDF)
RDF is a universal solvent for all information: Unstructured data – text, images Semi-structured data – markup, metadata Structured data – databases, tables
“Soft” (social, opinion) + “hard” (facts) information
RDF can represent simple assertions (“Jane runs fast”) to complex vocabularies and languages
Generic tools can be driven by the RDF data model
11
Integrated Data and Tools using RDF
12
Connections Create Graphs
Things and concepts create nodes
Relationships between things create connections (“edges”)
Adding things leads to more connections
More connections leads to more structure
Coherent structure leads to more knowledge and understanding
The natural structure of knowledge domains is a graph
13
Graphs Grow Naturally with Knowledge
14
Benefits of Graphs (ontologies)
Coherent navigation
Flexible entry points
Inferencing
Reasoning
Connections to related information
Ability to represent any form of information
Concept matching integrate external content
A framework for disambiguation
A common vocabulary to drive content “tagging”
15
Network Analysis is the New Algebra
Network analysis provides new tools for gauging: Influence Relatedness Proximity Centrality Inference Shortest paths Diffusion
Graphs can represent any structure
Many structures can only be represented by graphs
16
Information and Interaction is Distributed
Knowledge is everywhere
People and stakeholders are everywhere
External information needs to be integrated with internal information
A uniform access protocol/framework is desirable to: Preserve existing information assets Reflect the diversity of data formats
17
The Web is the Perfect Medium
All information may be accessed via the Web
All information may be given Web identifiers (URIs)
All Web tools are available for use and integration
All Web information may be integrated
Web-oriented architectures (WOA) have proven: Scalability Robustness Substitutability
Most Web technologies are open source
18
A Distributed Web-oriented Architecture
19
Leveraging – Not Replacing – Existing IT Assets
Existing IT assets represent: Massive sunk costs Legacy knowledge and expertise Stakeholder consensus Yet, still stovepiped
Semantic technologies are an interoperability layer over existing IT assets
Preserve prior investments while enabling interoperability
20
Democratizing the Knowledge Function
Move from bespoke software to knowledge graphs
Knowledge graphs can be constructed and modified by: Subject matter experts Employees Partners Stakeholders General public
Graph-driven applications can be made generic by function, visualization
Graph-driven applications democratize KR
21
Seven Pillars of the Semantic Enterprise
22
Summary of Semantic Technology Benefits
Can deploy incrementally lower risks
lower costs
Excellent integration approach
No need to re-do schema because of changed circumstances
Leverages existing information assets
Well-suited for knowledge applications
Can accommodate multiple viewpoints, stakeholders
Leadership visibility to the Forum