Linked Data Warehouses: A new breed of Business Intelligence

Preview:

DESCRIPTION

Using a Linked Data approach for publication & consumption of data on the Web is significantly reducing the costs and complexity of reaching many more consumers of your content. This presentation highlights how Best Buy, BBC, US EPA and Sentara Healthcare are leveraging a Linked Data approach. Session delivered at Enterprise Data World 2012 in Atlanta GA, USA on 2-May-2012.

Citation preview

Linked Data Warehouses:A New Generation of BI

ENTERPRISE DATA WORLD 2012ATLANTA 2-MAY-2012

By: Bernadette Hyland, Chair, W3C Government Linked Data WG

CEO, 3 Round Stones, Inc

Email. bhyland@3roundstones.comTwitter: @BernHyland

This presentation: http://slideshare.net/3roundstones

Wednesday, May 2, 12

• Linked Data is about publishing and consuming data using international data standards

• Based on 20 year old idea

• A system of linked information systems

Wednesday, May 2, 12

Wednesday, May 2, 12

Photo credit: http://www.flickr.com/photos/sjungling/5974860/

Wednesday, May 2, 12

1970s 1980s 1990s

$ cat foo.txt | grep blah | sort

A neat little package Client-Server The Early Web

A HISTORY OF SILOS

Wednesday, May 2, 12

There is a better way to connect data silos ...•No one vendor owns it•It scales ... to Web-scale•Doesn’t require a super model•Based on International Data Exchange Standards (RDF, SPARQL)

Wednesday, May 2, 12

17%

49%

16%

13%4%

6 months12 months18 months24 monthsMore than 24 months

ACCEPTABLE ROI FOR IT

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

GovernmentsGoals: Governmental transparency and/or improved

internal efficiencies (data warehouses)

Wednesday, May 2, 12

Hardware/Software Vendors

Goal: Improve interoperability between products and product lines

Wednesday, May 2, 12

RetailersGoal: Improve click-throughs on search results

Wednesday, May 2, 12

Book PublishersGoals: Improve internal manuscript pipelines, expose

additional ways of finding and using content

Wednesday, May 2, 12

New Media

Wednesday, May 2, 12

Web

Universal Client

Universal Connection

Universal Database

Logic and interlinking

Ubiquitous,reusable applications

URL Curation

of Data

Linked Data in Context

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Why is RDF important?• It is an international standard for publishing data on the Web (public and private)

•Data exchange model

•Serializations include RDF/XML, N-triples, N3, Turtle, ...

• It is the future of using the Web

Wednesday, May 2, 12

Today’s data warehouses•Data warehouse costs are high•Failure rates are high•Requires a lot of cooperation ...•Vocabulary alignment & data harmonization•Data formats not inter-operable•Cooperation requires coordination•18 months or longer ...

Wednesday, May 2, 12

Alternatives include ...•Use Data Exchange Standards to host structured content

•Create Linked data warehouses

•Faster & less expensive

•Web architecture, Web-scale

Wednesday, May 2, 12

Wednesday, May 2, 12

store name

address

phone

geo

hours

services

ratings

events

Wednesday, May 2, 12

58% of Americans research online before they buy.

Why?

Wednesday, May 2, 12

“We really didn’t go into it with any expectations. We just wanted to see if it was something we might want to do. That’s why we were caught by surprise by

the results… we weren’t really expecting any.”

-- Jay Myers, Lead Development Engineer, Best Buy

Wednesday, May 2, 12

30% increase in organic search results15% increase in click-through rate (CTR)

The impact:

Wednesday, May 2, 12

Paid search

100%

90%

80%

70%

60%

50%

40%

30%0% 10% 20% 30% 40% 50% 60%

House email

SEO

Marketers Reporting “Great” Return on Investment

Usa

ge >

>>

Banners, buttons

Text-link ads

Affiliate MarketingBehavioraltargetingContextual

targeting

Pop-ups/pop-unders

Rich media/video

Rented emaillists

Wednesday, May 2, 12

Wednesday, May 2, 12

BBC WANTEDThe BBC publishes large amounts of content online, as text, audio and video.

As the amount of content grows, we need to make it easy for users to locate items of interest

Wednesday, May 2, 12

"The RDF representations of these web identifiers allow developers to use our data to build applications."

-- Yves Raimond, BBC

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

WE’VE SEEN THIS BEFORE

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

Wednesday, May 2, 12

EMRData

InternalPortal  Data

Linked  DataCloud

Open  Government  Data

Social  Media

Clinical  Condi*on  Specific

PhysiciansServicesLoca*ons

DBpediaPub  MedNLM

CDCEPA

US  Census

FacebookTwiCer

ClinicalOntology

BusinessOntology

Wednesday, May 2, 12

•Decrease costly emergency department visits

•Reduce hospital re-admissions after treatment

• Improved self-care and medication compliance

•Education of triggers and disease management

Value Proposition

Wednesday, May 2, 12

Func*onal  Model

1.  Define  target  popula*on  and  clinical  data  from  electronic  medical  record

2.  Iden*fy  sources  of  open  government  data  related  to  environmental,  weather,  and  other  variables  related  to  chronic  pulmonary  disease  exacerba*ons

3.  Combine  open  content  from  NLM,  PubMed,  Medline  to  support  educa*on

4.  Leverage  a  Linked  Data  approach,  using  Open  Source  and  interna*onal  data  exchange  standards  (RDF)

5.    Alert  pa*ent  of  possible  hazardous  condi*ons  and  recommend  appropriate  ac*ons

Wednesday, May 2, 12

CA-­‐email-­‐message.jpg

Leverage  Linked  Data,  Open  Source  &  Standards

CDCEPA

US  Census

DBpediaPub  MedNLM

Web  of  Data

EMR

SMS

Email

Web

Wednesday, May 2, 12

Wednesday, May 2, 12

Shows:

1) Air Quality data from US EPA

2) Anonymized EMR data

3) Doctor’s details from CSV file

Uses Callimachus,a Linked Data Management Platform

Wednesday, May 2, 12

• Large and small vendors are involved in Linked Data

• From Oracle, IBM to 3 Round Stones

• Listing of active projects, companies and research See http://dir.w3.org/

• Best practices, see http://www.w3.org/2011/gld/charter

Tools & best practices?

Wednesday, May 2, 12

•Callimachus is a framework for data-driven applications based on Linked Data principles

•Callimachus allows Web developers to easily create data driven applications for the Web

•Callimachus Enterprise

•http://3roundstones.com

Wednesday, May 2, 12

“Linked Data means

Cooperation without coordination”

-- David Wood, PhD

Wednesday, May 2, 12

Where the Web has been,

the enterprise is going ...

Wednesday, May 2, 12

• Additional information available on the Web, in books ...

• Open Source Linked Data Management System

http://callimachusproject.org

Bernadette Hyland

Contact me at

@BernHyland

bhyland@3roundstones.com

Wednesday, May 2, 12

If you’d like to learn more ...

http://semtechbizsf2012.semanticweb.com

Wednesday, May 2, 12

Recommended