26
How Linked Data Can Speed Information Discovery Alex Meadows, CSpring Bubba Puryear, Syngenta

How Linked Data Can Speed Information Discovery

Embed Size (px)

Citation preview

Page 1: How Linked Data Can Speed Information Discovery

How Linked Data Can SpeedInformation Discovery

Alex Meadows, CSpringBubba Puryear, Syngenta

Page 2: How Linked Data Can Speed Information Discovery

Agenda Linked Data Overview Case Study: Linked Data At Syngenta Q&A

Page 3: How Linked Data Can Speed Information Discovery

We don’t know your data, it’s Going to take us some time.

-or-We have so many other projectswe’re not sure when we can getto this request.

We’re not sure what we want,but can’t we have it all?

-or-Here’s our requirements, whencan we have this completed?

Business BI Team

Page 4: How Linked Data Can Speed Information Discovery

New source: weeks to monthsExisting source: days to weeks

Page 5: How Linked Data Can Speed Information Discovery
Page 6: How Linked Data Can Speed Information Discovery

What is Linked Data? Coined in 2006 by Tim Berners-Lee Provides vocabulary for every data set Can combine vocabularies Highly structured in triple format

Page 7: How Linked Data Can Speed Information Discovery

Vocabulary: Classes

Page 8: How Linked Data Can Speed Information Discovery

Vocabulary: Properties

Page 9: How Linked Data Can Speed Information Discovery

Triples

Pale Ale

Beer

is a

Mark

Person

Mt. Carmel Brewing Co.

Brewer

Owner of

brews

is a

is a

Has First Name

Page 10: How Linked Data Can Speed Information Discovery

Triples: RDF/XML

Page 11: How Linked Data Can Speed Information Discovery

Option 1: Virtualization

New source: hours to weekExisting source: hours to days

Page 12: How Linked Data Can Speed Information Discovery

Ontop Mapping layer

between SQL and SPARQL

Integrates with many tools (Protégé, Sesame, etc.)

Page 13: How Linked Data Can Speed Information Discovery

Option 2: Lift and Format

New source: days to weeksExisting source: hours to days

Page 14: How Linked Data Can Speed Information Discovery

SPARQLPREFIX beer: http://my.beer.vocab/1.0/SELECT ?brewery NameWHERE { ?brewery beer:hasName ?breweryName ?person beer:owner_of ?brewery ?person beer:first_name “Mark”}

PREFIX beer: http://my.beer.vocab/1.0/SELECT ?beertypeWHERE { ?beer beer:isOfType ?beertype

?person beer:brews ?beer?person beer:first_name “Mark”

<beer:isOfType rdf:resource="beer:PaleAle"/><beer:isOfType rdf:resource=“beer:Lager”/>

<beer:hasName>Mt. Carmel Brewing Company</beer:hasName>

Page 15: How Linked Data Can Speed Information Discovery

Case Study: Linked Data At Syngenta

Page 16: How Linked Data Can Speed Information Discovery

SyngentaSyngenta is a leading agriculture company helping to improve global food security by enabling millions of farmers to make better use of available resources.

We have two primary lines of business: Seeds and Agricultural Chemicals.

We have a huge commitment to internal R&D and that is where our linked data initiatives are.

Page 17: How Linked Data Can Speed Information Discovery

Linked Data at Syngenta Concept Store

Enable Syngenta applications to consume and publish linked data controlled vocabulary (reference terms and relationships)

ENVision ToolEnables trial placements and weightings that best represent target markets

MINT DataMake genetic identity & inventory data available for discovery, analysis and R&D driven proof of concepts

Page 18: How Linked Data Can Speed Information Discovery

What we accomplished In a 3 day hackathon we:

Mapped about 60% of MINT’s model from 2 databases to RDF

Built a virtualized RDF triple store Created a data-discovery / browsing user

interface

Page 19: How Linked Data Can Speed Information Discovery

MINT Data

MINT Browser

Repository Configuration

• Identity

• Material

MINT Ontology

• Identity

• Material

RDBMS-RDF Mapper

RDF Repository

Broker

Open-Sesame

MINT Material

RDBMS JDBC

R2RML Mapping

• Material

Semantic Wiki

SPARQL

Ontology &

Mapping Designer

Ontologist

RDBMS-RDF MapperMINT Identity

RDBMS JDBC

R2RML Mapping

• Identity

Page 20: How Linked Data Can Speed Information Discovery

MINT Class Model The MINT ontology was

created within Protégé as shown here

Page 21: How Linked Data Can Speed Information Discovery

MINT Virtualization Mapping

Page 22: How Linked Data Can Speed Information Discovery

MINT Virtualization Mapping

Page 23: How Linked Data Can Speed Information Discovery

Next Steps Moving from the virtualized layer into actual

physical triple store implementation

Partnering with our benefits tracking team to get accurate metrics on MINT adoption and value

Linking to additional data sources to provide dashboard KPI’s and analytics for our R&D seeds pipeline

Page 24: How Linked Data Can Speed Information Discovery

THANK YOU!

Page 25: How Linked Data Can Speed Information Discovery

About Alex…

Principal Consultant, CSpring https://www.linkedin.com/in/alexmeadows Twitter, GitHub as OpenDataAlex Alex has spent the last ten years working in various industries to

help businesses unlock the information hidden in their data sets. He specializes in open source business intelligence solutions from data warehousing to dashboards, analytics, and beyond. His latest area of research has been on linked data (also known as triple stores). Alex has a Masters in Business Intelligence from Saint Joseph’s University in Pennsylvania and a Bachelors in Business Administration from Chowan University in North Carolina.

Page 26: How Linked Data Can Speed Information Discovery

About Bubba…

Team Leader, R&D IS, Syngenta https://www.linkedin.com/in/bubbapuryear I’ve held roles as a software engineer, architect and manager across

multiple industries. The last 13 years I’ve worked in the life sciences industry supporting Research & Development. I’m currently the program architect / technical lead for a standardization program within Syngenta bringing Track & Trace compliance to R&D’s material operations. Many of Syngenta’s R&D product decisions for our Seeds line of business are founded on data associated with plant material identity. I have a Bachelors degree in Computer Science from Rose-Hulman Institute of Technology.