36
Publishing EPA Data as Linked Data A brief by Michael Pendleton EPA Office of Environmental Information [email protected]

EPA OEI Linked Data Process

Embed Size (px)

DESCRIPTION

EPA OEI Linked Data Process presentation - 2012.

Citation preview

Page 1: EPA OEI Linked Data Process

Publishing EPA Data as

Linked Data

A brief by Michael Pendleton

EPA Office of Environmental [email protected]

Page 2: EPA OEI Linked Data Process

“We’re moving from managing documents

to managing discrete pieces of open data

and content which can be tagged, shared,

secured, mashed up and presented in the

way that is most useful for the consumer

of that information.”

-- Report on Digital Government: Building a 21st Century Platform to

Better Serve the American People

What is driving us?

Page 3: EPA OEI Linked Data Process

Goal: Make Open Data, Content, and Web APIs the New Default

Page 4: EPA OEI Linked Data Process

Slide Credit: David G. SmithAug 16, 2011 presentation U.S. Environmental Protection Agency

Linked DataWhat’s It All About?

• Speak the Language of the Web• Just as you surf web pages, linked data lets you surf

data.• SOAP was about making the web try to work like

applications; REST was about making applications work like the web.

• Linked Data is about making your DATA work like the web.

4

Page 5: EPA OEI Linked Data Process

RDF is a lingua RDF is a lingua franca for data franca for data

exchangeexchange

Page 6: EPA OEI Linked Data Process

Slide Credit: David G. Smith U.S. Environmental Protection Agency

Linked Data Basics

•Tim Berners-Lee: 5-Star model for publishing data

• http://www.w3.org/DesignIssues/LinkedData.html

6

Page 7: EPA OEI Linked Data Process

•Linked Data is about publishing and consuming data using international data standards

•Based on 20 year old idea (the Web)

•A system of linked information systems

Page 8: EPA OEI Linked Data Process
Page 9: EPA OEI Linked Data Process

Global requirements

•Comprehensively link legislation & regulations for more effective government

•Explain context, source, version & publication date with the data itself

•We need global standards for metadata

Page 10: EPA OEI Linked Data Process

The mission of the Government Linked

Data (GLD) Working Group is to provide

standards and other information which

help governments around the world

publish their data as effective and usable

Linked Data using Semantic Web

technologies.

Page 11: EPA OEI Linked Data Process

Best Practices

Vocabulary Guidance

Community Building

Page 12: EPA OEI Linked Data Process

US EPA publishes lots of CSV files ...

Page 13: EPA OEI Linked Data Process

And now, Linked Open Data ...

• A proof-of-concept launched 2011 with 5 Star Linked Data

• Publication of 1.3M facilities (FRS) and the substances (SRS) regulated by the EPA

• TRI program links to 25 years of data on major polluters

• Additional pilots in 2012 incorporating EPA and anonymized electronic medical records (EMR) data from Sentara Healthcare

• 5 Star Linked Open Data to be hosted & accessible on an EPA production Web site in summer 2012

Page 14: EPA OEI Linked Data Process

• Empower users to create their own views of data to satisfy different applications

• Build a community around the data in which users help each other to curate and connect as needed

• Skip the supermodel - Leave data in the multiple “best of breed” systems; wrap and expose on the Web of Data

Increase re-use by publishing Linked Data

Page 15: EPA OEI Linked Data Process

There is a Process

PublishPublish PublishPublish

ConvertConvert ConvertConvert

DescribeDescribe DescribeDescribe

NameName NameName

ModelModel ModelModel

IdentifyIdentify IdentifyIdentify

MaintainMaintain

Page 16: EPA OEI Linked Data Process
Page 17: EPA OEI Linked Data Process
Page 18: EPA OEI Linked Data Process
Page 19: EPA OEI Linked Data Process

• Identify a dataset others are likely to want to re-use

•Modeling

•Onsite modeling session (half day)

• Linked Data modeling supported by experts

• Validate the model with data owners/stewards

• Publish data on the Web (opendata.epa.gov) per Best Practices

• Produce automated scripts to maintain current data

• Announce Linked Open Data sets *

• Review usage reports to support relevance & user feedback

7 steps to publishing Linked Data

* Pending EPA Systems Security Plan approval

Page 20: EPA OEI Linked Data Process

Open Data Platforms• We’re using Callimachus, a Web platform for data-driven applications based on Linked Data principles.

• It is hosted on Amazon EC2 and we have 24x7x365 data & application support.

• There are other data platforms, we selected this one because it is fully W3C standards compliant, no vendor “lock in”

• It’s Open Source (Apache 2.0)

Page 21: EPA OEI Linked Data Process
Page 22: EPA OEI Linked Data Process
Page 23: EPA OEI Linked Data Process
Page 24: EPA OEI Linked Data Process
Page 25: EPA OEI Linked Data Process
Page 26: EPA OEI Linked Data Process
Page 27: EPA OEI Linked Data Process
Page 28: EPA OEI Linked Data Process
Page 29: EPA OEI Linked Data Process
Page 30: EPA OEI Linked Data Process
Page 31: EPA OEI Linked Data Process

•Linked Data promotes goals of transparency & economic development during times of fiscal austerity

•Publish in reusable format (RDF family of standards)

•Use OPEN vs proprietary in data formats

•Define a URI Policy and Strategy

•Use best practices and vocabularies exist -- don’t recreate the wheel

Recommendations

Page 32: EPA OEI Linked Data Process

Publishing Linked Data will require continual nurturing but the rewards are worth it

Page 33: EPA OEI Linked Data Process

Resources

• VisibleGovernment.ca Website http://visiblegovernment.ca

• Hack, Mash and Peer: Crowdsourcing Government Transparency, Jerry Brito, George Mason University, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1023485

• Blog on UK Environment Agency Water Quality, see http://data.southampton.ac.uk/datasets.html

• Southampton Open Data Service, see http://data.southampton.ac.uk/datasets.html

• Blog post on Clean Energy data from Reegle, see http://blog.semantic-web.at/2012/04/13/reegle-info-linked-open-energy-data-cloud/

• Blog post on Publishing Linked Open Data in Tight Economic Times, 30-Jan-2012, http://3roundstones.com/2012/01/30/publishing-linked-open-data-makes-good-sense-in-tight-economic-times/

• Blog post on HealthData.gov from US Health & Human Services, 4-June-2012, http://www.healthdata.gov/blog/welcome-new-healthdatagov

• Blog post on US HHS Domain Challenge 1: Metadata, 2-June-2012, http://www.healthdata.gov/blog/domain-challenge-1-metadata

Page 34: EPA OEI Linked Data Process

Coming soon ...• Best Practices for Publishing Linked Data (editor’s

Draft 20-Apr-2012), see https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html

• Linked Data Cookbook, see http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook

• Linked Data Directory, see http://dir.w3.org

• Attend the 2012 International Open Government Data Conference co-sponsored by data.gov & The World Bank 10-12 July 2012, Washington DC, see http://www.data.gov/communities/conference

Page 35: EPA OEI Linked Data Process

This work is Copyright © 2011-2012 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/

You are free:

to Share — to copy, distribute and transmit the work

to Remix — to adapt the work

Under the following conditions:

Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).

Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

Page 36: EPA OEI Linked Data Process

CreditsJennifer Bell,

VisibleGovernment.ca(CC-BY-SA)

http://www.slideshare.net/jenniferbell

1-5 Star Linked Data image

http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/

LOD Cloud DiagramsRichard Cyganiak, Anja

Jentzsch, (CC-BY-SA)http://lod-cloud.net/

Book covers © their respective owners and used under Fair Use for educational purposes

© 2012 Bernadette Hyland, released under a CC-BY-SA license