Open Source, Open Data

Preview:

DESCRIPTION

My presentation from Florida Linux Show 2009. Find out how open source's principles are being used outside of software, and how open source and open data can work together to change the world.

Citation preview

Open Source, Open DataKirrily RobertFlorida Linux Show, 2009

From Open Source to Open Data

1993

Me in 1993 My Linux desktop looked like this

1993

• I started using Linux in 1993

• I was very excited by it, even though it was quite primitive at the time

• Other people thought I was a little crazy

Image: Wikipedia Image: Engadget

1999

Google’s servers in 1999Jar Jar in 1999

1999

• By 1999 Linux + open source was starting to take off

• Companies using and building services on Linux etc.

• We were calling it “Open Source” - a more marketable term for Free Software

Four Software Freedomshttp://www.gnu.org/philosophy/free-sw.html

• Freedom to run the program

• Freedom to study the program and modify it for your own use

• Freedom to redistribute verbatim copies

• Freedom to improve the program, and release your improvements

Free Culture

• A similar movement

• Make cultural works freely available

• Mostly over the Internet

Free Culture

Free Culture

Free Culture

Free Culturehttp://wiki.freeculture.org/Free_Culture_Definition

• Freedom to use the work

• Freedom to study the work and to apply knowledge acquired from it

• Freedom to make and redistribute copies

• Freedom to make changes and improvements, and to distribute derivative works

Image: masternewmedia.org

What is Open Data?

Data

Image: himmelskratzer @ Flickr

What is data?

• Ones and zeroes (obviously)

• But also filing cabinets, research archives, and other offline resources

• It’s not OPEN data unless you can get at it

Open Data Freedoms

• Freedom to use the data

• Freedom to study the data and modify it for your own use

• Freedom to make and share verbatim copies

• Freedom to improve the data and redistribute the results

Data availability

• Digital

• Online

• Well formatted

Open Data Projects

public.resource.org

• Created 2007 by Carl Malamud

• “Making Government Information More Accessible”

public.resource.org

• SEC EDGAR records

• Patents database

• Copyright database

• Congressional records

• Legal decisions

• Fedflix

Data.gov

• Founded 2008

• “Increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.”

OpenStreetMap

Compare...

OpenStreetMap

Open Library Project

• CD data

• Tracks, artists, releases...

• CC license

Flickr

• Images

• Metadata• tags, timestamps, geolocations, etc.

• Range of CC licenses and permissive TOS

Infochimps

• Large data sets

• Various licenses

• Tools for transformation

• Open data about “everything”

• 8.5m concepts

• CC-BY license

• API and data dumps

2,416,683 books

16,608 ships

488 cheeses

Structured data { "name": "Asiago cheese" "id": "/en/asiago_cheese", "region": [{ "id": "/en/asiago", "name": "Asiago", "type" : "/location/location"

}], "source_of_milk": [{ "id": "/en/cattle", "name": "Cow", "type" : "/biology/organism_classification" }] }

Open Data Apps

• Apps for America competition

• Open source and open data

• Round 1: various data sources

• Round 2: Data.gov

Legistalker

Filibusted

Where the money goes

Open Source for Open Data

What can open source do?

Input

Processing

Output

Scrape

Munge

Visualise

Scraping data

• APIs• XML, RSS, JSON...

• Downloadable data sets• XML, Excel, CSV, triple dumps...

• Beautiful Soup (Python)• http://www.crummy.com/software/

Munging data

• Perl• http://perl.org/

• R (statistical analysis)• http://r-project.org/

• Hadoop (parallel data processing)• http://hadoop.apache.org/

Visualisations

• MIT Simile• http://simile.mit.edu/

• Processing• http://processing.org/

http://itoworld.com

Semantic Web

• Describe meaning, not markup

• Triples: subject, predicate, object

• Expression: RDF

Linked Open Data

Semantic web tools

• Triple stores• Sesame, BigData, Virtuoso...

• Libraries• RDFLib (Python), Redland RDf (librdf)...

Freebase Acre

Open source for open data

• Low barrier to entry

• Hooks in to Freebase data

• Share and clone apps

• Apps are BSD licensed

FMDB

Gendered names app

Query editor

Clone!

http://freebase.com/developer

Where next?

Open Data: Issues

• License clarity

• Govt + Corporate acceptance

• Developer literacy

• What do we DO with it?

What do we do with it?

What do we do with it?

• 10 years ago we were asking the same questions of Open Source

• With Open Data, we are just starting to realise its potential

• Please join us!

Keep in touch

• Email• kirrily@metaweb.com

• Freebase blog• http://blog.freebase.com/

• Twitter• @fbase

Recommended