Jens Lehmann AKSW Group, University of Leipzig
6 June 2012
Realising and Exploiting the EU Data Cloud
European Data Forum, Copenhagen, Denmark
Dataset Presentations
EU-Level Dataset Development
List of LATC Datasets
Business Legal Institutions
FTS(EU finance)
Eur-Lex(European Law)
EuroStat(Statistical Data)
CORDIS (EU projects, finance)
N-Lex(National Law)
Institution List
Euraxess (EU jobs, companies)
Taxation & Customs EU Who is Who
EURES (EU jobs)
EU Patent Office EU Barometer
EC Competition(market overview)
EU Agencies European Election Results
eSBN(eBusiness solutions)
PreLex(inter-institutional law)
European Parliament Media
UNODC(drugs & crime statistics)
European Central Bank Statitstics
Other: Eventseer, Sciencewise
Total: 22 Datasetshttp://latc-project.eu/datasets/
Financial Transparency System
Step 1: Analysing the Dataset
Financial Transparency System (FTS) contains information about 110000+ EU grants
Contains beneficiaries, amount of funding, year, responsible department, country etc.
Covers years 2007 – 2010
Originally published in HTML, XML and CSV
Financial Transparency System Step 2: Modelling the Data in RDF and OWL
Michael Martin, Claus Stadler, Philipp Frischmuth, Jens Lehmann: Increasing the Financial Transparency of European Commission Project Funding: Semantic Web Journal (Under review)
Financial Transparency System Step 3: Converting the Dataset
Java classes generated automatically from XML Schema
XML data accessible as Java Objects → script based transformation
High flexibility for data cleansing and special cases
Source code of transformation
● https://github.com/AKSW/FTS-EC-2-RDF/
XML
XSD Java Classes
Java Objects RDF
JAXB
TransformationJAXB
Financial Transparency System
Step 4: Publishing the Dataset
Landing Page, Linked Data, SPARQL endpoint, browser at http://fts.publicdata.eu via OntoWiki
Metadata: Datahub
OntoWiki
http://thedatahub.org
Financial Transparency System
Financial Transparency System
Financial Transparency System
Step 5: Enriching the Dataset
Linking with LIMES (http://limes.aksw.org)
Link targets:
● LinkedGeoData: cities● DBpedia: cities, countries, years, schema
Geo-Coding of beneficiaries on city and address level – 45k coordinates
Meta data: author, license, source, statistics using DublinCore, Void, DataCube
Financial Transparency System
Step 6: Queries, Applications, Visualisation
RDF version allows:
● Find organisations with highest funding● Compare funding across countries / beneficiaries● Compare funding per year and country (from FTS)
with gross domestic product (from DBpedia) – see next slide
→ overall increases transparency and may serve as input for research policy strategies
Financial Transparency SystemSELECT * { { SELECT ?ftsyear ?ftscountry (SUM(?amount) AS ?funding) { ?com rdf:type fts-o:Commitment . ?com fts-o:year ?year . ?year rdfs:label ?ftsyear . ?com fts-o:benefit ?benefit . ?benefit fts-o:detailAmount ?amount . ?benefit fts-o:beneficiary ?beneficiary . ?beneficiary fts-o:country ?country . ?country owl:sameAs ?ftscountry . } } { SELECT ?dbpcountry ?gdpyear ?gdpnominal { ?dbpcountry rdf:type dbp-o:Country . ?dbpcountry dbp-p:gdpNominal ?gdpnominal . ?dbpcountry dbp-p:gdpNominalYear ?gdpyear . } } FILTER ((?ftsyear = str(?gdpyear)) && (?ftscountry = ?dbpcountry)) }
Financial Transparency System
European Employment Services
European Employment Services (EURES) cooperation network for free movement of workers in the EU
Publishes 1.2+ mio Job vacancies, 700 000 CVs, 25000 employers
RDF version can be used to:● compare geographical, economic information for new jobs
(DBpedia, LGD)● Salary comparisons relative to standards in job region● Quality of nearby schools
European Employment Services
Neither API nor dump available → site scraping
Modelling considered existing ontologies
Published using D2R: http://www4.wiwiss.fu-berlin.de/eures/
7 mio triples, classes: Offer, Skill, Employer
3000 links to DBpedia cities + regions + countries + languages + currencies, LEXVO languages, Eurostat
Updates can be performed by scraping only new pages
Euraxess
Contains research jobs in EU, 6400 organisations, 1700 open jobs, 61000 registered researchers, 18000 researcher CVs
http://ec.europa.eu/euraxess/
Contains information about people, jobs, skills, languages etc.
links to DBpedia languages and LEXVO languages
Euraxess + EURES Query
Query: aggregates information about jobs and companies in a country from two different sources
SELECT DISTINCT ?job ?company WHERE {SERVICE <http://www4.wiwiss.fu-berlin.de/eures/sparql> { ?job eures:country ?countryjob. ?countryjob a eures:Country. ?countryjob rdfs:label ?n.}SERVICE <http://www4.wiwiss.fu-berlin.de/euraxess/sparql> { ?company euraxess:country ?countrycomp. ?countrycomp a euraxess:Country. ?countryjob owl:sameAs ?countrycomp .}}
Summary / Take Away Messages
Linked Data increasingly important in EU E-Government
Many RDF conversion tools/techniques available depending on source format
Linked Data simplifies data integration – added value by enrichment, e.g. linking to other data sets or schema creation
LOD cloud provides rich background information
Thanks for your Attention!