35
Linked Open Data for Public Contracts Martin Nečaský Faculty of Mathematics and Physics, Charles University in Prague Faculty of Informatics and Statistics, University of Economics in Prague 13.6.2013 – Publications Office of the European Union, Luxembourg

Linked Open Data for Public Contracts

Embed Size (px)

DESCRIPTION

Slides for my talk at the Publication Office of the European Union - 13.6.2013

Citation preview

Page 1: Linked Open Data for Public Contracts

Linked Open Data for Public Contracts

Martin NečaskýFaculty of Mathematics and Physics, Charles University in Prague

Faculty of Informatics and Statistics, University of Economics in Prague

13.6.2013 – Publications Office of the European Union, Luxembourg

Page 2: Linked Open Data for Public Contracts

Outline

Introduction to Linked Data What benefits Linked Data bring for TED and

Public Procurement in EU? What does it mean for TED and others to

publish its data as Linked Data? What we have already done in LOD2 project?

Page 3: Linked Open Data for Public Contracts

Linked Data - Introduction

Page 4: Linked Open Data for Public Contracts

Web Applications Eco-system

Linked Data helps to create an eco-system of web applications which publish, enrich and consume data about things in one shared global data space

Shared Global Data Space on the Web (Web of Data)

App 1

App 2

App 3

App 4

App 5

App 4

Page 5: Linked Open Data for Public Contracts

Architecture of Web of Documents

Shared global space of documents

Built on top of several simple principles:

1. HTML as a format for publishing documents

2. URLs as unique global identifiers of documents

3. HTTP for localization and accessing documents by their URLs

4. hyperlinks between documents

There are two kinds of applications working in this space of documents:• web browsers (localizing and

browsing documents through hyperlinks)

• search engines (indexing and full text searching of documents)

HTML

HTML

HTML

HTML

Web browser

Search engine

HTTP

HTTP

Page 6: Linked Open Data for Public Contracts

Web of Documents

Current Web (of Documents) provides lot of data about Prague. Problems• Data about Prague encoded in documents

distributed across the Web• Documents intended for humans not

computers• Documents about Prague or related things

not linked• Therefore, computers not able to process

data about Prague published on the Web http://monitor.statnipokladna.cz

Prague budget

http://registry.czso.cz

Basic info about Prague

http://www.praha.eu

Prague public contracts

http://www.czso.cz

Demography of Prague

http://www.risy.cz

EU funded projects in Prague

Page 7: Linked Open Data for Public Contracts

Web of Documents

Try to search for this information on the current Web• Top 100 suppliers of Prague with

headquarters outside of Prague region.• Money spent in Prague for new children

playgrounds in the last 5 years per one child.

• Organizations in Prague funded by EU structural funds and their top 100 suppliers. http://monitor.statnipokladna.cz

Prague budget

http://registry.czso.cz

Basic info about Prague

http://www.praha.eu

Prague public contracts

http://www.czso.cz

Demography of Prague

http://www.risy.cz

EU funded projects in Prague

Page 8: Linked Open Data for Public Contracts

Linked Data

data published on the Web according to four simple principles (introduced by sir T. B. Lee)1. Use URIs as names for things2. Use HTTP URIs so that people can look up those

names.3. When someone looks up a URI, provide useful

information, using the standards (RDF, SPARQL)4. Include links to other URIs so that they can

discover more things.

Page 9: Linked Open Data for Public Contracts

Things as first-class citizens

ProjectCZ.2.16/2.1.00/22189

Prague City

Prague Council

Prague Demography

Prague Budget

ContractDIL/23/07/007302/2010

Page 10: Linked Open Data for Public Contracts

HTTP URIs for Things

ProjectCZ.2.16/2.1.00/22189

praha.eu (Prague)http://

praha.eu/contract/7302

http://praha.eu/council

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/prague/budget

http://mfcr.cz/prague

risy.cz (Regional Information Service)

http://risy.cz/location/prague

http://risy.cz/contract/22189-

01

http://risy.cz/project/22189

czso.cz (Czech Statistical Office)

http://registry.czso.cz/prague

http://czso.cz/prague

http://czso.cz/prague/

demogstat

Page 11: Linked Open Data for Public Contracts

Data about Things in RDF

Client

Playground RevitalizationAuthority: PragueDelivery date: 31.8.2011Price: 28 444 000 CZK...

Playground Revitalization

28444000 CZK

dcterms:titlepc:contracting

Authority

pc:agreedPrice

gr:hasCurrencygr:hasCurrencyValue

31.8.2011

pc:estimatedEndDate

http://praha.eu/

contract/7302

http://praha.eu/

contract/7302

http://praha.eu/contract/7302/price

http://praha.eu/council

Page 12: Linked Open Data for Public Contracts

<http://www.praha.eu/contract/7302> dcterms:title "Playground Revitalization" ;pc:estimatedEndDate "31.8.2011" ;pc:agreedPrice <http://www.praha.eu/contract/7302/price> ;pc:contractingAuthority <http://www.praha.eu/council> .

<http://www.praha.eu/contract/7302/price>gr:hasCurrency "CZK" ;gr:hasCurrencyValue "28444000" .

Data about Things in RDF

Client

Playground RevitalizationAuthority: PragueDelivery date: 31.8.2011Price: 28 444 000 CZK...

http://praha.eu/

contract/7302

Page 13: Linked Open Data for Public Contracts

Vocabularies published RDF data would be hardly interpretable when

each publisher would use proprietary predicates therefore, standardized (or at least widely used)

predicates should have priority before proprietary ones e.g. Dublin Core, Good Relations, FOAF, schema.org, ... or more specific ones for public procurement

• e.g., Public Contracts Ontology (http://purl.org/procurement/public-contracts )

predicates are defined in so called vocabularies (or ontologies) note: ontology is a special case of vocabulary, it contains more detailed reasoning

rules which is out of scope of this lecture note: not only predicates but also classes (= types of things) are defined in

vocabularies/ontologies

Page 14: Linked Open Data for Public Contracts

Linking URIs of Related Things

praha.eu (Prague)http://

praha.eu/contract/7302

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/prague/budget

http://mfcr.cz/prague

risy.cz (Regional Information Service)

http://risy.cz/location/prague

http://risy.cz/contract/22189-

01

http://risy.cz/project/22189

czso.cz (Czech Statistical Office)

http://registry.czso.cz/prague

http://czso.cz/prague

http://czso.cz/prague/

demogstat

c: hasBeneficiary

a:fundedBy

b:hasBudgethttp://praha.eu/council

d:hasDemography

Page 15: Linked Open Data for Public Contracts

d:hasDemography

Linking URIs of Related Things

praha.eu (Prague)http://

praha.eu/contract/7302

mfcr.cz (Ministry of Finance)

http://mfcr.cz/prague/budget

http://mfcr.cz/prague

risy.cz (Regional Information Service)

http://risy.cz/contract/22189-

01

http://risy.cz/project/22189

czso.cz (Czech Statistical Office)

http://czso.cz/prague/

demogstat

c:hasBeneficiary

a:fundedBy

http://praha.eu/city

http://risy.cz/location/prague

http://registry.czso.cz/prague

http://czso.cz/prague

http://praha.eu/council

owl:sameAs

owl:sameAs

b:hasBudget

Page 16: Linked Open Data for Public Contracts

Linked Data for TED – What are the benefits?

Page 17: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD Problem: It is hard to get a unified view of a chosen thing

(i.e. contracting authority, supplier, contract, contract notice, tender, ...) from TED. The data about the thing is distributed across several contract

notices. LD solution: Each thing has a unique TED HTTP URI which

can be used by third-party applications to get all TED data for this thing. Data is represented as RDF graph respecting openly defined

vocabularies shared across developers and communities. Data include links to URIs of other things on TED. TED can flexibly and continuously extend the data provided for

the thing.

Page 18: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD

User

Web application

?detail=http://ted.eu/contract/CZ/54782145

TED LD Service

http://ted.eu/contract/CZ/54782145

http://praha.eu/contract/

7302

http://praha.eu/contract/7302/

price

http://praha.eu/

council

TED easily assembles data related to the requested contract and returns it as an interconnected graph to the requesting web application.

Page 19: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD

User

Web application

TED LD Service

http://ted.eu/org/CZ/00064581

http://praha.eu/contract/

7302

http://praha.eu/contract/7302/

price

http://praha.eu/

council

TED easily assembles data related to the requested authority and returns it as an interconnected graph to the requesting web application.

click

?detail=http://ted.eu/org/CZ/00064581

Page 20: Linked Open Data for Public Contracts

Problems with HTTP URIs Today, public procurement data are collected from contracting

authorities in a form of contract notices (calls for tender, contract award notices, etc.)

Notices usually do not contain explicit identifiers of contracting authorities and suppliers. These organizations are usually identified in the notices only by names

and addresses which are often misspelled and incorrect. Therefore, if we create an HTTP URI for an organization from one

notice, it is often very hard to recognize whether an organization from another notice is the same one or not.

Therefore, a serious questions arise – how the HTTP URI of an organization (contracting authority/supplier) should look like? How an organization should be identified in a notice so that we are able to unambiguously recognize it?

Page 21: Linked Open Data for Public Contracts

Problems with HTTP URIs There are two possible solutions to this question, both

are very simple from the technical point of view but very complex from the political point of view (enforcement in all EU countries)

1st solution: Some countries define unique mandatory identifiers for

organizations (for both, private companies as well as public institutions).

These identifiers should be present in the notices to identify contracting authorities and suppliers.

We can then use them to recognize organizations and associate them with corresponding HTTP URIs.

Page 22: Linked Open Data for Public Contracts

Problems with HTTP URIs 2nd solution:

Each organization involved in public procurement should have own public profile on the Web with own HTTP URI.

The public profile can be a simple HTML web page which also contains few data encoded in RDF (technically, it is very simple)

The public profile can be a part of the official web site of the organization, e.g. http://praha.eu/public-profile

Or, the organization can use services which can manage public web profiles of organizations. There already exist such services, e.g. http://opencorporates.org

• This service already contains profiles of many organizations, it associates them with HTTP URIs and provides basic RDF data about them (title, address, etc.)

The HTTP URI of the profile should become a part of the notice. This solution also saves some time and money because details about the

organization do not have to be repeated in each notice – each notice is linked to the HTTP URI where the information is present.

• Yes, if you think about the problem that there is only actual information on the profile which can be different than the information which was valid before for some earlier notices, then you are right. But this can be technically solved (e.g. TED and other authorities responsible for collecting public procurement data can back-up those information, etc.).

Page 23: Linked Open Data for Public Contracts

Problems with HTTP URIs 2nd solution:

praha.eu (Prague)

http://praha.eu/public-

profile

company-a.cz (Company A)

http://company-a.cz/public-profile

opencorporates.org

http://opencorporates.org/company-b/public-profile

http://opencorporates.org/company-c/public-profile

...

http://ted.europa.eu/notice/574832

http://ted.europa.eu/notice/575833

pc:contractingAuthority

pc:contractingAuthority

pc:supplier

pc:supplier

Page 24: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD Problem: It is hard to find information related to public

contracts, contracting authorities and suppliers which is published outside of TED somewhere else on the Web, e.g., data from the post-award phase public contracts not published on TED profiles of contracting authorities and suppliers

LD solution: TED publishes the basic data infrastructure of HTTP URIs of public contracts, contracting authorities, suppliers, etc. Others can enrich this basic infrastructure with their own data. The enriched TED datasets can be consumed by third-party

applications and even by TED itself.

Page 25: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD

Shared Global Data Space (Web)

TED Linked Data Basic Infrastructure

Publisher of profiles of CZ

suppliers

Publisher of post-award data of GE contracts

Page 26: Linked Open Data for Public Contracts

Suitable suppliers for a contract

?

Benefits of Publishing TED as LD

Public spending per inhabitant in 2010

Contracts similar to a contract

PC Filing Application

Public spending in Czech Republic "HeatMap"

Application

Page 27: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD Problem: Other authorities must copy TED data to

their databases if they want to use TED data (which includes also republishing TED data). Repeated work for building such databases and their

maintenance is paid from public budgets (!) LD solution: Other public authorities link their primary

data (represented as Linked Data, not necessarily published) to TED without the need to copy, integrate and maintain this data in their database. Anyone who works with the data of such other public

authority can get the data directly from TED if necessary.

Page 28: Linked Open Data for Public Contracts

Benefits of Publishing TED as LD Our planned experiment in Czech Republic in cooperation with Czech Ministry

of Finance (MoF) and data about public contracts

CZ Public Budgets (MoF)

NUTS&LAU CZ regions

CZ Public ContractsDemography (Czech Stat. Office)

Public contracts in Prague with Prague budget and demography statistics?

To show that institutions can share data by linking the data instead of copying them

Page 29: Linked Open Data for Public Contracts

Benefits for StakeholdersContracting Authorities and Suppliers

Unified global data space covering various aspects of public procurement across all EU countries.

contracting authorities They can find similar contracts to their contracts. They can group their calls with other authorities to achieve better offers from

suppliers. They can verify their requirements against requirements of other buyers to

increase quality and completeness of their requirements and ask for better prices. They can search for suitable suppliers who realized similar contracts successfully

in the past. suppliers

They can get necessary information about opened calls for tenders. They can better inform potential customers about their offers. They can analyze previous contracts in their market to better target their tenders

and improve the quality of the services they offer. They can group with other suppliers with complementary offers for joint

tendering.

Page 30: Linked Open Data for Public Contracts

Benefits for StakeholdersEU and Citizens

EU saves money Only basic infrastructure is build and primary data is published

• Related data is published and linked by third-parties There is no need to build and pay for complex applications and services

• These will be built by third-parties not only for citizens but also for contracting authorities and suppliers solely on the base of their demand.

There is no need to duplicate data in different public administration services and applications

• Data is linked instead of copied EU supports building a common market and interoperability (ISA) EU supports transparency

Citizens can more easily monitor what public administrations buy in their city/country, from who and for how much

They can also more easily compare the purchases of their city/country with other cities/countries.

Page 31: Linked Open Data for Public Contracts

Linked Data for TED – What needs to be done to adopt LD principles?

Page 32: Linked Open Data for Public Contracts

LOD lifecycle

Interlinking, fusing

Evolution, repair

Quality analysis

Evolution, repair

Search, browsing,

exploration

Extraction

Storage, querying

Manual revision, authoring

LOD lifecycle supported by LOD2 Stack

http://stack.lod2.eu

Page 33: Linked Open Data for Public Contracts

Public Procurement and LOD2 Project vocabulary for publishing Public Contracts as Linked Data

combination of existing broadly adopted vocabularies and their extension for public procurement (GoodRelations, Payments Ontology, schema.org, Dublin Core, SKOS)

Public Contracts filing application web application for contracting authorities and suppliers It enables to publish data about public contracts as Linked

Data. Contracting authorities can search for similar contracts and

suitable suppliers. Experimental Linked Data from Czech Republic, Great

Britain and TED

Page 34: Linked Open Data for Public Contracts

Experimental Linked Data from Czech Republic, Great Britain and TED created as part of LOD2 project

CZPublic

Contracts

Common Procurement Vocabulary

CZ Business Entities

CZ Demography

Stats

CZPublic

Budgets

DBPedia

TED Public Contracts and Organizations

SDMX

CZLAU Regions

NUTS Regions

(RAMON)

GB Public Contracts and Organizations

Products Ontology

Page 35: Linked Open Data for Public Contracts

Thank You for Your Attention