10
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked Data Dave Reynolds

2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Embed Size (px)

DESCRIPTION

Linked data 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4. Include links to other URIs. so that they can discover more things

Citation preview

Page 1: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

© 2008 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Uncertainty reasoning for Linked DataDave Reynolds

Page 2: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

2 May 8, 2023

Uncertainty reasoning for linked data• Linked data - a strikingly successful model

for exploiting semantic web technology• exhibits uncertainty related issues:

ambiguity, misalignment, reliability• what approach could we take address this?• without losing the simplicity which has

enabled significant adoption

Page 3: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Linked data

1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information,

using the standards (RDF, SPARQL) 4. Include links to other URIs. so that they can discover more

things

Page 4: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Uncertainty in linked data1. Misalignment of instance matches• link datasets by resolving co-references

and publishing links• links published as owl:sameAs (all or nothing)• match errors:

−match uncertainties not accessible−erroneous assumptions (e.g. clinical trial

example)• can partly address by use of skos mapping

vocabulary

Page 5: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Uncertainty in linked data2. Ambiguity from merging datasets• datasets have different assumptions,

definitions, context (esp. time) for different measures

• leads to multiple different valuesE.g. <http://dbpedia.org/resource/London> dbo:populationMetro 12300000;dbp:populationMetro “12,300,000 to 13,945,000”;dbo:populationTotal 7556900;owl:sameAs <http://www.okkam.org/ens/id...>.

<http://www.okkam.org/ens/id...>:population 7421209.

Page 6: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Uncertainty in linked data3. Other issues• Misalignment of models

− e.g. freebase/dbpedia links generated (temporary) problems :Musician owl:equivalentClass :Person

• Source reliability− not unique to linked data but amplifies it

Page 7: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Mitigation approaches?1. Weighted link vocabulary• Develop a simple, common vocabulary for

expressing uncertain co-reference links• Clients or intermediates can choose how to match

the link evidence to equivalence assertions

void:LinkSet

a ur:UncertainLinkSet ur:matchAlorithm alg:JaroStringMatch .

[a ur:WeightedLink; ur:target <…>; ur:match <…>; ur:weight 0.7]

Page 8: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Mitigation approaches?2. Imprecise value vocabulary• Develop a simple, common vocabulary for expressing

imprecise values that can arise from known measurement uncertainty or merge ambiguity

:London :population [a ur:ImpreciseValue

:sampleValue [:value 7556900; :source :dbpedia; :context :year2009];

:sampleValue [:value 7421209; :source :okkam; :context :year2008];

:estimatedValue 7500000] .

Page 9: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Mitigation approaches?3. Override graphs• Allow clients to chose which parts of merged data sources they

adopt (“trust”) and publish that decision• Allow clients to publish deltas to public datasets correcting

merge or other artefacts – per-link and per-assertion granularity

ur:argGraphur:ComputedDataSet

ur:Combinator

ur:Difference Union

void:DataSet

void:DataSet

Page 10: 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Uncertainty reasoning for Linked

Conclusion• multiple issues in ambiguity and

uncertainty in linked data

• proposed problems and solutions illustrative rather than definitive−low hanging fruit−area ripe for contribution