17
Historical Quantitative Reasoning on the Web Albert Meroño-Peñuela Ashkan Ashkpour

Historical Reasoning on the Web

Embed Size (px)

Citation preview

Page 1: Historical Reasoning on the Web

Historical Quantitative Reasoning on the Web

Albert Meroño-PeñuelaAshkan Ashkpour

Page 2: Historical Reasoning on the Web

Historical Open Data on the Web

• Volume• Velocity• Variety• Veracity

Page 3: Historical Reasoning on the Web

(Historical) Knowledge Discovery

Page 4: Historical Reasoning on the Web

Data Preparation

• Many interesting datasets are messy, incomplete and incorrect

• Data analysis requires clean data• Cleaning data involves careful interpretation and

study• Values and variables in the data are replaced

with (more) standard terms (coding)• Cross-dataset analyses requires a further data

harmonization step

Page 5: Historical Reasoning on the Web

Data Preparation

This ‘data preparation’ step can take up to 60% of the total work

Page 6: Historical Reasoning on the Web

We do this repeatedly for the same datasets!

Page 7: Historical Reasoning on the Web

Linking Social History Data

• Linked Open Data – machine-readable Web graph with 100 billion statements [1]

• Sharing (socio-historical) knowledge for reusability

• Solves integration

[1] http://lodlaundromat.org/

Page 8: Historical Reasoning on the Web

• Tablinker: Conversion of Excel spreadsheets to RDF• Integrator: Attach harmonization rules to the raw RDF• Qber: crowd based, interactive coding and harmonization • LSD Dimensions: index of statistical variables on the Web

Page 9: Historical Reasoning on the Web

http://lod.cedar-project.nl/maps/

Page 10: Historical Reasoning on the Web

http://lod.cedar-project.nl/maps/

N/A

Page 11: Historical Reasoning on the Web

Edit Rules

• Data is good• Knowledge to assess quality of data is good++

Page 12: Historical Reasoning on the Web

http://linkededitrules.org/

• Reusable rules hub• Quality assessment

tool

Page 13: Historical Reasoning on the Web

SCRY

Web standards compatible statistical functions in SPARQL

PREFIX : <http://scry.rocks/example/>PREFIX scry: <http://scry.rocks/>PREFIX impute: <http://scry.rocks/math/impute?>PREFIX mean: <http://scry.rocks/math/mean?>PREFIX sd: <http://scry.rocks/math/stdev?>

SELECT ?obs ?dim ?imputed_val WHERE {?obs a qb:Observation .?dim a qb:DimensionProperty|qb:MeasureProperty .FILTER NOT EXISTS { ?obs ?dim ?val .}?other_obs ?dim ?other_val .

SERVICE <http://sparql.scry.rocks/> { SELECT ?imputed_val { GRAPH ?g1 {impute:v scry:input ?other_val ; scry:output ?imputed_val .} }}}

Delegation of non-standard function to

remote SCRY orb

Page 14: Historical Reasoning on the Web

Don’t like SPARQL? Neither do we!

https://github.com/CEDAR-project/Queries http://grlc.clariah-sdh.eculture.labs.vu.nl/CEDAR-project/Queries/api-docs

Page 15: Historical Reasoning on the Web

Conclusion

• Data preparation: an expensive task (60%)• Linked Data is good for (socio-historical) data

integration on the Web• But data quality issues remain– Linked Edit Rules: rule-hub and data quality

assessment– SCRY: Linked Data compatible statistical functionality– grlc: you don’t need to know Linked Data to use

Linked Data

Page 16: Historical Reasoning on the Web

Refining Statistical Data on the Web

Page 17: Historical Reasoning on the Web

Thank you@albertmeronyo

https://github.com/CEDAR-project/https://github.com/CLARIAH/