Upload
ontotext
View
252
Download
4
Embed Size (px)
Citation preview
Data Visualization withGraphDB and Workbench
Co-lead, Innovation and Consulting Group, Ontotext Corp
Outline
↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
Ontotext History and Essential Facts
↗ Started in 2000 as a Semantic Web pioneer↗ As Innovation lab within Sirma Group (listed as SKK), the biggest Bulgarian software house
↗ Got spun-off and took VC investment in 2008
↗ 65 staff, HQ in Bulgaria, reps in Canada, UK, Germany and USA
↗ Over 400 person-years invested in R&D ↗ Multiple innovation & technology awards: Washington Post, BBC, FT, BAIT, etc.
↗ Member of multiple industry bodies: ↗ W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation
Clients (selection)
GraphDB
↗ Scalable RDF 1.1 engine
↗ Platform independent
↗ W3C standards support
↗ Open source API
↗ Reasoning and consistency checking
↗ Main contributor to RDF4J project
↗ Excellent support
This webinar
• SPARQL editing and data visualization features available in GraphDB Workbench (GDB WB)
• Using queries written by others: query URL, parameterization• Data visualizations that can be added with little programming• 3rd party SPARQL writing aids and visualization tools that can be
integrated to GraphDB (we'd be glad to do that for you)
• Full report: HTML, PDF
• Webinar: presentation, TODO recording
Outline
↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
SPARQL Editing
• GDB WB integrates the YASGUI editor• Automatic prefix addition (best practice: load prefixes.ttl)• Class autocompletion• Property autocompletion
FactForge Saved Queries
FactForge query F04
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
FactForge Charts: Bar
FactForge Charts: Pie
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
SPARQL Results in Google Sheet FactForge-Industries
Google Sheet Formulas
● Top left cell: get data (see next for the long ugly URL)
=importdata("http://factforge.net/repositories/ff-news?query=%23+F4%3A+Top-level+industries+by
+number+of+companies%0A%23+-+benefits+from+the+mapping+and+consolidation+of+industry+cl
assifications%0A%23+++and+predicates+in+DBPedia+done+in+the+FactForge%0A%23+-+benefits+fr
om+reasoning+-+transitive+and+symmetric+properties+across%0A%23+++the+industry+classificatio
n+taxonomy+of+FactForge%0A%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%
2F%3E%0APREFIX+ff-map%3A+%3Chttp%3A%2F%2Ffactforge.net%2Fff2016-mapping%2F%3E%0A%
0ASELECT+DISTINCT+%3Ftop_industry+(COUNT(*)+AS+%3Fcount)%0A%7B%0A+++%3Fcompany+dbo
%3Aindustry+%3Findustry+.%0A+++%3Findustry+%5Eff-map%3AindustryVariant+%2F+ff-map%3Aind
ustryCenter+%3Ftop_industry+.%0A%7D%0AGROUP+BY+%3Ftop_industry+ORDER+BY+DESC(%3Fcou
nt)+")
● Third col: extract industry name from industry URL
=regexreplace(A2,"http://dbpedia.org/resource/","")
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
Query URL• Interactive endpoint: http://factforge.net/sparql
− versus programmatic endpoint: http://factforge.net/repositories/ff-news
• List of repos as JSON: http://factforge.net/rest/repositories • Get query URL, then replace the endpoint
• If you dislike CSV, add Accept header, e.g.curl -H Accept:text/tab-separated-values
Query Parameters
• E.g. find the industries of a given $companyPREFIX dbo: <http://dbpedia.org/ontology/>SELECT ?industry {$company dbo:industry ?industry}
• Add parameter to query URL (value in NTriples format):&$company=<http://dbpedia.org/resource/Google>− URL: <http://dbpedia.org/resource/Google>− plain string: "Google"− string with language: "Google"@en− date with XSD type: "2017-05-25"^^<http://www.w3.org/2001/XMLSchema#date>
• Try it, returns?industry<http://dbpedia.org/resource/Software><http://dbpedia.org/resource/Internet><http://dbpedia.org/resource/Mobile_device><http://dbpedia.org/resource/Cloud_computing>
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
IRISA SQUALL (CNL)
• SQUALL (Semantic Query and Update High-Level Language). 2011-2013. Paper 1, 2 , 3, examples.
• Example questionWhich person is an author of at least 10 publication-s?
• Translates toSELECT DISTINCT ?x1 WHERE { ?x1 a :person . {SELECT DISTINCT ?x1 (COUNT(DISTINCT ?x3) AS ?x2) WHERE { ?x3 a :publication . ?x3 :author ?x1 .
IRISA SPARKLIS (Faceted SPARQL)
• Project
• Youtube video
• Demo
• Examples
• Paper
GrammaticalFramework and MOLTO• GrammaticalFramework: multilingual CNL• MOLTO: EC FP project. Ontotext publications: 1, 2, 3, 4 • Define abstract grammar about a domain, with surface
grammars for several natural languages• When one of the surface languages is SPARQL, this enables
CNL to/from SPARQL translation
MOLTO: CNL query to SPARQLQuestion in English/Swedish is translated to SPARQL
MOLTO: RDF to NL Generation (Lexicalization)painting description in a dozen languages
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
W3C Data Cube
W3C Data Cube ontology: • OLAP data model
• Statistical classifications following SDMX
Many statistical datasets available as RDF, e.g.:• Linked SDMX Data developed by Sarven Capadisli: International Monetary Fund IMF,
OECD, UN Food and Agriculture Organization FAO, Swiss Federal Statistical Office
BFS, European Central Bank ECB, World Bank, Transparency International.
• Eurostat developed by the LOD Around the Clock (LATC) project (static)
• Eurostat wrapper developed by Benedikt Kämpgen (updateable)
• US Securities and Exchange Commission SEC Edgar Wrapper developed by Benedikt
Kämpgen
• UN ComTrade developed by the Multisensor project
AKSW CubeViz
CubeViz: faceted statistical browser, visualization charts. ● Original project: OntoWiki addon (dependency), PHP: demo, source , wiki, used at the EU Open Data Portal.● Currently being rewritten to JavaScript: demo (doesn't quite work), source
AKSW CubeViz
Polar Chart (EU Digital Agenda Scoreboard)
OpenCube Toolkit
OpenCube Toolkit developed by OpenCube project. Tools for:Data Creation (conversion)• TARQL extension: CSV/TSV files• D2RQ extension for data cubes: relational databases • JSON-stat2qb extension: JSON-stat• R2RML extension: relational databases, following W3C standardData Expanding• OpenCube Compatibility Explorer: (a) search LOD and find cubes compatible to expand initial cube, (b) establish
typed links• OpenCube Aggregator: (a) creates 2n−1 new cubes: all combinations of n dimensions. (b) new observations for all
attributes of a hierarchical dimension.• OpenCube Expander: merge two compatible cubes.Data Exploring• Data catalogue management: user interface (UI) templates for managing metadata on RDF data cubes and
supporting search and discovery• OpenCube Browser: table-based visualizations • OpenCube OLAP Browser: OLAP operations: pivot, drill-down, and roll-up• R statistical analysis: run R data analysis scripts • Interactive chart visualization widgets: cube slices with charts• OpenCube MapView: visualize geo-spatial dimension: chroplet, markers, bubbles
CubesViewer
• CubesViewer: excellent OLAP visualization tool: demo, CubesViewer Studio demo, source, documentation.
• Based on DataBrewery Cubes framework: source, documentation. • Unfortunately does not yet support W3C Cubes
− We'd love to develop such feature for you (tracking issue)
CubesViewer
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
GDB WB Builtin Overview: Class Relations
GDB WB Builtin: Class Instances & Hierarchy
GDB WB Builtin: Domain/Range Graph
GDB WB Builtin Detail: Visual Graph
GDB WB Visual Graph: Relations of Google
GDB Graph Viz Dev: Company RelationsGDB Dev Hub: Visualizing GraphDB data with Ogma JS (library developed by Linkurious)
GDB Graph Viz Dev: Flight Routes
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
Visualization Toolkits
Numerous powerful and popular visualization tools, creating an amazing variety of graphs and charts, e.g.:
● d3.js, with addons (e.g. interactive selection of chart type)● Tableau Public edition● Microsoft PowerBI● GoJS● Google Charts● LinkuriousSpecialized tools, e.g.● CrossFilter for "faceting" of multidimensional data,● Cubism for viewing time series● CubeViz and OpenCube Toolkit for statistical data● Histropedia for making advanced timelines
Example with GDB and TableauPublic procurement spending through last 5 Bulgarian cabinets (2011-2016). Sofia Datathon, March 2017. Slides, Visualization
Example with GDB and PowerBIProcurements by one contracting authority in time. Filtering by government cabinet, focusing by time interval. Sofia Hackathon, Apr 2017
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
RDF by Example• ONTO tool for RDF instance visualization (rdfpuml) and R2RML generation (rdf2rml). • E.g. mapping Dun & Bradstreet company data to Financial Industry Business Ontology (FIBO)
RDF by Example• Dun & Bradstreet details (top-right): 3 "measures" (NetWorth, AnnualSales, ProfitLoss) • Total of 152 fields grouped in 32 nodes: impossible to comprehend without such diagram
R2RML Generation• Model of Museum Exhibitions (for J. Paul Getty Museum)• Includes RDB joins and field names (Gallery TMS)
R2RML Generated From Model• R2RML is verbose: 3 nodes, 15 statements for every model statement
• 1 model node (representing an Exhibition at a Venue) is expanded to
15 R2RML nodes: huge savings in complexity and maintainability
• R2RML requires semantic experts, whereas model diagrams can be
understood by subject-matter experts (museum curators, commodity
trade analysts, etc)
• Details in SWIB'16 presentation
R2RML Generated From Model: Detail
Outline↗ Intro: Ontotext, GraphDB, Webinar
↗ Writing SPARQL
↗ Built-in SPARQL Result Visualizations
↗ Using SPARQL Results in Spreadsheets
↗ Invoking SPARQL Queries, Parameterization
↗ Tools that Help With Writing SPARQL Queries
↗ Tools for Statistical Visualizations
↗ Graph Visualizations: Built-in, Developing
↗ Visualization Toolkits
↗ Declarative Visualization
↗ JDBC Data Access API
↗ Q&A
Why JDBC/ODBC?
• Many viz tools (e.g. Pentaho, Centrifuge, QlikView, Tableau) have ODBC/JDBC interfaces
• To save effort of constructing query URLs and saving results, we can provide a JDBC API
to GraphDB
• The user feeds SPARQL (not SQL queries) through JDBC, SPARQL tabular results are
returned to the tool
• We can reuse Jena JDBC or another open source library
• If the tool supports ODBC not JDBC, we can use the JDBC-ODBC bridge
(sun.jdbc.odbc.JdbcOdbcDriver).
• E.g. connecting from Java to Excel using ODBC and the JDBC-ODBC bridge
• Contact: [email protected]
Lead, Innovation and Consulting Group, Ontotext Corp
• We'd be glad to deploy any 3rd party tools and integrate them to GraphDB for you!
Thanks for your attention.Question time!
DOWNLOAD GRAPHDB FREE