8
Linked Data and Visualization: Two Sides of the Transparency Coin Auriol Degbelo Institute for Geoinformatics, University of Muenster Heisenbergstrasse 2, 48149 Muenster, Germany [email protected] ABSTRACT Transparency is an important element of smart cities, and ongo- ing work is exploring the use of available open data to maximize it. is position paper argues that Linked Data and visualization play similar roles, for dierent agents, in this context. Linked Data increases transparency for machines, while visualization increases transparency for humans. e work also proposes a quantitative approach to the evaluation of visualization insights which rests on two premises: (i) visualizations could be modelled as a set of state- ments made by authors at some point in time, and (ii) statements made by experts could be used as ground truth while evaluating how much insights are eectively conveyed by visualizations on the Web. Drawing on the linked data rating scheme of Tim Berners-Lee, the paper proposes a ve-stars rating scheme for visualizations on the Web. e ideas suggested are relevant to the development of techniques to automatically assess the transparency level of existing visualizations on the Web. CCS CONCEPTS Information systems Web searching and information dis- covery; Human-centered computing Visualization design and evaluation methods; KEYWORDS Visualization, Linked Data, Transparency, Rating Scheme, Evalua- tion, Smart City ACM Reference format: Auriol Degbelo. 2017. Linked Data and Visualization: Two Sides of the Transparency Coin. In Proceedings of UrbanGIS’17:3rd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics , Redondo Beach, CA, USA, November 7–10, 2017 (UrbanGIS’17), 8 pages. DOI: 10.1145/3152178.3152191 1 INTRODUCTION Smart cities are aracting growing interest from various stakehold- ers. As for research, Ojo et al. [32] recently reported an increase of 200 % in publication volume for smart cities research since 2009. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permied. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from [email protected]. UrbanGIS’17, Redondo Beach, CA, USA © 2017 ACM. 978-1-4503-5495-0/17/11. . . $15.00 DOI: 10.1145/3152178.3152191 As for local governments, a survey conducted in the US in 2016 by the International City/County Management Association (in co- operation with the Smart Cities Council) revealed that more than 50% of the respondents identied smart cities activities as being of medium to high priority to them 1 . Several industry players (e.g., IBM 2 , Microso 3 , Siemens 4 ) have their own smart city solutions, while various toolkits are currently on the market to make cities smarter (see [10]). ere are various possible ways of dening “smart cities”. In this work, the term is used to denote “a system integration of technological infrastructure that relies on advanced data processing with the goals of making city governance more ecient, citizens happier, businesses more prosperous and the en- vironment more sustainable” [42]. Citizen participation has been acknowledged in the literature as a key component of smart cities (e.g., [7, 11, 21]). Inputs from citizens indeed have the potential of providing “information, knowledge, and experience, which contributes to the soundness of government solutions to public problems” [29]. Transparency is an important component of citizen participation. For instance, Johannessen and Berntzen stressed that “A key aspect of participation is that of transparency” [21]; Kim and Lee [24] reported on some positive correlations between citizen engagement and transparency; and Aard et al. [2] pointed out that transparency is one major moti- vation of open government data initiatives, and can help citizens establish a trusting relationship with the government. Linked Data and visualization are two important technologies in the context of smart cities. As to the former, Dadzie and Pietriga [8] indicated that Linked Data provides a structured source of knowl- edge for both research and practical applications, and has been adopted as a practice in various open data initiatives. d’Aquin et al. [9] highlighted the relevance of Linked Data to deal with the diversity of smart city data. Regarding the laer, there is a growing amount of visualizations on the Web targeting city use cases (for examples of collections, see hps://goo.gl/PFGqua and hps://goo.gl/QCHFZM). is conrms Dykes et al.’s [13] early hunch that visualizations have a key role to play as one strives to make sense of city data that are being collected. An early discussion of the potential role of Geographic Information Systems for smart cities pointed out that “Geo-visualization is the most essential in- strument of representing geographic data and analysis results and exploring potential interesting ndings” [39]. is position paper argues that (as far as transparency and smart city are concerned), Linked Data and visualization play a similar 1 See hps://goo.gl/SCYjUH (last accessed: September 06, 2017) for the full report. 2 hps://goo.gl/xe1niw (last accessed: September 06, 2017). 3 hps://goo.gl/45vL6w (last accessed: September 06, 2017). 4 hps://goo.gl/q86QrA (last accessed: September 06, 2017).

Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

Linked Data and Visualization: Two Sides of the TransparencyCoin

Auriol DegbeloInstitute for Geoinformatics, University of Muenster

Heisenbergstrasse 2, 48149Muenster, Germany

[email protected]

ABSTRACTTransparency is an important element of smart cities, and ongo-ing work is exploring the use of available open data to maximizeit. �is position paper argues that Linked Data and visualizationplay similar roles, for di�erent agents, in this context. Linked Dataincreases transparency for machines, while visualization increasestransparency for humans. �e work also proposes a quantitativeapproach to the evaluation of visualization insights which rests ontwo premises: (i) visualizations could be modelled as a set of state-ments made by authors at some point in time, and (ii) statementsmade by experts could be used as ground truth while evaluatinghow much insights are e�ectively conveyed by visualizations on theWeb. Drawing on the linked data rating scheme of Tim Berners-Lee,the paper proposes a �ve-stars rating scheme for visualizations onthe Web. �e ideas suggested are relevant to the development oftechniques to automatically assess the transparency level of existingvisualizations on the Web.

CCS CONCEPTS•Information systems→Web searching and information dis-covery; •Human-centered computing→Visualization designand evaluation methods;

KEYWORDSVisualization, Linked Data, Transparency, Rating Scheme, Evalua-tion, Smart CityACM Reference format:Auriol Degbelo. 2017. Linked Data and Visualization: Two Sides of theTransparency Coin. In Proceedings of UrbanGIS’17:3rd ACM SIGSPATIALWorkshop on Smart Cities and Urban Analytics , Redondo Beach, CA, USA,November 7–10, 2017 (UrbanGIS’17), 8 pages.DOI: 10.1145/3152178.3152191

1 INTRODUCTIONSmart cities are a�racting growing interest from various stakehold-ers. As for research, Ojo et al. [32] recently reported an increaseof 200 % in publication volume for smart cities research since 2009.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor pro�t or commercial advantage and that copies bear this notice and the full citationon the �rst page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permi�ed. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior speci�c permission and/or afee. Request permissions from [email protected]’17, Redondo Beach, CA, USA© 2017 ACM. 978-1-4503-5495-0/17/11. . .$15.00DOI: 10.1145/3152178.3152191

As for local governments, a survey conducted in the US in 2016by the International City/County Management Association (in co-operation with the Smart Cities Council) revealed that more than50% of the respondents identi�ed smart cities activities as being ofmedium to high priority to them1. Several industry players (e.g.,IBM2, Microso�3, Siemens4) have their own smart city solutions,while various toolkits are currently on the market to make citiessmarter (see [10]). �ere are various possible ways of de�ning“smart cities”. In this work, the term is used to denote “a systemintegration of technological infrastructure that relies on advanceddata processing with the goals of making city governance moree�cient, citizens happier, businesses more prosperous and the en-vironment more sustainable” [42].

Citizen participation has been acknowledged in the literature as akey component of smart cities (e.g., [7, 11, 21]). Inputs from citizensindeed have the potential of providing “information, knowledge,and experience, which contributes to the soundness of governmentsolutions to public problems” [29]. Transparency is an importantcomponent of citizen participation. For instance, Johannessen andBerntzen stressed that “A key aspect of participation is that oftransparency” [21]; Kim and Lee [24] reported on some positivecorrelations between citizen engagement and transparency; andA�ard et al. [2] pointed out that transparency is one major moti-vation of open government data initiatives, and can help citizensestablish a trusting relationship with the government.

Linked Data and visualization are two important technologies inthe context of smart cities. As to the former, Dadzie and Pietriga [8]indicated that Linked Data provides a structured source of knowl-edge for both research and practical applications, and has beenadopted as a practice in various open data initiatives. d’Aquinet al. [9] highlighted the relevance of Linked Data to deal withthe diversity of smart city data. Regarding the la�er, there is agrowing amount of visualizations on the Web targeting city usecases (for examples of collections, see h�ps://goo.gl/PFGqua andh�ps://goo.gl/QCHFZM). �is con�rms Dykes et al.’s [13] earlyhunch that visualizations have a key role to play as one strives tomake sense of city data that are being collected. An early discussionof the potential role of Geographic Information Systems for smartcities pointed out that “Geo-visualization is the most essential in-strument of representing geographic data and analysis results andexploring potential interesting �ndings” [39].

�is position paper argues that (as far as transparency and smartcity are concerned), Linked Data and visualization play a similar1See h�ps://goo.gl/SCYjUH (last accessed: September 06, 2017) for the full report.2h�ps://goo.gl/xe1niw (last accessed: September 06, 2017).3h�ps://goo.gl/45vL6w (last accessed: September 06, 2017).4h�ps://goo.gl/q86QrA (last accessed: September 06, 2017).

Page 2: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA A. Degbelo

role for di�erent audiences. Linked Data, due to its structured for-mat, facilitates open data consumption for machines; visualizationfacilitates open data consumption for humans. �is implies that (asfar as transparency and smart city are concerned), it is worthwhileto consider areas where research on Linked Data, and research onvisualization can learn from one another. �is is what the articlewill try to do. In particular, the paper will discuss how assessing thee�ectiveness of visualizations with respect to insight generation(an open issue at the moment, see e.g., [6, 19, 34, 41]) may bene�tfrom work already done in Linked Data.

�e arguments are exposed in six sections. Section 2 presentsthe framework for transparency considered, and the contributionsof Linked Data to transparency. Section 3 discusses visualizationsas enablers of transparency. Section 4 brie�y touches upon thechallenges of assessing the e�ectiveness of Linked Data and visual-izations regarding transparency enablement. Section 5 brings forthan approach to quantitatively assess the insightfulness of visual-izations, and elaborates on its advantages as well as drawbacks.Section 6 proposes a rating scheme for visualizations on the Web(as a �rst step towards making them more �ndable), and Section 7concludes the article.

2 LINKED DATA AND TRANSPARENCYTransparency is important in the smart city context, but there arevarious kinds of it. For instance, Johannessen and Berntzen [21]suggested six categories of transparency: document transparency(i.e., access to government documents), meeting transparency (i.e.,access to meetings of public bodies including their agenda and min-utes), process transparency (i.e., explanation of processes leadingto government decisions including when and how citizens mayhave their say), benchmarking transparency (i.e., access to dataabout the performance of public institutions), decision-maker trans-parency (i.e., access to information about who the decision-makersare and what con�icting interests they may have) and disclosuretransparency (i.e., the right to ask wri�en or oral questions relatedto information not in documents or meeting agendas). Heald [17]mentioned some additional types of transparency including trans-parency onwards (i.e., the right of rulers to observe activities ofsubordinate agents), transparency downwards (i.e., the right ofruled to observe activities of their rulers), transparency outwards(i.e., an agent can observe what is going on outside the organiza-tion), transparency inwards (i.e., those outside can observe what ishappening inside the organization), transparency in retrospect (i.e.,an organization reports at periodic intervals about its activities)and transparency in real-time (i.e., continuous surveillance of theorganization). Not present in the previous lists, but also relevantto the smart city context is algorithmic transparency [3], i.e., theextent to which parameters of predictive algorithms shaping lo-cal government actions are made known to the public. �oughall these aspects of transparency are important for cities, the re-mainder of this work focuses only on transparency in the speci�ccontext of open government data (which is akin to ‘benchmarkingtransparency’ mentioned above).

Michener and Bersch [28] presented transparency as a contin-uum and proposed two dimensions of transparency: visibility andinferability. Visibility means that the information is (i) reasonably

complete and (ii) found with relative ease. Inferability refers tothe degree to which the information at hand can be used to drawaccurate conclusions. According to [28], properties of visibility areintrinsic to the information, whereas inferability is contingent onthe receptive capacity of the target audience. Viewing transparencyas a continuum implies that existing (legally or technically) opendata might be associated with various levels of it (i.e., both visibilityand inferability).

Transparency of open datasets can be increased through theexplicit linking of these datasets to related datasets, as well as appli-cations which consume them. Creating these explicit links booststransparency in at least two ways. First, linking expands the com-pleteness of information, and thereby its visibility. Second, creatingexplicit links between open datasets and applications which re-usethem is a way of unveiling their use, and making their value moreapparent. As Janssen et al. [20] indicated “open data has no valuein itself; it only becomes valuable when used”.

Linked Data is machine-readable data, and therefore primarilysuitable for consumption by machines. As a result, making typedlinks between resources explicit contributes to making the datamore transparent for so�ware agents. Since Linked Data is partof the Big Data landscape (see [18]), the arguments presented inthis section hold also for Big Data in the context of smart cities.Some examples of the use of Linked Data to increase transparencyare found in [14, 26, 30]. Futia et al. [14] report on using LinkedData principles to reduce fragmentation (i.e., increase visibility) ofinformation in Italian procurement data. Martin et al. [26] presentan RDF (Resource Description Framework) version of the FinancialTransparency System data from the European Commission, andpoint out that Linked Data leads to an increased �nancial trans-parency of EU project funding. Mora-Rodriguez et al. [30] used acombination of XBRL (Extensible Business Reporting Language)and Linked Data, to make corporate data more transparent.

3 VISUALIZATION AND TRANSPARENCYVisualization plays also an important role in the context of opendata re-use and smart city. Segel and Heer [37] discussed thatvisualization of data has a storytelling potential. In addition, datavisualization “makes it possible for researchers, analysts, engineers,and the lay audience to obtain insight in these data in an e�cientand e�ective way” [40]. Visualization can be considered from threedi�erent viewpoints presented in [40]: technology, art, or science.In the current article, visualization is viewed as a technology tocommunicate a story.

Scheider et al. [35] suggested to model the content of a map asthe set of assertions which can be extracted by looking at it. �atis, geovisualizations (and more generally visualizations) could beseen as a set of RDF statements made by authors with a certainreputation at some point in time (see [25]). �e consequence ofthis view is that visualizations are also enablers of transparency.�ey make (some) statements visible to the consumer, namely thosethat the author of the visualization includes in her narrative. �enumber of statements that the consumer actually notices whileusing a visualization depends on many factors, including her ownexperience and the degree of interactivity provided. Visualizationsstimulate visual thinking, and are therefore primarily suitable for

Page 3: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

Linked Data and Visualization: Two Sides of the Transparency Coin UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA

human agents. To echo Shneiderman [38], “�e a�raction of visualdisplays … is that they make use of the remarkable human percep-tual ability for visual information”. �erefore, making (selected)facts much more prominent via a visualization leads primarily toan increase in visibility and/or inferability for humans agents.

4 ASSESSING EFFECTIVENESS REGARDINGTRANSPARENCY

From the previous two sections, Linked Data and Visualizationcontribute to increase transparency for machines and humans re-spectively. �ey are two sides of the same coin. �e questionnow is how e�ective both are with respect to transparency. Whycare? Because increasing transparency means making more insightsvisible and inferable. In particular, visualizations which increasetransparency are key to e�ective information of users. E�ectiveinformation of users, in turn, is necessary if visualizations are tobe potent for problem-solving. As Gri�n et al. [16] pointed out:“Because maps are used to solve problems that underlie the sus-tainability of life on Earth (e.g., climate change, water resourceallocation, declines in biodiversity, etc.), understanding how mapsare insightful is more important than ever” [original emphasis].

Measuring the e�ectiveness of two linked datasets with respectto transparency is a ma�er of comparing how many statements(i.e., triples) each dataset makes visible with respect to a givenphenomenon. A reasoner could also be used to check the respectiveperformance of the two datasets with respect to inferability (i.e.,how many additional statements they enable). Given a dataset Dand a topic T, producing algorithms to answer the question Howmany triples of D are about T? is currently a challenge.

Measuring the e�ectiveness of two visualizations with respectto transparency faces also a challenge, but of a di�erent kind. �emain issue here is that of specifying what ‘insight’ is. As van Wijk[41] pointed out, insight is “ill-de�ned and hard to measure”. Changet al. [5] suggested a distinction between spontaneous insight (i.e.,a moment of enlightenment) and knowledge-building insights (i.e.,an advance in knowledge or a piece of information). As Chang etal. indicated, “cognitive scientists have successfully identi�ed theneural pa�erns of the spontaneous insight phenomenon and cannow observe and measure the insight process”. �e ideas suggestedin the following focus on measuring knowledge-building insightsmade visible to user by a visualization. Discussing the inferabilityaspect of these knowledge-building insights is le� for future work.

5 MEASURING KNOWLEDGE-BUILDINGINSIGHTS OF VISUALIZATIONS

Visualizations could be approached based on the following premises:• P1: a visualization is a set of statements made by authors

with a certain reputation at some point in time.• P2: �nding an absolute ground truth for a visualization

may be una�ainable, but statements made by experts canbe used as ground truth for evaluation purposes.

P1 was already discussed in Section 3. Regarding P2, an expertdenotes the creator of the visualization, or a user who has an accu-mulated knowledge about it through prolonged use. P2 adapts theidea well known in GIScience (see e.g., [12]) to use the best avail-able data as ‘truth’ against which the accuracy of other datasets

can be assessed. �e e�ectiveness of visualizations regarding thenumber of insights actually made visible can be assessed throughthe administration of questionnaires both pre- and post-interaction.Linked Data formulates statements as triples, and this happens tobe the simplest form in which statements can be made in naturallanguage (see [25]). As a result, one could take advantage of thesimplicity of triples while formulating statements to assess theinsighfulness of visualizations. An additional aspect of the triplesyntax which makes them a�ractive in this context is their inherentstructuredness. As Johnson [23] indicated, information is easierfor people to scan and understand when presented in a terse andstructured way. It’s a research question in itself to identify which ofthe two questionnaires (i.e., triple-based or natural language based)is easier to scan and understand for participants.

Illustrative example: Consider the visualization of the unem-ployment rate in Munster between 2010 and 2014 shown in Figure1. �ere are few statements which can be listed to assess this visu-alization - see Figure 3 (the triple-based version is shown on Figure4). Statement 1 is an identi�cation statement: it is related to onecharacteristic of Munster. Statement 2 is a comparison statement: itexpresses relationships pertaining to the unemployment phenome-non in Munster. Identi�cation and comparison were identi�ed in[1] as two elementary cognitive operations for geovisualizations,and apply to city visualizations at large. Statement 3 is a spatialstatement and may be useful while assessing the spatial learninge�ects of the visualization. Statement 4 points at a low-level factwhereas statement 5 is a about a higher-level fact (i.e., a generaltrend of the dataset). �ese examples show the signi�cance ofthe approach proposed here, and its capacity to cope with various�avors of knowledge-building insights in the context of smart cities.

Pros & Cons: �ere are few advantages of the approach in-troduced here to measure the e�ectiveness of visualizations withrespect to insight communication. First (and as shown above),the approach is �exible and can account for various examples ofinsights produced by city visualizations. �e fact that the ques-tionnaire provides the “I don’t know” option helps to establisha user-dependent baseline, and accounts for the (likely) diversebackground knowledge of users interacting with city visualizations.Second, the approach is quantitative. North [31] suggested a quali-tative way of measuring visualization insight where users verbalizeinsights in a think-aloud protocol. �e ideas brought forth hereprovide a useful complement. For example, one could map partici-pants’ answers to the set {-1, 0, 1} (where -1 denotes an incorrectanswer, 0 is given for “I don’t know”, and 1 is assigned to a correctanswer) and sum the scores. Variants of this rating are conceivable(e.g., give weights to di�erent questions, introduce and assign ascore to deceptive questions). Developing ratings which are eco-logically valid is an open research question, and would necessitatecollaboration with other disciplines, in particular, with researchersworking in the �eld of psychology. �ird, accounting for multipletruths is possible. Viewing ‘truth’ as what the visualization expertsays allows to account for the fact that di�erent stakeholders mayhave di�erent priorities/interests when assessing visualization ef-fectiveness. For example, a researcher on spatial learning may focusonly on spatial statements to see how much the user knows a�erinteracting with the visualization. A data journalist may designa questionnaire to assess how many (and which) high-level facts

Page 4: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA A. Degbelo

the user retains a�er interacting with her visualization. Fourth, asemi-automatic generation of the questionnaires may be possible, insome cases. �e visualization from Figure 1 is based on unemploy-ment data made open by the Munster City Council as PDF (PortableDocument Format) �les5. �e data from the Munster City Councilwas converted into Linked Data and visualized during a seminarat the Institute for Geoinformatics. �e map-based interface visu-alizes a total of 4399 triples. Listing 1 presents an excerpt of thetriples. If RDF triples about the dataset visualized are available, onecould randomly generate a �xed number of questions from the poolof RDF triples a�er an interaction session. �is necessitates toolsto automate the RdfToNaturalLanguage translation process (a lessdemanding task than the reverse operation of NaturalLanguage-ToRdf translation). Using triple-based questionnaires (such as theone shown on Figure 4) is also an option worth exploring whenthe visualization is built on top of RDF data (i.e., triples). Withrespect to drawbacks, one objection that can be raised about thequestionnaire-based approach is that it does not help to accountfor the temporally elusive aspect of insight - “insight triggered bya given visualization may occur hours, days, or even weeks a�erthe actual interaction with the visualization” [4]. �ere are twopossible answers to this. First, the questionnaire-based approach issuggested to assess knowledge-building insight, not spontaneous in-sight. Insight occurring long a�er interaction with the visualizationis spontaneous insight. Second, the questionnaire-based approachis intended to primarily account for the tacit understandability ofvisualizations. �is tacit understandability, in turn, is critical forvisualizations that ma�er. As Robinson et al. [33] recently pointedout in the context of geovisualizations, “Maps that ma�er are thosethat pique interest, are tacitly understandable and are relevant toour society” [emphasis added].

Listing 1: Example statements about unemployment inMunster as RDF.

1 @pref ix db ped ia : <h t t p : / / dbpe d ia . org / o n t o l o g y / > .2 @pref ix dc : <h t t p : / / p u r l . org / dc / e l e m e n t s / 1 . 1 / > .3 @pref ix lodcom : <h t t p : / / vocab . lodcom . de / > .4 @pref ix s p a r e l : <h t t p : / / d a t a . o rdnancesurvey . co . uk / o n t o l o g y / s p a t i a l r e l a t i o n s / > .5 @pref ix xsd : <h t t p : / /www. w3 . org / TR / xmlschema −2/ > .67 # Unemployment i n Muenster Nord (2010 −2014 )8 lodcom : nord dc : d e s c r i p t i o n ” Female Amount o f Unemployed i n Nord Borough From

2010 t o 2 0 1 4 ”@en ;9 dc : d e s c r i p t i o n ” Female Anzahl der A r b e i t s l o s e i n Nord S t a d t b e z i r k Von

2010 b i s 2 0 1 4 ”@de ;10 lodcom : hasFemaleUnemployment2010 ” 7 3 3 ” ˆ ˆ xsd : i n t e g e r ;11 lodcom : hasFemaleUnemployment2011 ” 7 4 8 ” ˆ ˆ xsd : i n t e g e r ;12 lodcom : hasFemaleUnemployment2012 ” 8 1 0 ” ˆ ˆ xsd : i n t e g e r ;13 lodcom : hasFemaleUnemployment2013 ” 7 7 4 ” ˆ ˆ xsd : i n t e g e r ;14 lodcom : hasFemaleUnemployment2014 ” 7 5 0 ” ˆ ˆ xsd : i n t e g e r ;15 lodcom : T y p e o f C i t y D i v i s i o n d bped ia : borough ;16 s p a r e l : c o n t a i n s lodcom : c o e r d e ;17 s p a r e l : c o n t a i n s lodcom : k inderhaus−o s t ;18 s p a r e l : c o n t a i n s lodcom : k inderaus−west ;19 s p a r e l : c o n t a i n s lodcom : s p r a k e l .

6 A RATING SCHEME FOR VISUALIZATIONSON THEWEB

Since Linked Data and visualization share some commonalities withrespect to transparency, and the former has a star rating scheme6,it may be worthwhile to explore the idea of a rating scheme forvisualizations on the Web. Two reasons motivate this. First, the

5See h�p://www.stadt-muenster.de/stadtentwicklung/zahlen-daten-fakten.html (lastaccessed: September 18, 2017).6h�ps://www.w3.org/DesignIssues/LinkedData.html (last accessed: July 21, 2017).

�ve stars rating scheme for Linked Data is arguably crude, but ithas the advantage that background programs can use it to checkexisting datasets within minutes (for an example, see the portal ofData.gov.uk7). Second, while there is useful ongoing work aboutbenchmarking RDF data (for a recent discussion, see [27]), there arerelatively li�le discussions about how to proceed with the growingamount of visualizations of datasets on the Web. If both a techniquefor ranking RDF datasets and a technique for ranking visualizationswould be available soon, the conjecture made in this article isthat a layman would go for the la�er �rst. Graves and Hendlerhave provided early insights into the wishes of laymen regardingvisualizations of open data in [15]8). �ey pointed out that thereis a “real interest” from the people surveyed to create, reuse andexplore visualizations of open data.

�e �rst and foremost question when it comes to ranking iswhat to reward. �e �ve data rating scheme of Sir Tim Berners-Leerewards at least four things: the use of an open license, the useof open standards, the use of a structured (i.e., machine readable)format, and linking. �ere are a plethora of aspects to rewardwhen it comes to visualization: ease of use, e�ectiveness, degreeof interactivity (the more interactivity, the more possibility for theconsumer to explore), to name but a few. Not all of these howeverwould be amenable to machine processing. �e next paragrapha�empts to take advantage of the fact that many visualizations onthe Web have are, in essence, HTML (Hypertext Markup Language)pages. �e goal is to encourage visualization authors to maketheir visualizations more open (through the supply of structuredmetadata about them), and to reward the e�orts they would put indoing this.

Looking at the aspects rewarded by the �ve stars Linked Datarating scheme above, and adapting them for visualizations on theWeb:

• �e �rst star could reward the machine-readability aspect.For instance, a visualization for which the source code canbe consulted as HTML would get a star (a mere .png or.jpeg image would not);

• �e use of JSON-LD to provide an additional descriptionabout the visualization earns the visualization author afurther star. JSON-LD9 has the status of W3C recommen-dation since 2014 and is an open standard for the annotationof content in Web-based programming environments. Re-cent statistics from the WebDataCommons suggest thatJSON-LD is quickly gaining popularity. As of October 2016,JSON-LD ranked second behind microdata as most usedmethod of embedding structured data in HTML10;

• �e use of an open license (e.g., Creative Commons) couldbe warrant a third star (note that the license is only requiredto be open, and needs not be Creative Commons). Open

7See h�p://guidance.data.gov.uk/�ve stars of openness.html (last accessed: July 21,2017).8[15] is one of the very few surveys which dealt with users’ perceptions about therole of visualization in the context of open data so far. Surveys about open data ingeneral are available (see e.g., [22, 36]), but explorations of users’ perceptions aboutvisualizations in the context of open data publication and consumption have been lesscommon.9h�ps://www.w3.org/TR/json-ld/;h�ps://json-ld.org/ (last accessed: September 19,2017).10See h�p://webdatacommons.org/structureddata/2016-10/stats/stats.html (last ac-cessed: September 19, 2017).

Page 5: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

Linked Data and Visualization: Two Sides of the Transparency Coin UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA

Figure 1: A visualization of unemployment rates in Munster.

(a) Rates for males (b) Rates for females

Figure 3: Evaluating the insightfulness of visualizations. Having questionnaires administered both pre-interaction and post-interaction can help assess the e�ectiveness of the visualization regarding the number of insights actually made visible.

Page 6: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA A. Degbelo

Figure 4: Evaluating the insightfulness of visualizations. �estions could be formulated taking advantage of the triple syntax.If the visualizations are built on top of RDF datasets, the process of generating the questions might be automated (to someextent).

License is used here in line with the Open De�nition todenote a license “which grants permission to access, re-useand redistribute a work with few or no restrictions”11;

• �e use of open source libraries could be rewarded by afourth star. Example of open source libraries include Lea�et(Map), j�ery UI (User Interaction), Intro.js (Documenta-tion). �e use of open source libraries should be partic-ularly encouraged on the Web because the survey datafrom [15] suggests that users are keen on (a) modifying theavailable visualization a li�le bit, and (b) creating a simi-lar visualization, but using their own data. Open sourcelibraries are a key enabler of these two tasks;

• Finally, explicitly linking to the data source (s) visualizedcould earn a ��h star. �is is in line with the wishes ofstakeholders surveyed in [15] to have some informationabout the origin of the dataset visualized.

Illustrative example: Figure 5 presents the annotation of thevisualization from Figure 1 in JSON-LD. �e annotation can be re-trieved from the source code of the visualization available at h�p://giv-oct.uni-muenster.de/ijald/g1/. Concepts and relationships fromSchema.org (see h�p://schema.org/docs/full.html) were used for theannotation. From the <script type=“application/ld+json”>..</script> statement (Lines 17 and 50 of Figure 5) one can infer both machine-readability and the use of an open standard to describe the visualiza-tion, granting it the �rst two stars. Line 20 of the �gure states thatthe application is a WebApplication, and line 21 further speci�esthat it is a visualization. Line 33 mentions that the visualizationis licensed under the terms of the Apache License (which is opensource), earning the visualization a third star. �e “isBasedOn”

11h�p://opende�nition.org/guide/ (last accessed: September 19, 2017).

property from Schema.org helps to describe the libraries whichwere used when creating the visualization. Since both j�ery andLea�et are open source, the visualization has four stars. Finally,the “supportingData” property from the Schema.org vocabularyenables the speci�cation of the original data on top of which thevisualization has been built, leading to a �ve stars visualization.

Pros & Cons: �ere are three main advantages of the approachsuggested above. First (and as the example has illustrated), theresources to implement it are already available. �e example hasindeed shown that Schema.org (which is already used by applica-tions from Google, Microso�, Pinterest, Yandex) could be easilyextended to document visualizations in a more telling way for theirfuture search. Second, since Schema.org provides some propertiesto describe spatio-temporal aspects of an entity, spatio-temporalsearch of visualization would become possible. �e already existingGeoJSON-LD vocabulary12 make an additional case for the extensi-bility of the approach to cope with geographic visualizations. �ird,JSON-LD (instead of other formats such as Microdata or RDFa)o�ers the advantage of a clean separation between HTML codeand structured documentation of the code, making maintenance ofthese documents for large websites potentially easier.

As to the drawbacks of the approach, developing heuristics toreliably assign the third and fourth star is a challenge. In partic-ular, more work is needed to �nd out what to do best when thevisualization builds on both closed-source and open-source ma-terials. �is is worthy challenge though. From a user’s point ofview, having so�ware agents crawling the Web (or part of it), andperforming a preliminary information aggregation task using the

12See h�p://geojson.org/geojson-ld/ (last accessed: September 19, 2017).

Page 7: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

Linked Data and Visualization: Two Sides of the Transparency Coin UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA

Figure 5: JSON-LD description of the visualization of unemployment rate in Munster.

�ve stars presented above, could be valuable while looking for vi-sualizations for their open data. �e second disadvantage of therating scheme is that it rewards some aspects, to the detriment ofothers. For example, the provision of spatio-temporal metadata isnot rewarded at the moment. �is drawback is acknowledged, butwould be inherent to any rating scheme of the sort introduced here.�e �nal decision of which aspects to reward, and which not, willneed concerted e�ort of the research community as a whole.

7 CONCLUSIONTransparency is an important component of citizen participationand smart cities. �is paper has argued that both Linked Data andvisualizations are enablers of transparency. �ey are two sides ofthe same coin: the former increases transparency primarily formachines, while the la�er does so essentially for humans. �earticle then went on to point out current challenges regardingassessing the e�ectiveness of Linked Data, and visualizations withrespect to transparency enablement. In addition, the paper exploredhow assessing the e�ectiveness of visualizations with respect toinsight generation could be approached, and take advantage ofLinked Data (whenever available). It suggested to view insight as aset of statements made by an author, and to use statements madeby experts as ground truth while evaluating the insightfulness ofvisualizations. �e work also proposed a rating scheme for LinkedData visualization which is derived from the well-known �ve starsopen data scheme, and can be used to make visualizations more

transparent with respect to their terms of use (i.e., license), tweakingoptions (i.e., use of open source libraries), and provenance (explicitlinking to the source dataset). �e ideas brought forward in thisarticle are useful to advance research and practice of both LinkedData and visualization through the development of techniques toautomatically assess the transparency level of visualizations, andtheir e�ectiveness in making insights visible to the end user.

ACKNOWLEDGMENT�e author gratefully acknowledges funding from the EuropeanUnion through the GEO-C project (H2020-MSCA-ITN-2014, GrantAgreement Number 642332, h�p://www.geo-c.eu/). �e visualiza-tion used as an illustration in the paper was created by (in alphabeti-cal order) Ana Maria Bustamante, Lukas Loho�, Jahangir Fahad andMa�hias Mohr during the Seminar “An Introduction to JavaScriptand Linked Data” at the Institute for Geoinformatics, University ofMuenster in 2015.

REFERENCES[1] Natalia Andrienko, Gennady Andrienko, and Peter Gatalsky. 2003. Exploratory

spatio-temporal visualization: an analytical review. Journal of Visual Lan-guages & Computing 14, 6 (dec 2003), 503–541. DOI:h�p://dx.doi.org/10.1016/S1045-926X(03)00046-6

[2] Judie A�ard, Fabrizio Orlandi, Simon Scerri, and Soren Auer. 2015. A systematicreview of open government data initiatives. Government Information �arterly32, 4 (2015), 399–418. DOI:h�p://dx.doi.org/10.1016/j.giq.2015.07.006

[3] Robert Brauneis and Ellen P. Goodman. 2017. Algorithmic transparency for thesmart city. Yale Journal of Law & Technology (2017), Forthcoming.

Page 8: Linked Data and Visualization: Two Sides of the ...geo-c.eu/pubs/2017_Degbelo_UrbanGISpaper.pdfoperation with the Smart Cities Council) revealed that more than 50% of the respondents

UrbanGIS’17, November 7–10, 2017, Redondo Beach, CA, USA A. Degbelo

[4] Sheelagh Carpendale. 2008. Evaluating information visualizations. In InformationVisualization, Andreas Kerren, John T. Stasko, Jean-Daniel Fekete, and ChrisNorth (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 19–45. DOI:h�p://dx.doi.org/10.1007/978-3-540-70956-5 2

[5] Remco Chang, Caroline Ziemkiewicz, Tera Marie Green, and William Ribarsky.2009. De�ning insight for visual analytics. IEEE Computer Graphics and Applica-tions 29, 2 (mar 2009), 14–17. DOI:h�p://dx.doi.org/10.1109/MCG.2009.22

[6] Arzu Coltekin, Susanne Bleisch, Gennady Andrienko, and Jason Dykes. 2017. Per-sistent challenges in geovisualization – a community perspective. InternationalJournal of Cartography (apr 2017), 1–25.

[7] Maria Cucciniello, Nicola Belle, Greta Nasi, and Marco Mena. 2016. Smartcities and transparency. Does smartness in�uence transparency?. In 49th HawaiiInternational Conference on System Sciences (HICSS 2016), Tung X. Bui and Ralph H.Sprague Jr. (Eds.). IEEE Computer Society, Koloa, Hawaii, USA, 2944–2952. DOI:h�p://dx.doi.org/10.1109/HICSS.2016.369

[8] Aba-Sah Dadzie and Emmanuel Pietriga. 2017. Visualisation of Linked Data- Reprise. Semantic Web 8, 1 (2017), 1–21. DOI:h�p://dx.doi.org/10.3233/SW-160249

[9] Mathieu D’Aquin, John Davies, and Enrico Mo�a. 2015. Smart cities’ data:challenges and opportunities for semantic technologies. IEEE Internet Computing19, 6 (nov 2015), 66–70. DOI:h�p://dx.doi.org/10.1109/MIC.2015.130

[10] Auriol Degbelo, Devanjan Bha�acharya, Carlos Granell, and Sergi Trilles. 2016.Toolkits for smarter cities: a brief assessment. In UCAmI 2016 - 10th Interna-tional Conference on Ubiquitous Computing & Ambient Intelligence, R Garcıa,P Caballero-Gil, M Burmester, and A �esada-Arencibia (Eds.). Springer In-ternational Publishing, Las Palmas, Gran Canaria, Spain, 431–436. DOI:h�p://dx.doi.org/10.1007/978-3-319-48799-1 47

[11] Auriol Degbelo, Carlos Granell, Sergio Trilles, Devanjan Bha�acharya, SvenCasteleyn, and Christian Kray. 2016. Opening up smart cities: citizen-centricchallenges and opportunities from GIScience. ISPRS International Journal ofGeo-Information 5, 2 (2016), 16. DOI:h�p://dx.doi.org/10.3390/ijgi5020016

[12] I J Dowman. 1999. Encoding and validating data from maps and images. InGeographical information systems: principles and technical issues (2nd ed.), P ALongley, D J Maguire, M F Goodchild, and D W Rhind (Eds.). John Wiley andSons, New York, Chapter 31, 437–450.

[13] Jason Dykes, Gennady Andrienko, Natalia Andrienko, Volker Paelke, and JochenSchiewe. 2010. Editorial – GeoVisualization and the Digital City. Computers,Environment and Urban Systems 34, 6 (nov 2010), 443–451. DOI:h�p://dx.doi.org/10.1016/j.compenvurbsys.2010.09.001

[14] Giuseppe Futia, Alessio Melandri, Antonio Vetro, Federico Morando, andJuan Carlos De Martin. 2017. Removing barriers to transparency: A casestudy on the use of semantic technologies to tackle procurement data incon-sistency. In �e Semantic Web: 14th International Conference, ESWC 2017, EvaBlomqvist, Diana Maynard, Aldo Gangemi, Rinke Hoekstra, Pascal Hitzler, andOlaf Hartig (Eds.). Portoroz, Slovenia, 623–637. DOI:h�p://dx.doi.org/10.1007/978-3-319-58068-5 38

[15] Alvaro Graves and James Hendler. 2013. Visualization tools for open govern-ment data. In Proceedings of the 14th Annual International Conference on DigitalGovernment Research - dg.o ’13, Sehl Mellouli, Luis F. Luna-Reyes, and Jing Zhang(Eds.). 136. DOI:h�p://dx.doi.org/10.1145/2479724.2479746

[16] Amy L. Gri�n, Anthony C. Robinson, and Robert E. Roth. 2017. Envisioning thefuture of cartographic research. International Journal of Cartography (may 2017),1–8. DOI:h�p://dx.doi.org/10.1080/23729333.2017.1316466

[17] David Heald. 2006. Varieties of transparency. In Transparency: �e Key to Be�erGovernance?, Christopher Hood and David Heald (Eds.). British Academy, 24–43.DOI:h�p://dx.doi.org/10.5871/bacad/9780197263839.003.0002

[18] Pascal Hitzler and Krzysztof Janowicz. 2013. Linked Data, Big Data, and the 4thParadigm. Semantic Web 4, 3 (2013), 233–235.

[19] Tobias Isenberg, Petra Isenberg, Jian Chen, Michael Sedlmair, and Torsten Moller.2013. A systematic review on the practice of evaluating visualization. IEEETransactions on Visualization and Computer Graphics 19, 12 (dec 2013), 2818–2827. DOI:h�p://dx.doi.org/10.1109/TVCG.2013.126

[20] M Janssen, Y Charalabidis, and A Zuiderwijk. 2012. Bene�ts, adoption barriersand myths of open data and open government. Information Systems Management29, 4 (2012), 258–268.

[21] Marius Rohde Johannessen and Lasse Berntzen. 2018. �e transparent smart city.In Smart Technologies for Smart Governments, Manuel Pedro Rodrıguez Bolıvar(Ed.). Springer International Publishing, 67–94. DOI:h�p://dx.doi.org/10.1007/978-3-319-58577-2 5 Forthcoming.

[22] John B. Horrigan and Lee Rainie. 2015. Americans’ views on open governmentdata. Pew Research Center April (2015). h�p://www.pewinternet.org/2015/04/21/open-government-data/ (last accessed: October 17, 2017).

[23] Je� Johnson. 2010. Designing with the mind in mind. Elsevier. DOI:h�p://dx.doi.org/10.1016/C2009-0-20318-7

[24] Soonhee Kim and Jooho Lee. 2017. Citizen participation and transparency inlocal government: Do participation channels and policy making phases ma�er?.In 50th Hawaii International Conference on System Sciences - HICSS 2017. AISElectronic Library (AISeL), Hilton Waikoloa Village, Hawaii, USA.

[25] W Kuhn, T Kauppinen, and K Janowicz. 2014. Linked Data - A paradigm shi�for Geographic Information Science. In Geographic Information Science - EighthInternational Conference, M Duckham, E Pebesma, K Stewart, and A U Frank(Eds.). Springer International Publishing, Vienna, Austria, 173–186.

[26] Michael Martin, Claus Stadler, Philipp Frischmuth, and Jens Lehmann. 2014.Increasing the �nancial transparency of European Commission project funding.Semantic Web 5, 2 (2014), 157–164. DOI:h�p://dx.doi.org/10.3233/SW-130116

[27] Edgard Marx, Amrapali Zaveri, Mofeed Mohammed, Sandro Rautenberg, JensLehmann, Axel-Cyrille Ngonga Ngomo, and Gong Cheng. 2016. DBtrends:Publishing and benchmarking RDF ranking functions. In Proceedings of the 2ndInternational Workshop on Summarizing and Presenting Entities and Ontologies(SumPre 2016), Andreas �alhammer, Gong Cheng, and Kalpa Gunaratna (Eds.).CEUR-WS.org, Anissaras, Greece.

[28] G Michener and K Bersch. 2013. Identifying transparency. Information Polity 18,3 (2013), 233–242. DOI:h�p://dx.doi.org/10.3233/IP-130299

[29] Michael E. Milakovich. 2010. �e internet and increased citizen participation ingovernment. eJournal of Democracy (JeDEM) 2, 1 (2010), 1–9.

[30] Maria Mora-Rodriguez, Ghislain Auguste Atemezing, and Chris Preist. 2017.Adopting semantic technologies for e�ective corporate transparency. In �eSemantic Web: 14th International Conference, ESWC 2017, Eva Blomqvist, DianaMaynard, Aldo Gangemi, Rinke Hoekstra, Pascal Hitzler, and Olaf Hartig (Eds.).Portoroz, Slovenia, 655–670. DOI:h�p://dx.doi.org/10.1007/978-3-319-58068-540

[31] Chris North. 2006. Toward measuring visualization insight. IEEE ComputerGraphics and Applications 26, 3 (2006), 6–9. DOI:h�p://dx.doi.org/10.1109/MCG.2006.70

[32] Adegboyega Ojo, Zamira Dzhusupova, and Edward Curry. 2016. Exploring thenature of the smart cities research landscape. In Smarter as the New UrbanAgenda: A Comprehensive View of the 21st Century City, Ramon J Gil-Garcia,A �eresa Pardo, and Taewoo Nam (Eds.). Springer International Publishing,23–47. DOI:h�p://dx.doi.org/10.1007/978-3-319-17620-8 2

[33] Anthony C. Robinson, Urska Demsar, Antoni B. Moore, Aileen Buckley, Bin Jiang,Kenneth Field, Menno-Jan Kraak, Silvana P. Camboim, and Claudia R. Sluter.2017. Geospatial big data and cartography: research challenges and opportunitiesfor making maps that ma�er. International Journal of Cartography (mar 2017),1–29. DOI:h�p://dx.doi.org/10.1080/23729333.2016.1278151

[34] Robert E. Roth, Arzu Coltekin, Luciene Delazari, Homero Fonseca Filho, AmyGri�n, Andreas Hall, Jari Korpi, Ismini Lokka, Andre Mendonca, Kristien Ooms,and Corne P.J.M. van Elzakker. 2017. User studies in cartography: opportunitiesfor empirical research on interactive maps and visualizations. International Jour-nal of Cartography (may 2017), 1–29. DOI:h�p://dx.doi.org/10.1080/23729333.2017.1288534

[35] S Scheider, J Jones, A Sanchez, and C Keßler. 2014. Encoding and queryinghistoric map content. In �e 17th AGILE International Conference on GeographicInformation Science - Connecting a Digital Europe �rough Location and Place,J Huerta, S Schade, and C Granell (Eds.). Castellon, Spain, 251–273.

[36] Birgit Schmidt, Birgit Gemeinholzer, and Andrew Treloar. 2016. Open data inglobal environmental research: �e belmont forum’s open data survey. PLOSONE11, 1 (jan 2016), e0146695. DOI:h�p://dx.doi.org/10.1371/journal.pone.0146695

[37] Edward Segel and Je�rey Heer. 2010. Narrative visualization: Telling storieswith data. IEEE Transactions on Visualization and Computer Graphics 16, 6 (2010),1139–1148. DOI:h�p://dx.doi.org/10.1109/TVCG.2010.179

[38] B. Shneiderman. 1996. �e eyes have it: a task by data type taxonomy forinformation visualizations. In Proceedings of the IEEE Symposium on Visual Lan-guages. IEEE Computer Society Press, Boulder, Colorado, USA, 336–343. DOI:h�p://dx.doi.org/10.1109/VL.1996.545307

[39] Wang Tao. 2013. Interdisciplinary urban GIS for smart cities: advancementsand opportunities. Geo-spatial Information Science 16, 1 (2013), 25–34. DOI:h�p://dx.doi.org/10.1080/10095020.2013.774108

[40] Jarke J van Wijk. 2005. �e value of visualization. In IEEE Visualization 2005. IEEE,Minneapolis, Minnesota, USA, 79–86. DOI:h�p://dx.doi.org/10.1109/VISUAL.2005.1532781

[41] Jarke J. van Wijk. 2013. Evaluation: A challenge for visual analytics. Computer46, 7 (2013), 56–60. DOI:h�p://dx.doi.org/10.1109/MC.2013.151

[42] C T Yin, Z Xiong, H Chen, J Wang, D Cooper, and B David. 2015. A literaturesurvey on smart cities. Science China Information Sciences 58, 10 (2015).