View
244
Download
2
Category
Preview:
Citation preview
Wikipedia Tools for Google Spreadsheets
Thomas SteinerGoogle Germany GmbH
ABC Str. 19, 20354 Hamburg, Germanytomac@google.com
ABSTRACTIn this paper, we introduce the Wikipedia Tools for GoogleSpreadsheets. Google Spreadsheets is part of a free, Web-based software office suite offered by Google within its GoogleDocs service. It allows users to create and edit spread-sheets online, while collaborating with other users in real-time. Wikipedia is a free-access, free-content Internet ency-clopedia, whose content and data is available, among othermeans, through an API. With the Wikipedia Tools for GoogleSpreadsheets, we have created a toolkit that facilitates work-ing with Wikipedia data from within a spreadsheet context.We make these tools available as open-source on GitHub,1
released under the permissive Apache 2.0 license.
Categories and Subject DescriptorsH.3.5 [Online Information Services]: Web-based services
KeywordsWikipedia, Wikidata, Google Spreadsheets, Google Sheets
1. INTRODUCTIONIn the world of Computer Science, spreadsheet applica-
tions serve for the organization, analysis, and storage ofdata in tabular form. Spreadsheets are the computerizedsimulation of paper accounting worksheets, and operate ondata represented as cells of an array, organized in rows andcolumns. Cells can contain numeric or textual data, or theresults of formulas that automatically calculate and displaya value based on the contents of other cells. With the Wiki-pedia Tools for Google Spreadsheets, we introduce a toolkitof such formulas, tailored to the universe of Wikipedia, thatenables a wide range of potential use cases starting frommarketing, to search engine optimization, to business anal-ysis. Especially through the chaining of formulas, the truepower and ease of spreadsheet applications can be unleashed.
1Wikipedia Tools for Google Spreadsheets: https://github.com/tomayac/wikipedia-tools-for-google-spreadsheets
Copyright is held by the International World Wide Web Conference Committee(IW3C2). IW3C2 reserves the right to provide a hyperlink to the author’s site if theMaterial is used in electronic media.
ACM 978-1-4503-4144-8/16/04.http://dx.doi.org/10.1145/2872518.2891112
1.1 Wikipedia and WikidataWikipedia’s content and data is available through the
Wikipedia API (https://{language}.wikipedia.org/w/api.php),where {language} represents one of the currently 291 sup-ported Wikipedia languages,2 for example, en for English,de for German, or zu for Zulu. Wikidata is a collaborativelyedited knowledge base and intended to provide a commonsource of structured data which can be used by projects suchas Wikipedia. Its content and data is available through theWikidata API (https://www.wikidata.org/w/api.php). Boththe Wikipedia and the Wikidata APIs’ data is available asXML or JSON, among other formats. Wikipedia pageviewsdata, i.e., the number of times within a given period of timethat a given Wikipedia article has been viewed can be ob-tained using the Pageviews API (https://wikimedia.org/api/
rest v1/?doc). The data is available in JSON format.
1.2 Google Spreadsheets and Apps ScriptsGoogle Spreadsheets can be extended with custom func-
tions (or formulas) using Google Apps Scripts3 that are writ-ten in standard JavaScript.4 To illustrate this, a trivial func-tion is defined in Listing 1 that can then be used from withina spreadsheet as outlined in Listing 2. Custom functions canaccess external resources on the Web by fetching URLs withthe UrlFetchApp, one of the scripting services available inGoogle Apps Script. Fetched data can either be in XML orJSON format and parsed with convenience functions.
function DOUBLE(input) {return input * 2;
}
Listing 1: Custom Google Sheets function called DOUBLE.
=DOUBLE(A1)
Listing 2: Usage of the custom DOUBLE function from List-ing 1 in a cell with the value of cell A1 as a parameter.
2. LIST OF DEVELOPED FUNCTIONSIn our Wikipedia Tools for Google Spreadsheets, we provide
eleven functions that—in traditional spreadsheets style—follow an all-uppercase naming convention and start with2List of Wikipedias: https://meta.wikimedia.org/wiki/List ofWikipedias3Google Apps Script: https://developers.google.com/apps-script/4Custom functions in Google Sheets: https://developers.google.com/apps-script/guides/sheets/functions
a WIKI prefix. These functions are wrappers around the par-ticular Wikipedia or Wikidata API calls, or the PageviewsAPI respectively. Figure 1 shows exemplary output for theEnglish Wikipedia article https://en.wikipedia.org/wiki/Berlin
and the English Wikipedia category https://en.wikipedia.org/
wiki/Category:Berlin. The functions are listed below.
WIKITRANSLATE Returns Wikipedia translations (languagelinks) for a Wikipedia article.
WIKISYNONYMS Returns Wikipedia synonyms (redirects) fora Wikipedia article.
WIKIEXPAND Returns Wikipedia translations (language links)and synonyms (redirects) for a Wikipedia article.
WIKICATEGORYMEMBERS Returns Wikipedia category mem-bers for a Wikipedia category.
WIKISUBCATEGORIES Returns Wikipedia subcategories fora Wikipedia category.
WIKIINBOUNDLINKS Returns Wikipedia inbound links fora Wikipedia article.
WIKIOUTBOUNDLINKS Returns Wikipedia outbound links fora Wikipedia article.
WIKIMUTUALLINKS Returns Wikipedia mutual links, i.e, theintersection of inbound and outbound links for a Wiki-pedia article.
WIKIGEOCOORDINATES Returns Wikipedia geocoordinates fora Wikipedia article.
WIKIDATAFACTS Returns Wikidata facts for a Wikipediaarticle.
WIKIPAGEVIEWS Returns Wikipedia pageviews statistics fora Wikipedia article.
WIKIPAGEEDITS Returns Wikipedia pageedits statistics fora Wikipedia article.
Most functions directly wrap native API calls, with threeexceptions: (i) the functionality of the WIKISYNONYMS andthe WIKITRANSLATE functions is combined in the WIKIEXPANDfunction, both the WIKITRANSLATE and the WIKIEXPAND func-tion accept an optional target languages parameter that al-lows for limiting the output to just a subset of all availableWikipedia languages; (ii) the function WIKIMUTUALLINKS isthe intersection of the two functions WIKIINBOUNDLINKS andWIKIOUTBOUNDLINKS; and (iii) the function WIKIDATAFACTS
provides a list of claims [11] (or facts), enriched with en-tity and property labels for improved readability, limited tosingle-value objects, and simplified using an adapted versionof Maxime Lathuiliere’s simplifyClaims function5 from hisWikidata SDK [6]. This allows us to return two columns—in RDF [2] terms “predicate” and “object” pairs—with oneunique object, for example, the predicate ISO 3166-2 code
with the object DE-BE, and deliberately discarding multi-value claims, for example, predicate head of government
with objects Michael Müller and Klaus Wowereit, amongmany others. While in the concrete example the orderingis clear (temporal), this is not true in the general case,for example, with predicate instance of. As a result, inWIKIDATAFACTS, we prefer indisputability of claims over theircompleteness. Listing 3 exemplarily shows the complete im-plementation of the WIKISYNONYMS function.
5Wikidata SDK simplifyClaims function: https://github.com/maxlath/wikidata-sdk#simplify-claims-results
/*** Returns Wikipedia synonyms
* @param {string} article The Wikipedia article
* @return {Array<string>} The list of synonyms
*/function WIKISYNONYMS(article) {’use strict’;if (!article) {return ’’;
}var results = [];try {var language = article.split(/:(.+)?/)[0];var title = article.split(/:(.+)?/)[1];if (!title) {
return ’’;}title = title.replace(/\s/g, ’_’);var url = ’https://’ + language +
’.wikipedia.org/w/api.php’ +’?action=query’ +’&blnamespace=0’ +’&list=backlinks’ +’&blfilterredir=redirects’ +’&bllimit=max’ +’&format=xml’ +’&bltitle=’ +encodeURIComponent(title);
var xml = UrlFetchApp.fetch(url).getContentText();
var document = XmlService.parse(xml);var entries = document.getRootElement()
.getChild(’query’).getChild(’backlinks’)
.getChildren(’bl’);for (var i = 0; i < entries.length; i++) {
var text = entries[i].getAttribute(’title’).getValue();
results[i] = text;}
} catch (e) {// no-op
}return results.length > 0 ? results : ’’;
}
Listing 3: Implementation of WIKISYNONYMS.
3. USAGE SCENARIOSWe have tested the Wikipedia Tools for Google Spreadsheets
with different usage scenarios in mind. These include, butare not limited to, the ones listed in the following.
3.1 Usage Scenario I: Ordered Category PanelWikipedia holds an enormous amount of categories, for
example, visitor attractions in Montreal.6 Category membersobtained through a call of WIKICATEGORYMEMBERS are listedin alphabetical order, however, if we additionally requestpageviews data for each category member through a seriesof WIKIPAGEVIEWS calls and then sort by pageviews in de-scending order, we get a representative list of top-10 visitorattractions—enriched with photos retrieved through calls ofWIKIDATAFACTS filtered on “image”—as shown in Figure 2.A similar feature (based on non-disclosed metrics) in form
6Visitor attractions in Montreal: https://en.wikipedia.org/wiki/Category:Visitor attractions in Montreal
WIK
ISYN
ON
YMS(
"en:
Ber
lin"
WIK
ISU
BC
ATE
GO
RIE
S("e
n:C
ateg
ory:
Ber
lin"
WIK
ICA
TEG
OR
YMEM
BER
S("e
n:C
ateg
ory:
Ber
linWIKIGEOCOORDINATES("en:Berlin")
WIK
IINB
OU
ND
LIN
KS(
"en:
Ber
lin")
WIK
IOU
TBO
UN
DLI
NK
S("e
n:B
erlin
")W
IKIM
UTU
ALL
INK
S("e
n:B
erlin
")W
IKIP
AG
EVIE
WS(
"en:
Ber
lin",
TO
DA
Y() -
7, T
OD
AY(
))
abБе
рлин
City
Ber
linde
fren
Cat
egor
y:B
erlin
eP
rixB
erlin
52.51666667
13.3
8333
333
Alb
ert E
inst
ein
1. F
C U
nion
Ber
linA
lber
t Ein
stei
nJa
nuar
y 24
, 201
648
96Ja
nuar
y 23
, 201
6-3
05hi
ghes
t poi
ntA
rken
berg
e
ace
Ber
linB
erlin
, Ger
man
yB
erlin
Ber
linB
erlin
Cat
egor
y:B
uild
ings
and
stru
ctur
es in
Ber
linFr
ee S
eces
sion
Ank
ara
1896
Sum
mer
Oly
mpi
csA
nkar
aJa
nuar
y 23
, 201
643
20Ja
nuar
y 23
, 201
6-6
topi
c's
mai
n W
ikim
edia
por
tal
Por
tal:B
erlin
afB
erly
nC
apita
l of E
ast G
erm
any
Frei
e B
ühne
(Zei
tsch
rift)
DE
3C
ityB
erlin
Cat
egor
y:C
rime
in B
erlin
Inte
lexi
tA
mst
erda
m19
00 S
umm
er O
lym
pics
Am
ster
dam
Janu
ary
22, 2
016
4685
Janu
ary
23, 2
016
6le
gisl
ativ
e bo
dyA
bgeo
rdne
tenh
aus
of B
erlin
akB
erlin
DE
BE
RLa
nd B
erlin
Ber
lin, G
erm
any
Cat
egor
y:C
ultu
re in
Ber
linN
ew S
eces
sion
Aud
i19
04 S
umm
er O
lym
pics
Aar
hus
Janu
ary
21, 2
016
4954
Janu
ary
22, 2
016
-1hi
ghes
t jud
icia
l aut
horit
yC
onst
itutio
nal C
ourt
of th
e S
tate
of B
erlin
als
Ber
linU
N/L
OC
OD
E:D
EB
ER
DE
-BE
Cap
ital o
f Eas
t Ger
man
Cat
egor
y:E
cono
my
of B
erlin
Alfo
ns M
aria
Jak
ob19
08 S
umm
er O
lym
pics
Ath
ens
Janu
ary
20, 2
016
5086
Janu
ary
22, 2
016
1co
untry
Ger
man
y
amበርሊን
Ber
lin-Z
entru
mW
illi F
reita
gD
EB
ER
Cat
egor
y:E
duca
tion
in B
erlin
Apr
il 12
1912
Sum
mer
Oly
mpi
csA
ache
nJa
nuar
y 19
, 201
652
11Ja
nuar
y 19
, 201
6-2
0D
ewey
Dec
imal
Cla
ssifi
catio
n2-
-431
55
anB
erlín
Ber
libR
oger
Ros
smei
slU
N/L
OC
OD
E:D
EB
ER
Cat
egor
y:G
eogr
aphy
of B
erlin
Aar
hus
1916
Sum
mer
Oly
mpi
csA
lexa
nder
plat
zJa
nuar
y 18
, 201
647
79IS
NI
0000
000
1 23
41 9
654
ang
Ber
linLa
nd B
erlin
Elis
abet
h C
onco
rdia
Cro
laB
erlin
-Zen
trum
Cat
egor
y:H
ealth
care
in B
erlin
Ant
isem
itism
1920
Sum
mer
Oly
mpi
csA
rcha
eopt
eryx
flag
flag
of B
erlin
arين
رلب
Ber
lin.d
eFe
licie
Ber
nste
inB
erlib
Cat
egor
y:H
isto
ry o
f Ber
linFo
reig
n re
latio
ns o
f Aze
rbai
jan
1920
s B
erlin
Atla
nta
flag
imag
eFl
ag o
f Ber
lin.s
vg
arc
ܝܢܪܠ
ܒB
erlin
(Ger
man
y)C
arl B
erns
tein
(Kun
stsa
mm
ler)
Land
Ber
linC
ateg
ory:
Ber
lin-r
elat
ed li
sts
Fore
ign
rela
tions
of A
rmen
ia19
24 S
umm
er O
lym
pics
Bon
nco
at o
f arm
s im
age
Coa
t of a
rms
of B
erlin
.svg
arz
ينرل
بيFe
dera
l Sta
te o
f Ber
linC
arl L
ange
nsch
eidt
Ber
lin.d
eC
ateg
ory:
Org
anis
atio
ns b
ased
in B
erlin
Acc
ordi
on19
28 S
umm
er O
lym
pics
Ber
linIS
O 3
166-
2 co
deD
E-B
E
ast
Ber
línC
ity o
f Ber
linJö
rg H
artm
ann
(Mau
erop
fer)
Ber
lin (G
erm
any)
Cat
egor
y:P
eopl
e fro
m B
erlin
Ale
iste
r Cro
wle
y19
32 S
umm
er O
lym
pics
Bru
ssel
sO
penS
treet
Map
Rel
atio
n id
entif
i624
22
ayB
erlin
His
toric
al s
ites
in b
erlin
Loth
ar S
chle
usen
erFe
dera
l Sta
te o
f Ber
linC
ateg
ory:
Pol
itics
of B
erlin
Alte
rnat
e hi
stor
y19
36 S
umm
er O
lym
pics
Bav
aria
Com
mon
s ca
tego
ryB
erlin
azB
erlin
Max
Jaf
feM
ark
Lehm
sted
tC
ity o
f Ber
linC
ateg
ory:
Rel
igio
n in
Ber
linA
then
s19
40 S
umm
er O
lym
pics
Bra
nden
burg
coat
of a
rms
Coa
t of a
rms
of B
erlin
azb
ينرل
بS
ilico
n A
llee
Eck
hard
Weh
age
His
toric
al s
ites
in b
erlin
Cat
egor
y:S
port
in B
erlin
Ala
n K
ay19
44 S
umm
er O
lym
pics
Bun
dest
agpo
stal
cod
e10
115–
1419
9
baБе
рлин
Spr
eeat
hen
Chr
iste
l Weh
age
Max
Jaf
feC
ateg
ory:
Tour
ism
in B
erlin
Alb
ert t
he B
ear
1948
Sum
mer
Oly
mpi
csB
arce
lona
loca
l dia
ling
code
030
bar
Ber
linA
then
s on
the
Spr
eeB
rigitt
e M
atsc
hins
ky-D
enni
ngho
ffS
ilico
n A
llee
Cat
egor
y:Tr
ansp
ort i
n B
erlin
Aac
hen
1952
Sum
mer
Oly
mpi
csB
aku
VIA
F id
entif
ier
1225
3098
0
bat-s
mg
Ber
līns
Cui
sine
of B
erlin
Luci
e B
erlin
Spr
eeat
hen
Cat
egor
y:V
isito
r attr
actio
ns in
Ber
linA
lexa
nder
III o
f Rus
sia
1956
Sum
mer
Oly
mpi
csB
asel
GN
D id
entif
ier
4005
728-
8
bcl
Ber
linO
tto F
reita
g (F
ußba
llspi
eler
)A
then
s on
the
Spr
eeC
ateg
ory:
Wor
ks s
et in
Ber
linA
lexa
nder
of A
phro
disi
as19
60 S
umm
er O
lym
pics
Cop
enha
gen
licen
ce p
late
cod
eB
beГо
рад
Берл
інJe
an-R
odriq
ue F
unke
Cui
sine
of B
erlin
Cat
egor
y:W
ikip
edia
boo
ks o
n B
erlin
Alg
iers
1964
Sum
mer
Oly
mpi
csC
entra
l Eur
ope
loca
ted
in th
e ad
min
istra
tive
terr
Ger
man
y
be-x
-old
Бэрл
інW
olfg
ang
Pre
ußC
ateg
ory:
Imag
es o
f Ber
linA
rt D
eco
1968
Sum
mer
Oly
mpi
csC
olog
neen
clav
e w
ithin
Bra
nden
burg
bgБе
рлин
Sve
n G
iller
tC
ateg
ory:
Ber
lin s
tubs
Ach
aean
s (H
omer
)19
72 S
umm
er O
lym
pics
Dub
linto
pic'
s m
ain
cate
gory
Cat
egor
y:B
erlin
biB
erlin
Frei
e H
anse
stad
t Ber
linC
ateg
ory:
Ber
lin te
mpl
ates
Ale
xand
er G
roth
endi
eck
1976
Sum
mer
Oly
mpi
csE
urop
eW
ikiv
oyag
e ba
nner
Ber
lin b
anne
r 2.jp
g
bmB
erlin
Aug
ustin
Ter
wes
ten
Aug
uste
and
Lou
is L
umiè
re19
80 S
umm
er O
lym
pics
Eur
opea
n U
nion
offic
ial w
ebsi
teht
tp://
ww
w.b
erlin
.de/
bnবারল
িনLu
cius
Rei
chlin
gA
nton
Dre
xler
1983
Wor
ld C
ham
pion
ship
s in
Ath
letic
sE
rfurt
Mus
icB
rain
z ar
ea ID
c9ac
1239
-e83
2-41
bc-9
930-
e252
a1fd
110
boཔ
ར་ལ
ན།S
ayed
Kam
alA
lban
Ber
g19
84 S
umm
er O
lym
pics
Eas
t Ber
linG
erm
an m
unic
ipal
ity k
ey11
0000
00
brB
erlin
Iwan
Kut
iske
rH
ouse
of A
scan
ia19
87 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Elb
ląg
LAU
1100
0000
bsB
erlin
Sta
dt B
erlin
Aur
ochs
1988
Sum
mer
Oly
mpi
csC
inem
a of
Ger
man
yN
DL
iden
tifie
r00
6291
94
bxr
Берл
инW
ilhel
m O
lsch
ewsk
i jun
ior
Ale
xand
erpl
atz
1991
Wor
ld C
ham
pion
ship
s in
Ath
letic
sFr
ankf
urt
Free
base
iden
tifie
r/m
/015
6q
caB
erlín
Ant
on H
errn
feld
Abb
ahu
1992
Sum
mer
Oly
mpi
csFl
oren
ceLC
Aut
h id
entif
ier
n790
3497
2
cbk-
zam
Ber
línD
onat
Her
rnfe
ldA
ndre
as S
chlü
ter
1993
Wor
ld C
ham
pion
ship
s in
Ath
letic
sG
othe
nbur
gFI
PS
10-
4 (c
ount
ries
and
regi
onG
M16
cdo
Bái
k-lìn
gTo
uris
mus
in B
erlin
Arc
haeo
pter
yx19
95 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Ger
man
yB
nF id
entif
ier
1529
8132
w
ceБе
рлин
Elli
e Ti
chau
erA
tlant
a19
96 S
umm
er O
lym
pics
Gda
ńsk
ince
ptio
n+1
237-
01-0
1T00
:00:
00Z
ceb
Ber
linE
nerg
ieve
rsor
gung
von
Ber
linA
Dol
l's H
ouse
1997
Wor
ld C
ham
pion
ship
s in
Ath
letic
sG
erm
an U
nity
Day
cate
gory
for p
eopl
e bo
rn h
ere
Cat
egor
y:P
eopl
e bo
rn in
Ber
lin
ckb
ينرل
بەS
tefa
n La
mpr
echt
Bon
n19
99 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Ger
man
cui
sine
cate
gory
for p
eopl
e w
ho d
ied
hC
ateg
ory:
Dea
ths
in B
erlin
coB
erlin
uC
harlo
tte M
arqu
ardt
Ber
lin2.
Fuß
ball-
Bun
desl
iga
Ger
man
Em
pire
Geo
Nam
es ID
2950
157
crh
Ber
linB
erm
uda
2000
Sum
mer
Oly
mpi
csH
ambu
rgfo
llow
sA
lt-B
erlin
csB
erlín
Fore
ign
rela
tions
of B
rune
i20
01 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Han
seat
ic L
eagu
eca
tego
ry fo
r film
s sh
ot a
t thi
s lo
Cat
egor
y:Fi
lms
shot
in B
erlin
csb
Ber
lëno
Bru
ssel
s20
03 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Han
over
cate
gory
of a
ssoc
iate
d pe
ople
Cat
egor
y:P
eopl
e fro
m B
erlin
cuБє
рлин
ъB
avar
ia20
04 S
umm
er O
lym
pics
Bud
apes
tca
tego
ry o
f peo
ple
burie
d he
reC
ateg
ory:
Bur
ials
in B
erlin
by
plac
e
cvБе
рлин
Bra
nden
burg
2005
Wor
ld C
ham
pion
ship
s in
Ath
letic
sB
ucha
rest
SE
LIB
R16
1170
cyB
erlin
Bun
dest
ag20
06 F
IFA
Wor
ld C
upE
aste
rn E
urop
eG
erm
an re
gion
al k
ey11
daB
erlin
Bau
haus
2006
FIF
A W
orld
Cup
Fin
alD
resd
enaw
ard
rece
ived
Prin
cess
of A
stur
ias
Aw
ard
- con
cord
deB
erlin
Bat
tle o
f Ram
illie
s20
07 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Aug
sbur
gde
scrib
ed b
y so
urce
Cat
holic
Enc
yclo
pedi
a
diq
Ber
linB
arry
Lyn
don
2008
Sum
mer
Oly
mpi
csA
vign
onof
fice
held
by
head
of g
over
nmG
over
ning
May
or o
f Ber
lin
dsb
Bar
lińB
ear
2009
Wor
ld C
ham
pion
ship
s in
Ath
letic
sFr
eder
ick
I, E
lect
or o
f Bra
nden
burg
LAC
iden
tifie
r00
53C
1712
eeB
erlin
Bar
celo
na20
11 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Fred
eric
k II,
Ele
ctor
of B
rand
enbu
rgN
LA (A
ustra
lia) i
dent
ifier
3655
9094
elΒ
ερολ
ίνο
Bric
k20
12 S
umm
er O
lym
pics
1936
Sum
mer
Oly
mpi
csN
LI (I
srae
l) id
entif
ier
0009
7494
7
eml
Ber
lîṅB
alts
2013
Wor
ld C
ham
pion
ship
s in
Ath
letic
sD
aim
ler A
GB
NE
iden
tifie
rX
X45
1163
eoB
erlin
oB
DS
M20
15 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
Hei
delb
erg
area
+891
.68
esB
erlín
Bak
u20
16 S
umm
er O
lym
pics
hist
ory
of to
pic
His
tory
of B
erlin
etB
erlii
nB
ritis
h M
useu
m20
17 W
orld
Cha
mpi
onsh
ips
in A
thle
tics
FAS
T-ID
1204
829
euB
erlin
Bam
berg
2019
Wor
ld C
ham
pion
ship
s in
Ath
letic
sC
omm
ons
galle
ryB
erlin
ext
Ber
línB
asel
2020
Sum
mer
Oly
mpi
csFa
cebo
ok P
lace
s ID
1111
7511
8906
315
WIK
ITR
AN
SLA
TE("
en:B
erlin
")W
IKIE
XPA
ND
("en
:Ber
lin",
{"de
"; "
fr"}
)W
IKIP
AG
EED
ITS(
"en:
Ber
lin",
TO
DA
Y() -
7, T
OD
AY(
))W
IKID
ATA
FAC
TS("
en:B
erlin
")
Fig
ure
1:
Exam
ple
outp
ut
for
each
funct
ion
inth
eW
ikipedia
Tools
forGoogleSpreadsheets
(cro
pp
ed).
Liv
esp
readsh
eet:
https://goo.gl/yvbmex
.
of an image carousel can be seen in Google’s KnowledgeGraph [10] Web search results pages when searching for“vis-itor attractions in montreal” (demo https://goo.gl/Ugt0je).
3.2 Usage Scenario II: Search AdsSearch advertisers can greatly profit from the information
that is contained in Wikipedia and Wikidata. For exam-ple, if we imagine a hotel booking site, it may be desir-able to advertise based on points of interest (POIs) and cre-ate advertisements automatically featuring known facts ofsuch POIs. Figure 3 shows an example where skyscraperslisted in the category skyscrapers over 350 meter7 are first ob-tained via WIKICATEGORYMEMBERS and then checked for their“height” fact via WIKIDATAFACTS, which is then used in twotemplates to create ads. Search keywords are generated bycalling WIKISYNONYMS and combined with terms like “hotel”.
3.3 Usage Scenario III: Marketing CampaignsOn January 13, 2016, Google Maps added Street View
imagery for the model railway Miniatur Wunderland.8 Tak-ing global Wikipedia pageviews as a popularity indicator,we can examine if the marketing campaign has had anyimpact on the attraction, assuming that more pageviewstranslate to increased visitor interest. Therefore, we firstobtain the Miniatur Wunderland article in all available lan-guages via WIKITRANSLATE and then retrieve pageviews viaWIKIPAGEVIEWS. Figure 4 shows indeed an international up-take of pageviews starting January 13 after an earlier linearcurve progression (except for the German article, which hada peak on January 8, a long weekend after a public holiday).
4. RELATED WORKIn his book Google Apps Script for Beginners [4], Gabet
gives an introduction to extending Google Spreadsheets withcustom functions. A similar introduction is given in Fer-reira’s Google Apps Script: Web Application Development Es-
sentials [3]. In [5], Han et al. describe their approach RDF123
to translate spreadsheets data to RDF, the inverse of whatwe do in WIKIDATAFACTS. Olsen and Moser show in [8] howWeb APIs can be taught with spreadsheets. The process ofcalling Web APIs via spreadsheets is further described in [9].Further, in [1], Abramson et al. describe how they enabledspreadsheets to have“super-computing”powers through par-allelized custom functions. An open-source toolkit for min-ing Wikipedia—not bound to spreadsheets, but designed forgeneral use with the Java programming language—is de-scribed by Milne et al. in [7].
5. CONCLUSIONS AND FUTURE WORKIn this paper, we have introduced the Wikipedia Tools for
Google Spreadsheets. First, we have introduced the data sour-ces Wikipedia and Wikidata and their different APIs. Sec-ond, we have shown how Google Spreadsheets can be ex-tended through custom functions that can then be used fromwithin a cell context as if they were native functions. In thefollowing, we have listed the implemented functions, and ex-plained where they extend the functionality of the underly-
7Skyscrapers over 350 meter: https://en.wikipedia.org/wiki/Category:Skyscrapers over 350 meters8Miniatur Wunderland on Google Street View:https://www.google.com/maps/about/behind-the-scenes/streetview/treks/miniatur-wunderland/
DIY Knowledge GraphFile Edit View Insert Format Data Tools Addons Help All changes saved in Drive
$ % 123
Arial 10
=IFERROR(SUM(QUERY(WIKIPAGEVIEWS("en:"&A2, TODAY() - 30, TODAY()), "SELECT Col2")), "")
DIY Knowledge GraphComments Share
steiner.thomas@gmail.com
Sheet1
Figure 2: Usage scenario I: Wikipedia Tools for GoogleSpreadsheets used to create an ordered category panel basedon Wikipedia category memberships and accumulated Wiki-pedia pageviews for popularity ranking (here: the top-10visitor attractions in Montreal). Live spreadsheet: https:
//goo.gl/Njvt1T.
AdWords AdsFile Edit View Insert Format Data Tools Addons Help All changes saved in Drive
$ % 123
Arial 10
=ARRAYFORMULA(LOWER(WIKISYNONYMS("en:"&C$2)&" hotel"))
AdWords AdsComments Share
steiner.thomas@gmail.com
Sheet1
Figure 3: Usage scenario II: Wikipedia Tools for GoogleSpreadsheets used to create textual search ads based onWikidata facts (here: skyscraper heights) and Wikipediasynonyms as keywords combined with the term“hotel”. Livespreadsheet: https://goo.gl/np1Is8.
Miniatur WunderlandFile Edit View Insert Format Data Tools Addons Help All changes saved in Drive
$ % 123
Arial 10
=IF(ISBLANK(C$2), "", QUERY(WIKIPAGEVIEWS(C$2, TODAY() - $B$1, TODAY()), "SELECT Col2"))
da:Miniatur W…de:Miniatur W…en:Miniatur W…es:MiniaturW…fa:سرزمين عجای…fi:Miniatur Wu…fr:MiniaturWu…hu:Miniatur W…
1/2Dec 29, 2015 Jan 5, 2016 Jan 12, 2016 Jan 19, 2016
0
500
1000
1500
2000
Date
Jan 13, 2016en:Miniatur Wunderland: 1487
Miniatur WunderlandComments Share
steiner.thomas@gmail.com
ExploreSheet1
Figure 4: Usage scenario III: Wikipedia Tools for GoogleSpreadsheets used to evaluate the impact of a marketingcampaign (here: model railway Miniatur Wunderland beingfeatured on Google Street View since January 13, 2016).Live spreadsheet: https://goo.gl/q1yhuV.
ing wrapped API functions. We have then focused on threedifferent usage scenarios that illustrate how to work withthe Wikipedia Tools for Google Spreadsheets and finally haveprovided an overlook on related work in the area.
Future work will focus on adding more functions as needbe and potentially making the functions more parameteri-zable. In the current iteration, we have favored simplicityand ease of use over customizability, essentially making themost common use case the only option. Possibly, in up-coming releases, we will add an advanced mode that allowsexperienced users to fine-tune the functions’ results, for ex-ample, to implicitly include bot traffic in WIKIPAGEVIEWS
that we have currently excluded on purpose.Concluding, we were positively surprised by the increased
productivity and short turnaround time enabled by the Wiki-
pedia Tools for Google Spreadsheets for the rapid prototypingof ideas, especially in combination with the fill-down andfill-right features in spreadsheets and the charting capabili-ties. We look forward to making the tools even more pow-erful and hope to attract collaborators for the open sourceproject available on GitHub at https://github.com/tomayac/
wikipedia-tools-for-google-spreadsheets. As a positive side ef-fect, the tools can even help improve Wikipedia and Wiki-data when authors add missing data, for example, we addedan image to one of the visitor attractions of Montreal, as thisfact was initially missing in Wikidata (and thus in Figure 2).
6. REFERENCES[1] D. Abramson, L. Kotler, D. Mather, and P. Roe.
ActiveSheets: Super-Computing with Spreadsheets. InU. Seattle, editor, Proceedings of the HighPerformance Computing Symposium – HPC 2001,pages 110–115, San Diego, USA, 2001.
[2] R. Cyganiak, D. Wood, and M. Lanthaler. RDF 1.1Concepts and Abstract Syntax. Recommendation,W3C, Feb. 2014.
[3] J. Ferreira. Google Apps Script: Web ApplicationDevelopment Essentials. O’Reilly Media, 2014.
[4] S. Gabet. Google Apps Script for Beginners. PacktPublishing, 2014.
[5] L. Han, T. Finin, C. Parr, J. Sachs, and A. Joshi.RDF123: From Spreadsheets to RDF. In TheSemantic Web – ISWC 2008, volume 5318 of LNCS,pages 451–466. Springer, 2008.
[6] M. Lathuiliere. Wikidata SDK, 2016.https://github.com/maxlath/wikidata-sdk (2016-02-08).
[7] D. Milne and I. H. Witten. An Open-Source Toolkitfor Mining Wikipedia. Artificial Intelligence,194:222–239, Jan. 2013.
[8] T. Olsen and K. Moser. Teaching Web APIs inIntroductory and Programming Classes: Why andHow. Paper 16, SIGED: IAIM Conference, Feb. 2013.
[9] K. Patel, S. Prish, S. Sadhu, L. Bizek, and X. Pan.Spreadsheet Functions to Call REST API Sources,May 15 2014. US Patent App. 13/672,704.
[10] A. Singhal. “Introducing the Knowledge Graph:things, not strings”, Official Google Blog, May 2012.http://googleblog.blogspot.com/2012/05/
introducing-knowledge-graph-things-not.html.
[11] D. Vrandecic and M. Krotzsch. Wikidata: A FreeCollaborative Knowledgebase. Commun. ACM,57(10):78–85, Sept. 2014.
Recommended