Upload
francine-greene
View
218
Download
2
Tags:
Embed Size (px)
Citation preview
Visualizing Natural Language Resources
Kristina Kocijan
University of Zagreb,
Faculty of Humanities and Social Sciences,
Department of Information and Communication SciencesZagreb, Croatia
Is it about beautiful pictures?
Sooo, what is this presentation about?
“
”
Beauty is in the eye of the beholder.
3rd century BC, Greek saying
Baudelaire’s beauty:
data is beautiful if it is the result of reason and calculation.
Thoreau’s beauty:
data is beautiful by its very plainness.
About beautiful pictures!
Sooo, what is this presentation about?
New ways of presenting data?
Sooo, that’s it – only beautiful pictures?
“
”
The hope is that, in not too many years, human brains and computing machines will be coupled together very tightly and that the resulting partnership will think as no human
brain has ever thought and process data in a way not approached by the information-handling machines we know today.”
J.C.R. Licklider, in ‘Man-Computer Symbiosis’, March, 1960.
Reading the same data
In different forms
Reading the same dataSlowly, slowly,
very slowly
Faster,
alas lucking infoNouns Common
Collective
Proper
Fem 8 344 1 3 177
Mas. 6 249 2 3 189
Neut. 5 520 3 66
No gender
0 0 36 362
Total per type
20 113 6 42 794
Total nouns
62 913
Reading the same data
Speedy, and empowering
Reading the same data
Speedy, and empowering
Reading the same data
Speedy, and empowering
Presenting the same dataStatistics for the nouns
in a dictionaryStatistics for the nouns
in a corpus
Nouns Common Proper
Fem 39.84 % 3.61 %
Mas. 32.26 % 4.64 %
Neut. 15.36 % 0.12 %
No gender 0 % 4.17 %
Total per type 87.46 % 12.54 %
Total nouns 1 048 570
Nouns Common Proper
Fem 13.26 % 5.05 %
Mas. 9.93 % 5.07 %
Neut. 8.77 % 0.10 %
No gender 0 % 57.80 %
Total per type 31.97 % 68.03 %
Total nouns 62 907
Distribution of top 10 paradigmas
In DIC:
ALAT
ASTRONOM
BLAGOST
BRATIĆ
CRTANJE
DAVOR
FABIANA
GUSJENICA
LEPTIR
MEDO
In Corp:
ALAT
BATBESKRAJ
BLAGO
BLAGOST
BRATIĆ
CRTANJE
GUSJENICA
MEDO
PROLAZNIK
Genitive+sg endings
In DIC In Corpus
Genitive+sg endings
In Corpus
Genitive+sg endings - weighted
In Corpus
Visual Story
As told by Data
“
”
Often the most effective way to describe, explore and summarize a set of numbers – even a very large set – is to look at pictures of those numbers.
Edward R. Tufte in ‘Visual Display of Quantitative Information’, 2001.
Story behind the NLR data
Instrumental
Genitive
Vocative
Dative
Accusative
Locative
Story behind the NLR data
Story behind the NLR data
Story behind the NLR data
Story behind the NLR data
Story behind the NLR data
Story behind the NLR data
Thank you!Visualizing
Natural Language ResourcesKristina Kocijan
University of Zagreb,
Faculty of Humanities and Social Sciences,
Department of Information and Communication SciencesZagreb, Croatia
Questions?