View
82
Download
1
Category
Preview:
DESCRIPTION
Presentation given at Dutch Information Visualisation Event 2014
Citation preview
StatMine – prototypeStatMineExploring official statistics
Martijn Tennekes, Edwin de Jonge, Jan van der Laan & Jessica Solcer
Statistics Netherlands (CBS)
Visweek 2013
StatMine, statistical goldmineEdwin de Jonge (@edwindjonge)Jan van der Laan, Jessica Solcer
Statistics Netherlands / CBSDutch Information Visualisation Event 2014, June 19, 2014
StatMine 0.2 2
Statistics Netherlands / CBS
- Creates and publishes official statistics on economics, demographics, health care and others.
- Since 1899
- Website: www.cbs.nl- Online DB: http://statline.cbs.nl (since 1997)
Why StatMine?
– Online StatLine contains more than one billion (109) facts‐ Policy makers‐ Journalists‐ Citizens‐ Enterprises‐ Economists‐ Social scientist‐ Historicians‐ etc
StatMine 0.2 3
StatMine 4
Problem 1Numbers ≠ Information
1. Numbers ≠ Information
We know from a user study that:1. Many interesting patterns in StatLine are not
spotted by users
2. Many important topics in StatLine are scattered across multiple tables
StatMine 0.2 5
StatMine 6
H1:Data analysis= Data insight
H1. Data insight
Goal of StatMine 0.1 was to provide more insight StatLine numbers by
• Presenting these facts visually and interactively
• We tested this succesfully on 4 “difficult StatLine tables.
StatMine 0.2 7
StatMine 0.2 8
Bar chart
- compare
Line chart
- development
Bubble/scatter chart
- correlationMosaic chart
- structure
an exploration of dissemination data: StatMine 9
Chart type – bar chart
StatMine 0.2 10
Small multiples?
StatMine 0.2 11
Demo
an exploration of dissemination data: StatMine 12
StatMine 0.1 Results
Tested on 25 users:
Findings:- Test persons think that visualizing data
adds value (small multiples)- Data owners look at their data
differently- They want this tool to check their data
before publication.StatMine 0.2 13
StatMine 14
Problem 2:Fragmented Information
2. Fragmented information
Most information in StatLine is fragmented:
‐ Energy consumption wrt economic growth‐ Perceived public safety wrt registered crime
– Users currently need to look into multiple tables and combine the information by hand. Gebruiker moet in meerdere tabellen kijken en informatie zelf combineren
StatMine 0.2 15
StatMine 16
2. Merge data!
H2. Table joining
Goal StatMine 0.2: create more insight by:
- Letting users combine tables- Condition: share at least one
column/data dimension.- Tested on small set of tables.
StatMine 0.2 17
StatMine 0.2 Results
Test persons: 20 internal, 40 external (policy makers, journalists).
Findings:- External users enthousiast about
visual possibilities StatMine- Joining of data fills a user need.
StatMine 0.2 18
StatMine 19
Problem 3Statistical numbers are uncertain
H3. Confidence intervals
– Al facts Statistics Netherlands have confidence interval
– European Statistics Code of Practice (12.2): ‐ “sampling and non sampling errors should
be systematically documented”
Goal StatMine 0.3:
Investigate how uncertainty in numbers can be presented understandable to users.
StatMine 20
Restricted to:‐ How do users interpret CI’s? And what does
that affect the interpretation of facts?‐ Do users need CI’s?
Assumption: ‐ For test data set of point estimate with CI
available
StatMine 0.3
StatMine 0.2 21
User test (100+) with synthetic data shows that:
‐ CI’s improve validity of user statements (they are more correct)
User test CI’s
StatMine 0.2 22
StatMine 0.3
– Prototype StatMine 0.3:‐ Show uncertainty in Line Charts‐ Bar Charts‐ Tested on 25 test persons.
23
Line charts with uncertainty
24
Bar charts with uncertainty
25
StatMine 0.4
–Build on CBS open data API–Will be public–Currently in beta test, ETA (2014 Q3)
26
Questions?
27
Recommended