21
1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

Embed Size (px)

Citation preview

Page 1: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

1SDMIV

Data Visualization- A Very Rough Guide

Data Visualization- A Very Rough Guide

Ken BrodlieUniversity of Leeds

Page 2: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

2SDMIV

What is This Thing Called Visualization?

What is This Thing Called Visualization?

Visualization– “Use of computer-

supported, interactive, visual representations of data to amplify cognition” (Card, McKinlay, Shneiderman)

– Born as a discipline in 1987 with publication of NSF Report

– Now widely used in computational science and engineering

Vis5D

Page 3: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

3SDMIV

Visualization – Twin SubjectsVisualization – Twin Subjects

Scientific Visualization

– Visualization of physical data

Information Visualization

– Visualization of abstract data

Ozone layer around earthAutomobile web site- visualizing links

Page 4: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

4SDMIV

Scientific Visualization – Another CharacterisationScientific Visualization – Another Characterisation

Focus is on visualizing an entity measured in a multi-dimensional space

– 1D– 2D– 3D– Occasionally nD

Underlying field is recreated from the sampled data

Relationship between variables well understood – some independent, some dependent

http://pacific.commerce.ubc.ca/xr/plot.html

Image from D. Bartz and M. Meissner

Page 5: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

5SDMIV

Scientific Visualization Model

Scientific Visualization Model

Visualization represented as pipeline:

– Read in data– Build model of

underlying entity– Construct a

visualization in terms of geometry

– Render geometry as image

Realised as modular visualization environment

– IRIS Explorer – IBM Open Visualization

Data Explorer (DX)– AVS

visualizemodeldata render

Page 6: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

6SDMIV

Extending the SciVis Model

Extending the SciVis Model

The dataflow model has proved extremely flexible

Provides basis of collaborative visualization

– Implemented in IRIS Explorer as the COVISA toolkit

Extensible– User code

introduced as module in pipeline allows computational steering

visualizemodeldata render

internetcollaborative server

render

simulate visualize rendercontrol

Page 7: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

7SDMIV

An e-Science DemonstratorAn e-Science Demonstrator

Emergency scenario: release of toxic chemical

– Simulation launched on Grid resource, steered from desktop using IRIS Explorer

– Collaborators linked in remotely using COVISA toolkit

Dispersion of pollutantstudied under varyingwind directions

A collaboratorlinks in overthe network

Page 8: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

8SDMIV

Other MetaphorsOther Metaphors

Other user interface metaphors have been suggested

Spreadsheet interface becoming popular..

Allows audit trail of visualizations

Jankun-Kelly and Ma

Page 9: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

9SDMIV

Information VisualizationInformation Visualization

Focus is on visualizing set of observations that are multi-variate

Example of iris data set

– 150 observations of 4 variables (length, width of petal and sepal)

– Techniques aim to display relationships between variables

Page 10: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

10SDMIV

Dataflow for Information Visualization

Dataflow for Information Visualization

Again we can express as a dataflow – but emphasis now is on data itself rather than underlying entity

First step is to form the data into a table of observations, each observation being a set of values of the variables

Then we apply a visualization technique as before

visualizedatatabledata render

A B C

1 .. .. ..

2 .. .. ..

variables

observations

Page 11: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

11SDMIV

Multivariate VisualizationMultivariate Visualization

Software:– Xmdvtool

Matthew Ward

Techniques designed for any number of variables

– Glyph techniques– Parallel co-

ordinates– Scatter plot

matrices– Pixel-based

techniques

Acknowledgement:Many of images in followingslides taken from Ward’s work ..and also IRIS Explorer!

Page 12: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

12SDMIV

Glyph TechniquesGlyph Techniques

Star plots– Each observation

represented as a ‘star’

– Each spike represents a variable

– Length of spike indicates the value

Variety of possible glyphs

– Chernoff faces Crime inDetroit

Page 13: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

13SDMIV

Parallel Co-ordinatesParallel Co-ordinates

Each variate represented as vertical axis

Axes laid out uniformly

Observation represented as a polyline traversing all M axes, crossing each axis at the observed value of the variate

Detroit homicide data (7 variables,13 observations)

Page 14: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

14SDMIV

Scatter Plot MatricesScatter Plot Matrices

Matrix of 2D scatter plots

– Each plot shows projection of data onto a 2D subspace of the variates

– Order M2 plots

Page 15: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

15SDMIV

The Screen Space ProblemThe Screen Space Problem

All techniques, sooner or later, run out of screen space

Parallel co-ordinates

– Usable for up to 150 variates

– Unworkable greater than 250 variates

Remote sensing: 5 variates, 16,384 observations)

Page 16: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

16SDMIV

Brushing as a SolutionBrushing as a Solution

Brushing selects a restricted range of one or more variables

Selection then highlighted

Page 17: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

17SDMIV

Clustering as a SolutionClustering as a Solution

Success has been achieved through clustering of observations

Hierarchical parallel co-ordinates

– Cluster by similarity

– Display using translucency and proximity-based colour

Page 18: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

18SDMIV

Hierarchical Parallel Co-ordinates

Hierarchical Parallel Co-ordinates

Page 19: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

19SDMIV

Reduction of Dimensionality of Variate

Space

Reduction of Dimensionality of Variate

Space

Reduce number of variables, preserve information

Principal Component Analysis

– Transform to new co-ordinate system

– Hard to interpret Hierarchical reduction

of variate space– Cluster variables

where distance between observations is typically small

– Choose representative for each cluster

Page 20: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

20SDMIV

Using a Dataflow System for Information Visualization

Using a Dataflow System for Information Visualization

IRIS Explorer used to visualize data from BMW

– Five variables displayed using spatial arrangement for three, colour and object type for others

– Notice the clusters…

More later..

Kraus & Ertl

Page 21: 1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds

21SDMIV

Scientific Visualization – Information VisualizationScientific Visualization – Information Visualization

Focus is on visualizing set of observations that are multi-variate

There is no underlying field – it is the data itself we want to visualize

The relationship between variables is not well understood

Focus is on visualizing an entity measured in a multi-dimensional space

Underlying field is recreated from the sampled data

Relationship between variables well understood

Scientific Visualization

Information Visualization