12
Multivariate Data Visualization Adapted from Slides by: Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported under NSF Grant IIS-9732897

Multivariate Data Visualization Adapted from Slides by: Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Multivariate Data Visualization

Adapted from Slides by:

Matthew O. WardComputer Science DepartmentWorcester Polytechnic Institute

This work was supported under NSF Grant IIS-9732897

What is Multivariate Data?

Each data point has N variables or observations

Each observation can be: nominal or ordinal discrete or continuous scalar, vector, or tensor

May or may not have spatial, temporal, or other connectivity attribute

Characteristics of a Variable

Order: grades have an order, brand names do not.

Distance metric: for income, distance equals difference. For rankings, difference is not a distance metric.

A variable can be classified by these three attributes, called Scale.

Effective visualizations attempt to match the scale of the data dimension with the graphical attribute conveying it.

Sources of Multivariate Data

Sensors (e.g., images, gauges)SimulationsCensus or other surveysCommerce (e.g., stock market)Communication systemsSpreadsheets and databases

Issues in Visualizing Multivariate Data

How many variables? How many records? Types of variables? User task (exploration, confirmation,

presentation) Data feature of interest (clusters, anomalies,

trends, patterns, ….) Background of user (domain expert, visualization

specialist, decision-maker, ….)

Methods for Visualizing Multivariate Data

Dimensional SubsettingDimensional ReorganizationDimensional Reduction

Dimensional Subsetting

Scatterplot matrix displays all pairwise plots

Selection allows linkage between views

Clusters, trends, and correlations readily discerned between pairs of dimensions

Dimensional Reorganization

Parallel Coordinates creates parallel, rather than orthogonal, dimensions.

Data point corresponds to polyline across axes

Clusters, trends, and anomalies discernable as groupings or outliers, based on intercepts and slopes

Dimensional Reorganization

Glyphs map data dimensions to graphical attributes

Size, color, shape, and orientation are commonly used

Similarities/differences in features give insights into relations

Dimensional Reduction

Map N-D locations to M-D display space while best preserving N-D relations

Approaches include MDS, PCA, and Kohonen Self Organizing Maps

Relationships conveyed by position, links, color, shape, size, etc.

The Role of Selection

User needs to interact with display, examine interesting patterns or anomalies, validate hypotheses

Selection allows isolation of subset of data for highlighting, deleting, focussed analysis

Direct (clicking on displayed items ) vs. indirect (range sliders)

Screen space (2-D) vs. data space (N-D)

Auxiliary Tools

Extent scaling to reduce occlusion of bandsDimensional zooming - fill display with

selected subspace (N-D distortion)Dynamic masking to fade out selected or

unselected dataSaving selected subsetsEnabling/disabling dimensionsUnivariate displays (Tukey box plots, tree

maps)