16
Visualization of Large Multivariate Data Sets using Parallel Coordinates Ing. Ľuboš Takáč PhD student Faculty of Management Science and Informatics University of Žilina

Visualization of Large Multivariate Data Sets using Parallel Coordinates

Embed Size (px)

DESCRIPTION

Presentation of my tool which visualize multivariate datasets using parallel coordinates

Citation preview

Page 1: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Visualization of Large Multivariate Data Sets using Parallel Coordinates

Ing. Ľuboš TakáčPhD student

Faculty of Management Science and InformaticsUniversity of Žilina

Page 2: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Presentation overview• Visualization

• Parallel Coordinates (PC)

• Large Multivariate Data Sets (LMDS)

• Problem of visualization LMDS

• Solutions

• Developed Software Tool

• Further research and application

Page 3: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Visualization• One of the best approach for presenting data from

PC to human

• Advantages• Global view of data (all in one picture)• Significant features highlighted (and vice versa)• Fast understanding of data by human

• Purpose• Understand raw data• To see some significant characteristics or anomalies which

can be further examined to gain some additional information about raw data.

Page 4: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Parallel Coordinates (parallel axes)• Invented by Maurice d’Ocagne, 1885• Popularized by Alfred Inselberg, 1959

• Easy construction (like ordinary graph)• Data are represented by polylines

• Suitable for visualizing multivariate data (more than 3 dimension)

• You can see dependecies between variables• Distribution per variable

Page 5: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Parallel Coordinates - Construction

Main difference between ordinary graph of function and parallel coordinates is in position of axes.

Page 6: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Parallel Coordinates – Examples

Examples of some 2D functions visualized using parallel coordinates by developed software tool.

Page 7: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Large Multivariate Data Sets• Collection of data usually presented in tabular

form

Multivariate data set of movies with 7 dimensions.

Page 8: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Problem of visualization LMDS• Records overlapping• by simply painting records should be overlapped, you

loose some information

• Overlapping the same records• by simply painting you do not see the difference between

overlapping two or hundreds same records

• Too many records to visualize => one big blur• imagine resolution 1024x768, ten thousand of records

uniformly distributed over axes (height 768 px means about 13 records per pixel)

Page 9: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Problem of visualization LMDS

Problem of overlapping painted records.

Page 10: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Possible Solutions

• Preprocessing data before visualization

• Paint data sophisticated by Alpha Compositing

Page 11: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Alpha compositing• Computer graphics painting method which use

alpha channel to define each color (alpha channel – transparency of color)

• If you paint object with non opaque color, the resulting color depends on background too

http://en.wikipedia.org/wiki/Alpha_compositing

Page 12: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Implemented solution

Visualizing the same randomly generated multivariate data sets by opaque color (upper image) and using alpha compositing technique (right image).

Page 13: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Developed software tool

• Based on mentioned principles• Interactive analyzing of LMDS• Interactive set operation (selection, difference,

intersection …)• High quality, antialiased image• Data import from text file• Record count is limited to hundreds thousand at

rs. 1920x1080

Page 14: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Developed software tool

Demonstration of developed software tool. Visualized data sets come from IMDb (Internet Movie Database).

Page 15: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Further research and application• Tool can help decision makers and data analyst to

gain some added information to do better decisions.

• Medical data• Scholar data

Page 16: Visualization of Large Multivariate Data Sets using Parallel Coordinates

Thank you for your [email protected]