Upload
abel-campbell
View
214
Download
1
Embed Size (px)
Citation preview
Visualization
Blaz Zupan
Faculty of Computer & Info ScienceUniversity of Ljubljana, Slovenia
Visual Data Mining
• Basic idea– visual presentation of the data– gain insight & generate hypothesis– draw conclusions– directly interact with data
• Include human in the data exploration process– use her/his flexibility– creativity– general knowledge
Benefits of Visualization
• involvement of the user• results are intuitive
– no need for understanding complex mathematical or statistical algorithms or parameters
• provision of qualitative overview of the data– can isolate specific patterns for further
quantitative analysis
• can deal with non-homogenous, noisy data
Visual Exploration Paradigm
Overview first, zoom & filter, and then details on demand.
Overview first, zoom & filter, and then details on demand.
Visual Exploration Paradigm
Overview first, zoom & filter, and then details on demand.
Overview first, zoom & filter, and then details on demand.
Classification
one-dimensional
two-dimensional
multi-dimensional
text, web content
networks
other (e.g. algorithms/software, ...)
Data Type
Interaction & DistortionTechnique
VisualizationTechnique
Standard
Projection
Filtering
Zoom
Distortion
Link & Brush
Standard 2D/3D Display
Geometrically Transformed Display
Iconic Display
Dense Pixel Display
Stacked Display
from D Keim & M Ward: Visualization, in Intelligent Data Analysis, M Berthold & DJ Hand (eds), Springer, 2003.
Data: One-Dimensional
R Bellazzi: Mining Biomedical Time Series by Combining Structural Analysis and Temporal Abstractions,In Proc. of AMIA 1998.
Data: Two-Dimensional
MineSet’s Map Visualizer.
Data: Multi-Dimensional
Data: Text • Galaxies visualization
• Uses the “night sky” visualization to represent a set of documents
• One document – one star
• Stars clustered together represent related documents
• Includes analytical tools to investigate groups and time-based trends, query contents
From Inspire (TM) Software, seewww.pnl.gov/infoviz/technologies.html
Data: Text • ThemeView (TM)• Topics or themes
of text documents shown in relief map of a natural terrain
• The height of a peek relates to the strength of the topic
From Inspire (TM) Software, seewww.pnl.gov/infoviz/technologies.html
Data: Text • Theme River (TM)
• Identification of time related trends and patterns
• Themes represented as colored streams
• The width of the stream relates to the collective strength of a theme
From Inspire (TM) Software, seewww.pnl.gov/infoviz/technologies.html
Data: Networks
E. coli metabolic network (colors denote predominant biochemical class of metabolites)Ravasz et al., Science 297, 30 Aug 2002.
S. cerevisiae gene interaction networkTong et al., Science 303, 6 Feb 2004.
V Batagelj, A Mrvar: Pajek @ vlado.fmf.uni-lj.si/pub/networks/pajek/
Data: Tree Hierarchies
Unix home directory Selected detail
Kleiberg et al.: Botanic Visualization of Huge Hierarchies, In InfoVis, 2001.
Classification
one-dimensional
two-dimensional
multi-dimensional
text, web content
networks
other (e.g. algorithms/software, ...)
Data Type
Interaction & DistortionTechnique
VisualizationTechnique
Standard
Projection
Filtering
Zoom
Distortion
Link & Brush
Standard 2D/3D Display
Geometrically Transformed Display
Iconic Display
Dense Pixel Display
Stacked Display
Standard 2D/3D
• x-y (x-y-z) plots
• bar charts• line graphs• histograms• maps
Standard 2D/3D
• x-y (x-y-z) plots
• bar charts• line graphs• histograms• maps
Standard 2D/3D
• x-y (x-y-z) plots
• bar charts• line graphs• histograms• maps
Geom.-Transformed Displays
• includes several classes of visualizations
• projection pursuit, finding “interesting transformations” of multi-dim data set
• scatterplot matrix
• parallel coordinates
Iconic Displays
W Horn et al.: Metaphor graphics to visualize ICU data over time, In IDAMAP 1998.
Dense Pixel Displays
DA Keim et al.: Recursive Pattern: A technique for visualizing very large amounts of dataProc. Visualization 95, pages 279-286, 1995.
Dense Pixel Displays
Ankerst et al.: Circle Segments: A technique for visually exploring large multidimensional data sets.In Proc. Visualization 96, Hot Topic Session, 1996.
Stacked Displays
J LeBlanc et al.: Exploring n-dimensional databases. In Proc. Visualization 90, pages 230-239, 1990.
• an example is dimensional stacking
• embed one coordinate system within the other
• e.g. two attributes in one system, then another two when drilling down
Stacked Displays
Decision table visualization from SGI’s MineSet
Stacked Displays
Mosaic display in Orange.
Classification
one-dimensional
two-dimensional
multi-dimensional
text, web content
networks
other (e.g. algorithms/software, ...)
Data Type
Interaction & DistortionTechnique
VisualizationTechnique
Standard
Dynamic Projection
Filtering
Zoom
Distortion
Link & Brush
Standard 2D/3D Display
Geometrically Transformed Display
Iconic Display
Dense Pixel Display
Stacked Display
Interaction Techniques
• Dynamic projection– dynamically change the
projections to explore multi-dimensional data sets
– projection pursuit, which finds well-separated clusters in scatterplot
• Interactive Filtering– browsing, can be
difficult for big data sets– querying, need to
specify a subset
• Zooming• Distortion
– e.g., fisheye view
• Brushing and linking– requires well-
integrated system for visualization
– selection from one visualization is fed into another one, selected instances highlighted in some way
Distortion
GW Furnas: Generalized Fisheye Views, Human Factors in Computing Systems CHI ‘86 Conference Proceedings, 16-23. 1986.
Distortion
From M Grobelnik, P Krese, D Mladenic: Project Intelligence (http://pi.ijs.si)
Distortion
From M Grobelnik, P Krese, D Mladenic: Project Intelligence (http://pi.ijs.si)
Distortion
From M Grobelnik, P Krese, D Mladenic: Project Intelligence (http://pi.ijs.si)
Brushing & Linking
Integration ofVisualization & Data Mining
1. Visualization techniques can be applied before (or independently) of DM
2. DM can be used to find patterns (or data subsets) that are further visualized
3. DM is interactive, users use visualization to guide the pattern search
4. Visualization of data mining models
Regression Tree
Regression tree visualization in SGI’s MineSet.
Classification Tree
Classification tree visualization in Orange.
Brushing: Trees & Scatter Plots
Sieve Diagram (Classification)
Nomograms
Intelligent Data Visualization
• Use an established visualization technique, but search for– interesting subset of attributes– interesting subset of data instances– interesting projection (how to use selected
attributes in visualization)
• All these to find “interesting” visualization
• Removes the burden for the user to find such visualizations by hand
Arrangement for Circle Segments
M Ankerst: Visual data mining with pixel-oriented techniques, In Proc. KDD, 2001.
VizRank
Conclusion
• Clarity of presentation• Aesthetics• Navigation & Interaction
• In data with many dimensions, tools are needed to find only “interesting” visualizations