Upload
craig-poole
View
235
Download
0
Tags:
Embed Size (px)
Citation preview
TECHNIQUES FOR VISUALIZING MASSIVE DATA SETSLeilani Battle, Mike Stonebraker
Context
Visualization System
Database
query
result
Problem
• Performance• Vis systems don’t scale well for big data• Or are turning into databases
• Over-plotting• Makes visualizations unreadable• Waste of time/resources
Solution: Resolution Reduction
Visualization System
Database
Resolution Reduction Layer
query
queryplan query
queryplan result
modified query
reduced result
ScalaR
• Scalable vis system for data exploration• Web front-end• Uses SciDB (www.scidb.org)
• Visualizes query results• Performs Resolution Reduction
Demo of ScalaR
Array Browser
• Collaboration with:• Brown: Justin DeBrabant, Stan Zdonik, Ugur Cetintemel• Stanford: Zhicheng Liu, Jeff Heer
• Google Maps-style exploration experience• Fetches subsets of the data (aka data tiles)
Array Browser Example
Array Browser Architecture
Demo of Array Browser
Future Work: Prefetching
• Goal: Reduce user-wait time by prefetching tiles• Cache tiles in the tile buffer• Need algorithms to decide what to pre-fetch
User Behavior Predictor (Seer)
P
P
• Learn common query sequences from user traces
Statistical Analysis Predictor
P
P
P
• Look for statistical similarities in tiles• Try to guess what’s important based on patterns
Using Multiple Predictors
• Run multiple predictors (or experts) in parallel• Compare predictions to user’s actual behavior• Use predictions from best performing expert
• May change over time based on user’s goals
Other Challenges
• Lots if interesting problems left to address• Best eviction policy for the tile buffer?• How to share data between multiple users?• More predictors?
Questions?
Gemini Sagittarius
Dogs Cats
Prefetching Experts
• User behavior predictor (Seer)• Learn common query sequences from user traces
• Stats analysis predictor• Look for statistical similarities in tiles• Try to guess what’s important based on patterns