20
TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Embed Size (px)

Citation preview

Page 1: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

TECHNIQUES FOR VISUALIZING MASSIVE DATA SETSLeilani Battle, Mike Stonebraker

Page 2: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Context

Visualization System

Database

query

result

Page 3: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Problem

• Performance• Vis systems don’t scale well for big data• Or are turning into databases

• Over-plotting• Makes visualizations unreadable• Waste of time/resources

Page 4: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Solution: Resolution Reduction

Visualization System

Database

Resolution Reduction Layer

query

queryplan query

queryplan result

modified query

reduced result

Page 5: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

ScalaR

• Scalable vis system for data exploration• Web front-end• Uses SciDB (www.scidb.org)

• Visualizes query results• Performs Resolution Reduction

Page 6: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Demo of ScalaR

Page 7: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Array Browser

• Collaboration with:• Brown: Justin DeBrabant, Stan Zdonik, Ugur Cetintemel• Stanford: Zhicheng Liu, Jeff Heer

• Google Maps-style exploration experience• Fetches subsets of the data (aka data tiles)

Page 8: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Array Browser Example

Page 9: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Array Browser Architecture

Page 10: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Demo of Array Browser

Page 11: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Future Work: Prefetching

• Goal: Reduce user-wait time by prefetching tiles• Cache tiles in the tile buffer• Need algorithms to decide what to pre-fetch

Page 12: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

User Behavior Predictor (Seer)

P

P

• Learn common query sequences from user traces

Page 13: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Statistical Analysis Predictor

P

P

P

• Look for statistical similarities in tiles• Try to guess what’s important based on patterns

Page 14: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Using Multiple Predictors

• Run multiple predictors (or experts) in parallel• Compare predictions to user’s actual behavior• Use predictions from best performing expert

• May change over time based on user’s goals

Page 15: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Other Challenges

• Lots if interesting problems left to address• Best eviction policy for the tile buffer?• How to share data between multiple users?• More predictors?

Page 16: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Questions?

Page 17: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker
Page 18: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Gemini Sagittarius

Dogs Cats

Page 19: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker
Page 20: TECHNIQUES FOR VISUALIZING MASSIVE DATA SETS Leilani Battle, Mike Stonebraker

Prefetching Experts

• User behavior predictor (Seer)• Learn common query sequences from user traces

• Stats analysis predictor• Look for statistical similarities in tiles• Try to guess what’s important based on patterns