Exploring ENRON Email with NetLens · TreePlus - Visualizing Graphs as Trees |Plant a seed and...

Preview:

Citation preview

Exploring ENRON Email with NetLens

Catherine Plaisant, Benjamin B. Bederson Hyunmo Kang, Bongshin LeeHuman-Computer Interaction Laboratory

University of Maryland

Joint Institute for Knowledge Discovery

Our research focusAlternatives UI to Graph Visualization

how to avoid this…

Node-Link diagrams have many limitations.Not readable, may show clusters but not much else, do not scale well.

NetLensIterative Exploration of Content-Actor Network Data

User Interface for exploratory search Generalizable to a variety of data

Provide consistent interface

Easy to learn and use

Kang et al.Proc. of Visual Analytics Science and Technology Conference (VAST 06)

Kang and al. Poster/Demo at Joint Conference in Digital Libraries, 2006

NetLensIterative Exploration of Content-Actor Network Data

Paired networks of Content and Actors, e.g. Paired networks of Papers and Authors

Papers refers to other papersAuthors have advisors

Paired networks of Emails and PeopleEmail respond to or include emailsPeople have assistants who send email for them

Paired networks of Products and CompaniesProducts replace or integrate productsCompanies are bought or merge

Entity E1 Entity E2

Self-relationship Self-relationshipRelationship

Content-actor model

Examples for scientific papers:

Toward SCALABILITYTotal Enron email (non duplicate)249,760 emails, 87,673 people

Email Overview by years

People (addresses)Overview by Domain

Alternative overviews: emails byday of the week, grouped by year

People by: connectance magnitude(Low medium high)

Multiple email search capabilities

1- Keyword SearchHere a search on “California”

2- Similarity SearchFind emails similar to one or more selected emails

Result set loaded in “My list”

(with Doug Oard’s team)

Social network analysis:

- Number of neighbors- Connectance- Centrality- Average Path Length

- Here selected people with high connectance

With Jen Golbeck

Social network analysis:

- Number of neighbors- Connectance- Centrality- Average Path Length

- Here selected people with high connectance

With Jen Golbeck

Explanations of the meaning of the attributes

People biosUsingsignatures and directory info

with Jen Golbeck

Integrated Phone callsReplay

Separate conversations

Direct access to mentions of :subject, names, keywords

(with Carol Espy’s team)

Thread Summaries

-List of emails in same thread-Access to thread-Access to thread summary

With Bonnie Dorr and Doug Oard’s teams

TreePlus to browse subset of network connections

TreePlus- Visualizing Graphs as Trees

Plant a seed and watch it growFaster, more accurate, preferred over traditional graphs for tasks that involve reading and exploration of connections

To show hidden graph structureHighlight and preview of adjacent nodesAnimated change of tree structureVisual hints about graph structure

B. Lee, C.S. Parr, C. Plaisant, B.B. Bederson, V.D. Veksler, W.D. Gray, C. Kotfila (2006)TreePlus: Interactive Exploration of Networks with Enhanced Tree LayoutsTo appear in TVCG Special Issues on Visual Analytics

B. Lee, C.S. Parr, C. Plaisant, B.B. Bederson (2005) Visualizing Graphs as Trees: Plant a seed and watch it growProceedings of GD 2005 (poster), LNCS, pp. 516-518

Generalization to other datasetse.g. NetLens for Scientific Publications (Papers and Authors)

User evaluationHeuristic review at NIST

5 PEOPLE – self trained with video)

Usability Study 9 people, training, debriefing

Other improvementsImproved feedback• +++ Improvement of flow managementAddition of My ListAdaptive explanations of viewsVideo trainingDocumentation of source / processing of variables

Implementation

C# (using piccolo toolkit)MS Access Database NetLens component code available on request

Conclusions - Future Directions

ConclusionsSimple content actor model helpfulPowerful yet simpleTraining about flow behavior

Continue integration with other IJKD dataE.g. Entity resolution

Evaluation (case studies of analysis)

Needs for Proto ToolFacilitate code customization for different applicationsFlexible entities switching (to handle any choice of pairs)Usability

Thank You

plaisant@cs.umd.edu (301)405-2768bederson@cs.umd.edu (301) 405-2764

NetLens: www.cs.umd.edu/hcil/netlensTreePlus: www.cs.umd.edu/hcil/treeplus

Papers and Video demonstrations availablefrom website. Source code available on request.

OTHER relevant HCIL projects

Temporal Data (Categorical): PatternFinder for Patient History Search

Fails, Karlson, Shahamat & Shneiderman, VAST 2006

Systematic & Flexible Network Explorationwith SocialAction

Clustering showsgrouping

Abstraction revealsrelationships

Perer & Shneiderman, InfoVis 2006

Network Visualization with Semantic Substrates

• Meaningfullayout of nodes

• User controlledvisibility of links

Shneiderman & Aris, InfoVis 2006