Upload
marc-smith
View
8.801
Download
5
Embed Size (px)
DESCRIPTION
Overview of the NodeXL project (Network Overview, Discovery and Exploration) that adds social network metrics and visualization features to Excel 2007. Contains updated images from version .84 of the NodeXL project.
Citation preview
NodeXLNetwork overview, discovery and exploration
for Microsoft Excel 2007http://www.codeplex.com/nodexl
Dan Fay (Microsoft Research - Redmond)Cody Dunne (U Maryland)
Marc Smith (Telligent)Vladimir Barash (MSR Silicon Valley/Cornell)
Tony Capone (Microsoft Research - Redmond)Natasa Milic-Frayling (Microsoft Research - Cambridge)
Eduarda Mendes Rodrigues (Microsoft Research - Cambridge)Eric Gleave (U Washington)Adam Perer (U Maryland)
Ben Shneiderman (U Maryland)
The NodeXL Team
Problem: No network chart in Excel
Problem: No network chart in
Excel
NodeXL: Network analysis and visualization tool
• Cyclic Graph data structures have limited support in existing Office tools
• Network analysis is of growing importance in academic, commercial, and Internet social media contexts
• Existing network analysis tools have command line interfaces or demand steep learning curves
• Many network data sets already live in Excel!
NodeXL: Goal: Make SNA easier
• Existing Social Network Tools are challenging for many novice users
• Tools like Excel are widely used• Leveraging a spreadsheet as a host for SNA
lowers barriers to network data analysis and display
Social Network Analysis ToolkitTools to support the study of the social network structure of
social media and other directed graph structures
Sociologist
“What are the structures of
communication in scientific
discussions?”
User Experience Information Visualization
“What are the best UI/UX workflows for network
analysis tools?”
Computer Scientist Algorithmicist for Social
Network Measures
“What are the measures and algorithms needed for understanding networks?”
8
The Ties that Blind?
9
Reply-To NetworkNetwork at distance 2 for the most prolific author of the microsoft.public.internetexplorer.general newsgroup
The Ties that Blind?
Darwin Bell11
Pajek without modification can sometimes reveal structures of great interest.
The Ties that Blind?
Two “answer people” with an emerging 3rd.
Mapping Newsgroup Social Ties
Microsoft.public.windowsxp.server.general13
14
• Answer person– Outward ties to local isolates– Relative absence of triangles– Few intense ties
• Reply Magnet– Ties from local isolates often
inward only– Sparse, few triangles– Few intense ties
Distinguishing attributes:
15
Distinguishing attributes:• Answer person
– Outward ties to local isolates– Relative absence of triangles– Few intense ties
• Discussion person– Ties from local isolates often
inward only– Dense, many triangles– Numerous intense ties
16
Clear and consistent signaturesof an “Answer Person”
• Light touch to numerous threads initiated by someone else
• Most ties are outward to local isolates• Many more ties to small fish than big fish
1
10
100
0 1 2 4 8 16 32 64
17
Roles Project• Using Netscan
data to derive social roles in Usenet
• Next steps: quantify & explore in more depth
Answer Person, microsoft.public.windows.server.general
Discussion, rec.kites
Flame, alt.flame
Social Support, alt.support.divorce
PUBLISHED in HICSS, JCMC, JoSS, IEEE Internet Communications (special issue on Social Networks) 18
NodeXL: Network Overview, Discovery and Exploration for Excel• Leverage spreadsheet for storage of edge and
vertex data
The NodeXL project is
Available via the CodePlex
Open Source Project
Hosting Site:
http://www.codeplex.com/nodexl
NodeXL workflow
data importation > processing > calculation > refinement > a network graph that tells a useful story
These steps include:
• Import data from several sources and file formats• Scrub data: Merge duplicate edges• Calculate network metrics• Insert sub-graph images• Auto-fill columns (and map data to display attributes):
- Set shape, color, opacity, size, and label/tooltip• Create clusters• Show graph• Read workbook• Adjust layout• Layout Again• Dynamic Filters – selectively hide edges and nodes
NodeXL: Import data from multiple sources:
• Multiple network “spigots” provide edge lists from several common sources and data formats.
Social media platforms are A source of multiple Social network data sets:
“Friends”“Replies”“Follows”“Comments”“Reads”“Co-edits”“Co-mentions”“Hybrids”
Export data to alternate file formats:
Prepare data for analysis
NodeXL: Import edges from other spreadsheets
• Map data columns from existing spreadsheets
NodeXL: Merge Duplicate Edges (if any)
• Aggregate duplicate edges and add a “Tie Strength Column” to store the count of “duplicates” (edges could be from multiple time slices).
NodeXL: Calculate Network Analytics and Metrics
• Starter library of basic network measures
• Users may unselect resource intensive measures
NodeXL: Insert network sub-graph images
• Create “ego-centric” networks for each node in the network
• Select number of degrees out to include
NodeXL:
Display nodes with subgraph images sorted by network attributes using Excel Data|Sort
NodeXL: Get reports of global network metrics
Metric Value
Graph Type Directed
Unique Edges 7,852
Edges With Duplicates 0
Total Edges 7,852
Self-Loops 10
Vertices 174
Graph Density 0.260514259
NodeXL: Display whole graph
• Toggle display of whole graph display pane with Show/Hide Graph Pane
NodeXL: Create a new whole graph display
• Select “Read Workbook” to load the graph into the Display Pane.
• The title “Document Actions” is imposed by Excel
NodeXL: Viewing the whole graph
NodeXL: Using Dynamic Filters to simplify the graph
• Each data column (including dates) associated with an edge or vertex is exposed with a slider filter.
• Filtered nodes and edges turn gray or become invisible
NodeXL: Apply dynamic filters to the data
NodeXL: Map data to display attributes
• Map Edge and Vertex attributes to size, width, color, opacity, and shape
NodeXL: Decorated Network Graphs
NodeXL: filtered, decorated, labeled networks
NodeXL: Clustered networks
NodeXL: Add URLS to Right-click menu of Nodes
NodeXL: Filtered clusters
Right click the canvass to control attributes of selected nodes
NodeXL: Import social networks from email
NodeXL: Import social networks from email
Systematic Yet Flexible Network Analysis Tasks1. Overall network metrics (e.g. number of nodes, number of
edges, density, diameter)2. Node rankings (e.g. degree, betweenness, closeness
centrality)3. Edge rankings (e.g. weight, betweenness centrality)4. Node rankings in pairs (e.g. degree vs. betweenness, plotted
on a scatter gram)5. Edge rankings in pairs6. Cohesive subgroups (e.g. finding communities in networks)7. Multiplexity (e.g. analyzing comparisons between different
edge types, such as friends vs. enemies)Shneiderman, Perer, Dunne
Micro ---- Scale ---- Macro
Micr
o ---
- Tim
e ---
- Mac
ro
Whole Graph
/ All Time
Sub Graph /
Narrow Time Slice
Sub Graph /
All Time
Whole Graph
/ Narrow
Time Slice
Filtering Network Diagrams
Random Layout
Fruchterman-Reingold Layout(Dense)
Fruchterman-Reingold Layout(Loose)
Random Layout (Decorated)
X = In-degree, Y = Out-degree
NodeXL Next Steps
• Enhanced layout controls– Smart selection of nodes
• Clustering and composite nodes– Add/remove a node to/from a cluster– Add/remove a node to/from a composite
• Add social network data sources:– Twitter, YouTube, Facebook, Outlook, Messenger,
etc.
NodeXL Partnerships and community
• University of Maryland• Northwestern University• Ohio University• Stanford University• University of Pennsylvania
7,000 + downloads on Codeplex
NodeXL User tasks and goals
NodeXLNetwork overview, discovery and exploration
for Microsoft Excel 2007http://www.codeplex.com/nodexl
Dan Fay (Microsoft Research - Redmond)Cody Dunne (U Maryland)
Marc Smith (Telligent)Vladimir Barash (MSR Silicon Valley/Cornell)
Tony Capone (Microsoft Research - Redmond)Natasa Milic-Frayling (Microsoft Research - Cambridge)
Eduarda Mendes Rodrigues (Microsoft Research - Cambridge)Eric Gleave (U Washington)Adam Perer (U Maryland)
Ben Shneiderman (U Maryland)