Transcript
Page 1: Networkx & Gephi Tutorial #Pydata NYC

Networkx & Gephi Tutorial#pydata

Gilad Lotan | @gilgul

Page 3: Networkx & Gephi Tutorial #Pydata NYC
Page 4: Networkx & Gephi Tutorial #Pydata NYC
Page 5: Networkx & Gephi Tutorial #Pydata NYC
Page 6: Networkx & Gephi Tutorial #Pydata NYC

#gayrights, #lgbt, #jesus, #flipflop, #jobs, #economy

#palestine, #OWS, #immigration,#abortion

#republican, #dems, #economics, #amnesty

Page 7: Networkx & Gephi Tutorial #Pydata NYC

#Debates / Ohio

Page 8: Networkx & Gephi Tutorial #Pydata NYC

#Debates / Ohio

Politicos

OSU Students

Ohio based Media

Page 9: Networkx & Gephi Tutorial #Pydata NYC

• Node network properties– from immediate connections

• indegreehow many directed edges (arcs) are incident on a node

• outdegreehow many directed edges (arcs) originate at a node

• degree (in or out)number of edges incident on a node

– from the entire graph• centrality (betweenness, closeness)

outdegree=2

indegree=3

degree=5

Source: Lada Adamic (SI508-F08)

Page 10: Networkx & Gephi Tutorial #Pydata NYC

Example Graph Types

• Complete Graph

• Bipartite Graph– Vertices can be divided into two disjoint sets– Ex: students & schools

Page 11: Networkx & Gephi Tutorial #Pydata NYC
Page 12: Networkx & Gephi Tutorial #Pydata NYC

Social Network Attributes• Scale Free

– Degree distribution follows a power law– Barabasi et al (‘99): mapped the topology of a portion of

the web

• Small World– Most nodes are not neighbors, but can be reached by

small number of hops– Watts & Strogatz (’98)– Properties: cliques, sub networks with high clustering

coefficient, most pairs of nodes connected by at least one short path

Page 13: Networkx & Gephi Tutorial #Pydata NYC

(Zachary) Karate club graph

social network of friendships between 34 members of a karate club at a US university in the 1970s.

Standard test network for clustering algorithms -> during the observation period the club broke up into two separate clubs over a conflict.

Page 14: Networkx & Gephi Tutorial #Pydata NYC

Graph Measures• Centrality

– Betweenness– Closeness– Eigenvector– Degree

• Clustering Coefficient (clique)• Modularity

Page 15: Networkx & Gephi Tutorial #Pydata NYC

Graph Layout• Open Ord

– Better distinguishes clusters• Yifan Hu• Force Atlas• Fruchterman Reingold

– Graph as a system of mass particles (nodes:particles, edges:springs)

Page 16: Networkx & Gephi Tutorial #Pydata NYC

Networkx

Page 17: Networkx & Gephi Tutorial #Pydata NYC

Graph Generators

Page 18: Networkx & Gephi Tutorial #Pydata NYC

Generate Twitter Graph

Page 19: Networkx & Gephi Tutorial #Pydata NYC
Page 20: Networkx & Gephi Tutorial #Pydata NYC

graphml file

nodes

edges

Page 21: Networkx & Gephi Tutorial #Pydata NYC

Twitter Users with Python in their Bios

• 2 days of Twitter data (Oct 24th and 25th)• Total: 4246 users (62k tweets)• @mikanyan1 tweeted 795 times

Page 22: Networkx & Gephi Tutorial #Pydata NYC

Pythonistas on Twitter

Page 23: Networkx & Gephi Tutorial #Pydata NYC

Pythonistas on Twitter

English / European

Japanese

Python(the snake)

Chinese

Spanish Speakers

Musicians, Artists

Page 24: Networkx & Gephi Tutorial #Pydata NYC
Page 25: Networkx & Gephi Tutorial #Pydata NYC

Twitter User Community: Data Science

• Grepped from Twitter bios over 1 week: "data science|data scientist|machine learning|data strateg”

• 1053 Users• 14k Tweets• Most tweeting users:

– @data_nerd (659)– @Chantel_Esworth (562)– @Da5_12 (253)

Page 26: Networkx & Gephi Tutorial #Pydata NYC

Dataists on Twitter

Page 27: Networkx & Gephi Tutorial #Pydata NYC

Thank You

Gilad LotanTwitter: @gilgul

Github: giladlotan