Upload
francisco-restivo
View
416
Download
4
Embed Size (px)
Citation preview
SSIIM, 2015/09/22 2
Topics
• Graphs and social networks• Some metrics• Communities • Dynamics• Fraud• Software • etc
Zachary karate club
SSIIM, 2015/09/22 3
Networks
• Networks are everywhere• Social, biological, financial, etc• Complex networks• Communities reveal properties of networks• Contagion• Controversies
SSIIM, 2015/09/22 4
Euler 1707 - 1783
SSIIM, 2015/09/22 5
SSIIM, 2015/09/22 6
SSIIM, 2015/09/22 7
SSIIM, 2015/09/22 8
SSIIM, 2015/09/22 9
Social networks
• Internet changed everything• Social interactions• Sharing • e-Commerce• Payments• Digital marketing• Political marketing• etc
SSIIM, 2015/09/22 10
SSIIM, 2015/09/22 11
Network growth
SSIIM, 2015/09/22 12
How teens communicate
SSIIM, 2015/09/22 13
SSIIM, 2015/09/22 14
SSIIM, 2015/09/22 15
SSIIM, 2015/09/22 16
Basics of graphs and networks
• G = (V, E)• O(G) = |V| order
• S(G) = |E| size
• A adjacency matrix
• Ki degree of vertex I
• Directed/undirected
SSIIM, 2015/09/22 17
Representation of networks
• Matrixes, graphs, edge lists, etc
A B C D EA 0 1 1 1 0B 1 0 1 0 1C 0 0 0 1 0D 0 1 1 0 0E 1 1 0 0 0
A BA CA DB AB CB EC DD BD CE AE B
SSIIM, 2015/09/22 18
Global metrics
• Number of vertexes 5
• Number of edges 11
• Number of components 1
• Diameter 2
• Density 0.55
SSIIM, 2015/09/22 19
Common Tasks
• Measuring “importance”– Centrality, prestige (incoming links)
• Link prediction• Diffusion modeling– Epidemiological
• Clustering– Blockmodeling, Girvan-Newman, Chinese whisper
• Structure analysis– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc. from Eytan Adar slides
SSIIM, 2015/09/22 20
Centrality Measures• Degree centrality
– Edges per node (the more, the more important the node)• Closeness centrality
– How close the node is to every other node• Betweenness centrality
– How many shortest paths go through the edge node (communication metaphor)
• Information centrality– All paths to other nodes weighted by path length
• Bibliometric + Internet style– PageRank from Eytan Adar slides
SSIIM, 2015/09/22 21
Champions league Pagerank
SSIIM, 2015/09/22 22
SSIIM, 2015/09/22 23
SSIIM, 2015/09/22 24
Community detection
• Communities and clusters are different • Network data is related to graph properties• Real world data is big
SSIIM, 2015/09/22 25
Community detection algorithms
• Clauset-Newman-Moore• Wakita-Tsurumi• Girvan-Newman• Chinese whispers• Link communities• etc
SSIIM, 2015/09/22 26
Modularity
• Compares number of edges with number of edges of a random network
• Maximize Q is NP-hard
jg,igij ijPijAm21Q
m2jkik
ijP
SSIIM, 2015/09/22 27
Clauset-Newman-Moore
A hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms.Its running time on a network with n vertices and m edges is O(md log n) where d is the depth of the dendrogram describing the community structure.
SSIIM, 2015/09/22 28
SSIIM, 2015/09/22 29
Wakita-Tsurumi
CNM algorithm does not scale well and its use is practically limited to networks whose sizes are up to 500,000 nodes. A simple heuristics that attempts to merge community structures in a balanced manner can dramatically improve community structure analysis.
SSIIM, 2015/09/22 30
SSIIM, 2015/09/22 31
Girvan-Newman
A property that is found in many networks, the property of community structure, in which network nodes are joined together in tightly knit groups, between which there are only looser connections.We propose a method for detecting such communities, built around the idea of using centrality indices to find community boundaries.
SSIIM, 2015/09/22 32
SSIIM, 2015/09/22 33
Chinese whispers [Biemann]
• a
Randomized graph-clustering algorithm, which is time-linear in the number of edges.It can be viewed as a simulation of an agent-based social network.
SSIIM, 2015/09/22 34
Link communities [Ahn et al]
Communities in networks often overlap such that nodes simultaneously belong to several groups. Meanwhile, many networks are known to possess hierarchical organization, where communities are recursively grouped into a hierarchical structure.
SSIIM, 2015/09/22 35
Dynamics
• Networks have a temporal dimension• Interactions – follow, like, share, mention,
retweet, hashtag, etc – occur in sequence• Network properties evolve in time
SSIIM, 2015/09/22 36
Impact of bots
• The use of bots is increasing • In Twitter, one in 20 active accounts are fake• In Facebook, one in 100 active accounts is
estimated to be fake• Better auditing algorithms are needed
SSIIM, 2015/09/22 37
SSIIM, 2015/09/22 38
SSIIM, 2015/09/22 39
Controversies
SSIIM, 2015/09/22 40
Software Tools
• NodeXL• Gephi• NetworkX• Meerkat• netvizz• d3.js• API
SSIIM, 2015/09/22 41
SSIIM, 2015/09/22 42
SSIIM, 2015/09/22 43
SSIIM, 2015/09/22 44
SSIIM, 2015/09/22 45
Datasets
I keep my collection herehttps://sites.google.com/site/frestivo/networked-life/databases
There is another in QuoraWhere can I find large datasets open to the public?
My collection of papers: http://tinyurl.com/qzjp6rg
SSIIM, 2015/09/22 46
Thank you!