31
Introduction to Networks and Business Intelligence Prof. Dr. Daning Hu Department of Informatics University of Zurich Sep 22th, 2016

Introduction to Networks and Business Intelligence - UZHda42db6a-9d82-41f0-b89d-fd01bfc6a717/Lecture 1.pdfIntroduction to Networks and Business Intelligence Prof. Dr. Daning Hu Department

Embed Size (px)

Citation preview

Introduction to Networks and

Business Intelligence

Prof. Dr. Daning Hu

Department of Informatics

University of Zurich

Sep 22th, 2016

2

Outline Network Science

A “Random” History

Network Analysis

Network Topological Analysis: Random, Scale-Free, and Small-world

Networks

Node level analysis

Link Analysis

Network Visualization

Network-based Business Intelligence Application

* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.

Jure Leskovec and Lada Adamic from Standford University

3

Networks are Everywhere

* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.

Jure Leskovec and Lada Adamic from Standford University

4

Networks are Everywhere

* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.

Jure Leskovec and Lada Adamic from Standford University

5

Networks are Everywhere

* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.

Jure Leskovec and Lada Adamic from Standford University

Networks and Complex Systems

Behind various complex systems there are intricate wiring

networks, which defines the interactions and exchanges among

the nodes/individuals/entities.

Society

Brain

Communities

Organizations

We need to understand the networks first in order to really

understand how these complex systems work.

7

Network Science

Network science is an interdisciplinary academic field which

studies complex networks such as information networks,

biological networks, cognitive and semantic networks, and

social networks. It draws on theories and methods including:

Graph theory from mathematics, e.g., Small-world

Statistical mechanics from physics, e.g., Rich get richer,

Data mining and information visualization from computer science,

Inferential modeling from statistics, e.g., Collaborative filtering

Social structure from sociology, e.g., weak tie, structural holes

network science can be defined as "the study of network

representations of physical, biological, and social phenomena leading

to predictive models of these phenomena.”

8

A “Random” History: Math, Psychology, Sociology…

The study of networks has emerged in diverse disciplines as a

means of analyzing complex relational data.

Network science has its root in Graph Theory.

Seven Bridges of Königsberg written by Leonhard Euler in 1736.

Vertices, Edges, Nodes, Links,

a branch of mathematics that studies the properties of pairwise relations in a

network structure

Social Network Analysis

Jacob Moreno, a psychologist, developed the Sociogram and to “precisely

describe the interpersonal structure of a group”.

Jacob’s experiment is the first to use Social Network Analysis and was

published in the New York Times (April 3, 1933, page 17).

Stanley Milgram (Small World Experiment: Six Degrees of Separation,

1960s). Facebook: 5.28 steps in 2008, 4.74 in 2011.

We Live in a Connected World

“To speak of social life is to speak of the association between

people – their associating in work and in play, in love and in war, to

trade or to worship, to help or to hinder. It is in the social relations

men establish that their interests find expression and their desires

become realized.”Peter M. Blau

Exchange and Power in Social Life, 1964

"If we ever get to the point of charting a whole city or a whole nation,

we would have … a picture of a vast solar system of intangible

structures, powerfully influencing conduct, as gravitation does in

space. Such an invisible structure underlies society and has its

influence in determining the conduct of society as a whole."

J.L. Moreno, New York Times, April 13, 1933

“For the last thirty years, empirical social research has been

dominated by the sample survey. But as usually practiced, …,

the survey is a sociological meat grinder, tearing the individual

from his social context and guaranteeing that nobody in the

study interacts with anyone else in it.”Allen Barton, 1968 (Quoted in Freeman 2004)

Moreover, the complexity of the relational world makes it

impossible to identify social connectivity using only our intuition.

Social Network Analysis (SNA) provides a set of tools to

empirically extend our theoretical intuition of the patterns that

compose social structure.

The Origin of Modern Network Science:

Social Network Analysis

Social network analysis (SNA) is a set of relational methods for

systematically understanding and identifying

connections/ties/relationships among actors.

Social network analysis (SNA)

is motivated by a structural intuition based on ties linking social actors

is grounded in systematic empirical data

draws heavily on graphic imagery

relies on the use of mathematical and/or computational models.

12

Jacob Moreno’s experiment on Friendship Network

Social Network analysis lets us answer questions about social interdependence. These

include:

“Networks as Variables” approaches

• Are kids with smoking peers more likely to smoke themselves?

• Do unpopular kids get in more trouble than popular kids?

• Do central actors control resources?

“Networks as Structures” approaches

• What generates hierarchy in social relations?

• What network patterns spread diseases most quickly?

• How do role sets evolve out of consistent relational activity?

We don’t want to draw this line too sharply: emergent role positions can affect

individual outcomes in a ‘variable’ way, and variable approaches constrain

relational activity.

What Does Social Network Analysis Study?

14

Now…

Nodes Links

Social network People Friendship, kinship, collaboration

Inter-organizational

network

Companies Strategic alliance, buyer-seller

relation, joint venture

Citation network Documents/authors Citations

Internet Routers/computers Wire, cable

WWW Web pages hyperlink

Biochemical network Genes/proteins Regulatory effect

… … …

Complex Networks in the Real World

15

Examples of Real-World Complex Networks

A collaboration network of

physicists (size < 1K)

Source: (Newman & Girvan,

2004)

The Internet

(size > 150K),

Source: Lumeta Corp.,

The Internet Mapping

Project

16

Now…

Universal modeling and analysis methods for complex network

data

Shared vocabulary between fields: Computer Science, Physics,

Sociology, Economics, Statistics, Biology

“Big” Data availability: Internet, mobile, bio, health, security…

Impact/usage: social networking, social media, marketing, etc.

Network Representations Social Network data consists of two linked classes of data:

Nodes: Information on the individuals (actors, nodes, points, vertices)

Network nodes are most often people, but can be any other unit capable of

being linked to another (schools, countries, organizations, etc.)

The information about nodes is what we usually collect in standard social

science research: demographics, attitudes, behaviors, etc.

Graph theory notation: G(V,E), E: Edges, Links, Ties, Relations,…

18

Network Representations

In general, a relation can be: (1) Binary or Valued (2) Directed or

Undirected

a

b

c e

d

Undirected, binaryDirected, binary

a

b

c e

d

a

b

c e

d

Undirected, ValuedDirected, Valued

a

b

c e

d

1 3

4

21

19

Network Analysis: Topology/Structural Analysis

Network Topology Analysis provides an analytic understanding

of social structures and supplement individualistic methods.

Network topological measures include:

Size,

Density,

Average Degree,

Average Path Length: on average, the number of steps it takes to get

from one member of the network to another.

Diameter

Clustering Coefficient: a measure of an "all-my-friends-know-each-

other" property; small-world feature

1

)(

)1(

2)(

i

ii

i

iCoeffClusteringCC

kk

EiCC

ki = Cd(i) = # of neighbors of node i

Ei = # of links actually exist between ki nodes

20

Topology Analysis: Three Topology Models

Random Network

Erdős–Rényi Random Graph model

used for generating random graphs in which edges are set between nodes

with equal probabilities

21

Topology Analysis: Three Topology Models

Small-World Network

Watts-Strogatz Small World Model

used for generating graphs with small-world properties

large clustering coefficients

22

Topology Analysis: Three Topology Models

Scale-Free Network

Barabási–Albert (BA) Preferential Attachment Model

A network model used to demonstrate a preferential attachment or a "rich-

get-richer" effect.

an edge is most likely to attach to

nodes with higher degrees.

Power-law degree distribution

23

Network Analysis: Topology Analysis

Topology Average Path Length

(L)

Clustering

Coefficient (CC)

Degree Distribution

(P(k))

Random Graph Poisson Dist.:

Small World

(Watts & Strogatz, 1998)

Lsw Lrand CCsw CCrandSimilar to random

graph

Scale-Free network LSF Lrand Power-law

Distribution:

P(k) ~ k-

k

NLrand

ln

ln~

N

kCCrand

!)(

k

kekP

kk

k : Average degree

24

Network Scientists

• Paul Erdős (Random graph model)

• Duncan Watts (Small-World model)

• A.-L. Barabási (Scale-Free model); “Linked”

• Mark Newman (SW and SF models)

25

Network Analysis: Node-level Analysis

Node Centrality can be viewed as a measure of influence or

importance of nodes in a network.

Degree

the number of links that a node possesses in a network. In a directed

network, one must differentiate between in-links and out-links by

calculating in-degree and out-degree.

Betweeness

the number of shortest paths in a network that traverse through that node.

Closeness

the average distance that each node is from all other nodes in the network

26

Example: Centrality Measures of Bin Laden in a

Global Terrorist Network

0

10

20

30

40

50

60

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

Degree

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

Betweenness

0

50

100

150

200

250

300

350

400

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

Closeness

The changes in the degree,

betweenness and closeness of

the node bin Laden from 1989

to 2002

27

Findings and Possible Explanations

The changes described in the above figure show that From 1994 to 1996, bin Laden’s betweenness decreased a lot and then increased

until 2001

In 1994, The Saudi government revoked his citizenship and expelled him from

the country

In 1995, he then went to Khartoum, Sudan, but under U.S. pressure was

expelled Again

In 1996, bin Laden returned to Afghanistan established camps and refuge

there

From 1998 to 1999, there is another sharp decrease in betweenness

After 1998 bombings of the United States embassies around world, President

Bill Clinton ordered a freeze on assets linked to bin Laden

Since then, bin Laden was officially listed as one of the FBI Ten Most Wanted

Fugitives and FBI Most Wanted Terrorists

In August 1998, the U.S. military launched an assassination but failed to harm

bin Laden but killed 19 other people

In 1999, United States convinced the United Nations to impose sanctions

against Afghanistan in an attempt to force the Taliban to extradite him

28

Network Analysis: Link Analysis Link analysis focuses on the prediction of link formations

between a pair of nodes based on various network factors. Its

applications include:

Finance: Insurance fraud detections

E-commerce: recommendation systems, e.g., Amazon

Internet Search Engine: Google PageRank

Law Enforcement: Crime link predictions

29

Network Visualization: Expert Partition of the

Collaboration Network

Weapons of

massive

destruction

Terrorism in

Europe

Criminal

justice

An international

terrorism conf.

Rand Corp.

Historical and policy

perspective of

terrorism

Not well-defined

group

Legal

perspective of

terrorism

30

31

Network-based Business Applications

Facebook: People you may know