28
based Business Intelligence (BI) Feb 19 th , 2013 Daning Hu, Ph.D., Department of Informatics University of Zurich F Schweitzer et al. Science 2009

MINFS544: Network-based Business Intelligence (BI)

  • Upload
    kostya

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

MINFS544: Network-based Business Intelligence (BI). Feb 19 th , 2013 Daning Hu, Ph.D., Department of Informatics University of Zurich. F Schweitzer et al. Science 2009. Stop Contagious Failures in Banking Systems. - PowerPoint PPT Presentation

Citation preview

Page 1: MINFS544: Network-based  Business Intelligence (BI)

MINFS544: Network-based Business Intelligence (BI)

Feb 19th, 2013

Daning Hu, Ph.D.,Department of Informatics

University of Zurich

F Schweitzer et al. Science 2009

Page 2: MINFS544: Network-based  Business Intelligence (BI)

Stop Contagious Failures in Banking Systems

During 2008 financial tsunami, which bank(s) we should inject capital first to stop contagious failures in bank networks? 2

Page 3: MINFS544: Network-based  Business Intelligence (BI)

Utilize Peer Influence in Online Social Networks

Intelligent Advertising, Product Recommendation Who are the most influential people? What are the patterns of information diffusion?

3

Page 4: MINFS544: Network-based  Business Intelligence (BI)

Develop Strategies to Attack Terrorist Networks

A Global Salafi Jihad Terrorist NetworkHu et al. JHSEM 2009

How to effectively break down a terrorist network? 4

Page 5: MINFS544: Network-based  Business Intelligence (BI)

Network-based Business Intelligence

5

Network-based (Modeling and Analysis)Modeling and analyzing various real-world social and organizational networks to understand:

the cognitive and economic behaviors of the network actors; and the dynamic processes behind the network evolution

Based on the above…

Business Intelligence (BI)Design network-based BI algorithms and information systems to provide decision support in various application domains

Financial Risk Management, Security Informatics, and Knowledge

Management, etc. Network Analysis, Simulation of Network Evolution, Data Mining, etc.

Page 6: MINFS544: Network-based  Business Intelligence (BI)

MINFS544: Network-based Business Intelligence

• Lecturer: Dr. Daning Hu; Teaching Assistant: Dr. Jiaqi Yan

• Email: [email protected] [email protected]

• Credits: 3 ECTS credits

• Class Meetings: Tue 14:00-15:45 PM, or Thu 10:15 –12:00

pm (Please see the schedule)

• Language: English

• Audience: Master and doctoral students

• Office Hours: Tue 13:00–14:00 PM, Room 2.A.12

• Grading: Course report (term paper) 70%, presentation

20%, participation 10%

Page 7: MINFS544: Network-based  Business Intelligence (BI)

7

Grading

• 1. A full research paper (70%). The format of this paper can be found at: http://icis2012.aisnet.org/index.php/submissions– * If possible, get it published in ICIS 2013 and get it cited.

• This paper should include answers to the following questions:– What is the problem?– Why is it interesting and important?– Why is it hard? Why have previous approaches failed?– What are the key components of your approach?– What 1) models, 2) data sets and 3) metrics will be used to validate

the approach?

Page 8: MINFS544: Network-based  Business Intelligence (BI)

8

Grading

• 2. Oral presentation of the paper (using slides) + Q&A (20%)

• For presentations, please see slide on How to give a good research talk at:

• http://research.microsoft.com/en-us/um/people/simonpj/papers/giving-a-talk/giving-a-talk.htm

 

• 3. Active participations and interactions (10%)

Page 9: MINFS544: Network-based  Business Intelligence (BI)

Course ScheduleDate Event Deliverable

19.02.13 Course introduction, Kick-off meeting

20.02.13 – 04.03.13

One to one meeting 1 page summary due

05.03.13 Research method tutorial

06.03.13 – 18.03.13

One to one meeting on research progress

3 pages literature review due

19.03.13 Lecture, feedback and discussion

20.03.13 – 10.04.13

Writing Research-in-Progress Paper (RIP)

5 – 8 pages RIP due

11.04.13 Lecture, feedback and discussion

12.04.13 – 02.05.13

Writing full paper 8 – 12 pages full paper due (first deadline)

25.04.13, 16.05.13

Presentation days, feedback and discussion

29.05.13 Final paper due

Page 10: MINFS544: Network-based  Business Intelligence (BI)

A Brief History of Network Science

10

Mathematical foundation – Graph Theory1736

1930 Social Network Analysis and Theories Sociogram: Network visualization Six degree of separation Structural hole: Source of innovation

Network Science Economic networks (Agent modeling & simulation) Dynamic network analysis BI applications: product diffusion in social media,

recommendation systems

1990 (Physicists) Complex Network Topologies

Small-world model (e.g., WWW) Scale-free model (“Rich get richer”)

2000

2012

?

Page 11: MINFS544: Network-based  Business Intelligence (BI)

Outline

11

Introduction

Dynamic Analysis of Dark Networks

A Global Salafi Jihad (GSJ) Terrorist Network

A Narcotic Criminal Network

A Network Approach to Managing Bank Systemic Risk

Ongoing Work

Conclusion

Page 12: MINFS544: Network-based  Business Intelligence (BI)

Dynamic Network Analysis (DNA)

12

What Why How

Model the changes in

network evolution

Temporal changes in

network topological

measures

Dynamic network

recovery on

longitudinal data

Studying dynamic link formation processes behind

network evolution.

Nodes forming links Network Evolution

Statistical analysis of

determinants behind

link formation

Homophily

Preferential

attachment

Shared affiliations

Simulate the

evolution of networks

Agent-based

Modeling and

Simulation

Examine network

robustness

Page 13: MINFS544: Network-based  Business Intelligence (BI)

Research Testbed: A Global Terrorist Network

13

The Global Salafi Jihad (GSJ) network data is compiled by a

former CIA operation officer Dr. Marc Sageman - 366 terrorists

friendship, kinship, same religious leader, operational interactions, etc.

geographical origins, socio-economic status, education, etc.

when they join and leave GSJ

The goal of dynamic analysis gain insights about the evolution of GSJ network

develop effective attack strategies to break down GSJ network

Sample data of GSJ terrorists

Page 14: MINFS544: Network-based  Business Intelligence (BI)

14

a

Page 15: MINFS544: Network-based  Business Intelligence (BI)

15

Page 16: MINFS544: Network-based  Business Intelligence (BI)

Dynamic Network Analysis

16

What Why How

Model the changes

in network evolution

Temporal changes in

network topological

measures

Dynamic network

recovery on

longitudinal data

Studying dynamic processes (i.e., link formation) behind

network evolution.

Nodes’ behaviors Network Evolution

Statistical analysis of

determinants behind

link formation

Homophily

Preferential

attachment

Shared affiliations

Simulate the

evolution of networks

Agent-based

Modeling and

Simulation

Examine network

robustness

Page 17: MINFS544: Network-based  Business Intelligence (BI)

Temporal Changes in Network-level Measures

Average Degree <k >

0

2

4

6

8

10

12

14

16

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

deg

ree

<k>

Fig.1. The temporal changes in the (a) average degree, (b) and (c) degree distribution

Degree = number of links a node has

a

b

c

0.00

0.03

0.06

0.09

0.12

0.15

0.18

0.21

0.24

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49

pro

bab

ility

of

deg

ree

1990

1991

1993

Poisson

0.00

0.03

0.06

0.09

0.12

0.15

0.18

0.21

0.24

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52

pro

bab

ility

of

deg

ree

1995

1997

1999

Page 18: MINFS544: Network-based  Business Intelligence (BI)

Findings

18

There are three stages for the evolution of the GSJ network:

1989 - 1993 The emerging stage:

The network grows in size

Accelerated Growth - No. of edges increases faster than nodes

Random network topology (Poisson degree distribution)

1994 - 2000 The mature stage:

The size of the network reached its peak in 2000

Scale-free topology (Power-law degree distribution)

2001 - 2003 The disintegration stage:

Falling into small disconnected components after 9/11

Page 19: MINFS544: Network-based  Business Intelligence (BI)

Temporal Changes in Node Centrality Measures

19

0

10

20

30

40

50

60

Degree

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Betweenness

Figure.2. Temporal changes in Degree and Betweenness centrality of Osama Bin Laden

Degree: No. of links a node has

Betweenness of a node i No. of shortest paths from all nodes to

all others that pass through node i

Measure i’s influence on the traffic

(information, resource) flowing through it

Page 20: MINFS544: Network-based  Business Intelligence (BI)

Findings and Possible Explanations

20

1994 – 1996: A sharp decrease in Bin Laden’s Betweenness

1994: Saudi revoked his citizenship and expelled him

1995: Went to Sudan and was expelled again under U.S. pressure

1996: Went to Afghanistan and established camps there

1998 –1999: Another sharp decrease in his Betweenness

After 1998 bombings of U.S. embassies, Bill Clinton ordered a freeze

on assets linked to bin Laden (top 10 most wanted)

August 1998: A failed assassination on him from U.S.

1999: UN imposed sanctions against Afghanistan to force the Taliban

to extradite him

Page 21: MINFS544: Network-based  Business Intelligence (BI)

Research Testbed: A Narcotic Criminal Network The COPLINK dataset contains 3 million police incident

reports from the Tucson Police Department (1990 to 2006).

3 million incident reports and 1.44 million individuals

Their personal and sociological information (age, ethnicity, etc.)

Time information: when two individuals co-offend

AZ Inmate affiliation data: when and where an inmate was housed

A Narcotic Criminal Network

19,608 individuals involved in organized narcotic crimes

29,704 co-offending pairs (links)

21

COPLINK Narcotic Data

Arizona Inmate Data

Overlapped (identified by first name, last name and DOB)

Number of People 36,548 165,540 19,608

Time Span 1990 - 2006 1985 - 2006 17 years

Table 1. Summary of the COPLINK dataset and the Arizona inmate dataset

Page 22: MINFS544: Network-based  Business Intelligence (BI)

Proportional hazards model (Cox Regression Analysis)

Homophily in age (group) and race

Shared affiliations:

Mutual acquaintances (through crimes)

Vehicle affiliation (same vehicle used by two in different crimes)

22

Statistical Analysis of Determinants for Link Formation

Fig.3. Results of multivariate survival (Cox regression) analysis of triadic closure (link formation).

Page 23: MINFS544: Network-based  Business Intelligence (BI)

IBM’s COPLINK is an intelligent police information system aims to to help speed up the crime detection process.

COPLINK calculates the co-offending likelihood score based on the proportional hazards model .

A ranked list of individuals based on their predicted likelihood of

co-offending with the suspect under investigation.

23

BI Application: Co-offending Prediction in COPLINK

Fig.4. Screenshots of the COPLINK system

Page 24: MINFS544: Network-based  Business Intelligence (BI)

Simulate Attacks on Dark Networks

24

Three attack (i.e. node removals) strategies:

Attack on hubs (highest degrees)

Attack on bridge (highest betweenness)

Real-world Attack (Attack order based on real-world data)

Simulate two types of attacks to examine the robustness

of the Dark networks

Simultaneous attacks (the degree/betweenness of nodes are NOT

updated after each removal) – Static

Progressive attacks (the degree/betweenness of nodes are

updated after each removal) – Dynamic

Page 25: MINFS544: Network-based  Business Intelligence (BI)

Hub Vs. Bridge Attacks

25

Both hub and bridge attacks are far more effective than real-

world arrests – Policy implications?

Both Dark networks are more vulnerable to Bridge attacks

than Hub attacks. Bridge (highest beweenness): Field lieutenants, operational leaders,

etc.

Hub (highest degree) : e.g., Bin LadenGSJ

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

Fraction of nodes removed

S a

nd <

s>

S (Hub attacks)

S (Bridge attacks)

Page 26: MINFS544: Network-based  Business Intelligence (BI)

26

Summary and Contributions

We developed a set of Dynamic Network Analysis (DNA)

methods that are effective in

Linking network topological changes to analytical insights

Systematically capturing the link formation processes

Examining the determinants of link formation

Dark networks are

robust against real-world attacks

but vulnerable to targeted bridge attacks

COPLINK provides real-time decision support for fighting crimes.

Page 27: MINFS544: Network-based  Business Intelligence (BI)

27

Research Readings and Resources• 1. Networks Overview:• * Statistical mechanics of complex networks, Section III, VI

– http://rmp.aps.org/abstract/RMP/v74/i1/p47_1

• * Networks, Crowds, and Markets: – http://www.cs.cornell.edu/home/kleinber/networks-book/

• 2. Networks in Finance:• * Financial Networks blog and research databases:

– WRDS database– http://www.financialnetworkanalysis.com/research-database/– http://www.stern.nyu.edu/networks/electron.html–  * Company Board Social Networks

Page 28: MINFS544: Network-based  Business Intelligence (BI)

28

Research Readings and Resources (cont.)• 3. Networks in Marketing:

– * Sinan Aral’s research in networks and marketing– Peer influence– http://web.mit.edu/sinana/www/

• * Social Media based Marketing:– http://searchengineland.com/guide/what-is-social-media-marketing

• 4. Recommender Systems:– http://www-cs-students.stanford.edu/~adityagp/recom.html

• 5. Word-of-Mouth Effects in Social Networks:– http://papers.ssrn.com/sol3/papers.cfm?abstract_id=393042&