2
Visual Analysis of People’s Calling Network from CDR data Yueqi Hu, Guizhen Wang, Ronghua Liang, Shuang Ye, Guangyu Chen, Chaokai Wen and Wei Chen Sloan Business School Media Lab Staff Media Lab First Year Media Lab Graduation Student (a) Radial tree view (b) Radial tree view of selected hierarchy and groups (d) Spiral view (c) Statistic view Fig. 1. Four visualizations for analyzing the MIT Reality Mining dataset. (a) depicts the entire calling network, in which each subject is represented as a leaf node with a unique identification. (b) displays the result of interactive hierarchy and group selection based on (a). (c) shows the calling pattern (in and out calling) of ID 8 with respect to (b). (d) shows the social connections of ID 11 (shown in yellow) with a spiral view. Abstract—Call detail records (CDR) is a widely used data in Social Network Analysis (SNA). This paper introduces a novel visual analysis technique for characterizing one person’s calling network and revealing the underlying social information. We represent the entire calling network with a hierarchical radial kd-tree view which can endure large scale networks. To inspect the calling pattern of one specific subject, a statistic view is provided with a radar chart representation to compare the calling patterns of different persons, or different time durations of one subject. In addition, a spiral layout is employed to intuitively reveal the closeness between one subject and his or her social groups. We demonstrate the effectiveness of our approach with the CDR data of the IEEE VAST 2008 mini challenge, and the MIT Reality Mining dataset. Index Terms—Social Network, Girvan-Newman Alogrithm, Visual Design. 1 I NTRODUCTION A CDR dataset implies a communication network, and provides mean- ingful information about the characteristics of human behaviors. Many efforts have been put into studying the structure of mobile phone net- works, revealing personal life routines and reactions to outlier events, and classifying different social networks [1]. Further studies have also been conducted on the dynamics of the mobile phone networks, favor- ing the analysis of evolution of relationships over time. Current re- search on mobile phone networks has greatly enabled the understand- ing of the mobility pattern and human behavior. Yet, easily accessing the characteristics of some subject (caller or callee) is still a challeng- ing problem. Conventional statistical and visualization techniques can definitely help address this problem, but tend to be inefficient due to the following reasons. First, the mobile phone network is of signif- icantly large data size. Second, the network contains complex infor- mation. Third, the time-varying property of the mobile phone network and the evolution of someone’s communication network is difficult to be captured. The primary goal of our work is, in a large scale network, to aid YueqiHu, Guizhen Wang, Shuang Ye, Guangyu Chen, Chaokai Wen and Wei Chen are with the State Key Lab of CAD&CG, Zhejiang University, E-mail: {aaron.huyq, wguizhen}@gmail.com; [email protected]. Ronghua Liang is with the College of Computer Science and Technology, Zhejiang University of Technology, E-mail: [email protected]. users intuitively analyzing the characteristics of one subject’s commu- nication network, and enable comparison of subjects from different communities. We use three analytical views to collaborate, the radial kd-tree view to describe the social network, the statistical view to dy- namically reveal kinds of underlying calling patterns, and a novel spi- ral view to explicitly characterize the closeness relationship between one subject and groups which have direct or indirect calling connec- tions with him or her. With our method, each sub-group network and calling pattern can be well structured and simultaneously analyzed. 2 APPROACH At first, we use calling time to measure the social relationship. The workflow of our method is illustrated in Figure 2. Hierarchical Radial kd-tree View This view in Figure 2 (d) re- veals the social communities. The social community distribution can be quickly observed from the root node, and the deeper the degree of the group node is, the more concrete the social relationship is. Each node corresponds to a certain hierarchy of the social groups, leaf nodes are in the original hierarchy to represent all subjects and the root node is in the top hierarchy to represent the whole commu- nity. Different colors are used to depict different subjects and groups. Generally speaking, people could have a fuzzy classification of his so- cial groups. And this prior social experience could be applied into the CDR datasets through dividing or merging groups by adjusting the tree in the hierarchical tree view. Interactive manipulation of a radial tree view enables the user to access certain desired group and adjust shown levels of the binary tree. The social groups in the hierarchical radial kd-tree view are detected using the Girvan-Newman network clustering algorithm [2], and are formed into a hierarchical radial kd-

Visual Analysis of People’s Calling Network from …Visual Analysis of People’s Calling Network from CDR data Yueqi Hu, Guizhen Wang, Ronghua Liang, Shuang Ye, Guangyu Chen, Chaokai

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Visual Analysis of People’s Calling Network from …Visual Analysis of People’s Calling Network from CDR data Yueqi Hu, Guizhen Wang, Ronghua Liang, Shuang Ye, Guangyu Chen, Chaokai

Visual Analysis of People’s Calling Network from CDR data

Yueqi Hu, Guizhen Wang, Ronghua Liang,

Shuang Ye, Guangyu Chen, Chaokai Wen and Wei Chen

Sloan

Business

School

Media Lab Sta�

Media Lab

First Year

Media Lab

Graduation

Student

(a) Radial tree view

(b) Radial tree view of

selected hierarchy and groups

(d) Spiral view(c) Statistic view

Fig. 1. Four visualizations for analyzing the MIT Reality Mining dataset. (a) depicts the entire calling network, in which each subjectis represented as a leaf node with a unique identification. (b) displays the result of interactive hierarchy and group selection based on(a). (c) shows the calling pattern (in and out calling) of ID 8 with respect to (b). (d) shows the social connections of ID 11 (shown inyellow) with a spiral view.

Abstract—Call detail records (CDR) is a widely used data in Social Network Analysis (SNA). This paper introduces a novel visualanalysis technique for characterizing one person’s calling network and revealing the underlying social information. We represent theentire calling network with a hierarchical radial kd-tree view which can endure large scale networks. To inspect the calling pattern ofone specific subject, a statistic view is provided with a radar chart representation to compare the calling patterns of different persons,or different time durations of one subject. In addition, a spiral layout is employed to intuitively reveal the closeness between onesubject and his or her social groups. We demonstrate the effectiveness of our approach with the CDR data of the IEEE VAST 2008mini challenge, and the MIT Reality Mining dataset.

Index Terms—Social Network, Girvan-Newman Alogrithm, Visual Design.

1 INTRODUCTION

A CDR dataset implies a communication network, and provides mean-ingful information about the characteristics of human behaviors. Manyefforts have been put into studying the structure of mobile phone net-works, revealing personal life routines and reactions to outlier events,and classifying different social networks [1]. Further studies have alsobeen conducted on the dynamics of the mobile phone networks, favor-ing the analysis of evolution of relationships over time. Current re-search on mobile phone networks has greatly enabled the understand-ing of the mobility pattern and human behavior. Yet, easily accessingthe characteristics of some subject (caller or callee) is still a challeng-ing problem. Conventional statistical and visualization techniques candefinitely help address this problem, but tend to be inefficient due tothe following reasons. First, the mobile phone network is of signif-icantly large data size. Second, the network contains complex infor-mation. Third, the time-varying property of the mobile phone networkand the evolution of someone’s communication network is difficult tobe captured.

The primary goal of our work is, in a large scale network, to aid

• Yueqi Hu, Guizhen Wang, Shuang Ye, Guangyu Chen, Chaokai Wen and

Wei Chen are with the State Key Lab of CAD&CG, Zhejiang University,

E-mail: {aaron.huyq, wguizhen}@gmail.com; [email protected].

• Ronghua Liang is with the College of Computer Science and Technology,

Zhejiang University of Technology, E-mail:

[email protected].

users intuitively analyzing the characteristics of one subject’s commu-nication network, and enable comparison of subjects from differentcommunities. We use three analytical views to collaborate, the radialkd-tree view to describe the social network, the statistical view to dy-namically reveal kinds of underlying calling patterns, and a novel spi-ral view to explicitly characterize the closeness relationship betweenone subject and groups which have direct or indirect calling connec-tions with him or her. With our method, each sub-group network andcalling pattern can be well structured and simultaneously analyzed.

2 APPROACH

At first, we use calling time to measure the social relationship. Theworkflow of our method is illustrated in Figure 2.

Hierarchical Radial kd-tree View This view in Figure 2 (d) re-veals the social communities. The social community distribution canbe quickly observed from the root node, and the deeper the degreeof the group node is, the more concrete the social relationship is.Each node corresponds to a certain hierarchy of the social groups,leaf nodes are in the original hierarchy to represent all subjects andthe root node is in the top hierarchy to represent the whole commu-nity. Different colors are used to depict different subjects and groups.Generally speaking, people could have a fuzzy classification of his so-cial groups. And this prior social experience could be applied intothe CDR datasets through dividing or merging groups by adjusting thetree in the hierarchical tree view. Interactive manipulation of a radialtree view enables the user to access certain desired group and adjustshown levels of the binary tree. The social groups in the hierarchicalradial kd-tree view are detected using the Girvan-Newman networkclustering algorithm [2], and are formed into a hierarchical radial kd-

Page 2: Visual Analysis of People’s Calling Network from …Visual Analysis of People’s Calling Network from CDR data Yueqi Hu, Guizhen Wang, Ronghua Liang, Shuang Ye, Guangyu Chen, Chaokai

(a)

Hierarchical Clustering

(b)

the Hierarchical Radical

kd-Tree View

(d)

CDR Dataset

From To Date time Duration

133 21 20060601 1256 778

133 178 20060601 1437 1141

133 70 20060601 1444 1215

178 133 20060601 2102 1137

133 178 20060601 2150 1183

70 133 20060602 1101 1143

178 133 20060603 0009 887

133 21 20060603 1228 1129

133 21 20060603 1400 1190

21 133 20060603 1623 803

133 178 20060603 2120 1597

133 21 20060604 1229 923

70 133 20060604 1557 828

(f)

Computing Closeness Values

with respect to groups

0.0

0.1

0.2

0.3

0.4

0.5

0.6

group7group6group5group4group3group2group1

(g)

the Statistic View

calling Time

listening Time

Phone ID: 133

(c)

Construting the radial tree

(e)

Phone ID: 133

the Spiral View

Fig. 2. The pipeline of our approach.

tree in Figure 2(b). Subsequently, the tree is recursively mapped intoa uniformly subdivided ring by three steps. First, all leaf nodes of thetree are represented as calling identifications, and sequentially and uni-formly distributed on a ring. Then, by recursively merging the childnodes, the tree is reformulated into a radial tree view in Figure 2(c). Atlast, by interactively expand or collapse the current tree nodes, the cor-responding interval on the ring encodes the size of clustered groups.

Spiral View We introduce a spiral view, in which a subject is placedin the center, and directly or indirectly connected groups are lined intoa spiral line according to their calling closeness in the radial kd-treeview. From the spiral view, users observe not only one subject’s socialcloseness with other subjects, but also the social closeness of subjectswho have the same closeness with the center one. In Figure 2(d), theleaf nodes in the brother branch of the center person are the closest, sothese brother branches are cut to form a community, and placed in thefirst place of the spiral line. The next community in the second placeof the spiral line is the brother branches of the center person’s fathernode, and so on.

Statistical View The statistic view in our approach favors effec-tive analysis and visualization of the duration of calls for each groupby using the radar charts. From Figure 2(d), we can conclude thatone person’s close contractors are always located close to this person,which makes the measured information easy to tell in a radar chartwithout a vision loss, because all the information bars are close toeach other. We use a fan glyph to represent the statistical information(Figure 2 (h)). The length of the glyph measures the information inthe corresponding group, and the set of groups yields several separatefans around the center. Calling network usually takes many kinds ofinformation, i.e., in and out calls. To distinguish them, we representdifferent kinds of information in different colors.

3 RESULTS ANALYSIS AND CONCLUSION

The interface and visual analysis system was implemented with theProcessing language [3]. We conducted case studies on two CDRdatasets. One is MIT Reality Mining dataset, in which seventy-fiveare either students or faculty in MIT Media Laboratory, and twenty-five are incoming students at MIT Sloan business school. The other isthe IEEE VAST 2008 challenge dataset, recording the phones calls ofabout 400 persons over a ten-day period.

The overview radial tree is shown as in Figure 1(a). A user canchoose a group of interest by traversing the tree. In the biggest tree ofFigure 1(a),there are two branches from the root. That implies thesepeople can be divided into two groups: one is all the people from

the Sloan Business School in MIT, and the other is mostly the MediaLab graduated students, among which, ID 78 and ID 40 are the twoof the four Media Lab First Year Graduate students. This finding isconfirmed in the user identification survey.

Moreover, the proposed spiral view can help the user to easily com-pare their social relationship with multiple friends. Figure 1(d) showsthe social groups with ID 11. People who have relationship couldbe grouped into 14 communities according to their close relation-ship with ID 11. The most closest community includes ten people(ID 4,5,8,12,13,31,23,60,102, and 104) where the social relationshipamong themselves can be observed by the radial tree layout. In thesimilar manner, we can find the second closest community which onlyhas ID 106.

The CDR dataset from IEEE VAST Challenge is involved with upto 400 persons, and Figure 3 (a) proves that our method can explicitlywork for large scale networks. Here, we focus on how to reveal thestatistic information. Take the ID 2 for example. Figure 3 (a) clearlyreveals the most close social communities of ID 2 in the compact so-cial network. From Figure 3 (b), we can find two groups of peopleare closely connected with ID 2. One has far more calls in daytimethan at nights, while the other has almost the same calling time in dayand night. It can be inferred that the former group is his/her workcommunity, and the other is his/her private friends.

2

(a) (b)

Fig. 3. The radial tree view from the VAST Challenge dataset.(a) Theoverview of the social relationship in the whole dataset. (b) The radartree chart of the ID 2 in one year, where the orange color representinghis (or her)ratio of call times with every person from 8:00am to 18:29pm,and the blue color representing his (or her)ratio of call times with everyperson from 18:30pm to 7:29am.

In this paper, we present a visual analytics system for CDR datawith three linked views to depict the calling network of one subject,which can be regarded as an analysis technique for the relationship ofan individual and the social groups. With the proposed approach, wecan easily classify the network data into different communities, iden-tify persons with similar life styles, and compare the associated com-munication networks among different persons. Though our currentvisualization method can endure the calling network with up to thou-sands of people, it will cause vision loss if users analyze the largerscale networks on a common laptop screen. This problem can besolved by constructing a multi-level hierarchy radial kd-tree view ofthe group datasets, and preserving subjects who have strong connec-tions with the central subject in the statistical view. In the future, wewill work on visualizing more complex information in the calling net-work.

REFERENCES

[1] N. Eagle, A. S. Pentland, and D. Lazer. Inferring friendship network struc-

ture by using mobile phone data. Proceedings of the National Academy of

Sciences, 106(36):15274–15278, Aug. 2009.

[2] M. Girvan and M. E. J. Newman. Community structure in social and bio-

logical networks. Proceedings of the National Academy of Sciences of the

United States of America, 99(12):7821–7826, June 2002.

[3] C. Reas, B. Fry, and J. Maeda. Processing: A Programming Handbook for

Visual Designers and Artists. The MIT Press, 2007.