Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Visual Analysis of People’s Calling Network from CDR data
Yueqi Hu, Guizhen Wang, Ronghua Liang,
Shuang Ye, Guangyu Chen, Chaokai Wen and Wei Chen
Sloan
Business
School
Media Lab Sta�
Media Lab
First Year
Media Lab
Graduation
Student
(a) Radial tree view
(b) Radial tree view of
selected hierarchy and groups
(d) Spiral view(c) Statistic view
Fig. 1. Four visualizations for analyzing the MIT Reality Mining dataset. (a) depicts the entire calling network, in which each subjectis represented as a leaf node with a unique identification. (b) displays the result of interactive hierarchy and group selection based on(a). (c) shows the calling pattern (in and out calling) of ID 8 with respect to (b). (d) shows the social connections of ID 11 (shown inyellow) with a spiral view.
Abstract—Call detail records (CDR) is a widely used data in Social Network Analysis (SNA). This paper introduces a novel visualanalysis technique for characterizing one person’s calling network and revealing the underlying social information. We represent theentire calling network with a hierarchical radial kd-tree view which can endure large scale networks. To inspect the calling pattern ofone specific subject, a statistic view is provided with a radar chart representation to compare the calling patterns of different persons,or different time durations of one subject. In addition, a spiral layout is employed to intuitively reveal the closeness between onesubject and his or her social groups. We demonstrate the effectiveness of our approach with the CDR data of the IEEE VAST 2008mini challenge, and the MIT Reality Mining dataset.
Index Terms—Social Network, Girvan-Newman Alogrithm, Visual Design.
1 INTRODUCTION
A CDR dataset implies a communication network, and provides mean-ingful information about the characteristics of human behaviors. Manyefforts have been put into studying the structure of mobile phone net-works, revealing personal life routines and reactions to outlier events,and classifying different social networks [1]. Further studies have alsobeen conducted on the dynamics of the mobile phone networks, favor-ing the analysis of evolution of relationships over time. Current re-search on mobile phone networks has greatly enabled the understand-ing of the mobility pattern and human behavior. Yet, easily accessingthe characteristics of some subject (caller or callee) is still a challeng-ing problem. Conventional statistical and visualization techniques candefinitely help address this problem, but tend to be inefficient due tothe following reasons. First, the mobile phone network is of signif-icantly large data size. Second, the network contains complex infor-mation. Third, the time-varying property of the mobile phone networkand the evolution of someone’s communication network is difficult tobe captured.
The primary goal of our work is, in a large scale network, to aid
• Yueqi Hu, Guizhen Wang, Shuang Ye, Guangyu Chen, Chaokai Wen and
Wei Chen are with the State Key Lab of CAD&CG, Zhejiang University,
E-mail: {aaron.huyq, wguizhen}@gmail.com; [email protected].
• Ronghua Liang is with the College of Computer Science and Technology,
Zhejiang University of Technology, E-mail:
users intuitively analyzing the characteristics of one subject’s commu-nication network, and enable comparison of subjects from differentcommunities. We use three analytical views to collaborate, the radialkd-tree view to describe the social network, the statistical view to dy-namically reveal kinds of underlying calling patterns, and a novel spi-ral view to explicitly characterize the closeness relationship betweenone subject and groups which have direct or indirect calling connec-tions with him or her. With our method, each sub-group network andcalling pattern can be well structured and simultaneously analyzed.
2 APPROACH
At first, we use calling time to measure the social relationship. Theworkflow of our method is illustrated in Figure 2.
Hierarchical Radial kd-tree View This view in Figure 2 (d) re-veals the social communities. The social community distribution canbe quickly observed from the root node, and the deeper the degreeof the group node is, the more concrete the social relationship is.Each node corresponds to a certain hierarchy of the social groups,leaf nodes are in the original hierarchy to represent all subjects andthe root node is in the top hierarchy to represent the whole commu-nity. Different colors are used to depict different subjects and groups.Generally speaking, people could have a fuzzy classification of his so-cial groups. And this prior social experience could be applied intothe CDR datasets through dividing or merging groups by adjusting thetree in the hierarchical tree view. Interactive manipulation of a radialtree view enables the user to access certain desired group and adjustshown levels of the binary tree. The social groups in the hierarchicalradial kd-tree view are detected using the Girvan-Newman networkclustering algorithm [2], and are formed into a hierarchical radial kd-
(a)
Hierarchical Clustering
(b)
the Hierarchical Radical
kd-Tree View
(d)
CDR Dataset
From To Date time Duration
133 21 20060601 1256 778
133 178 20060601 1437 1141
133 70 20060601 1444 1215
178 133 20060601 2102 1137
133 178 20060601 2150 1183
70 133 20060602 1101 1143
178 133 20060603 0009 887
133 21 20060603 1228 1129
133 21 20060603 1400 1190
21 133 20060603 1623 803
133 178 20060603 2120 1597
133 21 20060604 1229 923
70 133 20060604 1557 828
(f)
Computing Closeness Values
with respect to groups
0.0
0.1
0.2
0.3
0.4
0.5
0.6
group7group6group5group4group3group2group1
(g)
the Statistic View
calling Time
listening Time
Phone ID: 133
(c)
Construting the radial tree
(e)
Phone ID: 133
the Spiral View
Fig. 2. The pipeline of our approach.
tree in Figure 2(b). Subsequently, the tree is recursively mapped intoa uniformly subdivided ring by three steps. First, all leaf nodes of thetree are represented as calling identifications, and sequentially and uni-formly distributed on a ring. Then, by recursively merging the childnodes, the tree is reformulated into a radial tree view in Figure 2(c). Atlast, by interactively expand or collapse the current tree nodes, the cor-responding interval on the ring encodes the size of clustered groups.
Spiral View We introduce a spiral view, in which a subject is placedin the center, and directly or indirectly connected groups are lined intoa spiral line according to their calling closeness in the radial kd-treeview. From the spiral view, users observe not only one subject’s socialcloseness with other subjects, but also the social closeness of subjectswho have the same closeness with the center one. In Figure 2(d), theleaf nodes in the brother branch of the center person are the closest, sothese brother branches are cut to form a community, and placed in thefirst place of the spiral line. The next community in the second placeof the spiral line is the brother branches of the center person’s fathernode, and so on.
Statistical View The statistic view in our approach favors effec-tive analysis and visualization of the duration of calls for each groupby using the radar charts. From Figure 2(d), we can conclude thatone person’s close contractors are always located close to this person,which makes the measured information easy to tell in a radar chartwithout a vision loss, because all the information bars are close toeach other. We use a fan glyph to represent the statistical information(Figure 2 (h)). The length of the glyph measures the information inthe corresponding group, and the set of groups yields several separatefans around the center. Calling network usually takes many kinds ofinformation, i.e., in and out calls. To distinguish them, we representdifferent kinds of information in different colors.
3 RESULTS ANALYSIS AND CONCLUSION
The interface and visual analysis system was implemented with theProcessing language [3]. We conducted case studies on two CDRdatasets. One is MIT Reality Mining dataset, in which seventy-fiveare either students or faculty in MIT Media Laboratory, and twenty-five are incoming students at MIT Sloan business school. The other isthe IEEE VAST 2008 challenge dataset, recording the phones calls ofabout 400 persons over a ten-day period.
The overview radial tree is shown as in Figure 1(a). A user canchoose a group of interest by traversing the tree. In the biggest tree ofFigure 1(a),there are two branches from the root. That implies thesepeople can be divided into two groups: one is all the people from
the Sloan Business School in MIT, and the other is mostly the MediaLab graduated students, among which, ID 78 and ID 40 are the twoof the four Media Lab First Year Graduate students. This finding isconfirmed in the user identification survey.
Moreover, the proposed spiral view can help the user to easily com-pare their social relationship with multiple friends. Figure 1(d) showsthe social groups with ID 11. People who have relationship couldbe grouped into 14 communities according to their close relation-ship with ID 11. The most closest community includes ten people(ID 4,5,8,12,13,31,23,60,102, and 104) where the social relationshipamong themselves can be observed by the radial tree layout. In thesimilar manner, we can find the second closest community which onlyhas ID 106.
The CDR dataset from IEEE VAST Challenge is involved with upto 400 persons, and Figure 3 (a) proves that our method can explicitlywork for large scale networks. Here, we focus on how to reveal thestatistic information. Take the ID 2 for example. Figure 3 (a) clearlyreveals the most close social communities of ID 2 in the compact so-cial network. From Figure 3 (b), we can find two groups of peopleare closely connected with ID 2. One has far more calls in daytimethan at nights, while the other has almost the same calling time in dayand night. It can be inferred that the former group is his/her workcommunity, and the other is his/her private friends.
2
(a) (b)
Fig. 3. The radial tree view from the VAST Challenge dataset.(a) Theoverview of the social relationship in the whole dataset. (b) The radartree chart of the ID 2 in one year, where the orange color representinghis (or her)ratio of call times with every person from 8:00am to 18:29pm,and the blue color representing his (or her)ratio of call times with everyperson from 18:30pm to 7:29am.
In this paper, we present a visual analytics system for CDR datawith three linked views to depict the calling network of one subject,which can be regarded as an analysis technique for the relationship ofan individual and the social groups. With the proposed approach, wecan easily classify the network data into different communities, iden-tify persons with similar life styles, and compare the associated com-munication networks among different persons. Though our currentvisualization method can endure the calling network with up to thou-sands of people, it will cause vision loss if users analyze the largerscale networks on a common laptop screen. This problem can besolved by constructing a multi-level hierarchy radial kd-tree view ofthe group datasets, and preserving subjects who have strong connec-tions with the central subject in the statistical view. In the future, wewill work on visualizing more complex information in the calling net-work.
REFERENCES
[1] N. Eagle, A. S. Pentland, and D. Lazer. Inferring friendship network struc-
ture by using mobile phone data. Proceedings of the National Academy of
Sciences, 106(36):15274–15278, Aug. 2009.
[2] M. Girvan and M. E. J. Newman. Community structure in social and bio-
logical networks. Proceedings of the National Academy of Sciences of the
United States of America, 99(12):7821–7826, June 2002.
[3] C. Reas, B. Fry, and J. Maeda. Processing: A Programming Handbook for
Visual Designers and Artists. The MIT Press, 2007.