18
CS230 Final Project Report Hsu, Shu-Chi UCI ID: #13924048 Cheng, Huang-Yao (Even) UCI ID: #41123717 Simulator for understanding message propagation on social network Abstract There are many different communication channels between people in our daily life. Before Industrial Age comes, the most common way to deliver a message is to carry the message by a person. With the advancement of time and the invention of telegram, telephone, and internet, technology drastically changes the way of communication and shortens the delivery time in most cases. In today, online social network websites such as facebook and twitter have been widely used for most people because of the strong connection among user groups, and with making use of its instantaneity, a news can be propagated over the world in a short time while raise quite huge amount of attention. The goal of this project is developing a simulator for testing message propagation over social networks that have multiple communication channels, e.g., phone, email, SMS, OSN and etc. The simulator will explore the performance of various kinds of message propagation (unicast, multicast, and broadcast) over social network channels with different characteristics. 1. Introduction Online social networks (OSNs) are the most popular websites on the worldwide web in the present time, and there are millions of users communicating and sharing messages with their friends through OSNs like twitter and facebook. With such rapidly increasing memberships of OSN and other communication channels, relationships between people become very complicated. To realize the role of each channel, it is worth constructing a model of activity between huge numbers of users. Hence, we can understand how to propagate a message efficiently. We use an input data which contain fifty thousand user and their neighbors. Then, we construct a corresponding map for the input data. We assign some properties of these user like their location, communication channels with other users, and preferences. Also, we set different delay time in different channels. We design some events and store them in a queue by time order. During the execution of simulation, we collect propagation data like execution time or times of forwarding. Finally, we will analyze these data and find out what the most important factor affect propagation is and minimize the execution time and times of forwarding messages needed in the simulation.

Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

CS230 Final Project ReportHsu, Shu-Chi

UCI ID: #13924048Cheng, Huang-Yao (Even)

UCI ID: #41123717

Simulator for understanding message propagation on social network

Abstract

There are many different communication channels between people in our daily life. Before Industrial Age comes, the most common way to deliver a message is to carry the message by a person. With the advancement of time and the invention of telegram, telephone, and internet, technology drastically changes the way of communication and shortens the delivery time in most cases. In today, online social network websites such as facebook and twitter have been widely used for most people because of the strong connection among user groups, and with making use of its instantaneity, a news can be propagated over the world in a short time while raise quite huge amount of attention. The goal of this project is developing a simulator for testing message propagation over social networks that have multiple communication channels, e.g., phone, email, SMS, OSN and etc. The simulator will explore the performance of various kinds of message propagation (unicast, multicast, and broadcast) over social network channels with different characteristics. 1. Introduction

Online social networks (OSNs) are the most popular websites on the worldwide web in the present time, and there are millions of users communicating and sharing messages with their friends through OSNs like twitter and facebook. With such rapidly increasing memberships of OSN and other communication channels, relationships between people become very complicated. To realize the role of each channel, it is worth constructing a model of activity between huge numbers of users. Hence, we can understand how to propagate a message efficiently.

We use an input data which contain fifty thousand user and their neighbors. Then, we construct a corresponding map for the input data. We assign some properties of these user like their location, communication channels with other users, and preferences. Also, we set different delay time in different channels. We design some events and store them in a queue by time order. During the execution of simulation, we collect propagation data like execution time or times of forwarding. Finally, we will analyze these data and find out what the most important factor affect propagation is and minimize the execution time and times of forwarding messages needed in the simulation.

Page 2: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

Figure 1:A simulated illustration of a local network. In the diagram, E receives message propagates from almost everyone, while H tends to serve as an active message propagator to

send out messages through the network.

2. Related WorksIn the field of Social Science, people investigate the ways to communication between

human beings for a long while. In the work of S.W. Littlejohn et al, we can define communication from three dimensions: Level of observation, Intentionality, and Normative judgement.[1] For example, the definition of communication as “The process that links discontinuous parts of the living world to one another” [2] belongs to the first category. Also, there is definition that defines communication includes its intention: “Those situations in which a source transmits a message to a receiver with conscious intent to affect the latter’s behaviors.”[3] A definition that does not require intent follows: “It is a process that makes common to two or several what was the monopoly of one or some.”[4] And in the last case, it suggests a successful communication is made in this case: “Communication is the verbal interchange of a thought or idea.”[5]

When the idea comes to the field of computer science research, the most thing interest us is topics such as: “How fast can we propagate a message?” In the work of Sakaki et al., the system made use of this idea and detect the happening of earthquake by keeping monitoring the twitter news feed, and it filtered out the contents which is included with the keywords such as “earthquake”, “shaking”, and so on.[6] In the paper from Ghosh et al., it walk into the topic of preventing spam by investigating one user and its number of connections.[7] In the paper by Ando et al., it proposed a method of access control applying a social network to a P2P network in order to eliminate the problem of information leaks and sharing illegal files. In Daly and Haahr’s work, it proposed a solution to provide a better performance and also low end-to-end delay in a disconnected MANETs.[9]

In our work, we want to develop a simulator to simulate message propagation over multiple communication channels from human contact, phone, SMS to OSN, and how fast it can achieve a certain amount of coverage. Also, we want to know under the failure of some significant nodes, what will the system reacts to this situation and how a message will be propagate in this scenario.

Page 3: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

3. Node Properties and Policies3.1 Node Properties

We assume every node represents a person. Therefore, every node has its own ID and location. They sometimes join the network and leave the network on the time. Also, everyone has their own friends which they want to share their messages with. Some people also have preferred way to communicate with their friends. It can be e-mail, phone, text, social network, or human contact. Therefore, there are multiple channels between two nodes, and nodes may choose only channel to propagate messages.

Figure 2: Properties of a node (user) have in our system.

3.2 User TypesWe set up different user types for each node to approach a more real life situation. The user types include teenagers, Computer Science major students, housewife, grandparents, and businessmen and they cover most variety from ages, socialization degree, and different prefered channels they would like to use (we will walk into details in 3.4). Beside the iconic definition of user types provided here, we also allow some individual difference show up here in our system. For example, in current settings, the variance will reflects on the delay of sending, transmitting, receiving a message and also it’s preference of choosing channels.

Page 4: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

Figure 3a: Left: User Types in the system: Teenagers, Computer Science major students, Housewife, Grandparents, and Businessmen.

Figure 3b: Right: Communication channels in the system: Online Social Network(OSN), Email, SMS, Phone, and Human Contact.

3.3 Channel TypesThe channel types here are online social nentworks (OSN), email, SMS, phone, and human contact. The first two are used for propagating message based on internet and the third and fourth are based on wireless services in telecommunication. At the end, we present human contact since it’s one of the most effective way to communicate messages since people can easily get updated with very short amount of time if they are close to each other enough. In the spirit of this idea, in our system we also assume if two people gets close enough, they would like to choose human contact as their preferred channel. 3.4 Message Forwarding Policies

Different Policies can affect multiple properties regarding to propagate one message. Some policies tends to forward a message to a huge group of recipients while other policies tend to forward to a smaller amount of recipients. For some communication channels they can be very expensive in terms of transfer time such as the contact between human and human, while some channels can be very cheap for example a phone call or SMS. In our system we categorize each node into different characteristic, and in the following we will list out a few basic stereotypes of possible message forwarding policies:

Page 5: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

3.4.1 Teenagers● Preferred Channel: SMS, Online Social Network● Delay Time: low● Broadcast Rate: medium● Message Filtering Rate: low

3.4.2 Computer Science major students● Preferred Communication Channel: Email, Online Social Network● Delay Time: low● Broadcast Rate: low● Message Filtering Rate: high

3.4.3 Housewife● Preferred Channel: Phone● Delay Time: medium● Broadcast Rate: medium● Message Filtering Rate: low

3.4.4 Grandparents● Preferred Communication Channel: Person to Person● Delay Time: high● Broadcast Rate: low● Message Filtering Rate: medium

3.4.5 Businessmen● Preferred Communication Channel: Phone● Delay Time: low● Broadcast Rate: high● Message Filtering Rate: low

4. Implementation

Our target is to build a map, a message queue, and an event queue, so we can mimic the human behaviors in the real life with these components. First, we read information of nodes from “trusted_network_ff_50000_1” file which contains fifty nodes and their neighbors. The “trusted_network_ff_50000_1” is a node data released from facebook. There are some data files with more nodes, but some of them are too huge for us. Our goal is to analyze the influence of different policies during propagation of messages, so 50000 is a enough for us. The data structure we create for nodes includes node’s ID, join time, left time, its location, its neighbors list ,and reciprocation with neighbors. Then, we connect these nodes with some edges which means different channels between each node and its neighbor. This is the first stage of our work. 1 204 222 2 105 189 1062 202 96 1 117 293 448 182 2 69 37 2474 366 287 3 122 77 2765 465 412 1 73 283 656 350 177 2 46 154 257 263 35 271 147 2 59 185 101 8

Page 6: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

7 390 385 0 140 212 60 292

Figure 4: Example of the information of a node in the format of storage file.

Second, we create a message queue which nodes send and receive messages through this queue. Whenever nodes have a message to send, they call a send function, create a package with the destination, specific channels, and the information they want to share, and then send it to the message queue. Once there are some messages in the queue, the queue will notify the target nodes so that they know they have an incoming package such like text, e-mail, and new cal. Then, they can get the incoming package from the message queue.

Third, we give these nodes and edges their own attributes. For example, a node may represent a computer nerd. He spends most of his time with his computer. As result, the most common ways he used to communicate with other are email and online social network. Therefore, most of channels connect to this node are email or online social network. Compare to person to person talking, it has less effect by distance. Thus, its delay time is lower than human talking.

Figure 5: Users interact with simulator through send and receive function. In the interior of the simulator, especially we construct a message propagation queue and also a world event queue to simulate the scenario when earthquake

or natural disasters happen, the behavior of some nodes will change.

Next, we make this map act dynamically. People sometimes are in a place that they can not get messages from others and go back after a while, so it is possible for nodes to go up and go down at some specific time. We design a sequence of events and store them into a event queue by order of time. During execution of simulation, these events will happen according to its schedule. By this way, our model is more like real life.

Finally, we can understand how message propagate in this simulator with all components above. We will create different kinds of messages and try to send them in various ways. Then, we collect time and number of messages forwarding needed to send these messages to a number of nodes in the simulator. We can find out the relationship between users’ policy and efficiency of message disseminating in real life ultimately.

Page 7: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

5. Framework5.1 MapsEach point in the image represents a user in the network. The red one means it has not received the message. The green one is a user who received the message. The black one is a user leave this network. And the yellow one means this user are sending message. The lines between two points is represented a channel between the users. There five kinds of channel in our simulation which are email, OSN, SMS, phone, and human contact. White lines in our program are the channels that are in used. 5.1.1 Sparse map

This map is set up to illustrate the scenario of a countryside that the distance between each person is far and the nodes here are distributed uniformly on the map.

5.1.2 Urban map

In this map we designed a crowded area, which is the idea of a city in the center, and there are some nodes distributing randomly outside of the city. Since earlier we mentioned people would like to choose human contact as their preferred channel when they get close enough to each other, as a result, we expect to see the effect of human contact can take place in this map.

Page 8: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

5.1.3 Two Cities map

After we constructed the map of one city, we have stronger incentive to see how messages are propagating over crowded areas. In the center of the map we have two cities set up and the rest areas shows the idea of rural area.

Page 9: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

5.1.4 Three Cities mapThree cities map is the last scenario in our system. We have constructed three crowded

areas as shown following, and the center city serves as a communication channel in between the city to the west and the city to the east. In this map we would like to see how the west city and east city communicate to each other via the city in the center.

5.2 Scheduling Events

In our simulator, we set up a queue stores a series of event and order them by the timestamps of the event. With the advance of system time, the program pull out the topmost event which has the same timestamp of current logical time, and we bring this event to its execution stage.

There are four kinds of events: NODE_ON, NODE_OFF, MESSAGE_SENDING, and MESSAGE_RECEIVING events. Node on and off are generated and inserted to the queue before system starts to simulate the situation people would go online and offline all the time.

During the simulation, started with a seed that carries the initial message to send and tell the system to schedule it into the event queue. In the stage of message sending, it goes over all the neighbors and depends on the broadcast rate of itself, it decides whether to propagate this message to one of its neighbor, and which channel it would like to adopt in this transaction with. At the end a message receiving event is formed and appended on to event queue according to calculating the summation of delay in sending, and transmitting a message between the two nodes.

In the stage of message receiving, the program dequeues the event from our event queue and starts to parse the information it carries. Since in our system we only model the world with only one message propagates through the whole map, we need only to keep track of whether

Page 10: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

this message has been received and read. And it is very easy to upgrade into the scenario we have multiple messages propagating in the world. While a message has been received and not read before, we will allow this node to form a message sending event in order to propagate this message to its neighbors. The timestamp of new message sending event will take into account of current time and the delay in receiving a message given this node, and at the end the event is inserted into the queue at a proper position based on the timestamp it carries.

The following figure demonstrates how event nodes and the sorted event queue work in our simulation system.

6. User Interface

The system provides a visual solution instead of complex text log output and distinguished the status of nodes and active channels with colors. Also it gives users a better idea about how the message is propagated through time and via which nodes. At the right hand side from the top it starts with the information of current logical time, the total number of messages received. In the center we provided legends immediately right aside the dynamic graph as a result it is very convenient and easy understandable. At the bottom part it shows the steps of received messages represented by percentage and the time it achieves the percentage.

Page 11: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

7. ResultsThe followings are the information of the settings in this simulation:

● Number of nodes: 300● Average number of friends for each node: 10● Length of one tick: 10 milliseconds● Distribution of number of friends of nodes: long tail distribution

7.1 Map type

Page 12: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

The y axis represents the percentage of nodes received the message. In all the map of this chart, the user types of nodes are uniform distributed.We can found that message propagation is fast in the map with crowded area. The reason why the propagation rate in the sparse area is slower is related to delay time. It takes more time for the people to send their messages to their friends in the sparse area. Between the cities map, the two cities map is fastest. That is because it is easier to connect the urban and sparse area in this map which has two cities than the urban map which has only one city. The three cities map spends too much time on exchanging messages between cities which are not as close as cities in two cities map, and thus it slows the whole propagation.

7.2 Uniform Node Distribution

Page 13: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

The y axis represents the percentage of nodes received the message. From this chart, we can find that how the specialists influence message dissemination when the center of map has more specialist than other part of map. The map with more business is the fastest one while the map with more college student is the slowest one. Businessmen like to communicate with their friend through cell phone which has little transmission delay. Also, the rate of broadcast is high, which means it is more likely for business users to share their message with their friends. However, CS major college students like to share their messages on the social network which has larger sending and receiving time. As a result, it takes more time to propagate a message in this map. 7.3 Urban Map

Page 14: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

The y axis represents the percentage of nodes received the message. In the urban map, most people in city are specialist while most people in the countryside are housewife and grandparents. Although the grandparents like to meet their friends in person, housewife and businessmen like to call their friends. It is easier and has less delay time for this two kinds of user to contact the grandparents user. Therefore, the business city is the fast one. On the other hand, there is no help for college students inform grandparents users since grandparents users do not like to use social network. As a result, the college is the slowest one.

7.4 Business Area at the Center

The y axis represents the percentage of nodes received the message.

Page 15: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

This chart illustrate how helpful to gather businessmen in the city. Since the urban map has more population at the center area, the ratio of businessmen increases compared to the sparse map. Thus, the efficiency of message dissemination also increases duo to the population factor.Also, propagation delay in the urban is less than in the countryside, so the urban map uses only about half time than sparse map to propagate the message to 90% of nodes.

7.5 Student Area at the Center

The y axis represents the percentage of nodes received the message. Although college student prefer to use the social network which is the slowest media to send message , having a crowded area at the center still greatly increased the rate of message exchanging. According to two chart above, we can found that gathering people at the center helps increase the rate of message propagation. It is fast to send message in a city, so it is easy to increase the number of nodes which received the message in the beginning stage of message dissemination. More nodes got the message means more nodes is sending this message to their friends. To sum up, it is better to send message as much as possible in the beginning. Since cell phone has smallest delay time, it is the best way for the users to contact their friends in our simulation system. That is why maps with more businessmen spend less time on propagation. 7.6 The impact of Human Contact

Page 16: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

The y axis represents the percentage of nodes received the message. In the end of our project, we try to figure out how human contact impacts the propagation rate, so we implement human contact channel. This channel has the smallest sending delay but the highest transmission delay. Compare to the original chart above, we can found that the urban map is the fastest map now. That is because that the centralized map does not take much time in human contact than other maps. This channel greatly slow down the propagation rate in the simulation. Although two and three cities also have crowd area, the nodes are not as close as the urban map which has only one city. But two and three cities are still faster than the sparse map which has no city. Therefore, it is important to count the distance between nodes if we

Page 17: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

take the human contact into consideration.

8. Project ScheduleWeek Schedule detail Meeting with

Kyungbaek

Week 3 Discuss and Decide Project, Have a Meeting with Kyungbaek

v

Week 4 Implement File Reader

Week 5 Survey Papers, Implement Socket Functions, Discussed Possible Channels in the Simulator

v

Week 6 Survey papers, Organize Paper Survey Report

Week 7 Implement Sender and Receiver Functions, dynamic node events and the Structure of Simulator

v

Week 8 Implement the Structure of Simulator, and Policies

Week 9 Testing and Tuning v

Week 10 Testing and Tuning, Organize Paper for Final Project

Final Week Prepare Slides for Presentation

9. Reference1. Hyunwoo Park. Social Networking Platform Competition Simulation. Term Paper: INFO

290-7 IT Economics, Strategy and Policy. 13 December 20082. Littlejohn, SW (1989). Theories of human communication. Belmont, CA: Wadsworth3. Jürgen Ruesch, “Technology and Social Communication,” in Communication Theory and

Research, ed. L. Thayer (Springfield, IL: Thomas, 1957), 4624. Gerald R. Miller, “On Defining Communication: Another Stab,” Journal of Communication

16 (1966): 92.5. F. A. Cartier, “The President’s Letter,” Journal of Communication 9 (1959): 5.6. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. Earthquake Shakes Twitter

Users: Real-time Event Detection by Social Sensors. In Proceedings of the Nine-teenth International WWW Conference (WWW2010). ACM, 2010.

7. GHOSH, S., KORLAM, G., AND GANGULY, N. The Effects of Restrictions on Number of Connections in OSNs: A Case-Study on Twitter. In Proc. of WOSN (2010).

8. K. Ando, A. Fukagai, K.Ohshima, and M.Terada. DHT Network with Link Access Control Using a Social Network. In Proc. 2008 Int. Symposium on Applications and the Internet, pages 18–25, 2008.

9. E. Daly and M. Haahr. Social network analysis for routing in disconnected delay-tolerant manets. In Proceedings of ACM MobiHoc, 2007.

Page 18: Simulator for understanding message propagation on social ...cs230/projectpresentationswin2010... · Simulator for understanding message propagation ... a message queue, and an event

10. Stanley Wasserman,Katherine Faust, Social Network Analysis: Methods and Applications.