Upload
gaurav-bhardwaj
View
689
Download
1
Tags:
Embed Size (px)
Citation preview
CMPE 275
Project 1 Report
MOOC Communication Backbone
Team:
Bhushan Deo (009269403)
Deven Pawar (009275994)
Dhruv Gogna (005067284)
Gaurav Bhardwaj (009297431)
Vinay Bhore (009290931)
Introduction The main goal of Project 1 is to simulate the behavior close to those of MOOCs without the web service
calls (and UI). This project focuses on implementing the underlying functionalities of MOOC by relying
more on network coordination and communication between clusters and respective nodes, and problem
solving for the end user. Discovery of clusters and problem solving remains transparent to an end user.
Clusters can communicate amongst themselves for problem solving for some jobs if needed. This takes
care of a major concern when a cluster cannot process the request of a client by itself as no two clusters
can share a database.
However, in each cluster, nodes may or may not share a database (depending on how the cluster is
implemented.) Each cluster follows a specific network topology based on how it is being implemented.
Along with this, every cluster needs to take care of a few basic questions. Some of these include how and
when a leader election is conducted, what happens when a leader goes down, is that a single point of
failure for the whole topology, how does the network deal when a node that was down comes back up,
what happens when a node goes down in a topology like line or ring that breaks the chain/ring. In this
report, we provide a detailed explanation of our cluster’s topology, leader election strategy,
functionalities provided, database used and its implementation.
High Level Architecture
The team’s cluster contains 5 nodes, any of which can be used to serve the client. Database configuration
has been described in later sections. Logically, the nodes are connected in a mesh topology (every node
can connect to every other node in the cluster).
Mesh Topology Advantages:
1. Easier to communicate across nodes, as each node directly knows every other node in the cluster.
2. Very less probability of network partition.
Disadvantages:
1. Too much communication
2. Not scalable to large number of nodes
Functionalities Our project provides three functionalities to the user (Client) independently. There is also an inter-MOOC
functionality which has been described in a later section. Note that, these functionalities are provided on
the public port and the internal/management message passing happens on the management port.
1. List Courses: List the courses provided by the MOOC
2. Course Description: Show the details for a particular course that is input by the user, from the list
of courses returned in functionality 1.
3. Questions and Answers: Show a list of questions. The client can then choose to select any question
and view all the available answers.
The functionality is being implemented keeping the client opaque from the complexity involved in
processing the client request. Also, since Protobuf is used as the efficient data interchange format, query
response from the server to the client is very quick.
Actual processing of the request is done in several stages, first we will discuss how the request is
constructed and it’s various modules to understand what data is flowing from client to server and vice
versa.
Request Structure
The request is divided into two main parts: Header and Payload.
Following are the key fields in Header and Payload:
1. Header:
header.routing_id: This is used by the server for instantiating appropriate Resource depending on
whether the request is a PING, JOB, etc.
2. Payload:
body.job_op.data.name_space: This is used to differentiate what kind of functionality is being
requested by the client.
Functionality Value of body.job_op.data.name_space
List Courses listcourses
Get Course Description coursedescription
See Questions seeq
View Answers for a Question seea
Interaction between Server and the Database:
The client request requires the Server to hit the MongoDB database to gather relevant information. The
reason for using MongoDB is MongoDB is much more than just a Key value store, it uses dynamic schema
and the NoSql enables the CRUD operations quite easily. It is also very easy to create the replica sets on
the other nodes in the same cluster with primary for read writes and secondary for reads.
The resulting dataset is in the form of JSON that needs to be parsed to Java objects/primitives so that we
can use those to build the response. That’s where Jackson parser comes into the picture.
Jackson Parser:
Jackson Parser takes the input in the form of JSON and converts it to Java beans, which is the required
format that needs to be sent as response. These beans are put in a Linked Hash Map. We prefer Linked
Hash Map to Hash Map, as this is an ordered data structure that returns response in the order they were
inserted.
Finally we create the response of type Request where we set the headers and payload in a same manner
as mentioned above.
Database Distributed Database in cluster: MongoDB Replica Set (total 5 nodes in cluster – thus, 1 Master + 4
Secondaries, only reads, so any client can access the MOOC from any node)
Usage of MongoDB - where Our cluster uses MongoDB as persistent storage to serve client's request. Since in our design, every
node in the cluster is able to serve clients’ request independently i.e. without need of any coordination
from other members of cluster, we need to keep replicated sets of data on each node. The diagram
makes the design clear:
Configuration All configurations for replica set configuration are provided in a conf. file in JSON format which was used
to initiate replica-set configuration. These configuration can be changed on the go.
1. Replication frequency: Exactly after this time secondary members starts fetching the data from
primary's oplog. For testing purpose we took it as 5 seconds.
2. Port: each node starts its mongod instance on 27017.
3. Read & write preference: Each secondary member is capable of serving read requests from client.
4. Any node can be set as arbiter so as to make the election proceed properly in case of even
participants.
Commands to start MongoDB instance on each node:
1. Start mongod on each node using mongo shell.
2. sudo mongod --port 27017 --dbpath /srv/mongodb/rs0-0 --replSet rs0 --smallfiles --oplogSize 128
3. sudo mongod --port 27017 --dbpath /srv/mongodb/rs0-1 --replSet rs0 --smallfiles --oplogSize 128
4. sudo mongod --port 27017 --dbpath /srv/mongodb/rs0-2 --replSet rs0 --smallfiles --oplogSize 128
5. Same for the remaining nodes
6. Finally, create conf file as shown below and then send command rs.inititate()
Why MongoDB Replica Set 1. High availability: Since each node now saves the data to be served to any client, data is highly
available throughout the cluster.
2. Fault tolerance: MongoDB provides automatic leader election algorithm, so if some node dies
in cluster it automatically elects another node as primary.
3. Automatic fail-over: MongoDB replica set is able to redirect read requests to secondary in case
primary is down.
4. Ease of operation and configuration: MongoDB is easy to install, operate and configure, with
simple rs.add(host,port); you can add a new replica node in cluster.
Python Client To provide an interface to the user of the MOOC, a python client is used. It gives the user, ability to
request any of the functionalities which have been described earlier.
There are two main aspects while writing the python client:
1. Use of protobuf for exchange of messages between client and server
2. Following the “protocol” defined by the server for sending and receiving messages. The diagram describes the server side pipeline and the corresponding steps followed on the client
side. In this way, the server and client have an agreed protocol so that they can communicate
properly. In this way, the client-server communication is possible, even though they are written
in different programming languages, because they are united by a defined protocol and a common
data interchange format (protobuf).
Leader Election Strategy (within team cluster) As described in above sections, our team’s design is a bidirectional mesh overlay network. Even though
any client can connect to any node to get required data about the MOOC, in cases where co-ordination is
required, a leader is necessary. Since we do not want a centralized leader, but a more flexible and fault
tolerant strategy, we need to hold elections.
Leader Election Algorithm: Floodmax
Parameter used for differentiating nodes:
We use the ballot ID field in the Management Protobuf message to differentiate nodes during election.
Example: Node 1 has management port 5671. Thus the ballot ID is set by this node to (5671%100) = 71.
Criteria for being leader: The node with the highest ballot ID always becomes the leader after every
election round.
Diameter of Graph: In our case, we have a mesh and thus, the diameter of the graph is 1. Hence, the
leader election will be done in in 1 round.
Why Floodmax + Mesh: Even though this design results in too much message passing during elections, it
is suitable for our case, since we have very less number of nodes (5). This design not only simplifies
leader elections, but also supports and simplifies other aspects of our MOOC implementation.
Steps:
1. When nodes have established heartbeats with each other, they send out ELECTION
messages. From the perspective of node with ballot ID 71:
2. Since this is an asynchronous network, each node has to store the ballot IDs it sees. Once it
has received ballot IDs from all nodes in the network, it can compare them and decide on
the winner.
3. After this round of ELECTION messages, all nodes in the cluster know who the leader is. The
eventual leader (in this case, 75, since our criteria for leader is that it should have the
highest ballot ID), will wait for nominations from other nodes. The remaining nodes, having
identified the leader will send it their nomination.
4. When the probable leader receives nominations from all active nodes in the cluster, it
understands that it has been approved as leader. Now it just sends out DECLARE WINNER
message to the other nodes so that they can officially persist information about who the
leader of the cluster is.
5. On failure of the leader (75), steps 1,2,3,4 are repeated and we have a new leader in the
cluster (74). Thus, the cluster always has a leader for co-ordination and we avoid a single
point of failure in the system.
Inter MOOC Voting Strategy For inter MOOC scenario, the standards body decided to do voting on who gets to host a competition
proposed by a client. The standards body also decided the message structure as follows:
Message Structure:
Client----->Server: request.body.job_op.data.name_space = “competition”
Server---->Server:
After client request, the communication takes place between Servers on the management port:
- Request from one cluster’s leader to another:
job_propose.name_space = “competition”
job_propose.owner_id = “1234”
job_propose.job_id = time in millis
job_propose.weight = 1
- Response from one cluster’s leader to the original proposer cluster’s leader:
job_bid.name_space = “competition”
job_bid.owner_id = copy from request
job_bid.job_id = copy from request
job_bid.bid = Random: 0(No) or (Yes)
Keeping Track of Other Cluster Leaders: A Leaders.txt file is proposed to keep track of leaders using
their host and port. It contains this information in following format:
<<cluster number>>:<<host>>,<<management port>>
The server will ingest this file when a competition message is received and needs to be forwarded to
other clusters. Thus, this file has to be kept updated manually on all nodes when a leader in the cluster
changes. This functionality has not yet been implemented. Our cluster cannot send out competition
messages but can respond and send back JOB BIDs to other clusters on request.
Steps:
1. The user invokes the “Competition” functionality on the client. Accordingly, the client sends a
“Competition” message to the node on a cluster to which it is connected. This node sends
out JOB PROPOSAL on the management port of the other leaders, using information from the
Leaders.txt file.
2. On receiving such a JOB PROPOSAL, the clusters discuss internally. This is fully functional in
our cluster.
i. In our team’s case, the leader sends out internal JOB PROPOSAL to the other nodes.
ii. The nodes respond back with a JOB BID with the bid set to 1(Yes) or 0(No) randomly.
iii. If majority (>N/2) nodes respond with a Yes, the leader sends back a Yes JOB BID back to
the original proposer. Else, the leader sends back a No JOB BID.
3. The original proposer on receiving the first YES bid from one of the clusters, sends that
clusters’ details back to the client. This is because, that cluster has won the competition by
responding back first.
Conclusion For our team, following are the major learnings from this project:
1. Thinking Distributed
2. Leader Elections in Practice
3. Voting Strategies in Practice
4. Distributed Databases
5. Netty, Network Programming
6. Design of Web Servers and the patterns used
7. Collaboration in Small and Large Software Development Teams
We hope to explore this project further and extend it. Some exciting possibilities include implementing a
“Stack Overflow” type message board functionality, where clients can upvote, downvote answers in a
real time voting round.
Appendix Our understanding of the Server’s Public Interface: