6
RSTP-SP: Shortest Path Extensions to RSTP Eduard Bonada, Dolors Sala Department of Information and Communication Technologies Universitat Pompeu Fabra {eduard.bonada, dolors.sala}@upf.edu Abstract—The spanning tree protocol is the component of the Ethernet architecture that establishes the network connectivity. Its plug-and-play property and ease of configuration have been some of the pillars of Ethernet’s success. However, the new provider applications require improving the protocol capabilities such as response time, path optimality and path control. Optimal paths can be achieved if we deploy one tree rooted at each node. Nevertheless, this introduces the challenge of maintaining the path symmetry requirement of Ethernet networks. In this paper we propose RSTP-SP as an extension to RSTP that meets the performance objectives and keeps the bridging requirements. We evaluate RSTP-SP by means of a simulation analysis and we compare it to Shortest Path Bridging (SPB). Simulation results show that RSTP-SP outperforms SPB in terms of recovery time and outage experienced. In contrast, the message overhead introduced by RSTP-SP is higher than in the SPB case. Ethernet bridging; spanning tree; RSTP; shortest path bridging I. INTRODUCTION Ethernet low cost, high data rates, low complexity and simple maintenance offer Network Providers a good opportunity for using Ethernet data networks at very large scale replacing the existing ATM/SONET or IP networking [1]. Ethernet was originally designed for LANs without very strict requirements. Using it as a carrier-grade technology represents a new application that leads to new requirements: quick recovery, good resource utilization, use of optimal paths and control over path-selection. In Ethernet technology the spanning tree protocol is responsible for establishing the connectivity of the different Ethernet segments in a single interconnected network [2]. This connectivity service is driven by the bridging principles based on deploying a plug-and-play architecture. A bridge starts with no configuration and sends all received data frames to all ports except the incoming. From this reception the bridge learns the port that leads to the source address. Subsequent frames sent to this address are directed to this port. All frames with unknown destination continue to be sent to all ports. This broadcast could result in a continuous flooding if the network has loops. Two important aspects of this operation need to be highlighted. One, the broadcast bridging operation only works in networks without loops, hence the spanning tree protocol is used to build a logical tree topology (with no loops and with only one path between any two nodes) over any physical topology. And two, the learning operation learns from the incoming frame assuming that the path coming from a node is the same as the path reaching such node, this is, the path has to be symmetric. While this property comes natural in the original spanning tree protocols (STP[3] and RSTP[4]), it is more difficult to obtain in the shortest-path extensions as seen in this paper. However maintaining a property is a must for backwards compatibility with existing equipment. There are some implications of pruning the physical topology into an active tree. First, selecting a single path to connect two peers eliminates all potential redundancy that might be available through extra links. Second, the shortest- path branches only provide optimal paths between peers located in the same branch. In fact, only the Root observes optimal communication with all nodes. In addition, RSTP experiences recovery times of tens of seconds when the root fails [5], and it only drives path selection based on links costs. In this paper we propose RSTP-SP: the necessary extensions to RSTP in order to construct multiple trees that operate with shortest-path communication. The idea is to extend the active topology into one tree rooted at each node of the network. If each node uses its own tree to introduce its own data traffic, shortest-path communication are achieved between each Root and the rest of nodes. In addition, it provides an increase of network resources utilization because chances are that a link is active at least in one of the trees. RSTP-SP is based on executing the single tree protocol as many times as trees to construct. This is, in a network with N nodes, N trees are configured by N instances of the protocol. Similar approaches based on multiple trees have been proposed earlier. Most of them are based on extending MSTP [6] but focus on increasing resources utilization or QoS instead of deploying optimal paths [7][8]. AMSTP [9] uses MSTP and provides shortest path bridging also creating one tree per node. However, unidirectional VLANs are configured and these might lead to asymmetrical paths. A different approach is proposed in [10] where the ARP messages are used to construct unidirectional paths substituting the bridge learning functionality. This removes the symmetry requirement but results in non-deterministic routes as the selection depends on the path that the first ARP follows. The fact that each bridge uses its own tree to transmit data results into a unidirectional structure of paths because a different tree is used in each direction. This introduces an important challenge because the protocol needs a careful selection of trees due to the symmetry requirement. The 802.1aq Shortest Path Bridging (SPB) task group is also currently defining an evolution of RSTP that operates with shortest-paths [11]. SPB solves the symmetry implementing a link-state protocol moving away from the distance-vector approach of the spanning trees. This paper, instead, presents a solution based on the spanning tree protocols that supports shortest path bridging. 2012 IEEE 13th International Conference on High Performance Switching and Routing 978-1-61284-0833-6/12/$26.00 ©2012 IEEE 223

[IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

  • Upload
    dolors

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

RSTP-SP: Shortest Path Extensions to RSTP

Eduard Bonada, Dolors Sala Department of Information and Communication Technologies

Universitat Pompeu Fabra {eduard.bonada, dolors.sala}@upf.edu

Abstract—The spanning tree protocol is the component of the Ethernet architecture that establishes the network connectivity. Its plug-and-play property and ease of configuration have been some of the pillars of Ethernet’s success. However, the new provider applications require improving the protocol capabilities such as response time, path optimality and path control. Optimal paths can be achieved if we deploy one tree rooted at each node. Nevertheless, this introduces the challenge of maintaining the path symmetry requirement of Ethernet networks. In this paper we propose RSTP-SP as an extension to RSTP that meets the performance objectives and keeps the bridging requirements. We evaluate RSTP-SP by means of a simulation analysis and we compare it to Shortest Path Bridging (SPB). Simulation results show that RSTP-SP outperforms SPB in terms of recovery time and outage experienced. In contrast, the message overhead introduced by RSTP-SP is higher than in the SPB case.

Ethernet bridging; spanning tree; RSTP; shortest path bridging

I. INTRODUCTION Ethernet low cost, high data rates, low complexity and

simple maintenance offer Network Providers a good opportunity for using Ethernet data networks at very large scale replacing the existing ATM/SONET or IP networking [1]. Ethernet was originally designed for LANs without very strict requirements. Using it as a carrier-grade technology represents a new application that leads to new requirements: quick recovery, good resource utilization, use of optimal paths and control over path-selection.

In Ethernet technology the spanning tree protocol is responsible for establishing the connectivity of the different Ethernet segments in a single interconnected network [2]. This connectivity service is driven by the bridging principles based on deploying a plug-and-play architecture. A bridge starts with no configuration and sends all received data frames to all ports except the incoming. From this reception the bridge learns the port that leads to the source address. Subsequent frames sent to this address are directed to this port. All frames with unknown destination continue to be sent to all ports. This broadcast could result in a continuous flooding if the network has loops. Two important aspects of this operation need to be highlighted. One, the broadcast bridging operation only works in networks without loops, hence the spanning tree protocol is used to build a logical tree topology (with no loops and with only one path between any two nodes) over any physical topology. And two, the learning operation learns from the incoming frame assuming that the path coming from a node is the same as the path reaching such node, this is, the path has to be symmetric. While this property comes natural in the original spanning tree protocols (STP[3] and RSTP[4]), it is more difficult to obtain

in the shortest-path extensions as seen in this paper. However maintaining a property is a must for backwards compatibility with existing equipment.

There are some implications of pruning the physical topology into an active tree. First, selecting a single path to connect two peers eliminates all potential redundancy that might be available through extra links. Second, the shortest-path branches only provide optimal paths between peers located in the same branch. In fact, only the Root observes optimal communication with all nodes. In addition, RSTP experiences recovery times of tens of seconds when the root fails [5], and it only drives path selection based on links costs.

In this paper we propose RSTP-SP: the necessary extensions to RSTP in order to construct multiple trees that operate with shortest-path communication. The idea is to extend the active topology into one tree rooted at each node of the network. If each node uses its own tree to introduce its own data traffic, shortest-path communication are achieved between each Root and the rest of nodes. In addition, it provides an increase of network resources utilization because chances are that a link is active at least in one of the trees. RSTP-SP is based on executing the single tree protocol as many times as trees to construct. This is, in a network with N nodes, N trees are configured by N instances of the protocol.

Similar approaches based on multiple trees have been proposed earlier. Most of them are based on extending MSTP [6] but focus on increasing resources utilization or QoS instead of deploying optimal paths [7][8]. AMSTP [9] uses MSTP and provides shortest path bridging also creating one tree per node. However, unidirectional VLANs are configured and these might lead to asymmetrical paths. A different approach is proposed in [10] where the ARP messages are used to construct unidirectional paths substituting the bridge learning functionality. This removes the symmetry requirement but results in non-deterministic routes as the selection depends on the path that the first ARP follows.

The fact that each bridge uses its own tree to transmit data results into a unidirectional structure of paths because a different tree is used in each direction. This introduces an important challenge because the protocol needs a careful selection of trees due to the symmetry requirement. The 802.1aq Shortest Path Bridging (SPB) task group is also currently defining an evolution of RSTP that operates with shortest-paths [11]. SPB solves the symmetry implementing a link-state protocol moving away from the distance-vector approach of the spanning trees. This paper, instead, presents a solution based on the spanning tree protocols that supports shortest path bridging.

2012 IEEE 13th International Conference on High Performance Switching and Routing

978-1-61284-0833-6/12/$26.00 ©2012 IEEE 223

Page 2: [IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

The rest of the paper is organized as follows. Section II describes the RSTP in a simplified form to understand its basic operation. The causes and consequences of the symmetry challenge as well as the adopted SPB solution are discussed in section III. Section IV provides the description of the protocol extensions that compose RSTP-SP. A simulation evaluation comparing the protocol performance in different scenarios is included in section V. Finally, section VI concludes the paper.

II. RSTP This section describes the operation of RSTP in a

simplified form to understand the basic operation. It is a distributed distance-vector protocol whose principles come from the algorithm defined by Perlman [12]. It builds a shortest-path tree rooted at one of the nodes: the Root.

Every node is configured with a unique identifier, BridgeID. The protocol eventually elects as Root the node with the lowest BridgeID (in the example of figure 1(a), the node B0 is elected as the Root). Nodes construct the tree by determining which ports point upward to the Root and which point downward to the leaves. The protocol manages this property by configuring a port variable called role. There are three different types of port roles: the root role is assigned to the port that leads to the Root node with the smallest cost; the alternate role is set to the ports that also lead to the Root but at higher cost; and the designated role is assigned to the ports that lead to the leaves of the tree. The immediate node in the path to the Root is referred as the parent, and the immediate neighbor in the opposite direction is the child (B2 is parent of B6, and B6 is child of B2). Depending on the role selected, the ports belong to the active topology or not: root and designated ports are in forwarding state (forward data frames) while alternate ports stay in discarding state (block the data traffic).

The protocol constructs the tree by the exchange and update of node state. The state information is stored at port level (defining the port state) and at bridge level (defining the bridge state). The port state describes the distance to reach the Root from the corresponding port and the bridge state describes the distance to reach the Root from the bridge through the port with a shortest path. These states are stored in the form of Priority Vectors that include: the ID of the Root (Root), the cost to it (Cost), the ID of the bridge that owns the vector (Bridge), and the ID of the port that owns the vector (Port).

Nodes exchange their vectors by transmitting messages encoded as Bridge Protocol Data Units (BPDUs). The exchange of messages allows the states to evolve until all nodes agree on the Root and the tree configuration. Each received vector is first compared to the one locally stored. A vector is considered better if it conveys a lower Root; or same Root and a lower Cost; or same Root and Cost and a lower Bridge; or same Root and Cost and Bridge and a lower Port. If the incoming vector updates the locally stored the tree is re-configured. Reconfiguring the tree really means selecting the port roles and port states based on the information in the port vectors. After the re-calculation, the node disseminates the new vectors to its neighbors.

The processing of received messages and the corresponding dissemination results into a continuous propagation of the best

Figure 1. Two asymmetrical trees (a,b) become symmetrical after using the path-array in the tie-breaking rule (c,d).

vectors along the network. The BPDUs originally started at the Root B0 are those that always update because they carry a lower Root. Intuitively, this effect is like having wave-fronts of information that start propagating at each node [13]. When two wave-fronts encounter, the one with a higher Root wins and removes the other. There is only one wave-front, the one starting at B0 (or lowest BridgeID alive), that always wins and hence it spans the entire network. At this point all nodes agree on B0 as Root and the tree is configured.

Nodes keep sending periodical BPDUs every 2 seconds to all designated ports in order to refresh the vectors information. These messages are used to detect a failure in case of lack of reception. In this case, the information in the port is not considered valid anymore and the vector is cleared. This derives to reconfiguration of the node and hence the re-calculation of the active tree. The failure detection can be improved if bridge ports directly detect a failed link at physical level, which results into an earlier reconfiguration before the expiration of the timer.

III. SYMMETRY CHALLENGE In the example of figure 1(a), the traffic generated at B0 is

forwarded to B1 using the tree rooted at B0 (T0). Similarly in 1(b), traffic from B1 to B0 uses the tree T1. Since the symmetry requirements need to be met, the branch in T0 from B0 to B1 must be the same as the branch in T1 from B1 to B0. In this example the trees T0 and T1 are not symmetrical (see grey arrows) and the bridge learning and forwarding functions would not work properly.

The problem arises because there are multiple shortest-paths between the pair of nodes (0-3-5-1, 0-2-6-1 and 0-4-5-1) and the algorithm to construct the trees decides for the first path in T0 and for the second in T1. A common implementation of the algorithm might lead to non-symmetrical trees because the shortest-path selected is the one which immediate next hop has the lowest identifier. In the example of figure 1(a), B1 selects B5 instead of B6 because

224

Page 3: [IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

5<6. Similarly in 1(b), B0 selects B2 because 2<3<4. The path between B0 and B1 is not symmetrical because the path selection depends on this local information (immediate next hop), which results in different decisions at different points of the network.

SPB solves this by extending the decision elements and use the complete path from the Root: the path-array. Instead of looking at the parent identifier to select among multiple routes, this approach compares the entire array of each eligible shortest-path. The path-array of each eligible shortest-path is first sorted from lowest to highest, and then the elements are compared one by one. If the element is the same in both path-arrays, the next element is compared. Otherwise, the path-array with the lowest element is considered better and hence selected. In the example of figure 1(c), B1 decides between the path-array in p2, 0-3-5-1, and the path-array in p1, 0-2-6-1. Once sorted, the arrays become 0-1-3-5 and 0-1-2-6, respectively. The latter is considered better because it has 2<3 at third position. Similarly in figure 1(d), B0 decides between 1-5-4-0 in p1, 1-5-3-0 in p2 and 1-6-2-0 in p3. Once sorted, the arrays are 0-1-4-5, 0-1-3-5 and 0-1-2-6; B0 then selects the p3 because 2<3<4. Since both selected paths are the same but in opposite direction, the path from B0 to B1 is the same.

The calculation of the symmetrical trees using the path-array is straightforward in a framework where a centralized algorithm computes the paths. This is the case of the link-state protocol in SPB because the paths are computed locally by each node, and hence the path-array can be easily obtained. A distributed solution to construct the path-array for RSTP-SP is provided in the protocol description of the following section.

IV. RSTP-SP PROTOCOL DESCRIPTION RSTP-SP runs a different instance of the single tree

protocol for each node. The instances are completely independent and run at different levels. The use of parallel tree instances requires each tree to manage its own messages and, consequently, each node needs to store separated information per tree (a bridge stores one bridge vector per tree and one port vector per tree and port). Each event that triggers a protocol operation applies to one of the tree instances and the operation executed uses the variables belonging to that particular tree. First, the initialization of bridges results in the local bridge becoming Root only in the own tree (event InitializeBridge in block A of the pseudo-code in figure 2). Second, the instructions in BPDUReceived apply to the tree instance of the received BPDU (hence the vectors used belong to such tree). Third, the dissemination of periodical messages of PeriodicalBPDU is run for all active trees (all designated ports of all trees send periodical messages). Fourth, FailureDetection is executed for each tree (a port failure detection results into as many tree reconfigurations as trees). The auxiliary sub-routines in blocks E and F include the main changes that RSTP-SP introduces and following points provide a detailed description.

A. Selection of Symmetrical Trees One of the challenges for RSTP-SP is to derive the path-

array in a distributed environment and find a way to apply the tie-breaking rule described in section III. In distance-vector

TABLE I. FIELDS OF THE RSTP-SP PRIORITY VECTOR

RSTP-SP Priority Vector (R:C[PA]:P) Root (R) BridgeID of the Root Cost (C) Cost to the Root

PathArray (PA) Array of BridgeIDs from the Root to the owner of the vector

Port (P) PortID of the port that owns the vector

Figure 2. Pseudo-code of the RSTP-SP operation (updates are underlined)

protocols, the topology distribution and the path calculations are merged together in the same operation (i.e. the paths are selected step by step as the information is distributed). Therefore, the distribution of BPDUs must be used to obtain the path-array. The solution is straightforward because the BPDUs flow from the Root to the leaves following the tree branches. Hence the BPDUs can keep track of the visited nodes and construct the path-array step by step as they are propagated. By encoding the path-array in the BPDUs, each node is aware of the entire paths to reach the Root and hence they can make the right selection based on sorting the array and comparing item by item. Since the path-array is used to decide among paths, hence to elect the port roles, it is stored in the

PROCESSING OF EVENTS

A) InitializeBridge b Become Root of tree b (all ports are designated)

Send BPDUs to all designateds

B) BPDUReceived in port p (for tree t) Call CompareVectors(received vector, vector in port)

if received is BETTER OR transmitter is parent Store received vector in port p Call ConfigureTree

else if received is WORSE Send BPDU to port p

else if received is EQUAL Do nothing

C) PeriodicalBPDU (all trees) Send BPDUs to designated ports

D) FailureDetection in port p (all trees) Copy bridge vector to p's port vector Call ConfigureTree

SUB-ROUTINES

E) CompareVectors(A, B) if( (A.Cost is lower than B.Cost ) OR ( same Cost AND A.PathArray is lower than B.PathArray ) OR ( same Cost and PathArray AND A.Port is lower than B.Port )

return BETTER else if ( all fields are equal ) return EQUAL else return WORSE

F) ConfigureTree (for tree t) If lowest B in port vectors is equal to own BridgeId

Deactivate tree t Set cost to infinity in all vectors

Select port roles if node is not Root Update port vectors Transition port states Send BPDUs to all designated ports

225

Page 4: [IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

priority vectors held by the nodes (as shown in the Priority Vector description in table I). The path-array is actually the array of BridgeIDs from the Root to the local bridge.

With the extension of the priority vector, the tie-breaking mechanism described for RSTP needs to be extended as well. CompareVectors in the pseudo-code shows the updated rule. First, the Root field is not used as the initial tie-breaking checking. Instead, it represents the tree identification (i.e. the tree is identified by the BridgeID of its Root). Second, if the costs are equal the path-array is compared. The rules to decide if a path-array is better than another are the same as those described in section III (sort the IDs and select the array with lower elements). The last step compares the port IDs.

B. Failure Detection Using one tree routed at each node introduces another

aspect to consider. In RSTP, the node with the lowest BridgeID is elected as the unique Root of the tree. If this node fails, the protocol recovers choosing the node with the second lowest BridgeID as new Root. However, in RSTP-SP each node is the Root of its own tree and no other node can arise as Root of another’s tree. Therefore when the Root dies, its tree must become inactive. A node detecting such situation announces it by sending BPDUs with infinity cost (see the underlined instructions in ConfigureTree). These BPDUs are normally processed by the receiving nodes and are seen as common BPDUs with a very large cost. Note that issuing the messages with infinity (or very large) cost is an effective solution to disseminate the no-connectivity situation without changing the process of BPDU processing and keeping the possibility to automatically activate the tree when the Root comes back again.

C. Node Failure Consequences

A recovery from a node failure in RSTP-SP does not differ from RSTP in terms of protocol operation and behavior. If a non-Root node fails, the recovery is as quick as in the single link failure case. On the other hand, if the Root of a tree fails, a count-to-infinity behavior is experienced within the tree instance of the failed Root [5]. Consequently, a node that fails in an SPB framework always results into a Root failure in one of the trees. This means that any node failure leads to count-to-infinity in one of the trees and a quick recovery in the rest.

Nevertheless, the particularity in SPB is that after a Root failure, this node does not inject traffic any more into its own tree. If there is no data communication, there is no urgency to reconfigure the tree because communications from this node cannot be established until it recovers. However, the network is really affected by this behavior in terms of message overhead. The count-to-infinity within the dead tree generates BPDUs that loop around which reduce the processing power of the nodes and the available capacity of the links.

To solve this behavior we introduce in RSTP-SP a confirmation mechanism [14] in order to completely avoid the count-to-infinity behavior. This is based on verifying the Root availability before sending potential false information that

triggers the count-to-infinity. The implementation of this mechanism certainly avoids the count-to-infinity but it delays the recovery in the single link failure scenarios. Hence, there is a tradeoff between (1) using the confirmation mechanism and delay the recovery of single link failure and (2) accept the count-to-infinity effect within the dead tree with no data traffic being forwarded. We refer to the version with confirmation as RSTP-SP-Conf.

V. EVALUATION This section includes the simulation analysis of the

protocols under different scenarios. We have implemented the SPB, RSTP-SP and RSTP-SP-Conf in the ns-3 network simulator. The modeled failure detection mechanism is the immediate physical failure detection. Only BPDU messages are simulated and no user traffic is modeled unless otherwise stated. We take as a reference for the BPDU processing and transmitting delay the study in [15] that assumes a delay of 1.33msec per message.

The performance evaluation focuses on the time to construct the trees (Convergence Time, CT) and the amount of information exchange required for such action (Message Overhead, MO). CT is defined as the time between the failure and the last node reconfiguring the tree. We also measure CT in hop delays as a normalized time unit. That is, a CT of 5 hops means that the protocol takes 5 times the hop delay to converge. MO refers to the amount of messages that the nodes need to exchange in order to recover the tree. The number of messages observed is used to evaluate the overhead in terms of (1) link capacity used and (2) node processing power required.

A. Convergence Time

In the first tests we use a two-dimensional grid of degree 4 and 64 nodes (referred as grid4). We evaluate the performance in three scenarios: cold-start (network start-up), link failure in the center of the grid, and node failure also in the center. For each scenario we execute 100 simulations with different BridgeIDs. Figure 3 shows the average CT, with 95% confidence intervals, for each protocol and scenario. In the cold-start case, all protocols perform equally because they are all based on flooding information that starts at each node and spans the entire network. This results into a CT depending on the topology diameter as it is the longest path the flooding follows.

In the event of a central link failure, RSTP-SP and SPB also provide a similar performance because the messages that they issue are propagated from the failure location to the furthest node, and hence it depends on the diameter as well. The small difference is because SPB needs to propagate the entire path, while RSTP-SP only needs to reconfigure the branches that are affected by the failure. The larger CT observed in RSTP-SP-Conf is because of the delay introduced by the confirmation mechanism.

The last set of columns in figure 3 shows the observed CT after the failure of a central node. First note that the values for RSTP-SP do not consider the dead tree without traffic

226

Page 5: [IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

Figure 3. Average CT (with 95% conf. inter.) in

cold-start, central link failure, and central node failure (100 executions with random BridgeIDs).

Figure 4. Average CT (with 25%-75% percentiles)

failing all possible links in different topologies.

Figure 5. Data traffic received during a link failure recovery.

Figure 6. Average MO (with 95% conf. inter.) in

cold-start, central link failure, and central node failure (100 executions with random BridgeIDs).

Figure 7. Average MO (with 25%-75% percentiles)

failing all possible links in different topologies.

Figure 8. Percentage of nodes affected by a link failure recovery in RSTP-SP

(actually experiencing count-to-infinity). In this case the CT of both RSTP-SP protocols is reduced because the largest distance between failure and furthest nodes has decreased (an entire node has failed).

We have done an additional analysis to observe the behavior of the protocols in other link failures and in other topologies. We use the grid4 topology, a two-dimensional grid of degree 8 and 64 nodes (grid8), and a structured topology of 50 nodes with a meshed core with dual-homed edges (mesh) [16].

For each topology we fail one link at each execution, and we do as many executions as links to test all possible link locations. This results in 112-210-110 runs for the grid4, grid8 and mesh topologies, respectively. Figure 4 shows the average CT with the 25%-75% percentiles (top and bottom part of the vertical lines in each bar). As previously stated, RSTP-SP and SPB perform similarly and RSTP-SP-Conf introduces the confirmation delay. The values for all protocols are proportional to the topology diameter. We have also done experiments with the previous topologies and varying the size. They confirm that the CT in all protocols and in all networks is related to the network diameter.

The convergence time can also be analyzed from the data traffic perspective if we measure the outage of received data packets during a failure recovery. In this test each node sends broadcast data packets, hence the percentage of packets

received at each instant represents the level of instant connectivity. Plot in figure 5 shows the time-line of total received packets during a central link failure in the grid4 topology of 64 nodes. The horizontal axis is measured in hops and the vertical axis in percentage of total received packets. First note that the RSTP-SP protocols are practically not affected by the failure because those paths that remain the same in the new topology continue to provide connectivity. Particularly, RSTP-SP-Conf takes more time to provide full connectivity because of the confirmation delay. On the contrary, SPB suffers a higher outage because the new information issued during the reconfiguration creates discordances between topology databases in different nodes. The communication between nodes with different databases is temporary stopped in order to avoid potential forwarding loops.

B. Message Overhead

Figure 6 shows the average MO per node measured in the 64 nodes grid4 for the different scenarios. Observe the logarithmic scale on the vertical axis. MO in a cold-start is similar in all protocols because all are based on flooding of messages. The differences appear when we compare the protocols in the event of failures: SPB clearly outperforms RSTP-SP protocol. The reason is that, for example, in the link failure scenario SPB only floods the link-state updates of the two nodes detecting the failure. Differently, RSTP-SP reconfigures many trees where each one issues BPDUs to

227

Page 6: [IEEE 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR) - Belgrade, Serbia (2012.06.24-2012.06.27)] 2012 IEEE 13th International Conference on

update the paths. Also note that in the link failure scenarios the confirmation mechanism in RSTP-SP does not represent a big difference because it only introduces delay (effect already seen evaluating the CT).

A similar behavior is observed in the event of node failure. In this case the reason of the high MO in RSTP-SP is the count-to-infinity occurring in the tree of the failed Root. In addition, even if RSTP-SP-Conf allows removing this effect, the messages transmitted are not reduced to the level of SPB because these are mainly due to the reconfigurations in the trees where a non-Root failure occurs (same order as in the link failure).

We have also measured MO for different topologies and sizes. The obtained results for different grid sizes of degree 4 show that the amount of messages per node almost becomes constant for large networks. The reason is that the MO is proportional to the connectivity level (or the average node degree). To justify this, figure 7 shows the average MO after all possible links failure for topologies with different average node degree (3.5 for grid4, 6.2 for grid8 and 4.2 for mesh). First, see that these tests confirm that SPB needs fewer messages to recover. Second, also observe that the MO of all protocols is proportional to the average node degree. The difference between RSTP-SP and RSTP-SP-Conf is that the confirmation mechanism is based on flooding and hence it results into more messages in more connected topologies.

A different comparison is to evaluate the protocols in the steady state once the cold-start has finished and no failures have occurred. In this situation, all protocols send refreshing messages to keep the topology alive: SPB nodes apply a global refreshing and flood all their link-states; RSTP-SP nodes do it more locally and just send BPDUs from parent to child in all ports of all trees. This results into SPB transmitting the double of messages than RSTP-SP with and without confirmation (112 and 216 respectively).

A particularity of the link-state protocols is that any failure always leads to a flooding of a new physical topology, which results into a recalculation of all trees in all nodes. On the other hand, a distance-vector protocol only reconfigures the affected nodes and trees. This can be observed in figure 8 showing the percentage of nodes that reconfigure a certain amount of affected trees after a central link failure. For example, black boxes indicate the percentage of nodes that are affected in more than 75% of the trees (e.g. in a network of 4 nodes, half of them are affected in more than 3 trees). Note that when the network size grows, the percentage of affected trees decreases and the majority of the nodes only reconfigure less than 25% of the trees. Note that SPB in this plot would always indicate 100% of affected trees in 100% of the nodes.

VI. CONCLUSION In this paper we have presented the description of a

protocol that extends RSTP in order to operate with optimal paths. The last evolution of the standard, SPB, provides the same property but its protocol operation is completely different. We have compared them by means of simulation.

Results show that RSTP-SP and SPB perform similar in terms of convergence time but SPB experiences a much larger outage during recoveries. In contrast, the RSTP extension results into higher message overheads compared to SPB. This is because link-state protocols are based on computation while distance-vectors rely on message dissemination.

However, the RSTP-SP operation can be optimized in several aspects in order to improve the MO performance. The idea is to share information between different tree instances and reuse the fields in the BPDUs, especially the path-array. Also note that the use of the path-array opens the door to designing more detailed path-control in order to provide advanced path-selection driven by different metrics. An additional aspect to observe in order to improve the protocol comparison is to use the MO measurements to compare the node processing power requirements. The number of messages that trigger a tree computation multiplied by the processing required per trigger provides an indication of the protocol complexity. One trigger in RSTP-SP results in a few vector comparisons, while SPB runs Dijkstra once per tree. In consequence, while SPB outperforms RSTP-SP in amount of messages, the comparison could turn around if we observe the overall processing requirements. A direct outcome of this result could be to evaluate the feasibility of a HW implementation of the protocol: the simpler a single execution is, the easier the implementation.

REFERENCES [1] G.Chiruvolu et al. “Issues and approaches on extending Ethernet beyond

LANs”. IEEE Communications Magazine, 42(3):80–86, 2004. [2] R.Perlman. “Interconnections: Bridges, Routers, Switches, and

Internetworking Protocols”, Addison-Wesley, 2000. [3] IEEE standard for Information Technology - MAC bridges. ANSI/IEEE

Std 802.1D, 1998 Edition, pages i–355, 1998. [4] IEEE standard for Local and Metropolitan Area Networks Media Access

Control MAC bridges. IEEE Std 802.1D-2004, pages i–269, 2004. [5] E.Bonada and D.Sala. “Characterizing the convergence time of RSTP”.

Proceedings of MIC-CSP 2012, Barcelona, April 2012. [6] IEEE standard for local and metropolitan area networks virtual bridged

local area networks. IEEE Std 802.1Q-2005, pages 1–285, 2006. [7] M.Huynh and P.Mohapatra. “A scalable hybrid approach to switching in

metro Ethernet networks”. Proceedings of LCN 2007, 2007. [8] S.Sharma, K.Gopalan, S.Nanda and T. Chiueh. “Viking: a multi-

spanning-tree Ethernet architecture for metropolitan area and cluster networks. Proceedings of INFOCOM 2004.

[9] G.Ibañez, A.García, A.Azcorra, “Alternative Multiple Spanning Tree Protocol for optical Ethernet backbones”, Proceedings of LCN 2004.

[10] G.Ibañez et al. “Implementation of ARP-path low latency bridges in Linux and OpenFlow/NetFPGA" Proceedings of HPSR 2012.

[11] IEEE. Shortest Path Bridging - draft 4.5. IEEE 802.1 documents, February 2012.

[12] R.Perlman. “An algorithm for distributed computation of a spanning tree in an extended LAN”. ACM SIGCOMM CCR, pages 44–53, 1985.

[13] E.Bonada and D.Sala. “On the theoretical bounds of the Spanning Tree Algorithm”, Proceedings of Jornadas Telecom I+D, Bilbao, 2008.

[14] E.Bonada and D.Sala. “RSTP-Conf: efficiently avoiding count-to-infinity in RSTP”, in preparation, April 2012.

[15] R.Pallos et al. “Performance of rapid spanning tree protocol in access and metro networks”. Proceedings of AccesNets 2007.

[16] J.Chiabaut, “All Pairs Shortest Paths Performance Measurements”, IEEE 802.1 documents (/docs2008/aq-chiabaut-all-pairs-shortest-path-0308-v01.pdf ), March 2008.

228