HP CNA

Embed Size (px)

Citation preview

  • 8/3/2019 HP CNA

    1/20

    Converged networks with Fibre Channelover Ethernet and Data Center Bridging

    Technology brief, 2nd edition

    Introduction ......................................................................................................................................... 2Traditional data center topology ............................................................................................................ 2Early attempts at converged networks ..................................................................................................... 2Network convergence with FCoE ........................................................................................................... 310 Gigabit Ethernet ............................................................................................................................. 4

    HP Virtual Connect Flex-10 ................................................................................................................ 4HP Virtual Connect FlexFabric with FCoE ............................................................................................ 5

    Emerging standards for network convergence ......................................................................................... 5FCoE standard ................................................................................................................................. 5

    FCoE protocol encapsulation.......................................................................................................... 5Fibre Channel Forwarder ............................................................................................................... 6ENode ........................................................................................................................................ 6

    FCoE and Ethernet............................................................................................................................ 7DCB standards ................................................................................................................................. 8

    Priority-based Flow Control ............................................................................................................ 8Enhanced Transmission Selection .................................................................................................. 10Quantized Congestion Notification ............................................................................................... 12Data Center Bridging Exchange ................................................................................................... 14

    Migrating to converged fabrics............................................................................................................ 17HP strategy ....................................................................................................................................... 19For more information .......................................................................................................................... 20Call to action .................................................................................................................................... 20

  • 8/3/2019 HP CNA

    2/20 2

    Introduction

    Using application-specific networks for data, management, and storage is complex and costly. Networkconvergence is a more economical solution that simplifies your data center management by partially orcompletely consolidating all block-based storage and Ethernet-based data communications networks onto asingle fabric. Any network topology constructed with one or more switched network nodes is a fabric.Converged networks consolidate two or more network types onto a single fabric.

    The promise of network convergence is that it will reduce the cost of qualifying, buying, powering, cooling,provisioning, maintaining, and managing network-related equipment. The challenge is determining the bestadoption strategy for your business.

    This technology brief does the following for you:

    Defines converged networks Summarizes previous attempts to create them Explains Fibre Channel over Ethernet (FCoE) technology Describes how converged network topologies and converged network adapters (CNAs) work together to

    tie multiple networks into a single, converged infrastructure

    Introduces the networking standards required to support this new breed of converged networks Explains how the new standards will affect how you will design and deploy your converged network

    infrastructure over the next several years

    Traditional data center topology

    Traditional data centers typically have underused capacity, inflexible single-purpose resources, and highmanagement costs. Typical designs of data center infrastructure include separate, heterogeneous networkdevices for different types of data. Each device adds to the complexity, cost, and management overhead.Many datacenters support three or more types of networks that serve these purposes:

    Block storage data management Remote management Business-centric data communicationsMultiple types of networks require unique switches, network adapters, and network management systemsand technology to unify these networks.

    Early attempts at converged networks

    There have been many attempts to create converged networks over the past decade. Fibre Channel Protoco(FCP) is a lightweight mapping of SCSI to the Fibre Channel (FC) layers 1 and 2 transport protocol(Figure1, left). FC carries not only FCP traffic, but also IP traffic, to create a converged network. The cost of

    FC and the acceptance of Ethernet as the de-facto standard for LAN communications prevented widespreadFC use except for data center SANs for enterprise businesses.

    InfiniBand (IB) technology provides a converged network capability by transporting inter-processorcommunication, LAN, and storage protocols. The two most common storage protocols for IB are SCSIRemote Direct Memory Access Protocol (SRP) and iSCSI Extensions for RDMA (iSER). These protocols usethe RDMA capabilities of IB. SRP builds a direct SCSI to RDMA mapping layer and protocol, and iSERcopies data directly to the SCSI I/O buffers without intermediate data copies (Figure 1, left of center). Theseprotocols are lightweight but not as streamlined as FC. Widespread deployment was impractical becauseof the perceived high cost of IB and the complex gateway and routers needed to translate from these

  • 8/3/2019 HP CNA

    3/20 3

    IB-centric protocols/networks to the native FC storage devices in data centers. High Performance Computing(HPC) environments that have adopted IB as the standard transport network use SRP and iSER protocols.

    Figure 1. Comparison of multiple protocol stacks for converged networks

    Fibre InfiniBand FCoE/DCBChannel

    Internet SCSI (iSCSI) was an attempt to bring a direct SCSI to TCP/IP mapping layer and protocol to themass Ethernet market, to drive costs lower, and to allow deploying SANs over existing Ethernet LANinfrastructure. iSCSI technology (Figure 1, center) was very appealing to the small and medium businessmarket because of the low-cost software initiators and the ability to use any existing Ethernet LAN.However, iSCSI typically requires new iSCSI storage devices that lack the features in devices using FCinterfaces. Also, iSCSI to FC gateways and routers are very complex and expensive. They do not scale costeffectively for the enterprise. Most enterprise businesses have avoided iSCSI or have used it for lower tierstorage applications or for departmental use.

    FC over IP (FCIP) and Internet FC Protocol (iFCP) map FCP and FC characteristics to LANs, MANs, andWANs. Both of these protocols map FC framing on top of the TCP/IP protocol stack (Figure 1, right ofcenter). FCIP is a SAN extension protocol to bridge FC SANs across large geographical areas. It is not forhost/server or target/storage attachment. The iFCP protocol allows Ethernet-based hosts to attach to FCSANs through iFCP-to-FC SAN gateways. These gateways and protocols were never widely adopted excepfor SAN extension because of their complexity, lack of scalability, and cost.

    Network convergence with FCoE

    FCoE is the next attempt to converge block storage protocols onto Ethernet. FCoE relies on an Ethernetinfrastructure that uses a new set of Data Center Bridging (DCB) standards defined by the IEEE (Figure 1,right). Converged Enhanced Ethernet (CEE) is Ethernet infrastructure that implements DCB. Although theDCB standards can apply to any IEEE 802 network, most use it to refer to enhanced Ethernet, making DCB

  • 8/3/2019 HP CNA

    4/20 4

    and CEE equivalent terms. We use the term DCB to refer to an Ethernet infrastructure that implements atleast the minimum set of DCB standards to carry FCoE protocols.

    A traffic class (TC) is a traffic management element. DCB enhances low-level Ethernet protocols to senddifferent traffic classes to their appropriate destinations. It also supports lossless behavior for selected TCs,for example, those that carry block storage data. FCoE with DCB tries to mimic the lightweight nature ofnative FC protocols. It does not incorporate TCP or even IP protocols. This means that FCoE is a non-routable protocol meant for local deployment within a data center. The main advantage of FCoE is thatswitch vendors can easily implement the logic for converting FCoE/DCB to native FC in high performance

    switch silicon. FCoE solutions should cost less as they become widely used.

    10 Gigabit Ethernet

    One obstacle to using Ethernet for converged networks has been its limited bandwidth. As 10 GigabitEthernet (10 GbE) technology becomes more widely used, 10 GbE network components will fulfill thecombined data and storage communication needs of many applications. With 10 GbE, converged Ethernetswitching fabrics handle multiple TCs for many data center applications. DCB-capable Ethernet gives youmaximum flexibility in selecting network management tools. As Ethernet bandwidth increases, fewerphysical links can carry more data (Figure 2).

    Figure 2. Multiple traffic types sharing the same link

    HP Virtual Connect Flex-10

    Virtual Connect (VC) Flex-10 technology lets you partition the Ethernet bandwidth of each 10 Gb Ethernetport into up to four FlexNICs. The FlexNICs function and appear to the system as discrete physical NICs,each with its own PCI function and driver instance. The partitioning must be in increments of 100 Mb.

    While FlexNICs share the same physical port, traffic flow for each is isolated with its own MAC addressand VLAN tags between the FlexNIC and associated VC Flex-10 module. Using the VC Manager CLI orGUI, you can set and control the transmit bandwidth available to each FlexNIC according to server

    workload needs. With the VC Flex-10 modules now available, each dual-port Flex-10 enabled server ormezzanine card supports up to eight FlexNICs, four on each physical port. Each VC Flex-10 module cansupport up to 64 FlexNICs.

    Flex-10 adds LAN convergence to VCs virtual I/O technology. It aggregates up to four separate trafficstreams into a single 10 Gb pipe connecting to VC modules. VC then routes the frames to the appropriateexternal networks. This lets you consolidate and better manage physical connections, optimize bandwidth,and reduce cost.

  • 8/3/2019 HP CNA

    5/20 5

    HP Virtual Connect FlexFabric with FCoE

    Now that we have achieved an acceptable level of LAN convergence with Flex-10 technology, the nextlogical step is to add LAN/SAN convergence technology. Virtual Connect FlexFabric broadens VirtualConnect Flex-10 technology to provide solutions for converging different network protocols. We plan todeliver the FlexFabric vision by converging technology, management tools, and partner product portfoliosinto a virtualized fabric for the data center.

    Emerging standards for network convergenceConverged networks require new standards. The International Committee for Information TechnologyStandards (INCITS) T11 technical committee creates the standards that relate to storage and storagenetworking based technologies. The IEEE 802.1 Work Group is responsible for developing two types ofstandards:

    Standards common to all IEEE 802 defined network types (for example, Ethernet and Token-Ring) Standards necessary to support communication within and between these network types.FCoE standard

    FCoE is an emerging technology under development by the INCITS T11 technical committee. INCITS/ANSIT11.3 FC-BB-5 is the official standard. It includes two protocol definitions:FCoE and FCoE InitializationProtocol (FIP). The FCoE protocol defines the encapsulation of FC frames into Ethernet frames. FIP defines afabric discovery protocol, creates an Ethernet version of FC fabric login services, and defines the protocolsfor handling MAC address assignment and association with World Wide Names (WWNs). FCoE relies onimproved flow control, well-defined traffic shaping, and multiple TC support that IEEE 802.1 DCB standardsprovide.

    FCoE protocol encapsulation

    FCoE is different from previous attempts to move SCSI traffic over Ethernet. The FCoE protocol allowsefficient, high performance conversion between FCoE links and FC links in layer 2 switches. DCBenhancements offer lossless operation for some TCs. This lets us place the FC protocol directly on top of

    layer 2 (link layer) Ethernet, so we dont have to rely on more complex transport protocols such as TCP toensure lossless behavior. Implementing FCoE in this way lets us develop devices such as adapters andswitches that use most of the existing FC logic on top of the new DCB/Ethernet physical interfaces.

    The FCoE protocol encapsulation standard requires IEEE 802.1Q tags. Each FCoE frame contains explicitTC/priority tags for efficient processing in layer 2 DCB-capable Ethernet switches. Data centers deployFCoE for intra-data center use with a similar span as a switched LAN subnet or SAN fabric because FCoE isa layer 2 protocol and does not use the layer 3 IP protocol.

    FCoE encapsulates FC frames, including FC frame delimiters, headers, payload, and frame checksequence, within the Ethernet frames using a format illustrated in Figure 3.

  • 8/3/2019 HP CNA

    6/20 6

    Figure 3. Illustration of an FCoE frame

    Layer 2 encapsulation provides several advantages to FCoE over previous converged networkimplementations:

    Because devices use existing FC logic, FCoE devices use existing FC driver models for the new convergednetwork adapters.

    We can easily implement FCoE in switches because the logic necessary to convert between FCoE and FCis simple.

    Existing FC security and management operations, procedures, and applications do not change whenusing an FCoE/DCB infrastructure for a partial or completely converged network.

    FCoE takes advantage of a lossless 10 GbE fabric with significantly higher bandwidth than 8 Gb FCfabrics (actually 6.4 Gb plus encoding overhead in the FC protocol).

    Future protocols can use enhanced DCB Ethernet features that support FCoE.

    Fibre Channel ForwarderFibre Channel Forwarder (FCF) is a function within a switch that acts as a translation point that supportsconverting FCoE traffic between DCB-enabled Ethernet ports and native FC ports. There is one FCF functionin a switch for each upstream FC fabric connected to the FC ports of that switch. In other words, there canbe more than one FCF function in a switch. An FCF also provides the portal where converged networkadapters access the traditional SAN fabric services, for example fabric login, name services, and zoningservices. When first initialized, converged network adapters discover the available FCFs in a DCB network.Through management direction, they attach themselves to at least one FCF to begin communication with aSAN fabric. During fabric login, FCFs provide the mechanism that negotiates the MAC addressprovisioning to the FCoE portion of a converged network adapter. The most commonly used mechanism isFabric Provisioned MAC Addresses, or FPMA. It operates as FC addresses in an FC network where theaddress used in the frames is allocated at fabric login time. This is different from normal Ethernet NIC

    functions, which typically have a static address burned in to them in the factory.

    ENode

    ENode is a device that takes the place of the traditional LAN NIC and the FC HBA in a host or server. It iscommonly called a converged network adapter (CNA). It provides both data communications and blockstorage communications through a converged network implemented with DCB-capable Ethernet. An ENodemerges the traffic from the NIC and from the SCSI/FC functions into a stream of Ethernet frames to the DCB-enabled Ethernet network. Within the DCB network, a DCB/FCoE/FC switch disaggregates the convergedtraffic streams and sends the different TCs to their appropriate destinations: legacy LANs, legacy FC nodes,or DCB network nodes.

    Header

    Header

    Start ofFrame Payload

    FrameCheck

    End ofFrame

  • 8/3/2019 HP CNA

    7/20 7

    The ENode (Figure 4) consists of these components:

    FCoE Controller uses FCoE Initialization Protocol (FIP) to discover the SAN fabrics through the FCFs andprovisions the virtual N_Ports (VN_Ports) and FCoE Link End Points (LEPs).

    FCoE LEPs convert FC frames to FCoE frames on the transmit side, and convert FCoE frames to FC frameson the receive side. There is one LEP for each VN_Port established in the ENode.

    VN_Ports instantiate virtual N_Ports with N_Port ID Virtualization (NPIV) capability similar to a traditionalFC HBA. The VN_Ports in an ENode include information about the MAC address to WWN translations

    required for proper communications with FCFs in a converged network. FC Function is the traditional logic implemented in an FC HBA. It handles storage discovery, storage

    connection management, error recovery, and host bus (PCIe) interface interoperation to upper layerdriver/SCSI drivers. Again, this function behaves so much like an FC HBA function that CNAs and HBAsfrom the same vendors typically use the same storage drivers in the host operating systems to controlthem. This makes deploying both converged and non-converged systems in a data center very easyduring the transition to a converged infrastructure.

    Figure 4. FCoE architecture components

    FCoE and Ethernet

    FCoE requires DCB-enabled Ethernet. The IEEE is working to enhance the IEEE 802 network standards toallow FC, or any TC requiring lossless behavior, to run efficiently over many types of IEEE 802 compliant,MAC layer protocols, including Ethernet. We expect the FCoE standard ratification in late 2010. It isimportant to understand that FCoE will not work on legacy Ethernet networks because it requires a losslessform of Ethernet.FC cannot handle dropped frames as Ethernet allows today. It is possible to create alossless Ethernet network using existing IEEE 802.3x flow control mechanisms. If the network carries multiple

  • 8/3/2019 HP CNA

    8/20 8

    TCs, the existing mechanisms can cause Quality of Service (QoS) issues, limit the ability to scale a network,and affect performance.

    DCB standards

    DCB is not just the name for a set of new standards the IEEE is developing. It is a term often used forEthernet designed to carry multiple TCs, some with lossless behavior. You can think of DCB-enabledEthernet as applying the DCB standards to IEEE 802.3 Ethernet standards to create a new set of products toimplement this improved version of Ethernet. The change from legacy Ethernet to DCB-enabled Ethernetrequires hardware and software changes, so you cant upgrade legacy Ethernet NICs and switches withDCB support to carry FCoE traffic. Fortunately, you only have to update the data paths in a data center thatcarry FCoE with DCB-enabled Ethernet devices.

    For full end-to-end data center use, all equipment manufacturers must agree to adopt four new IEEEprotocols. The proposed standards are still under development, and full ratification of the complete set maytake until late 2010 or 2011. One result of these ongoing standardization efforts is that DCB/FCoEproducts offered on the market today will likely need frequent software upgrades or even new hardware bythe time DCB/FCoE technology is fully mature.

    The DCB Task Group within the IEEE 802.1 Higher Layer LAN Protocols Work Group is defining DCB forprotocols and technologies that apply to data center-oriented LAN communications. The standards they

    develop apply to all IEEE 802 network types, but they implicitly target Ethernet for primary implementation.Table 1 lists four new technologies defined in three DCB draft standards.

    Table 1. DCB draft standards for IEEE 802 networks

    Draft standard New technology

    IEEE 802.1 Qbb Priority-based Flow Control (PFC)

    IEEE 802.1 Qaz Enhanced Transmission Selection (ETS)

    DCB Capability Exchange Protocol (DCBX)

    IEEE 802.1Qau Quantized Congestion Notification (QCN)

    These standards serve three general purposes:

    Allow IEEE 802 LANs to carry multiple traffic classes Support lossless behavior on a subset of these traffic classes Formally define standard frame transmission scheduling mechanisms to support multiple traffic classes.You dont have to use all four of these protocols to implement a DCB network, and you dont need to use aloptions available in each protocol. However, if vendors do not implement the entire set, products may limitthe possible scale or features. Because the standards are evolving, current DCB/FCoE products do notimplement all of these protocols or all their supported options. Therefore, we must discuss their deploymentlimitations within a data center.

    Priority-based Flow Control

    Legacy FC networks support a link-level flow control mechanism known as buffer-to-bufferor credit-basedflow control. This lightweight, high performance mechanism lets FC work in a lossless manner. Credit-basedflow control provides a reliable layer 2 network required for block storage traffic, for example SCSI. Totransport FC and SCSI protocols over Ethernet and maintain a lightweight implementation, we recommendproviding a similar mechanism for Ethernet networks.

    Legacy Ethernet uses a simple flow control mechanism. It uses pause frames to let a congested networkdevice port on an Ethernet NIC or switch tell its link partner to pause all traffic for a specified time. Thisapproach can limit performance when a network device port has multiple queues for receiving incoming

  • 8/3/2019 HP CNA

    9/20 9

    frames of varying priority or TCs: If one queue becomes full, the device must send a pause frame to theother side of the link. This pauses all traffic, regardless of TC/priority.

    Supporting lossless behavior of block storage protocols on legacy Ethernet networks requires using legacypause frames. However, this forces all traffic to be lossless on that link. The most bursty or bandwidth drivenTCs dictate the behaviors of all TCs. Many types of traffic flows, for example real-time audio/video datastreams, dont require lossless transmission and dont perform well on a lossless link. Even traditional TCP-based traffic flows optimized for lossy communications environments often dont perform well in losslessenvironments that transport different classes of traffic with vastly different characteristics simultaneously.

    In Figure 5, low-bandwidth, latency-sensitive traffic for voice/video/financial transactions (green) andhigher bandwidth bulk traffic for storage (red) are sent on a link. The receiving device has two sets ofqueues for receiving and storing data, one for green traffic and the other for red traffic. In this example, thehigh bandwidth bulk traffic will fill the red receive queue. Although the green traffic has plenty of queuespace available, the receiving device sends a pause frame because the red queue is full. The transmittingdevice receives this pause frame and stops all traffic on the link. Long delays interrupt the low latencytraffic.

    Figure 5. Legacy pause-based flow control

    Priority-based Flow Control (PFC) uses a modified version of the pause frame called a Per Priority Pause(PPP) frame. PPP allows the pause frame to specify which priorities, and thus which TCs, to pause. PFC usesthe priority levels in the class of service fields of the 802.1Qbb PPP frame header. When a network devicehas one or more receive queues that are nearly full, it constructs a PPP frame to send to the remote linkpartner. The remote device examines the class of service fields to determine which priorities/TCs to pause.The ports transmit function will stop sending the priorities/TCs going to the full ingress queues on thecongested device without affecting priorities/TCs going to unfilled queues on the congested device.

  • 8/3/2019 HP CNA

    10/20 10

    Figure 6 illustrates the same scenario up to the point where the receiving node needs to send a pauseframe. A PPP frame dictates pausing the red TC. The pause takes advantage of the class of service fields torestrict the pause to only classes of traffic that have nearly full queues. The transmitting station stops sendingred traffic; the latency-sensitive green traffic continues to flow properly.

    Figure 6. PFC-based flow control

    Receive queues in a DCB Ethernet device will have high and low watermarks. If the queues fill up to thehigh watermark, the device generates a PPP frame. If the level of the queue drops below the lowwatermark, the device will send a PPP frame specifying a zerotime to indicate that the link partner maysend traffic for the affected TCs immediately. This allows an XON/XOFF-type operation on a perpriority/TC. PPP frames allow a single frame to specify XON/XOFF behavior independently for any of upto eight priorities/TCs. This reduces the control frame overhead if devices support PFC on multiple TCs.

    The FCoE protocol requires DCB-enabled Ethernet devices to support only one PFC-enabled priority/TC. Noall eight priorities/TCs must support PFC, and not all priorities/TCs have to support PFC. Many devices on

    the market today support only one PFC-enabled priority/TC. In the future, devices should support a greaternumber of PFC-enabled priorities/TCs, but that is not required for basic FCoE transport over DCB-enabledEthernet links.

    Enhanced Transmission Selection

    Legacy Ethernet supports multiple traffic management elements called traffic classes (TCs). IEEE 802.1Q(VLAN) tags with a class of service (CoS) field assign a transmission priority to each TC. You can implementup to eight TCs (TC0 through TC7) in an Ethernet device. Current standards and product implementationsfocus on transmitting the traffic classes in strict priority order. For applications operating completely at layer2, the MAC layer, strict priority does not allow for fair, deterministic bandwidth control typically preferred

  • 8/3/2019 HP CNA

    11/20 11

    for all but the very highest priority traffic classes. This includes converged networks that handle blockstorage traffic using a layer 2 encapsulation protocol, like FCoE.

    One common misunderstanding about many modern Ethernet devices, particularly Ethernet switches, is thatthey already have bandwidth control and traffic shaping capabilities that support layer 2 protocols likeFCoE. But these devices typically define traffic classes based on layer 3 (IP) or layer 4 information inframes, not by the priority field of the IEEE 802.1Q tag field or the Ethertype (protocol) field in the Ethernetframe header.

    The Enhanced Transmission Selection (ETS) standard formally defines how the port transmit logic of anEthernet device selects the next frame to send from one or more priority/traffic class queues for layer 2, orMAC based, protocols. This lets the device allocate bandwidth between layer 2 defined traffic classes andsupport strict priority scheduling for traffic classes requiring it. ETS refines the existing TCs. ETS adds abandwidth-sharing algorithm that you can assign to each of the supported TCs. When you configure a TCto use the ETS bandwidth-sharing algorithm, you must provide a bandwidth percentage.

    Traffic class queues that are part of TCs assigned a strict priority-scheduling algorithm (typically the defaultalgorithm) are processed in strict priority order. They have three typical uses:

    Extremely high priority network control or management traffic Low-bandwidth/low-latencyJitter (variable latency) sensitive or intolerantThe ETS standard specifies that once all the strict priority TC queues are empty, the device sends framesfrom the TCs assigned an ETS scheduling algorithm. A single ETS TC can have more than one priorityqueue.

    There is a common misconception about the ETS bandwidth-sharing algorithm. Some people think that thebandwidth percentage assigned to an ETS traffic class is a percentage of link bandwidth for the port. Thatis not true. ETS bandwidth percentages represent the percentage of available bandwidth after satisfying allof the strict priority TCs. That is, if the strict priority TCs take up 4 Gb/s of the link bandwidth of a 10 Gb/slink, the ETS queue assigned 50 percent bandwidth is asking for 50 percent of the remaining 6 Gb/s of thelink bandwidth, or 3 Gb/s.

    The ETS standard does not specify the bandwidth allocation algorithm that DCB-enabled Ethernet devicesmust use to select frames from the TCs. Device vendors get to decide the best algorithms for their products.The standard does suggest that deficit weighted round robin (DWRR) and a handful of other algorithmswould suffice. The ETS standard also does not specify the algorithm for selecting frames for transmit frommultiple priority queues assigned to the same TC. The standard suggests that using a strict priority algorithmbetween these queues is one possibility.

    As Ethernet frames of varying priority queue up for transmission on a port, the device maps them intopriority queues and traffic classes. The device then places the frames into independent priority or trafficclass queues. Network administrators responsible for managing the port on the network device areresponsible for configuring these assignments. The ETS standard specifies that these administrators are alsoresponsible for assigning the scheduling algorithm for each traffic class.

    In Figure 7, priority 5 frames are in TC4, and priority 1 frames are in TC1. Strict priority was thescheduling algorithm for both TCs, so the device sends their frames before any frames of TCs assigned theETS scheduling algorithm. In this case, the device sends frames for TC4 before any frames from TC1. Ifthere are no frames in the queue for TC4, then the device sends frames in TC1 before any frames in any ofthe other TCs. TCs assigned with ETS scheduling (TC0, TC2, and TC3) have been allocated 50, 40, and 10percent of the available bandwidth, respectively. These allocations are the percentage of bandwidthavailable after the transmit requirements of TC4 and TC1are satisfied.

  • 8/3/2019 HP CNA

    12/20 12

    Figure 7. Example of an Enhanced Transmission Selection (ETS) configuration

    Also in Figure 7, note that TC2 has priority queues 2 and 3. The ETS standard suggests that frames transmitfrom TC2 queues in strict priority order. In this example, the device sends any frames in the queue forpriority 3 before any frames in the queue for priority 2. Again, the standard leaves the implementation ofscheduling for these intra-TC queues to device vendors. Vendors might use the two traffic classes scheduledin strict priority order or in round robin. Some implementations may be configurable to allow either mode.

    The FCoE protocol requires DCB-enabled Ethernet devices to support at least two TCs that support ETSscheduling algorithms: one to support traditional data communication traffic and one to support FCoEtraffic. Many devices on the market only support two TCs with ETS capability. In future generations ofhardware, devices should support more TCs capable of ETS bandwidth scheduling, but this is not requiredfor basic FCoE transport over DCB-enabled Ethernet links.

    Those who adopt of this technology must clearly understand another important aspect of ETS performance.ETS bandwidth allocation is merely the best effort specification of minimum bandwidth guarantee. Manyfactors can limit the effectiveness of a device to meet these bandwidth requirements consistently. Thebandwidth consumed by the strict priority queues can directly affect the amount of bandwidth available forETS traffic classes. When a port receives a per priority pause frame (PPP) from its link partner, alltransmission from that traffic class or priority queue within the traffic class stops for the duration of thepause. This could dramatically reduce the effective throughput of that traffic class. Finally, implementingcongestion notification can also affect the amount of data transmitted from a traffic class, but not as severelyas PFCs effect on ETS.

    Quantized Congestion NotificationThe IEEE 802.1Qau standard specifies a protocol called Quantized Congestion Notification (QCN). TheQCN protocol supports end-to-end flow control in large, multi-hop, DCB-enabled, switched Ethernetinfrastructures. It is one of the most significant standards for enabling converged network deployments inmoderate to large data centers. PFC protects against occasional bursty congestion on a single link betweenDCB-enabled devices. QCN protects larger multi-hop or end-to-end converged networks from persistent orchronic congestion. These multi-hop networks are susceptible to congestion because typical tree-like networkarchitectures tend to have choke points where multiple sources of data compete for network resources andbandwidth to reach a smaller number of destinations. Typical shared storage traffic patterns especiallycompound this issue. QCN does not guarantee a lossless environment in the DCB-enabled LAN. You must

  • 8/3/2019 HP CNA

    13/20 13

    use QCN in conjunction with PFC to provide lossless operation with smooth congestion management acrosslarge DCB-enabled networks.

    QCN uses a special new tag that allows sources of traffic, for example CNAs, to identify traffic flows to allinterconnect devices in a QCN-enabled DCB network. QCN defines two specific points in a network thatimplement the QCN protocol, congestion points and reaction points. The QCN protocol has these basicprocedural elements:

    Reaction points initiate traffic into the network. They can include CNAs, target nodes, or DCB-enabledswitches that bridge between native FC networks and the DCB-enabled Ethernet network. Reaction pointstag their frames with traffic flow information identifying the source and destination of the traffic flow.

    When transmit queues fill up due to congestion from oversubscription, congestion points (typicallyswitches) statistically sample the frames in the congested transmit queues to identify the traffic flowscontributing most to the congestion.

    The congestion point device calculates congestion feedback quanta for each traffic source sampled. Thedevice uses information from the sampled traffic flow tags to send congestion notifications back to thetraffic sources.

    Upon receiving the congestion notification, a reaction point will use the feedback quanta to reduce thetransmission rate for that traffic flow to that specific destination. QCN does not affect traffic sent onunrelated flows to unrelated destinations.

    If a reaction point receives no further congestion notification messages, it slowly increases its transmitrates until they reach normal levels.Most DCB-enabled Ethernet switches will implement congestion points.

    We can roughly equate QCN operation to the TCP window algorithms that restrict traffic flow when thedevice detects lost frames. In the case of QCN, however, the protocol operates at layer 2 in the network. Ituses high-performance, low-level hardware to improve the networks ability to react to congestion. Figure 8illustrates a multi-hop network that implements QCN.

    Figure 8. QCN congestion notification

    StorageReaction Points

    Data Flow

    Congestion Notification Messages

    Congestion Points

    In this example, multiple CNAs in servers are sending write data to a common storage device through amulti-hop network. As a switch queue fills and surpasses a high water mark, the device sends congestion

  • 8/3/2019 HP CNA

    14/20 14

    notification messages to the server CNAs. The switch selects the server CAN by statistically sampling thecongested queue. The congestion notification occurs dynamically by sending higher feedback quanta toCNAs producing the most traffic and lower feedback quanta to sources producing less traffic. As a result,the CNAs throttle down their transmit rates on congested traffic flows. The decrease in traffic flow ratesreduces the number of frames in the congested queue in the switch to achieve a more sustainable, balancedlevel of performance. As the congestion eases, the switch reduces or stops sending notifications and theCNAs start to accelerate the throughput rate. This active feedback protocol continuously balances trafficflow.

    It is possible to construct simple converged networks on one or two switch hops without QCN. In fact, theFCoE protocol does not require use of QCN in DCB-enabled Ethernet equipment. However, the generalunderstanding is that building relatively complex multi-hop or end-to-end, data-center-wide convergednetworks based on DCB-enabled Ethernet equipment requires enabling QCN in this infrastructure. Networksthat use the QCN protocol face several challenges:

    QCN protocol complexity Implementing the flow tagging, statistical sampling, and congestionmessaging is relatively complex. Identifying the proper timing and quanta of notification feedback tosatisfy a wide variety of operating conditions is also difficult.

    Difficult interoperability process Perfecting multi-vendor interoperability could take several yearsbecause of protocol complexity.

    No QCN support in current generation products No DCB/FCoE products shipping today support theQCN protocol. Furthermore, most, if not all, products will require a hardware upgrade to support QCN.Products claiming to support QCN have unproven, untested hardware implementations. Vendors haventperformed any rigorous interoperability tests with production level QCN software.

    Complete end-to-end support requirement To enable QCN in a network, the entire data path mustsupport the QCN protocol. All hardware across the DCB-enabled network must support QCN. This posesa significant problem because upgrading existing first-generation, DCB-based converged networksrequires replacing or upgrading all DCB components.

    Because of these challenges, only one-hop and two-hop networks will be reliable until next generationhardware becomes available to support QCN. Most currently shipping hardware cannot support QCN andcannot be software upgraded to add this support. Therefore, support for larger DCB-based network

    deployments will require hardware upgrades.

    Data Center Bridging Exchange

    Data Center Bridging Exchange (DCBX) protocol provides two primary functions:

    Lets DCB-enabled Ethernet devices/ports advertise their DCB capabilities to their link partners Lets DCB-enabled Ethernet devices push preferred parameters to their link partnersDCBX supports discovery and exchange of network configuration information between DCB-compliant peerdevices. DCBX enhances the Link Layer Discovery Protocol (LLDP) with more network status information andmore parameters than LLDP. The specification separates DCBX exchange parameters into administered andoperational groups. The administered parameters contain network device configurations. The operational

    parameters describe the operational status of network device configurations. Devices can also specify awillingness to accept DCBX parameters from the attached link partner. This is most commonly supported inCNAs that allow the attached DCB-enabled switch to set up their parameters.

  • 8/3/2019 HP CNA

    15/20 15

    NOTE

    Link Layer Discovery Protocol (LLDP),IEEE 802.1AB, defines a protocoland a set of managed objects that can be used for discovering thephysical topology and connection end-point information from adjacentdevices in 802 LANs and MANs. The protocol is not restricted fromrunning on non-802 media.

    Table 2. DCBX supported parameters

    Protocol Parameters Advertised

    PFC Indication of which priorities have PFC enabled

    Willingness to accept PFC recommendations (CNA)

    Number of priorities that can support PFC

    MACsec bypass capability

    ETS Number of traffic classes supported on the port

    Priority to Traffic Class Mapping

    Willingness to accept ETS recommendations (CNA)

    Traffic class bandwidth allocations (for ETS TCs)Bandwidth allocation algorithms for each TC

    QCN Not currently in the standard

    Other How applications, for example FCoE, map to priorities

    Figure 9 illustrates DCBX parameter negotiation between a CNA and the attached switch port whereneither device is willing to accept DCBX parameter recommendations. In this case, the CNA and switchadvertise DCB capabilities to each other. The adapter chooses a storage traffic priority that is notcompatible with the switch. The CNA and switch cannot properly exchange storage traffic with one anotherso communication on that link does not happen. Typically, this generates an error that prompts you to

    reconfigure either the CNA or the switch parameters to make them compatible. The same situation canoccur on links between switches.

    http://www.ieee802.org/1/pages/802.1ab.htmlhttp://www.ieee802.org/1/pages/802.1ab.htmlhttp://www.ieee802.org/1/pages/802.1ab.htmlhttp://www.ieee802.org/1/pages/802.1ab.html
  • 8/3/2019 HP CNA

    16/20 16

    Figure 9. DCBX static parameter exchange

    CNA parameters

    switch parameters

    X

    The DCBX protocols strength lies in its ability to perform dynamic negotiation using attributes calledrecommendedand willingness. CNAs and switches using DCBX can advertise their willingness to adoptparameter settings from their link partner. In the example shown in Figure 10, a CNA communicates theinitial exchange of ETS and PFC information and willingness to consider parameters from the switch. Theswitch acknowledges this willingness and sends the CNA the recommended parameter values for ETS andPFC parameters. If the CNA can successfully adopt the recommended parameters, the CNA will re-advertise its DCBX parameters using the recommended values. The two devices will then be able to

    communicate on the established link.

  • 8/3/2019 HP CNA

    17/20 17

    Figure 10. DCBX dynamic negotiation

    CNA willingness

    CNA new parameters

    XDCB - CDCB - C

    Switch recommended

    Migrating to converged fabrics

    In a one-hop architecture, converged traffic goes from a server to a switch that splits it to Ethernet and FibreChannel. In two-hop architecture, converged traffic goes to a second switch before the split. The moreswitch hops in a DCB-enabled network, the more difficult it is to keep the network operating at peakefficiency while minimizing congestion. Figure 11 shows the expected industry path to convergence.

    Figure 11. Industry path to convergence

  • 8/3/2019 HP CNA

    18/20 18

    This is the first phase of migration to converged fabrics. CNAs will connect to converged fabric accessswitches that support DCB-enabled Ethernet, legacy Ethernet, and legacy FC. The CNAs will provideconverged connectivity between servers and the first hop switch before disaggregating the traffic to thelegacy LAN and SAN infrastructure. Figure 12 compares traditional deployment to the first phase ofconverged network deployment.

    Figure 12. Comparison of traditional deployment and converged network, phase 1

    Figure 13 shows how the next phases of deployment may occur as you update existing data centers orbuild new ones. Eventually a server will require only a single pair of redundant CNAs. Converged networkswitches will replace separate FC, 10 GbE, and IB switches.

  • 8/3/2019 HP CNA

    19/20 19

    Figure 13. Converged network deployment, phases 2 and 3

    HP strategy

    We believe that the transition to DCB/FCoE can be graceful. It need not disrupt existing networkinfrastructures if you first deploy at the server-to-network edge and then migrate farther into the network.

    With this approach, you will gain the immediate benefit of reduced cable and adapter hardware with theleast amount of disruption to the overall network architecture.

    As you deploy new servers, you can deploy DCB/FCoE with new CNAs and DCB/FCoE/FC enabled

    edge/access switches. Doing this will optimize, simplify, and reduce the cost of the server-to-network edgeinfrastructure, and you wont have to replace the entire data center communications infrastructure. Youshould start by implementing DCB/FCoE technology only with those servers requiring access to FC SANstorage targets. Many data centers average about 60 to 80 percent LAN-only network attachment, so onlythe remaining 20 to 40 percent would need both LAN and SAN.

    Not all servers need access to FC SANs. Looking forward, many IT organizations are re-evaluating thenetwork storage connectivity of their server infrastructure. Besides DCB/FCoE technology, other methods ofconverging traffic include iSCSI protocols with storage devices at 10 Gb, and file-oriented network storageprotocols with storage such as NFS or CIFS. Neither of these technologies requires a DCB-enabled Ethernetnetwork. Both can operate on traditional 1/10 Gb Ethernet infrastructure.

    Transitioning the server-to-network edge first to accommodate FCoE/CEE will maintain the existing

    architecture structure and management roles, keeping the existing SAN and LAN topologies. Updating theserver-to-network edge offers the greatest benefit and simplification without disrupting the data centerarchitecture.

  • 8/3/2019 HP CNA

    20/20

    Copyright 2010 Hewlett-Packard Development Company, L.P. The information containedherein is subject to change without notice. The only warranties for HP products and servicesare set forth in the express warranty statements accompanying such products and services.Nothing herein should be construed as constituting an additional warranty. HP shall not beliable for technical or editorial errors or omissions contained herein.

    For more information

    Resource description Web address

    HP Multifunction Networking Products http://h18004.www1.hp.com/products/servers/proliant-advantage/networking.html

    HP ProLiant networking

    Ethernet network adapters

    http://h18004.www1.hp.com/products/servers/networking/index-nic.html

    Server-to-network edge technologies:converged networks and virtual I/Otechnology brief

    http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02044591/c02044591.pdf

    Ethernet technology for industry-standardservers technology brief

    http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02475134/c02475134.pdf

    HP FlexFabric and Flex-10 technologytechnology brief

    http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01608922/c01608922.pdf

    Server virtualization technologies for x86-based HP BladeSystem and HP ProLiantservers technology brief

    http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01067846/c01067846.pdf

    HP Virtual Connect Technology web page http://isscontent.americas.hpqcorp.net/products/blades/virtualconnect/

    Call to action

    Send comments about this paper [email protected]

    http://h18004.www1.hp.com/products/servers/proliant-advantage/networking.htmlhttp://h18004.www1.hp.com/products/servers/proliant-advantage/networking.htmlhttp://h18004.www1.hp.com/products/servers/networking/index-nic.htmlhttp://h18004.www1.hp.com/products/servers/networking/index-nic.htmlhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02044591/c02044591.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02044591/c02044591.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02475134/c02475134.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02475134/c02475134.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01608922/c01608922.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01608922/c01608922.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01067846/c01067846.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01067846/c01067846.pdfhttp://isscontent.americas.hpqcorp.net/products/blades/virtualconnect/http://isscontent.americas.hpqcorp.net/products/blades/virtualconnect/mailto:[email protected]:[email protected]:[email protected]:[email protected]://isscontent.americas.hpqcorp.net/products/blades/virtualconnect/http://isscontent.americas.hpqcorp.net/products/blades/virtualconnect/http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01067846/c01067846.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01067846/c01067846.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01608922/c01608922.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01608922/c01608922.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02475134/c02475134.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02475134/c02475134.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02044591/c02044591.pdfhttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c02044591/c02044591.pdfhttp://h18004.www1.hp.com/products/servers/networking/index-nic.htmlhttp://h18004.www1.hp.com/products/servers/networking/index-nic.htmlhttp://h18004.www1.hp.com/products/servers/proliant-advantage/networking.htmlhttp://h18004.www1.hp.com/products/servers/proliant-advantage/networking.html