Upload
lamhuong
View
233
Download
1
Embed Size (px)
Citation preview
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric
May 12, 2015 Abstract This white paper illustrates how the Cisco Application Centric Infrastructure (ACI) can be implemented as a single fabric stretched between two data centers, with the ability to optimally route traffic to and out of the data center where the resource is located. In today’s data centers, uptime and availability is paramount. IT organizations and the applications they provide are becoming more vital for the overall business they are serving. Application owners are demanding a service level agreement of ‘five nines’ or better availability. Another key requirement is moving compute workloads between virtualized hosts within and between data centers all while still providing service to customers.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 2
World Wide Technology Advanced Technology Center WWT designed, architected and implemented this ACI Active/Active Stretched Fabric use case in its Advanced Technology Center (ATC).
The ATC represents a significant investment in technology infrastructure with hundreds of racks of networking, compute and storage products used to demonstrate and deploy integrated architectural solutions for WWT customers, partners and employees.
Powered by a multi-‐tenant private cloud infrastructure, the ATC is organized into four groups of labs for research and development, testing, training and integration. Each lab addresses different phases for the introduction, evolution and lifecycle of technology products.
The ATC ecosystem is defined by the combined experience of WWT Consulting Systems Engineers, IT Operations and Professional Services Engineers, along with the knowledge of peers from manufacturing partners and customers. This ecosystem of organizations provides thought leadership from a multi-‐ disciplined technology perspective aligned by the common goal of integrating the right technology solutions to address and resolve real-‐world technical and business challenges.
For this use case, WWT architects, engineers and programmers used the Next Generation Data Center (NGDC) environment. The NGDC environment is a holistic approach to help customers realize the reality of distributed data center, data center automation/orchestration, and hybrid cloud designs.
Cisco Partnership This white paper is the result of the partnership between WWT and Cisco Systems in developing solutions for the next generation in data centers. The Cisco Application Centric Infrastructure (ACI) is designed to manage a system of network switches and compute resources through redundant, Application Policy Infrastructure Controllers (APICs). The network fabric is managed to support specific application requirements.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 3
Uptime and Availability Challenge Active/Active data centers are implemented in several architectures. The most common design involves splitting or providing the application or service in two different data centers. This approach uses a Global Site Load Balancer (GSLB) to direct the client to the correct data center that contains the application host based on DNS load balancing policy.
There are few challenges with this approach:
1. The DNS time to live (TTL) value must time out before the user will be re-‐directed to the new location (data center) for the application.
2. Layer 2 extension is required between the data centers and a solution for Source Network Address Translation (SNAT).
3. Layer 2 extension has its own challenges including traffic hair-‐pinning and asymmetrical traffic patterns.
Further detail about these challenges is covered in the appendix on using GLSB for data center traffic re-‐ direction.
The Active / Active ACI Stretched fabric architecture addresses these challenges by using policy that spans between the data centers.
Networking Overlays In the last five years, there have been a number of overlay protocols implemented to address the sub-‐ optimal traffic routing in an active/active data center environment. A network overlay typically provides either a Layer 2 or Layer 3 service. Some of the common data center Layer 2 network overlays are FabricPath (TRILL), OTV, and VXLAN. Layer 3 overlays consist of GRE, BGP MPLS VPNs, and LISP. The overlay provides a basic service of encapsulating a frame or packet and transmitting over the underlay network to the remote overlay tunnel endpoint. When it reaches the remote overlay tunnel endpoint, it is un-‐encapsulated and forwarded. The overall goal is to provide a service (layer 2/3) that would not be native to the Ethernet/IP underlying network all while hiding the underlay to the two endpoints communicating over the overlay network. OTV is a commonly deployed overlay to connect two data centers at layer 2 which allows for in service workload mobility. LISP is a layer 3 overlay that fixes some of the challenges of inbound routing correction described previously.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 4
Cisco Application Centric Infrastructure Overview The Cisco Application Centric Infrastructure (ACI) fabric consists of three components: a controller, policy and network infrastructure. The central controller -‐ the Application Policy Infrastructure Controller (APIC), implements network policy for forwarding packets on switches in a spine and leaf architecture. The APIC abstracts the network infrastructure and provides a central policy engine. Configuration of the fabric and implementation of policy is through the northbound REST API interface of the APIC. Multiple controllers are attached to separate leaf switches for availability. Configuration changes made on one controller are communicated and stored across all controllers in the fabric.
Switches serve either a spine or leaf role. Leaf switches can also have additional sub-‐roles within the ACI fabric; border or transit leaf. A border leaf switch has a Layer 3 connection to external networks. Recent releases of ACI software support disjointed leaf switches, leaf switches that do not have connections to every spine within the fabric. A disjointed leaf can be a transit leaf, connecting two spines located in different physical locations. By connecting the two spines together with the transit leafs, the two locations are controlled with a single policy by a cluster of APICs distributed across both locations. In addition to supporting transit leaf switches, the 40 Gigabit Long-‐range QSFP optics provide connectivity of up to 30 kilometers.
This topology is illustrated in the following figure.
Figure 4 -‐ Transit Leaf Topology
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 5
Design Overview This design demonstrates how a single ACI fabric can be implemented in separate data center environments with a single administrative network policy domain. Bare-‐metal hosts and hosts running hypervisors for virtualization (Microsoft Hyper-‐V and VMWare ESXi) are defined and managed by the APICs regardless of their physical connectivity.
The IP address ranges for the Bridge Domains and EPGs are also available anywhere within the fabric. Normal ACI forwarding policy can be applied along with a single point of management for both physical sites from the cluster of APICs.
Figure 5 – ACI Logical View
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 6
Network Architecture The network architecture is comprised of two data center fabrics connected via Transit Leaf switches. The ACI Fabric is providing Access and Aggregation LAN segments of the data center while the Border Leafs connect to the Core/Edge of the data center. Figure 6 – ACI Topology View
External Connectivity External fabric connectivity for each physical data center is provided through the common tenant in the ACI fabric. Using the common tenant is not a requirement, rather a preferred configuration.
Each application tenant will access the WAN through the common tenant by creating an Endpoint Group (EPG) for connectivity purposes, (e.g., Web). This EPG references a bridge domain (e.g., Production BD) in the common tenant which has external connectivity. A contract will permit traffic to flow from the common tenant to the application tenant. Reference Figure 5 – ACI Logical View.
By using the common tenant for external connectivity, the network and security administrator can assign the appropriate network configuration policy, security contracts and policy, as well as firewall and load balancing services for the fabrics in each data center. The network policy is similar for each data center, but the IP addressing, and Bridge Domain and External Routed Network are specific to each site.
The application (DevOps) teams will reference the common tenant configuration and configure application connectivity for intra-‐ and inter-‐tenant communication through the Application Network Profile (ANP).
The border leaf switches connect to a Nexus 7000 switch for external Layer 3 connectivity. The Nexus 7000 serves two purposes. It provides connectivity between the ACI fabric/endpoints and external devices/endpoints. It also provides inbound routing correction for the ACI endpoints via the Locator/ID Separation Protocol Multi-‐Hop Across Subnet Mode (LISP MH ASM) along with Intelligent Traffic Director (ITD). Outbound routing correction is handled by the ACI fabric using ACI standard forwarding policy. The traffic will be sent to the closest border leaf using the MP-‐BGP metric to find that closest
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 7
border leaf(s). ITD allows the Nexus 7000 to load balance the inbound traffic to the Border Leafs along with SLA probing the Border Leafs for reachability and availability.
Solution Components The design incorporates components typically found in data center environments. The role of the component and the specific product are shown in the following table.
Role Product Network Infrastructure Fabric Cisco ACI -‐ Nexus [6] 9396, [2] 9336 version 11.0(3f)
[2] APIC version 1.0(3f) Data Center Core/Edge Switches Nexus 7010 switches (LISP MH ASM, ITD) version 6.2(10) Compute Cisco UCS -‐ C22 M3 Rack Servers Synchronized Storage Solution NetApp Metro-‐Cluster / EMC VPLEX Virtualization Hypervisor VMWare ESXi and vCenter 5.5 DNS and GSLB F5 Global Traffic Manager (GTM) version 11.2 Demonstrated Applications VDI (VMWare View) and 2-‐tier Web Service (Microsoft
SharePoint)
F5 F5 Global Traffic Manager (GTM) allows holistic management of multi-‐data center application delivery via intelligent DNS. GTM actively monitors application health at each data center and responds to DNS requests based on availability, performance and custom traffic engineering.
GTM uses a Wide-‐IP that maps a DNS entry to a pool of application instances spread across multiple data centers. In the event of a failure of a data center, the application instance in that data center will be unavailable and no longer be provided to DNS queries. Custom traffic engineering can be implemented to direct traffic based on variables such as application performance metrics, geo-‐location, round robin, etc.
The Active-‐Active ACI design utilizes GTM as the DNS server but could potentially be used as the GSLB solution if a disaster recovery (DR) data center is available. For more information on implementing ACI in a disaster recovery configuration, refer to the WWT white paper -‐ Federated Application Centric Infrastructure (ACI) Fabrics for Dual Data Center Deployments: https://goo.gl/GQAvTR
Locator/ID Separation Protocol (LISP) LISP (RFC 6830) is an overlay protocol that encapsulates an IP packet which uses a mapping database in order to deliver the encapsulated packets from Ingress Tunnel Router (ITR) to the Egress Tunnel Router (ETR). A detailed explanation of all of the components that make up LISP is outside of the scope of this whitepaper. LISP Multi-‐Hop allows the Endpoint ID (EID) to be discovered at a router but the LISP encapsulation is accomplished at a different router one or more Layer 3 hops from the LISP discovery router.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 8
Within the ACI Fabric design, LISP discovery happens using the data-‐plane discovery mechanism. When the LISP First Hop Router (FHR – discovery router) receives a frame/packet from an EID, it adds the EID to the LISP dynamic EID local table and notifies the LISP Site Gateway (encapsulation router).
Figure 7 – LISP integration View
Intelligent Traffic Director (ITD) ITD provides scalable load distribution of traffic to a group of servers and/or appliances. It includes the following main features related to the Active/Active ACI design:
• Redirection and load balancing of line-‐rate traffic to ACI border leafs; up to 256 in a group,
• IP stickiness with weighted load balancing,
• Health monitoring of border leafs using IP Service Level Agreement (SLA) probes (ICMP)
• Automatic failure detection and traffic redistribution in the event of a border leaf failure, with no manual intervention required, node level standby support
• ITD statistics collection with traffic distribution details
• VRF support for ITD Service and Probes Within the Active-‐Active ACI Fabric, ITD is running on the Nexus 7000 that is directly connected to the ACI Border Leafs. The purpose of ITD within this architecture is load balance ingress traffic amongst the Border Leafs. ITD also uses IP SLA probes in order to verify that Border Leaf are reachable.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 9
Storage Synchronization Solution In order to provide the capability for virtualized workload mobility, a single storage device must be available in both data centers. This can be accomplished by connecting the SAN in each data center over dark fiber or Fibre Channel over IP (FCoIP) but a more effective solution would be to have the data residing in both data centers synchronously. There are several technologies that can achieve this level of storage synchronization. EMC VPLEX is a reliable simple architecture that supports many different storage vendors with an option to synchronize the storage utilizing the IP network for transport. Another leading solution is NetApp MetroCluster. Both VPLEX and MetroCluster have been validated in this design.
VPLEX is able to provide distributed storage volumes using cache coherence and simultaneous access to storage devices through the creation VPLEX clusters (one in each data center). VPLEX distributed devices are available from either VPLEX cluster and have the same Logical Unit Number (LUN) and storage identifier when presented to the host, enabling true concurrent read/write across data centers. VPLEX Clusters use synchronous replication to keep data in sync on both sides of the cluster.
NetApp MetroCluster also addresses the challenge to provide continual data availability across two data centers and does it while retaining the build in storage efficiency of the Data ONTAP operating system. MetroCluster consists of two Data ONTAP clusters that synchronously replicate to each other. Each cluster is an active-‐active HA pair, so all nodes serve clients at all times. Data is written to the primary copy and synchronously replicated to the secondary copy in the remote site. Cluster peering interconnect mirrors cluster configurations to provide a single point of policy.
Video Demonstration For a video demonstration of the concepts presented in this whitepaper, visit https://goo.gl/5IKSoS
Conclusion The Cisco Application Centric Infrastructure (ACI) is an innovative architecture where applications use the data center as a dynamic, shared resource pool. This pool of resources is managed through a central controller exposing all configuration and management components through a northbound REST API.
WWT is providing value by helping customers design Active-‐Active data centers to meet their reliability/resiliency requirements and provide quality business outcomes.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 10
Appendix Traditional Distributed Data Center Traffic Patterns Figure 1 -‐ Traditional Highly Available Data Center
In Figure 1, the DNS entry resolves to the Application Delivery Controller 1 (ADC1) in the west data center. Traffic is sent to the Virtual IP (VIP) of ADC1 via normal routing. ADC1 uses network address translation (NAT) to redirect the user traffic to the appropriate Web Server in this example. The ADC1 also translates the source IP address using what is referred to as Source-‐NAT (S-‐NAT). S-‐NAT ensures the return traffic from the Web Server will always return to the correct ADC for translation back to the IP addresses for the user session. The data center interconnect (DCI) cloud in Figure 1 represents a layer 2 extension technology. See section “Network Overlays” for information on the different types of layer 2 overlays.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 11
Figure 2 -‐ Hair-‐pinning Traffic with SNAT (after workload migration)
Figure 2 depicts the hair-‐pinning effect on the traffic pattern when the workload has migrated to the east data center. Since both data centers are up and operational, the GSLB view is that the primary VIP-‐ 1 is still online and therefore retains west data center as the destination for the service. The ADC1 devices also has no knowledge of where physically the Web Server is located because of the layer 2 network extension.
Stretched Active-‐Active Application Centric Infrastructure (ACI) Fabric World Wide Technology 12
Figure 3 -‐ Asymmetric Traffic without SNAT (after workload migration)
Figure 3 shows what will happen if you removed the S-‐NAT function from the environment and the workload is moved from one active data center to another. The traffic will inbound in the west data center and using default routing possibly exit the east data center causing an asymmetrical traffic pattern. Most stateful devices, such as firewalls, will deny the traffic thus bringing the service offline. Even if stateful devices are not present in the traffic pattern, troubleshooting becomes more difficult.