Upload
advantech-europe
View
245
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
The Impact of Optimized
Packet Processing Software
on Multicore Platforms
for DPI and Network Security
Agenda
Optimizing the Hardware
Paul Stevens, Advantech
www.advantech.com/nc
Optimizing the Software
Eric Carmes, 6WIND
www.6wind.com
Multicore Network Platform Design Goals
Meeting OEM Requirements
Need a clear path to sustainable business growth through differentiated products and services
Preserve existing investments while meeting new performance requirements
Reduce time to revenue to beat competition
Need to deploy a flexible architecture and a scalable technology
Develop a range of products with a limited number of technologies
Ensure hardware independency
Need to meet dynamic market requirements
Manage performance growth
Reduce cost and power consumption
Must ship a working product on time
Integrate and validate new and complex technologies faster
Anatomy of a Network Appliance (today)
Intel®
Xeon®
processor
5600 series +
I/O Hub
10
GbE
10
GbE
10
GbE
10
GbE
PCIe x8
PCIe x8
PCIe x8
PCIe x8
Intel®
Xeon®
processor
5600 series +
I/O Hub
10
GbE
10
GbE
10
GbE
10
GbE
PCIe x8
PCIe x8
PCIe x8
PCIe x8
Enterprise(10-80Gbps)
Variety of Security and Encryption coprocessor options
GbE
GbE
GbE
GbE
PCIe x1
SMB (1-10 Gbps)GbE
GbE
GbE
GbE
IA chipset (e.g.,
Intel® Core™ i7
Processor or
Intel ® Atom)
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
Intel®
Xeon®
processor
5600 series +
I/O Hub
Intel®
Xeon®
processor
5600 series +
I/O Hub
PCIe x8
PCIe x8
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
10
GbE
XAUI
XAUI
XAUI
XAUI
NPU
XAUI
XAUI
XAUI
XAUI
NPU
Data Center >80Gbps
Switch
Control Plane
Processing
Data Plane
Processing
Translating to a Scalable Blade Topology (today)
IA
Node
IA
Node
IA
NodeSwitch
Switch connect40G
2x10G 2 x10G 2 x 10G
IA packet processing
and load balancing to
the IA Node payloads 40G
Dual
Star 10G
NPU
Hub
IA
Noe
NPU
+
Switch
IA
Node
IA
Node
IA
Node
NPU does front end
packet processing
and load balancing to
the IA Node payloads
NPU + Switch Connect
2x10G 2x10G 2x10G Dual
Star 10G
40G
40G
Performance Scaling to full 40G Interconnects
IA
Node
IA
Node
IA
NodeSwitch
100G+
2x 20G
Fast path packet processing and load balancing to the IA Node payloads
Dual Star 20G
100G+NPUNPU
Dual Star 40G
2x 20G2x 20G 2x 40G 2x 40G
IA
Node
IA
Node
IA
NodeSwitch
100G+
100G+
Dual Star 40G
2x 40G 2x 40G2x 40G2x 40G2x 40G
IA
Node
IA
Node
Hub blade(sec.)
GbE
SW
xGE
SW
Switch
Management
(LMP)
Hub blade(sec.)
GbE
SW
xGE
SW
Switch
Management
(LMP)
GbE
GbE used as Base Interface for
Management and control plane
Dual star topology
Secondardy 40GE
40GE fabric used as fabric interface for data
and user plane. Dual star topology
Hub blade(prim.)
Hub blade(prim.)
GbE
SW
xGE
SW
Switch
Management
(LMP)
IPMB
Low level management interface based on2 redundant IPMB busses.
Bussed or radial (star) topology
GbE
SW
xGE
SW
General purpose CPU
blade
CPU
xGE
MACGbE
MAC
CPU
NPU blade
NPU
xGE
SWGbE
SW
NPU
LMP
Switch
Management
(LMP)
ShMC ShMC
xGE
MAC
Primary 40GE
40GE used as fabric interface for data and user plane
Dual star topology
High-end DPI Example
THUB240GE Hub
Blade
40GE
Switching
rule based load
balancing
THUB240GE Hub
Blade
40GE
Switching
rule based load
balancing
Next Gen40GE Dual Xeon Blade
ATCA-7410Dual NPU
Blade
High Level
Flow Pro.
and DPI
Low Level
Flow
Processing
THUB240GE Hub
Blade
Additional
switching
capacity using
dual dual star
THUB240GE Hub
Blade
Additional
switching
capacity using
dual dual star
6WINDGate gate
Slow / Fast Path
Partitioning across iA/NPU
Creating a Virtuous Cycle with Multicore
for cost-optimized DPI
New
Technology
Introduction
More Cores,
Higher
Throughput
& Capacity
Packetarium is a cost-optimized, modular system
architecture for multicore packet processing.
Scalable and upgradable to meet bandwidth
demand, it’s also a cost effective alternative to
ATCA.
Trade-off on availability (system level)
The all-IP design simplifies customization and
the identical system management design
preserves ATCA S/W investment.
The Mainboard’s topology is similar to ATCA
backplane + switch with transition modules +
chassis management modules
Each network processing board connects to
mainboard’s switch via 2 or 4 x 10GE (XAUI)
80G Packetarium™ – “ATCA rewrapped”
Shrink & Cost-down for non-HA DPI
1, 2 or 4 x 10GbE (XAUI) per board
8 boards per system
QorIQ up to 128 cores
MIPS64 up to 256 cores
TI DSP up to 480 cores
X86 in design
x1
32 cores
x2
64 cores
x8
256 cores
>256 cores
• Processor-independent
• Main architectures supported
today
• More to follow
Scalable Hardware Platforms for DPI
Unprecedented performance stress on network
equipment (cloud and mobile infrastructure)
40G throughput now with 100G on the horizon
Complex networking protocols.
Accurate user packet identification and QoS
classification.
Efficient packet steering decisions for optimized
application-level processing
Advanced content inspection functions
Application-aware firewall, video compression.
Challenges for DPI Software
Introducing the 6WIND Solution
High-performance packet processing
engine.
Optimized for DPI acceleration and
protocol termination.
Includes comprehensive set of
networking protocols with High
Availability support.
Fast path architecture maximizes
system throughput.
Used by tier-1 OEMs worldwide.
Multicore Processor
…….
DPI Application Processing
Linux
Advantech Platform
Packet Detection Challenges
Wire-speed performance.
Packets may be fragmented and need re-construction.
Packet always hidden by combination of encapsulation techniques
VLAN, GTP, IP in IP, GRE, L2TP, MPLS…
Packet is often encrypted (IPsec).
Integrated firewall required.
Latency for each packet has to be minimized.
Solution requires high-performance packet processing for packet
identification, classification, steering and termination.
Flexible Mapping to Cores, Processors and Blades
Packet
Processing
Data
Plane
Control Plane
Networking Stack
Fast Path
DPI
Application Processing
Fast Path Cores
Linux
CoresDynamically allocate functions
across processor cores.
Transparent scaling
across homogeneous
or heterogeneous
blades
Includes a Full Set of Networking Protocols
High
availability
Monitoring system,
synchronization daemons for
ARP-NDP, routing and IPsec
IPv4-v6 forwarding
IPsec, IPsec SVTI
Layer 2 VLAN, GRE,
link aggregation
SCTP
IPv4-v6 filtering,
NAT
IPv6 tunneling and
transition
QoS
IPv4-v6 reassembly
RSTP
ROHC
Flow inspection
Multicast
GTP-u encapsulation
TCP termination
MPLS encapsulation
PPP / L2TP
Fast Path Modules
Networking Stack
Optimized stack for multicore including:
• All Linux networking features
(TCP/IP, filtering, NAT, IPsec…)
• Optimized SMP, 2K VR for forwarding,
firewalling, NAT and IPsec
• Integrated crypto engine management
for IPsec and SSL
• VNB framework for fast Layer 2
through Layer 4 protocol integration
• Network system calls optimization
(UDP, SCTP, RAW).
• Graceful Restart extensions for High
Availability.High Availability Extended Fast Path
Routing
Protocols
Static RIP (IPv4, IPv6), RIPng,
OSPFv2, OSPFv3, BGP-4,
BGP-4+, ECMP (IPv4, IPv6),
VRRP, PIMv4-SM, PIMv6-SM,
IGMP/MLD snooping & proxy,
static route monitoring & BFD
SecurityIKE, IKEv2, EAP, VPN
monitoring
Connectivity
PPP, Multi-link PPP, PPPoE,
CHDLC, VLAN, GRE, 6in6,
4in4, L2TP, DHCPv4/v6, DNS
proxy, RADIUS client
Mobility
Home agent, FMIP,
corresponding node, mobile
node, IPsec integration,
NEMO, proxy MIP
Virtual
Routing
(VRF)
Routing protocols, IKE
Control Plane Modules
Switching LACP
DPI Application Processing
6WINDGate APIs 6WINDGate APIs
6WINDGate in DPI
Packet Processing
40G /
100G
traffic
DecryptionFlow
identificationEncryption
40G /
100G
traffic
Policy enforcement, video compression,
security etc.
Architecture optimized for managing very large flow tables (millions of flows)
Efficient APIs maximize system throughput (packet cloning, zero-copy architecture etc.)
Scalable architecture for simultaneous support of multiple application instances.
Apply
policy
(QoS)
Protocol
termination
TCP, HTTP etc.
No
application
processing
Flow to be
processed by
application
Unknown flow or
flow to be
monitored
Application flow identification and
analysis
Update
flow and
flow event
Flow table
DPI
6WINDGate APIs 6WINDGate APIs
Example: Mobile Video Compression
Packet Processing
40G /
100G
traffic
DecryptionGTP flow
identificationEncryption
40G /
100G
traffic
Compressed
video
Video compression
Flow with
video Apply
policy
(QoS)
HTTP
termination
Flow without
video
• Detection of flows that could include video.
• Detection of events to locate video in flow.
Update
flow and
flow event
Unknown flow or
flow event
Flow table
Packet Processing
DPI
6WINDGate APIs 6WINDGate APIs
Example: Application-Aware Firewall + UTM
Flow to be
scanned40G /
100G
traffic
Apply
policy
(QoS)Firewall
Transparent
proxy
TCP, UDP etc.
DecryptionL2 / L3 flow
identificationEncryption
40G /
100G
traffic
Scanned
flow
UTMAnti-virusDetection of flows that could contain
viruses.
Update
flow and
flow event
Unknown flow or
flow to be
monitored
Flow table
Summary
6WIND-Advantech solution addresses critical requirements for DPI equipment:
Wire-speed DPI
Comprehensive protocol support for
advanced services
Fast path environment optimized for
acceleration of DPI and application
processing.
Zero downtime reliability via integrated
High Availability support
Portable solution available on industry-
leading processor platforms
Deployed today in cloud infrastructure and
mobile networks.
Multicore Processor
…….
DPI Application Processing
Linux
Advantech Platform