25
Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Embed Size (px)

Citation preview

Page 1: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Virtual Network Diagnosis as a

ServiceWenfei Wu (UW-Madison)Guohui Wang (Facebook)

Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Page 2: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Virtual Networks in the Cloud

Application Traffic

GW GW

Middlebox

Subnet 1VM

VM

VS

MB

R RSubnet 2

VM

VS

VM

• The cloud provider maintains the data center infrastructure• The tenant request/configure virtual networks for their apps• The virtual network is mapped to the physical infrastructure• Network services are provided as virtual middleboxes• Multiple tenants share the infrastructure

Tenant 2VM

VM

VS

VM

Page 3: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Virtual Network Problems

• Multiple layers may have various problemsoVirtual network connectivity issueso Tenants’ application performance may be degraded

• Isolation and abstraction prevent tenants from diagnosing their virtual networks

Page 4: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Existing Solutions

• Infrastructure-layer tools may expose the physical network to tenants• sFlow, NetFlow• OFRewind, NDB, etc.

• Tracing tools in the virtual network are difficult to deploy in some virtual components (e.g., middleboxes)• Tcpdump, Xtrace, SNAP

Page 5: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

VND Proposal

• The cloud provider should offer a virtual network diagnosis service (VND) to the tenants

1. Diagnostic request2. Trace collection3. Trace parse4. Data query

Path: n1… nk

Pattern: IP, port, …n1: Table Server1

n2: Table Server2

Diagnosis Policy

Table Server

Table Server

Tenant allocated resources & Physical Topology

Path: n1… nk

Pattern: IP, portn1: collector1

n2: collector2

DiagnosisPolicy

DiagnosisRequest

Tenant

Control Server

Page 6: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

VND Challenges

• Preserve isolation and abstractions• Low overhead• Scalability

Page 7: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Contents

• Motivation• VND Design & Implementation• Evaluation• Conclusions

Page 8: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

VND Architecture

Cloud Controller

Collection Policyts id … ts id …

Raw Data

Trace Collector

Raw Data

Trace Collector

Flow Trace

CollectConfig Collect Config

& Topology

PolicyManager

Tenant

Query Execution

AnalysisManager

Table ServerTable Server

Control ServerQuery Executor

Trace Table

Query ExecutorTrace Table

Trace Parser Trace Parser

Parse Config

ParseConfig

Page 9: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Data Collection (1)

• The tenant submits a Data Collection Configuration

LoadBalancer

FrontEnd

Server 1

Server 2

Server 3

Cap: Input srcIP=10.0.0.6 dstIP=10.0.0.8 tcp dstPort=80dualDirection

Cap: output…

Virtual Appliance Link : l1 Capture Point node1 Flow Pattern field = value, ... Capture Point node2 ...Virtual Appliance Node: n1 Capture Point input, [output] ...

Page 10: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Data Collection (2)

• Policy Manager generates a Data Collection Policy

srcIP=10.0.0.6,…, etc.

Collector (c1)tr_id1

Trace ID tr_id1Physical Cap Point vs1Collector c1 at h1Pattern field = value, ...Path vs1, ..., h1, c1

Hypervisor

vSwitch (vs1)

LoadBalancer

NIC

Page 11: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Data Parse

• The tenant submits a Data Parse ConfigurationTable ID tab_id1Filter expFields field_list

exp = exp and exp | exp or exp | not exp | (exp) | primprim = field in value_set

field_list = field (as name) (, field (as name))*

Trace ID allFilter: ip.proto = tcp or ip.proto = udpFields: timestamp as ts, ip.src as src_ip, ip.dst as dst_ip, ip.proto as proto, tcp.src as src_port, tcp.dst as dst_port, udp.src as src_port, udp.dst as dst_port

ts src_IP dst_IP proto src_port dst_port

… … … … … …

Page 12: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Data Analysis

ts id …

Trace_ID Vnet_App Cap_Loc Pattern Trace_ID …

AnalysisManager

Collection View Parse View

ts id … ts id …Trace Table

Query Executor

Trace Table

Query Executor

Trace Table

Query Executor

SQL Interface

Diagnostic Schema

• Trace Tables form a distributed database

Operations Diagnostic ApplicationsTenants

Page 13: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Data Analysis Examples

• RTT1. create view T1_f as select * from T1 where srcIP=IP1

and dstIP=IP22. create view T1_b as select * from T1 where dstIP=IP1

and srcIP=IP23. create view RTT as select f.ts as t1, b.ts as t2 from T1_f

as f, T1_b as b where f.seq + f.payload_length = b.ack4. select avg(t2-t1) from RTT

• throughput, loss, delay, statistics, etc.

ts src_IP dst_IP seq ack Payload_length… … … … … …

Page 14: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Optimizations

• Local Table Server placement• Place the collector locally with the capture points• Avoid the trace collection traffic traversing the network• Move data only when queries need it

• Avoid interference with existing rules using the multi-table feature on OVS

Page 15: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Contents

• Motivation• VND Design & Implementation• Evaluation• Conclusions

Page 16: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Trace Collection Overhead (1)

• We transfer data between VMs in 2 physical serverso10Gbps NICo8 pairs of VMs, each pair transfer 1Gbps traffic

• Each minute, we capture one VM pair’s traffic• The duplicated traffic increases

as we capture more traffic• The original application traffic

is not impacted

Page 17: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Trace Collection Overhead (2)

• VMs perform memory copy and data transfer simultaneously• We measure memory and network throughput

• Each 1Gbps network traffic duplication causes 59 MB/smemory overhead

Page 18: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Data Query Overhead

• We use RTT monitoring as an exampleo200 Mbps trafficoCalculate average RTT periodically

• Network overhead is negligible• Execution time scales linearly with the data size• VND can process 2-3 Gbps traffic in real time

RTT

Page 19: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Scalability (1)

• Control Server is a simple web server can be scaled up easily• Data collection distributed locally to avoid being the

bottleneck

Page 20: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Scalability (2)

• Data Query simulation• A data center with 10,000 servers, each has a 10Gbps

NIC• Virtual network size [2, 20] • Query executors can process 3 Gbps traffic in real time• Total link utilization [0.1, 0.9]

• Results• 30% of total link capacity can be queried in real time

Page 21: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Conclusions

• The cloud provider should offer a virtual network diagnosis service to the tenants• We design VND• Architecture, interfaces and operations• Address several important challenges

• Our experiments and simulation demonstrate the feasibility of VND

Page 22: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

END

Q&A

Please contact with WISDOMhttp://wisdom.cs.wisc.edu

Page 23: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Functional Validation

• Middlebox Bottleneck Detection

• VND use throughput and RTT time to find abnormity

RE IDS

Page 24: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

Functional Validation(2)

• Middlebox scaling

RE IDS

Page 25: Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)

VND Architecture

Cloud Controller

Trace Parser

Trace Table

Table Server PolicyManager

AnalysisManager

CollectConfig

ParseConfig Collect Config

& Topology

Collection Policy

Flow Trace

Parse Config

Query Execution

Tenant

ts id …

Raw Data

Query Executor

Trace Collector

Trace Parser

Trace Table

Table Server

ts id …

Raw Data

Query Executor

Trace Collector

Control Server