47
Migration of real world failures to virtual reality Radek Krzywania [email protected] Piotr Turowicz [email protected]

Migration of real world failures to virtual reality Radek Krzywania [email protected] Piotr Turowicz [email protected]

Embed Size (px)

Citation preview

Page 1: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Migration of real world failures to virtual reality

Radek [email protected]

Piotr [email protected]

Page 2: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Network Virtualization

• A research initiatives for network virtualization can be found around the world, e.g.:– FEDERICA– 4WARD– GENI– Etc.

• The objective is NOT to provide virtual machines for computation purposes, but to create a transparent virtual infrastructures, which are seen by end users as real physical network

Page 3: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Network Virtualization

• As users see the virtualized resources as physical one, they want to have some of the following features:– Ideal network environment (no failures, well

tuned environment), or– Real environment with failure occurrences,– Access to various layer of the network (L1,

L2, L3) as if it was a physical network,– Possibility to debug failures and bottle necks

Page 4: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Physical network• The physical infrastructure is influencing

the virtual one:– Failures in physical environment will cause

failures in virtual infrastructure at different scale

Phy

sica

l

Virt

ual

Page 5: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Virtual Network debugging

• Bottle necks may be difficult to debug in virtual infrastructures

Phy

sica

l

V

irtua

l

Page 6: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Virtual Network debugging

• Issues with virtual network debugging– Virtual machines at end points can be tuned

(e.g. TCP buffers, etc.), but users has no access to see performance of virtualization software (Xen, OpenVZ, VMware, etc.) if no administration privileges are assigned.

– No access to physical network driver for tuning

– No possibility to enable additional hardware support on physical NIC

Page 7: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Virtual Network debugging

• Other issues, like high delays, jitter, packet loss, packet reordering debugging issues:– No access to physical equipment in the

infrastructure (e.g. routers, switches)– Delays and packet drops can be effects of

virtualization software processing (e.g. highly utilized is unable to process all packets/frames), especially in paravirtualization cases

Page 8: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Example of real world issuePackets are not received, which may suggest a link issue

Virtual router is not aware of physical

queues, packet are sent, no packet

drop

In fact, queues on

the physical router are full due to other

physical traffic

Packet loss

Page 9: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Is utopia an objective?

Page 10: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Is utopia expected?

• Some users may want to have reliable network connectivity at both physical and thus virtual level.

• But, there are some users who want to see the network as a source of failures and issues for testing purposes.

Page 11: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Failures virtualizationLayer Issue

Layer 7 Transfered data is incomplete to test how application may behave – e.g. live audio streaming, UDP based streamings

Layer 4 Issues with reconstruction of send data after packet drops, reordering, jitter issues, buffers tunning – e.g. TCP protocol enhancements testing

Layer 3 IP packets are not delivered, limited transfer, traffic balance experiments, L3 network reorganization in case of failute (BGP) – e.g. testing of routing protocols or router software

Layer 2 Ethernet frames drop, reachability issues related with Spanning Tree, VLANs misconfiguration – e.g. for validation of network configuration before deployment, some protocols experiments

Layer 1 Optical information degradation, influance of optical effects in fibers on data transfer – e.g. mismatch of laser power to link attenuation

Page 12: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

• Some applications are fully relying on physical connection quality

• When performance is a key (real time audio/video), some level of transfer failures is acceptable (e.g. UDP instead of TCP)

How to virtualize issues at L7/L4/L3

Page 13: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

• Users may want to examine how application behave in case of limited transfer, high jitter, high delays, packet loss packet reordering

How to virtualize issues at L7/L4/L3

Page 14: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L7/L4/L3

• Failures for can be virtualized via:– Adding a software hardware proxy at the

received, which will modify incoming packets at L3 (or received data packs) – there is already existing software for that serving as a router proxy (e.g. dummynet – a FreeBSD software pack, able to introduce delays, packet loss, bandwidth shaping, etc. )

Page 15: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L7/L4/L3

dummynet – can modify forwarded traffic, introducing delay, packet loss,

etc.

Receiver is influanced at L3 (packet loss), L4

(reordered information), L7

(data lost, modified)

Page 16: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• L2 networks are quite popular to be used for virtualization of networks (e.g. VLANs)

• It is difficult to simulate virtual L2 at physical L2

VLAN 1234

Page 17: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• Virtual switches are possible to be emulated to introduce L2 to virtual network

• 802.1ad/QinQ features, allowing creation of user VLANs inside virtual links (which are VLANs in physical network)

Page 18: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• Ethernet misconfiguration (e.g. spanning tree issues) could be simulated by Linux Bridge (http://www.linuxfoundation.org/en/Net:Bridge)

• It can be set up on existing Linux box, making it behaving as a bridge

• It includes a lot of configuration features that can be manipulated.

Page 19: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• Bridging software may be used for manipulation/misconfiguration of virtual L2 infrastructure by:– Providing loops, failures in STP– Modification of ARP tables, in order to

simulate switch misbehavior– Random interface up/down status changes

Page 20: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• Another way of manipulating the L2 data are:– Random frame drop– Frames CRC modifications to invalidate frame

Page 21: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• Requirement for NIC driver modification– NIC driver at sender or receiver can drop a

frame at random basis or according to some schema

– On the Internet it is possible to find e1000 driver which allows sending/accepting frames with incorrect CRC

Page 22: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L2

• Frame drop can be used to:– Cause IP packet drops and thus transmitted

data loss– Simulate congestion on switch (too short

queues buffers)– Simulation of lower layers failures

Page 23: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• Virtualization consumes more computing resources than pure physical machines

• Virtual machines running on single physical host compete for physical resources

OS OS OS

CPU HD NIC

Page 24: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• That conclude that e.g. 250Mbps can be given to each of four virtual machines, while physical machine is equipped with 1Gbps interfaces (if resources are shared fairly)

OS 1 OS 2 OS 3

NIC

OS 4

1 GbpsOS 1OS 2OS 3OS 4

Page 25: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• Due to such resource utilization, a virtualization of L1 effects to higher layers may be too time consuming and will drastically decrease efficiency of both physical and virtual machines.

• Most optical effects and bits modification causes alternation of higher layers frames, and due to CRC check, a frame drop is expected as an effect

Page 26: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1• Probability of influence of single bit change

at optical level on particular parts of frame– the highest probability is to alternate data field

(with IP information included)

PreambleFrame Header

Data CRC

Page 27: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• CRC is calculated from whole Frame, so any change will be detected causing frame drop (and thus IP packet drop, L4 data inconsistency, etc.)

PreambleFrame Header

Data

CRC

Page 28: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• A debugging of issues on fibers requires:– An advanced optical measurement equipment– An direct access to both ends of fiber, it’s

parts, or whole end-to-end circuits (which may include some amplifiers or other optical equipment in the middle)

Page 29: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• It is impossible to have any of the above in virtualized network environment

• The physical network may not be an optical one, so the corresponding optical effect will not apply (while user may want to have them simulated)

Page 30: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1• Solution

– Virtual Optical Measurements equipment– Modified NIC drivers on virtual machines,

emulating results of optical effects in fibers

VLAN 1234

Optical link

virtual attributes

Page 31: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

Page 32: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• Chromatic Dispersion• Fiber length:

• Light pulse:

Pulse width

Page 33: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

Fiber section:

Light pulse:

Pulse width

Polarization Mode Dispresion example

Page 34: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

Polarization Mode Dispresion example

Page 35: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Asymmetries in fiber during fiber manufacturing and/or stress distribution during cabling, installation and/or servicing create fiber local birefringence.

A "real" long fiber is a randomly distributed addition of these local birefringent portions.

Polarization Mode DispersionThe causes

Page 36: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

Chromatic Dispersion• Is deterministic• Is linear• Is not affected by

environment• Can be compensated• Predictable, a

mathematical model can be built to calculate signal modification in fibers

Polarization Mode Dispersion• Is stochastic• Is not linear• Is affected by the

environment• Cannot be easily

compensated• Stochastic and random,

an attempt to build a mathematical model is a challenge.

Page 37: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Network debugging use case

• A virtual 1Gbps link is proved between two end-points

• Only 100Mpbs capacity is achieved

1Gbps circuit

R1 R2 R3SW1

Page 38: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Network debugging use case

• Ping is reporting 10% packet loss

• Trace route shows packet loss to R3

1Gbps circuit

R1 R2 R3SW1

Page 39: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Network debugging use case

• At L3 the issue is related to the link between R2 and R3

• Further investigation require operations at lower layers

1Gbps circuit

R1 R2 R3SW1

Page 40: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Network debugging use case

• Frame counters on SW1 shows that not whole traffic is received from R2

• Frame drop counter is increasing at incoming interface on SW1

1Gbps circuit

R1 R2 R3SW1

Page 41: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• NIC driver at virtual machine in R2 is sending frames with wrong CRC value, according to the predefined pattern (probability of the bit change at L1)

1Gbps circuit

R1 R2 R3SW1

Page 42: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1• An investigation of optical link between R2

an SW1 is required – virtual measurement equipment can be implemented, basing on virtual fiber information

The same information used for CRC modification pattern are used to

generate virtual measurement of the virtual

link

1Gbps circuit

R1 R2 R3SW1

Page 43: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• A virtual analyzer shows that PMD is higher, than initial value (measured after link installation, in fact predefined while test bed was created)

Current PMD > initial PMD

1Gbps circuit

R1 R2 R3SW1

Page 44: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

How to virtualize issues at L1

• A link physical conditions were modified – e.g. fiber was bended, temperature has changed, etc.

1Gbps circuit

R1 R2 R3SW1

Page 45: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Fiber virtual attributes

• Physical attributes needs to be reflected in virtual fibers, e.g:– CD value, – PMD value,– fiber attenuation, – laser power and receiver sensitivity

(according to attenuation)

Page 46: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Purpose of failure virtualization

• The mathematical model can be used to – Validate network before deployment– Define physical attributes of network

equipment to be used (e.g. laser power according to attenuation)

– Network issues debugging training– Educational purposes for students

Page 47: Migration of real world failures to virtual reality Radek Krzywania radek.krzywania@man.poznan.pl Piotr Turowicz piotrek@man.poznan.pl

Q&A

Thank you