22
0.5 mln packets per second with Erlang Nov 22, 2014 Maxim Kharchenko CTO/Cloudozer LLP

0.5mln packets per second with Erlang

Embed Size (px)

DESCRIPTION

LINCX is an OpenFlow switch written in Erlang and running on LING (Erlang on Xen). It shows some remarkable performance. The presentation discusses various speed-related optimizations.

Citation preview

Page 1: 0.5mln packets per second with Erlang

0.5 mln packets per second with Erlang

Nov 22, 2014

Maxim Kharchenko

CTO/Cloudozer LLP

Page 2: 0.5mln packets per second with Erlang

The road map• Erlang on Xen intro

• LINCX project overview

• Speed-related notes

– Arguments are registers

– ETS tables are (mostly) ok

– Do not overuse records

– GC is key to speed

– gen_server vs. barebone process

– NIFS: more pain than gain

– Fast counters

– Static compiler?

• Q&A

Page 3: 0.5mln packets per second with Erlang

Erlang on Xen a.k.a. LING

• A new Erlang platform that runs without OS

• Conceived in 2009

• Highly-compatible with Erlang/OTP

• Built from scratch, not a “port”

• Optimized for low startup latency

• Open sourced in 2014 (github.com/cloudozer/ling)

• Local and remote builds

Go to erlangonxen.org

Page 4: 0.5mln packets per second with Erlang

Zerg demo: zerg.erlangonxen.org

Page 5: 0.5mln packets per second with Erlang

The road map

• Erlang on Xen intro• LINCX project overview

• Speed-related notes

– Arguments are registers

– ETS tables are (mostly) ok

– Do not overuse records

– GC is key to speed

– gen_server vs. barebone process

– NIFS: more pain than gain

– Fast counters

– Static compiler?

• Q&A

Page 6: 0.5mln packets per second with Erlang

LINCX: project overview

• Started in December, 2013

• Initial scope = porting LINC-Switch to LING

• High degree of compatibility demonstrated for LING

• Extended scope = fix LINC-Switch fast path

• Beta version of LINCX open sourced on March 3, 2014

• LINCX runs 100x faster than the old code

LINCX repository:github.com/FlowForwarding/lincx

Page 7: 0.5mln packets per second with Erlang

Raw network interfaces in Erlang• LING adds raw network interfaces:

Port = net_vif:open(“eth1”, []),port_command(Port, <<1,2,3>>),receive{Port,{data,Frame}} >‐...

• Raw interface receives whole Ethernet frames

• LINCX uses standard gen_tcp for the control connection and net_vif -

for data ports

• Raw interfaces support mailbox_limit option - packets get dropped if

the mailbox of the receiving process overflows:

Port = net_vif:open(“eth1”, [{mailbox_limit,16384}]),...

Page 8: 0.5mln packets per second with Erlang

Testbed configuration

* Test traffic goes between vm1 and vm2

* LINCX runs as a separate Xen domain

* Virtual interfaces are bridged in Dom0

Page 9: 0.5mln packets per second with Erlang

IXIA confirms 460kpps peak rate• 1GbE hw NICs/128 byte packets

• IXIA packet generator/analyzer

Page 10: 0.5mln packets per second with Erlang

Processing delay and low-level stats

• LING can measure a processing delay for a packet:

1> ling:experimental(processing_delay, []).Processing delay statistics:Packets: 2000 Delay: 1.342us + 0.143 (95%)‐

• LING can collect low-level stats for a network interface:

1> ling:experimental(llstat, 1). %% stop/displayDuration: 4868.6msRX: interrupts: 69170 (0 kicks 0.0%) (freq 14207.4/s period 70.4us)RX: reqs per int: 0/0.0/0RX: tx buf freed per int: 0/8.5/234TX: outputs: 1479707 (112263 kicks 7.6) (freq 303928.8/s period 3.3us)TX: tx buf freed per int: 0/0.6/113TX: rates: 303.9kpps 3622.66Mbps avg pkt size 1489.9BTX: drops: 12392 (freq 2545.3/s period 392.9us)TX: drop rates: 2.5kpps 30.26Mbps avg pkt size 1486.0B

Page 11: 0.5mln packets per second with Erlang

The road map

• Erlang on Xen intro

• LINCX project overview• Speed-related notes

– Arguments are registers

– ETS tables are (mostly) ok

– Do not overuse records

– GC is key to speed

– gen_server vs. barebone process

– NIFS: more pain than gain

– Fast counters

– Static compiler?

• Q&A

Page 12: 0.5mln packets per second with Erlang

Arguments are registers

animal(batman = Cat, Dog, Horse, Pig, Cow, State) >‐ feed(Cat, Dog, Horse, Pig, Cow, State);animal(Cat, deli = Dog, Horse, Pig, Cow, State) >‐ pet(Cat, Dog, Horse, Pig, Cow, State);...

%% SLOWanimal(batman = Cat, Dog, Horse, Pig, Cow, State) >‐ feed(Goat, Cat, Dog, Horse, Pig, Cow, State);...

• Many arguments do not make a function any slower

• But do not reshuffle arguments:

Page 13: 0.5mln packets per second with Erlang

ETS tables are (mostly) ok

• A small ETS table lookup = 10x function activations

• Do not use ets:tab2list() inside tight loops

• Treat ETS as a database; not a pool of global variables

• 1-2 ETS lookups on the fast path are ok

• Beware that ets:lookup(), etc create a copy of the data on the heap of

the caller, similarly to message passing

Page 14: 0.5mln packets per second with Erlang

Do not overuse records

• selelement() creates a copy of the tuple

• State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?)

copies of the tuple

• Use tuples explicitly in performance-critical sections to control

the heap footprint of the code:

%% from 9p.erlmixer({rauth,_,_}, {tauth,_,Afid,_,_}, _) > {write_auth,AFid};‐mixer({rauth,_,_}, {tauth,_,Afid,_,_,_}, _) > {write_auth,AFid};‐mixer({rwrite,_,_}, _, initial) > start_attaching;‐mixer({rerror,_,_}, _, initial) > auth_failed;‐mixer({rlerror,_,_}, _, initial) > auth_failed;‐mixer({rattach,_,Qid}, {tattach,_,Fid,_,_,Aname,_}, initial) >‐ {attach_more,Fid,AName,qid_type(Qid)};mixer({rclunk,_}, {tclunk,_,Fid}, initial) > {forget,Fid};‐

Page 15: 0.5mln packets per second with Erlang

Garbage collection is key to speed

• Heap is a list of chunks

• 'new heap' is close to its head, 'old heap' - to its tail

• A GC run takes 10 s on averageμ• GC may run 1000s times per second

proc_tHTOP

...

Page 16: 0.5mln packets per second with Erlang

How to tackle GC-related issues

• (Priority 1) Call erlang:garbage_collect() at strategic points

• (Priority 2) For the fastest code avoid GC completely – restart the fast

process regularly:

spawn(F, [{suppress_gc,true}]), %% LING only‐

• (Priority 3) Use fullsweep_after option

Page 17: 0.5mln packets per second with Erlang

gen_server vs barebone process

• Message passing using gen_server:call() is 2x slower than Pid ! Msg

• For speedy code prefer barebone processes to gen_servers

• Design Principles are about high availability, not high performance

Page 18: 0.5mln packets per second with Erlang

NIFs: more pain than gain

• A new principle of Erlang development: do not use NIFs

• For a small performance boost, NIFs undermine key properties of

Erlang: reliability and soft-realtime guarantees

• Most of the time Erlang code can be made as fast as C

• Most of performance problems of Erlang are traceable to NIFs, or

external C libraries, which are similar

• Erlang on Xen does not have NIFs and we do not plan to add them

Page 19: 0.5mln packets per second with Erlang

Fast counters• 32-bit or 64-bit unsigned integer counters with overflow - trivial in C,

not easy in Erlang

• FIXNUMs are signed 29-bit integers, BIGNUMs consume heap and are

10-100x slower

• Use two variables for a counter?

foo(C1, 16#ffffff, ...) -> foo(C1+1, 0, ...);foo(C1, C2, ...) > foo(C1,‐ C2+1, ...);...

• LING has a new experimental feature – fast counters:

erlang:new_counter(Bits) > Ref‐erlang:increment_counter(Ref, Incr)erlang:read_counter(Ref)erlang:release_counter(Ref)

Page 20: 0.5mln packets per second with Erlang

Future: static compiler for Erlang

• Scalars and algebraic types

• Structural types only – no nominal types

• Target compiler efficiency not static type checking

• A middle ground between:

• “Type is a first class citizen” (Haskell)

• “A single type is good enough” (Python, Erlang)

Page 21: 0.5mln packets per second with Erlang

Future: static compiler for Erlang - 2

• Challenges:

• Pattern matching compilation

• Type inference for recursive types

y = {(unit | y), x, (unit | y)}

• Work started in 2013

• Currently the compiler is at the proof-of-concept stage

y = nil | {x, y}

Page 22: 0.5mln packets per second with Erlang

Questions

???

e-mail: [email protected]