3
Sluice: Network-Wide Data Plane Programming Vikas Natesh 1 Pravein Govindan Kannan 2 Anirudh Sivaraman 1 Ravi Netravali 3 1 New York University 2 National University of Singapore 3 University of California, Los Angeles CCS CONCEPTS Networks Programming interfaces; Programmable net- works; Network management. KEYWORDS Network-wide programming; data plane 1 INTRODUCTION The last several years have seen the emergence of programmable network devices including both programmable switching chips and programmable network interface cards (NICs). Along with the rise of x86-based packet processing for middleboxes and vir- tual switches, these trends point towards a future where the en- tire network will be programmable. The benefits of network pro- grammability range from commercial use cases such as network virtualization [9] implemented on the Open vSwitch platform [11] to recent research works that implement packet scheduling [12], measurement [10], and application offload of niche applications on programmable switches [7, 8]. While the benefits of programmability are clear, they are difficult to reap because programming the network as a whole remains a challenge. Current programming languages target individual net- work devices, e.g., P4 for the Tofino [1] programmable switching chip and the Netronome SmartNIC [3]. However, at present, there is no unified programming model to express and implement general data plane functionality at the level of an entire network, without having to individually program each network device. Prior work has looked at programming an entire network. In particular, Maple [13] was an early example of a network-wide programming model designed for OpenFlow switches. Maple au- tomatically divides functionality between a stateless component running on switches and a stateful component running on the network’s controller. However, this creates overhead as packets requiring stateful processing must be forwarded to the controller. SNAP [6] is a more recent example of network-wide programming; unlike Maple, it offloads stateful functionality to switches by lever- aging stateful processing available in programmable switches while providing the operator with a view of one-big-switch (OBS) of per- sistent arrays. This abstraction is good at expressing network-wide policies that do not require explicit placement of packet processing code on particular devices, e.g., DNS tunnel detection in a LAN. However, to develop applications that do require such specific place- ment, a more fine-grained programming model is necessary. For Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. SIGCOMM Posters and Demos ’19, August 19–23, 2019, Beijing, China © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-6886-5/19/08. . . $15.00 https://doi.org/10.1145/3342280.3342343 Sluice Program @device1 Snippet a ----- ----- @device2 Snippet b ----- ----- P4 Compiler Sluice Compiler Network - - - - - - - - Device 2 program Device 1 program Device n program Figure 1: Sluice Workflow instance, an operator may wish to run active queue management (AQM), ECN, and DCTCP [5] on specific switches, but not all. This is hard to achieve using SNAP. Further, SNAP does not provide abstractions to express queue-based measurement e.g., tracking an EWMA of queueing latencies on a particular switch. To summarize, Maple and SNAP cannot express programmable switch functional- ity where the network operator requires specific placement of code on specific devices, e.g., packet scheduling, congestion control, and load balancing. This demo presents Sluice, a programming model that takes a network-wide specification of the data plane and compiles it into runnable code that can be launched directly on the programmable devices of a network. In contrast to prior network-wide program- ming models like SNAP and Maple that were focused on specific tasks (e.g., routing and security policies), Sluice aims to be more generic, but potentially at the cost of operator effort in specifying code placement. Sluice endows network operators with the ability to design and deploy large network programs for various functions such as scheduling, measurement, and in-network applications. The benefits of Sluice can be summarized as follows: (1) Sluice provides the same functionality as a per-device language like P4 but makes it easier to program the data plane of an entire network by abstracting device-specific architectural details like stateful ALUs, pipelines, etc., and (2) Sluice automatically reduces the amount of boilerplate code needed to write data plane functionality. For instance, the 8 line traffic matrix Sluice program we demonstrate translates into over 40 lines of P4 (excluding header/metadata/parser definitions and ipv4 forwarding P4 code). We demonstrate Sluice’s functionality and ease of use via two examples: traffic matrix generation for network analysis and a streaming join-filter operation. Sluice is open-source and available at https://github.com/sluice-project/sluice. 2 SLUICE DESIGN In the Sluice model, a network-wide program consists of high-level code snippets annotated by the operator to run on particular de- vices in a network. The code in each snippet is to be executed on packets arriving at its corresponding device. Snippets support a variety of operations: read-from/write-to packets; arithmetic using packet/meta data, local variables, or stateful register arrays; and control flow statements. To handle custom packet headers not sup- ported by default (Ethernet/IP/UDP/TCP), users may define packet

Sluice: Network-Wide Data Plane Programminganirudh/sluice.pdfdata plane functionality at the level of an entire network, without having to individually program each network device

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sluice: Network-Wide Data Plane Programminganirudh/sluice.pdfdata plane functionality at the level of an entire network, without having to individually program each network device

Sluice: Network-Wide Data Plane ProgrammingVikas Natesh1 Pravein Govindan Kannan2 Anirudh Sivaraman1 Ravi Netravali3

1New York University 2National University of Singapore 3University of California, Los Angeles

CCS CONCEPTS• Networks → Programming interfaces; Programmable net-works; Network management.

KEYWORDSNetwork-wide programming; data plane1 INTRODUCTIONThe last several years have seen the emergence of programmablenetwork devices including both programmable switching chipsand programmable network interface cards (NICs). Along withthe rise of x86-based packet processing for middleboxes and vir-tual switches, these trends point towards a future where the en-tire network will be programmable. The benefits of network pro-grammability range from commercial use cases such as networkvirtualization [9] implemented on the Open vSwitch platform [11]to recent research works that implement packet scheduling [12],measurement [10], and application offload of niche applications onprogrammable switches [7, 8].

While the benefits of programmability are clear, they are difficultto reap because programming the network as a whole remains achallenge. Current programming languages target individual net-work devices, e.g., P4 for the Tofino [1] programmable switchingchip and the Netronome SmartNIC [3]. However, at present, thereis no unified programming model to express and implement generaldata plane functionality at the level of an entire network, withouthaving to individually program each network device.

Prior work has looked at programming an entire network. Inparticular, Maple [13] was an early example of a network-wideprogramming model designed for OpenFlow switches. Maple au-tomatically divides functionality between a stateless componentrunning on switches and a stateful component running on thenetwork’s controller. However, this creates overhead as packetsrequiring stateful processing must be forwarded to the controller.SNAP [6] is a more recent example of network-wide programming;unlike Maple, it offloads stateful functionality to switches by lever-aging stateful processing available in programmable switches whileproviding the operator with a view of one-big-switch (OBS) of per-sistent arrays. This abstraction is good at expressing network-widepolicies that do not require explicit placement of packet processingcode on particular devices, e.g., DNS tunnel detection in a LAN.However, to develop applications that do require such specific place-ment, a more fine-grained programming model is necessary. ForPermission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected] Posters and Demos ’19, August 19–23, 2019, Beijing, China© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.ACM ISBN 978-1-4503-6886-5/19/08. . . $15.00https://doi.org/10.1145/3342280.3342343

SluiceProgram

@device1Snippeta----------

@device2Snippetb----------

P4Compiler

SluiceCompiler

Network

--------

Device2program

Device1program

Devicenprogram

Figure 1: Sluice Workflow

instance, an operator may wish to run active queue management(AQM), ECN, and DCTCP [5] on specific switches, but not all. Thisis hard to achieve using SNAP. Further, SNAP does not provideabstractions to express queue-based measurement e.g., tracking anEWMA of queueing latencies on a particular switch. To summarize,Maple and SNAP cannot express programmable switch functional-ity where the network operator requires specific placement of codeon specific devices, e.g., packet scheduling, congestion control, andload balancing.

This demo presents Sluice, a programming model that takes anetwork-wide specification of the data plane and compiles it intorunnable code that can be launched directly on the programmabledevices of a network. In contrast to prior network-wide program-ming models like SNAP and Maple that were focused on specifictasks (e.g., routing and security policies), Sluice aims to be moregeneric, but potentially at the cost of operator effort in specifyingcode placement. Sluice endows network operators with the abilityto design and deploy large network programs for various functionssuch as scheduling, measurement, and in-network applications. Thebenefits of Sluice can be summarized as follows: (1) Sluice providesthe same functionality as a per-device language like P4 but makes iteasier to program the data plane of an entire network by abstractingdevice-specific architectural details like stateful ALUs, pipelines,etc., and (2) Sluice automatically reduces the amount of boilerplatecode needed towrite data plane functionality. For instance, the 8 linetrafficmatrix Sluice programwe demonstrate translates into over 40lines of P4 (excluding header/metadata/parser definitions and ipv4forwarding P4 code). We demonstrate Sluice’s functionality andease of use via two examples: traffic matrix generation for networkanalysis and a streaming join-filter operation. Sluice is open-sourceand available at https://github.com/sluice-project/sluice.

2 SLUICE DESIGNIn the Sluice model, a network-wide program consists of high-levelcode snippets annotated by the operator to run on particular de-vices in a network. The code in each snippet is to be executed onpackets arriving at its corresponding device. Snippets support avariety of operations: read-from/write-to packets; arithmetic usingpacket/meta data, local variables, or stateful register arrays; andcontrol flow statements. To handle custom packet headers not sup-ported by default (Ethernet/IP/UDP/TCP), users may define packet

Page 2: Sluice: Network-Wide Data Plane Programminganirudh/sluice.pdfdata plane functionality at the level of an entire network, without having to individually program each network device

SIGCOMM Posters and Demos ’19, August 19–23, 2019, Beijing, China V. Natesh. et al.

s2

s3s1h1 h3

h2

Figure 2: Topology For Traffic Matrix Demoheader declarations similar to C structs. An optional annotationin the packet declaration defines the parser condition for theseuser-defined headers (for example, see packet p in Figure 4). Sluiceprograms may also import device-specific variables/attributes foruse in code snippets. Sluice also lets the programmer restrict snip-pets to operate on specific flows or IP address ranges.

Figure 1 describes the Sluice workflow. The compiler translateseach snippet of a sluice program into a device-specific program. Af-ter initial parsing, lines of code in the snippet are decomposed intoa directed acyclic graph (DAG) that maps dependencies betweenvariables in each snippet. This graph is then passed to the backendof the compiler that generates the corresponding P4 program forthat device, e.g., the P4 Behavioral Model [4] or Tofino [1].1

3 DEMONSTRATIONS3.1 Traffic MatrixFigure 2 displays the Mininet [2] network topology used for ourtraffic matrix demo. Packets are sent over UDP from each host toall other hosts according to a Poisson traffic model with mean inter-arrival time of 0.5 seconds. The code below is our Sluice programwith a single snippet traffic_example that is launched on all switchesof the network. To run the Mininet emulation, the user passes theSluice program and network topology to the compiler. The compilergenerates P4 code to run on each switch as well as control planetable entries for routing packets through the topology.import d ev i c e psa ;pa cke t p : udp ( s r c P o r t : 1 2 3 4 )

nhops : b i t <32 >;

@ bmv2 : ;s n i p p e t t r a f f i c _ e x amp l e ( )

p e r s i s t e n t cn t : b i t <32 > [ 10 ] ;cn t [ psa . i n g r e s s _ p o r t ] = cn t [ psa . i n g r e s s _ p o r t ] + 1 ;p . nhops = p . nhops + 1 ;

This demo shows how a simple Sluice program can be used tomeasure link usage for a specific UDP flow (srcport 1234) acrossthe network. Each packet p contains a custom header nhops thatis incremented each time the packet enters a switch to inform thereceiving host of the number of hops the packet took. Each switchmaintains a stateful register counter cnt, indexed by the switchingress port, that tracks how many packets have entered throughthat ingress port. Aggregated over all switches, these countersrepresent a matrix measuring each link’s usage in the networkat a given time. This matrix (residing on the whole network) isthen queried once every second from the control plane to generatetime-series plots of packet rate for each link. Figure 3 displays thecumulative histogram of packet rates on link s1-s3 after collectingdata for 15 minutes. The expected CDF of packet rates Poisson(µ = 2packets/sec) is also plotted to validate the Sluice translation.

1Currently we only support the P4 Behavioral Model

0 1 2 3 4 5 6Packets/sec

0.0

0.2

0.4

0.6

0.8

1.0

CDF

ExpectedObserved

Figure 3: CDF of Packet Rate on link s1-s3

s2s1(4,6,0)(3,1,0)(7,2,0)

(9,0,3)(3,0,4)(7,0,5)

h2

h1 h3 (7,2,5)

INNERJOIN(3,1,4)(7,2,5)

FILTER(7,2,5)

@ bmv2 : s1;Snippet join()

persistent i_id : bit<1>[100]; persistent c_id : bit<1>[100]; persistent i_t : bit<32>[100]; persistent c_t : bit<32>[100]; transient z: bit<1>; if(udp.srcPort == 11111)

i_id[p.ad_id] = 1;i_t[p.ad_id] = p.i_t;z = c_id[p.ad_id] == 1;p.c_t = z ? c_t[p.ad_id] : 0;

if(udp.srcPort == 22222) c_id[p.ad_id] = 1; c_t[p.ad_id] = p.c_t; z = i_id[p.ad_id] == 1; p.i_t = z ? i_t[p.ad_id] : 0;

@ bmv2 : s2;Snippet filter()

transient c_null : bit<1>; transient i_null : bit<1>; c_null = p.c_t == 0; i_null = p.i_t == 0; if(c_null or i_null)

drop(); if(p.ad_id != 7)

drop();

Packet p: udp(srcPort : 11111, 22222)

ad_id : bit<32>; i_t : bit<32>; c_t : bit<32>;

Figure 4: Streaming example topology, data flow, and codeplacement on switches

3.2 Stream processingThis example demonstrates a simple join-filter operation betweentwo streams of tuples. A stream is an unbounded table where apacket represents a tuple of data (ad_id, impression_time, click_time)enclosed in a custom header. The topology in Figure 4 describes thedata flow and shows how an operator query runs on the switchesof the network. Host 1 sends a stream of ad impressions while Host2 sends a stream of ad clicks. The two streams are joined on thead_id field at s1 and filtered on the ad_id field at s2 and the resultis sent to h3.4 FUTUREWORKAn optimizing Sluice compiler. We envision using the depen-dency DAG (§2) to provide several automatic optimizations andcode transformations. For example, it is possible that certain linesof code in a snippet cannot be run on the device annotated by theoperator, e.g., programmable switching chips have limited supportfor floating point. or complex string operations. Code containingsuch features must be moved to the control plane or an end hostwhile at the same time, preserving the original program semanticsintended by the operator. Doing this automatically would free theSluice programmer from reasoning about these semantics.

Supporting multi-tenancy. Another area of future work is al-lowing Sluice to support multiple tenants with their own Sluiceprograms running on their own virtual networks overlayed onthe same physical topology. If each tenant wants to run their ownnetwork-wide program on their virtual topology, the network oper-ator will need to merge all these into one data plane implementationthat runs on the entire physical network. Extending Sluice to sup-port this multi-tenancy use case would allow us to provide thesame benefits to the data plane that multi-tenant network virtual-ization [9] provided for the control plane.

Page 3: Sluice: Network-Wide Data Plane Programminganirudh/sluice.pdfdata plane functionality at the level of an entire network, without having to individually program each network device

Sluice: Network-Wide Data Plane Programming SIGCOMM Posters and Demos ’19, August 19–23, 2019, Beijing, China

REFERENCES[1] Barefoot Tofino. https://www.barefootnetworks.com/products/brief-tofino. Ac-

cessed: 2019-07-02.[2] Mininet. http://mininet.org/. Accessed: 2019-07-02.[3] Netronome SmartNIC. https://www.netronome.com/products/smartnic/

overview. Accessed: 2019-07-02.[4] P4 Behavioral Model. https://github.com/p4lang/behavioral-model. Accessed:

2019-07-02.[5] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sen-

gupta, and M. Sridharan. Data Center TCP (DCTCP). In Proceedings of the ACMSIGCOMM 2010 Conference, SIGCOMM ’10, pages 63–74, New York, NY, USA,2010. ACM.

[6] M. T. Arashloo, Y. Koral, M. Greenberg, J. Rexford, and D. Walker. SNAP: StatefulNetwork-Wide Abstractions for Packet Processing. In Proceedings of the 2016ACM SIGCOMM Conference, SIGCOMM ’16, pages 29–43, New York, NY, USA,2016. ACM.

[7] X. Jin, X. Li, H. Zhang, N. Foster, J. Lee, R. Soulé, C. Kim, and I. Stoica. Netchain:Scale-free sub-RTT Coordination. In Proceedings of the 15th USENIX Conference onNetworked Systems Design and Implementation, NSDI’18, pages 35–49, Berkeley,CA, USA, 2018. USENIX Association.

[8] X. Jin, X. Li, H. Zhang, R. Soulé, J. Lee, N. Foster, C. Kim, and I. Stoica. NetCache:Balancing Key-Value Stores with Fast In-Network Caching. In Proceedings of the

26th Symposium on Operating Systems Principles, SOSP ’17, pages 121–136, NewYork, NY, USA, 2017. ACM.

[9] T. Koponen et al. Network Virtualization in Multi-tenant Datacenters. In Proceed-ings of the 11th USENIX Conference on Networked Systems Design and Implemen-tation, NSDI’14, pages 203–216, Berkeley, CA, USA, 2014. USENIX Association.

[10] S. Narayana, A. Sivaraman, V. Nathan, P. Goyal, V. Arun, M. Alizadeh, V. Jeyaku-mar, and C. Kim. Language-Directed Hardware Design for Network PerformanceMonitoring. In Proceedings of the Conference of the ACM Special Interest Groupon Data Communication, SIGCOMM ’17, pages 85–98, New York, NY, USA, 2017.ACM.

[11] B. Pfaff, J. Pettit, T. Koponen, E. J. Jackson, A. Zhou, J. Rajahalme, J. Gross, A.Wang,J. Stringer, P. Shelar, K. Amidon, and M. Casado. The Design and Implementationof Open vSwitch. In Proceedings of the 12th USENIX Conference on NetworkedSystems Design and Implementation, NSDI’15, pages 117–130, Berkeley, CA, USA,2015. USENIX Association.

[12] A. Sivaraman, S. Subramanian, M. Alizadeh, S. Chole, S.-T. Chuang, A. Agrawal,H. Balakrishnan, T. Edsall, S. Katti, and N. McKeown. Programmable PacketScheduling at Line Rate. In Proceedings of the 2016 ACM SIGCOMM Conference,SIGCOMM ’16, pages 44–57, New York, NY, USA, 2016. ACM.

[13] A. Voellmy et al. Maple: Simplifying SDN Programming Using AlgorithmicPolicies. In Proceedings of SIGCOMM, 2013.