15-744: Computer Networking15744/lectures/08-evolution.pdfOF 1.0 Dec 2009 12 OF 1.1 Feb 2011 15 OF 1.2 Dec 2011 36 OF 1.3 Jun2012 40 OF1.4 Oct 2013 41 Proliferation of header fields

15-744: Computer Networking

L-10 Evolving the Data Plane

Adding New Functionality to the Internet

• Adding new functionality to the data plane• Future internet architectures• Overlay networks• Active networks• OpenFlow’s next generation à P4• Reading

• P4: Programming Protocol-IndependentPacket Processors• Active network vision and reality:lessons from a capsule-

based system• Optional

• XIA: Efficient Support for Evolvable Internetworking• A Case for End System Multicast

2

A History of Internet Evolution

• Many success stories

• 1983 à Flag day switch from NCP to IP • 1988 à /etc/hosts to DNS• 1996 à TCP SACK• 1989-1994 à EGP to BGP [1..4]

3


• But also many failures

• 1986 à IP Multicast

• 1997 à IntServ

• 1998 à DiffServ

• 1995 à IPv6

• Internet capabilities, traceback schemes, explicit

congestion control, content-oriented processing,

…

4


• Hard to change IP• …especially after 1990

5

IP

Applications

Technology

Innovation bothabove and below IP

Clean-Slate vs. Evolutionary

• Successes of the 80s followed by failures of the 90�s• IP Multicast• QoS• RED (and other AQMs)• ECN• …

• Concern that Internet research was dead• Difficult to deploy new ideas• What did catch on was limited by the backward

compatibility required

6

7

Outline

• Active Networks

• Architectures for Incremental Deployment

• P4

• Overlay Routing

8

Why Active Networks?

• Traditional networks route packets looking only at destination• Also, maybe source fields (e.g. multicast)• Problem: Rate of deployment of new protocols and

applications is too slow• Applications that do more than IP forwarding

• Firewalls• Web proxies and caches• Transcoding services• Nomadic routers (mobile IP)• Transport gateways (snoop)• Reliable multicast (lightweight multicast, PGM)

9

Active Networks

• Nodes (routers) receive packets:• Perform computation based on their internal state and

control information carried in packet• Forward zero or more packets to end points depending

on result of the computation• Users and apps can control behavior of the

routers• End result: network services richer than those by

the simple IP service model

10

Variations on Active Networks

• Programmable routers (e.g. OpenFlow)• More flexible than current configuration mechanism• For use by administrators or privileged users

• Active control (e.g. SDN)• Forwarding code remains the same• Useful for management/signaling/measurement of

traffic• �Active networks�

• Computation occurring at the network (IP) layer of the protocol stack à capsule based approach

• Programming can be done by any user• Source of most active debate

11

Case Study: MIT ANTS System

• Conventional Networks: • All routers perform same computation

• Active Networks: • Routers have same runtime system

• Tradeoffs between functionality, performance and security

12

System Components

• Capsules• Active Nodes:

• Execute capsules of protocol and maintain protocol state

• Provide capsule execution API and safety using OS/language techniques

• Code Distribution Mechanism• Ensure capsule processing routines

automatically/dynamically transfer to node as needed

13

Capsules

• Each user/flow programs router to handle its own packets• Code sent along with packets• Code sent by reference

• Protocol: • Capsules that share the same processing code

• May share state in the network• Capsule ID (i.e. name) is MD5 of code

14

Capsules

Active Node

IP Router

Active Node

Capsule Capsule

IP Header Version DataType Previous Address

Type Dependent Header Files

ANTS-specific header

• Capsules are forwarded past normal IP routers

15

Capsules

Active Node 1

IP Router

Active Node 2

Capsule

Request for code

Capsule

• When node receives capsule uses �type� to determine code to run

• What if no such code at node?• Requests code from �previous address� node• Likely to have code since it was recently used

16

Capsules

Active Node 1

IP Router

Active Node 2

CapsuleCapsule

Code Sent

• Code is transferred from previous node • Size limited to 16KB• Code is signed by trusted authority (e.g. IETF)

to guarantee reasonable global resource use

17

Research Questions

• Execution environments• What can capsule code access/do?

• Safety, security & resource sharing• How isolate capsules from other flows, resources?

• Performance• Will active code slow the network?

• Applications• What type of applications/protocols does this enable?

18

Functions Provided to Capsule

• Environment Access• Querying node address, time, routing tables

• Capsule Manipulation• Access header and payload

• Control Operations• Create, forward and suppress capsules• How to control creation of new capsules?

• Storage• Soft-state cache of app-defined objects

19

Safety, Resource Mgt, Support

• Safety:• Provided by mobile code technology (e.g. Java)

• Resource Management:• Node OS monitors capsule resource consumption

• Support:• If node doesn�t have capsule code, retrieve from

somewhere on path

20

Applications/Protocols

• Limitations• Expressible à limited by execution environment• Compact à less than 16KB• Fast à aborted if slower than forwarding rate• Incremental à not all nodes will be active

• Proof by example• Host mobility, multicast, path MTU, Web cache routing,

etc.

21

Discussion

• Active nodes present lots of applications with a desirable architecture

• Key questions• Is all this necessary at the forwarding level of the

network?• Is ease of deploying new apps/services and protocols a

reality?

22

Outline

• Active Networks


• P4

• Overlay Routing

Proposed -Centric Networking

• Content: Named Data Networking• Mobility: MobilityFirst• Cloud: Nebula

Can we support heterogeneous communication types on a single Internet architecture?

Problem: Focusing on one communication type may hinder using other communication types, as occurred to IP

23

Future -Centric Networking

• Service, content, mobility, and clouddid not receive much attention before

• Yet more networking styles may be usefulin the future• E.g., DTN, wide-area multicast, …?

Can we support future communication typeswithout redesigning the Internet architecture?

Problem: Introducing additional communication types tothe existing network can be very challenging

24

Legacy Router May Prevent Innovation

Can we allow use of a new communication type even when the network is yet to natively support it?

Problem: Using a new communication type may requireevery legacy router in the network to be upgraded

“I got a computer withAwesome-Networking

announced at Sigcomm 2022!Can I use it right now?”

Internet

“Ouch, we just replaced all of our routers built in 2012.

Can you wait for another10 years for new routers?”

25

XIA’s Goals and Design Pillars

Support multiple communication

types concurrently(heterogeneity)

Support future communication

types(flexibility)

Allow using new communication

types at any point (incremental deployment)

“Principal types” “Fallbacks”

26

Principal Types

Define your own communication model

27

Principals

128.2.10.162

Current InternetXIA

IP address

Host 0xF63C7A4…

Principal type

Type-specific identifier

Service 0x8A37037…

Content 0x47BF217…

Future …

Hash of host’s public key

Hash of content

Hash of service’s public key

28

Principal Definition 1:

Address Allocation and Intrinsic Security

• XIA uses self-certifying identifiers that guarantee

security properties for communication operation

• Host ID is a hash of its public key – accountability (AIP)

• Content ID is a hash of the content – correctness

• Does not rely on external configuration

• Intrinsic security is specific to the principal type

• Example: retrieve content using …

• Content XID: content is correct

• Service XID: the right service provided content

• Host XID: content was delivered from right host

29

Principal Definition 2: Type-Specific Semantics

30

Contact a host

Use a service

Retrieve content

Host 0xF63C7A4…

Service 0x8A37037…

Content 0x47BF217…

Principal Definition 3: Type-Specific Processing

31

XIA router

Host-specific processing

Commonprocessing

Service-specific processing

Content-specific processing

…

Input Output

• Type-specific processing examples• Service: load balancing or service migration• Content: content caching

Routers with Different Capabilities

• Routers are not required to supportevery principal type• The only requirement: Host-based communication

Host

Common

Host-only router

Host

Common Service

Service-enabled router

Host

Common

Content-enabled router

Content

32

Using Principal Types that areNot Understood by Legacy Routers?

Legacy routerwithout

content support

Want to communicate using content principals



33

Fallbacks

Tomorrow’s communication types… today!

34

Fallbacks: Alternative Ways forRouters to Fulfill Intent of Packet

Content

Intent: Retrieve

Fallback: Contact ,

who understands request

What the network does:

• With content-enabled routers, use for routing

• Otherwise, use for routing (always succeeds)

Content

Host

Host

Content

35

DAG-BasedAddress

Your address is more than a number

36

DAG (Direct Acyclic Graph)-Based Addressing Enables Fallbacks

IntentPacket sender Routing choice

Another routing choice(with lower priority)

This host knows how to handle content request

Fallback

Content

Host

37

38

Outline

• Active Networks


• P4

• Overlay Routing

In the Beginning…

• OpenFlow was simple

• A single rule table• Priority, pattern, actions, counters, timeouts

• Matching on any of 12 fields, e.g.,• MAC addresses• IP addresses• Transport protocol • Transport port numbers

39

Over the Past Five Years…

40

Version Date # HeadersOF 1.0 Dec 2009 12OF 1.1 Feb 2011 15OF 1.2 Dec 2011 36OF 1.3 Jun 2012 40OF 1.4 Oct 2013 41

Proliferation of header fields

Multiple stages of heterogeneous tables

Still not enough (e.g., VXLAN, NVGRE, STT, …)

“Classic” OpenFlow (1.x)

41

Target Switch

SDN Control Plane

Installing and querying rules

“OpenFlow 2.0”

42

Target Switch

SDN Control Plane

Populating:Installing and querying rules

Compiler

Configuring:Parser, tables, and

control flow

Parser & Table Configuration

RuleTranslator

general interface between the controller and the switches.

That is, we believe that future generations of OpenFlow

should allow the controller to tell the switch how to operate,

rather than be constrained by a fixed switch design. The key

challenge is to find a “sweet spot” that balances the need

for expressiveness with the ease of implementation across a

wide range of hardware and software switches. In designing

P4, we have three main goals:

• Reconfigurability. The controller should be able to re-

define the packet parsing and processing in the field.

• Protocol independence. The switch should not be tied

to specific packet formats. Instead, the controller should

be able to specify (i) a packet parser for extracting header

fields with particular names and types and (ii) a collection

of typed match+action tables that process these headers.

• Target independence. Just as a C programmer does

not need to know the specifics of the underlying CPU, the

controller programmer should not need to know the de-

tails of the underlying switch. Instead, a compiler should

take the switch’s capabilities into account when turning

a target-independent description (written in P4) into a

target-dependent program (used to configure the switch).

The outline of the paper is as follows. We begin by in-

troducing an abstract switch forwarding model. Next, we

explain the need for a new language to describe protocol-

independent packet processing. We then present a simple

motivating example where a network operator wants to sup-

port a new packet-header field and process packets in mul-

tiple stages. We use this to explore how the P4 program

specifies headers, the packet parser, the multiple match+

action tables, and the control flow through these tables. Fi-

nally, we discuss how a compiler can map P4 programs to

target switches.

Related work. In 2011, Yadav et al. [4] proposed an ab-

stract forwarding model for OpenFlow, but with less empha-

sis on a compiler. Kangaroo [1] introduced the notion of pro-

grammable parsing. Recently, Song [5] proposed protocol-

oblivious forwarding which shares our goal of protocol in-

dependence, but is targeted more towards network proces-

sors. The ONF introduced table typing patterns to express

the matching capabilities of switches [6]. Recent work on

NOSIX [7] shares our goal of flexible specification of match+

action tables, but does not consider protocol-independence

or propose a language for specifying the parser, tables, and

control flow. Other recent work proposes a programmatic in-

terface to the data plane for monitoring, congestion control,

and queue management [8, 9]. The Click modular router [10]

supports flexible packet processing in software, but does not

map programs to a variety of target hardware switches.

2. ABSTRACT FORWARDING MODELIn our abstract model (Fig. 2), switches forward packets

via a programmable parser followed by multiple stages of

match+action, arranged in series, parallel, or a combination

of both. Derived from OpenFlow, our model makes three

generalizations. First, OpenFlow assumes a fixed parser,

whereas our model supports a programmable parser to allow

new headers to be defined. Second, OpenFlow assumes the

match+action stages are in series, whereas in our model they

can be in parallel or in series. Third, our model assumes that

actions are composed from protocol-independent primitives

supported by the switch.

Our abstract model generalizes how packets are processed

in di↵erent forwarding devices (e.g., Ethernet switches, load-

balancers, routers) and by di↵erent technologies (e.g., fixed-

function switch ASICs, NPUs, reconfigurable switches, soft-

ware switches, FPGAs). This allows us to devise a com-

mon language (P4) to represent how packets are processed

in terms of our common abstract model. Hence, program-

mers can create target-independent programs that a com-

piler can map to a variety of di↵erent forwarding devices,

ranging from relatively slow software switches to the fastest

ASIC-based switches.

Figure 2: The abstract forwarding model.

The forwarding model is controlled by two types of oper-

ations: Configure and Populate. Configure operations pro-

gram the parser, set the order of match+action stages, and

specify the header fields processed by each stage. Config-

uration determines which protocols are supported and how

the switch may process packets. Populate operations add

(and remove) entries to the match+action tables that were

specified during configuration. Population determines the

policy applied to packets at any given time.

For the purposes of this paper, we assume that configura-

tion and population are two distinct phases. In particular,

the switch need not process packets during configuration.

However, we expect implementations will allow packet pro-

cessing during partial or full reconfiguration enabling up-

grades with no downtime. Our model deliberately allows

for, and encourages, reconfiguration that does not interrupt

forwarding.

Clearly, the configuration phase has little meaning in fixed-

function ASIC switches; for this type of switch, the com-

ACM SIGCOMM Computer Communication Review 89 Volume 44, Number 3, July 2014

Abstract Model

• Reconfigurable parser + Non-sequential match-action + Protocol independent processing

Simple Motivating Example

• Data-center routing• Top-of-rack switches• Two tiers of core switches• Source routing by ToR

• Hierarchical tag (mTag)• Pushed by the ToR• Four one-byte fields• Two hops up, two down

44

up1

up2 down1

down2

ToR ToR

Header Formats

• Header• Ordered list of fields• A field has a name and width

header ethernet {fields {dst_addr : 48;src_addr : 48;ethertype : 16;

}}

header mTag {fields {up1 : 8;up2 : 8;down1 : 8;down2 : 8;ethertype : 16;

}}

header vlan {fields {pcp : 3;cfi : 1;vid : 12;ethertype : 16;

}}

Parser

• State machine traversing the packet• Extracting field values as it goes

46

parser start { parser vlan {ethernet; switch(ethertype) {

} case 0xaaaa : mTag;case 0x800 : ipv4;

parser ethernet { . . .switch(ethertype) { }

case 0x8100 : vlan;case 0x9100 : vlan; parser mTag {case 0x800 : ipv4; switch(ethertype) {. . . case 0x800 : ipv4;

} . . .} }

}

Typed Tables

• Describe each packet-processing stage • What fields are matched, and in what way• What action functions are performed• (Optionally) a hint about max number of rules

47

table mTag_table {reads {ethernet.dst_addr : exact;vlan.vid : exact;

}actions {add_mTag;

}max_size : 20000;

}

Action Functions

• Custom actions built from primitives• Add, remove, copy, set, increment, checksum

48

action add_mTag(up1, up2, down1, down2, outport) {add_header(mTag);

copy_field(mTag.ethertype, vlan.ethertype);set_field(vlan.ethertype, 0xaaaa);

set_field(mTag.up1, up1);set_field(mTag.up2, up2);set_field(mTag.down1, down1);set_field(mTag.down2, down2);

set_field(metadata.outport, outport);}

Control Flow

• Flow of control from one table to the next• Collection of functions, conditionals, and tables

• For a ToR switch:

49

ToR

From core(with mTag)

From local hosts(with no mTag)

SourceCheckTable

LocalSwitchingTable

EgressCheck

mTagTable

Miss: Not Local

Control Flow

• Flow of control from one table to the next• Collection of functions, conditionals, and tables

• Simple imperative representation

50

control main() {table(source_check);

if (!defined(metadata.ingress_error)) {table(local_switching);

if (!defined(metadata.outport)) {table(mTag_table);

}

table(egress_check);}

}

51

Outline

• Active Networks


• P4

• Overlay Routing

52

Overlay Routing

• Basic idea:• Treat multiple hops through IP network as one hop in �virtual� overlay network

• Run routing protocol on overlay nodes• Basic technique: tunnel packets

• Tunnels• IP-in-IP encapsulation• Poor interaction with firewalls, multi-path routers, etc.

• Why?• For performance – can run more clever protocol on overlay• For functionality – can provide new features such as

multicast, active processing, IPv6

53

Overlay for Performance [S+99]

• Why would IP routing not give good performance?• Policy routing – limits selection/advertisement of routes• Early exit/hot-potato routing – local not global

incentives• Lack of performance based metrics – AS hop count is

the wide area metric• How bad is it really?

• Look at performance gain an overlay provides

54

Quantifying Performance Loss

• Measure round trip time (RTT) and loss rate between pairs of hosts• ICMP rate limiting

• Alternate path characteristics• 30-55% of hosts had lower latency

• 10% of alternate routes have 50% lower latency• 75-85% have lower loss rates

55

Overlay for Features

• How do we add new features to the network?• Does every router need to support new feature?• Choices

• Reprogram all routers à active networks• Support new feature within an overlay

• Examples:• IP V6 & IP Multicast

• Tunnels between routers supporting feature

• Mobile IP• Home agent tunnels packets to mobile host�s location

59

Overlay Example: Multicast

• Unicast: one source to one destination• Multicast: one source to many destinations• Two main functions:

• Efficient data distribution • Logical naming of a group

60

Example Applications

• Broadcast audio/video• Push-based systems• Software distribution• Web-cache updates • Teleconferencing (audio, video, shared

whiteboard, text editor)• Multi-player games• Server/service location• Other distributed applications

61

Multicast – Efficient Data Distribution

Src Src

62

Multicast Router Responsibilities

• Learn of the existence of multicast groups (through advertisement)

• Identify links with group members• Establish state to route packets

• Replicate packets on appropriate interfaces• Routing entry:

Src, incoming interface List of outgoing interfaces

63

Supporting Multicast on the Internet

IP

Application

Internet architecture

Network

?

?

At which layer should multicast be implemented?

64

IP Multicast

CMU

BerkeleyMIT

UCSD

routersend systemsmulticast flow

• Highly efficient• Good delay

65

End System MulticastMIT1

MIT2

CMU1

CMU2

UCSD

MIT1

MIT2

CMU2

Overlay TreeBerkeley

CMU1

CMU

BerkeleyMIT

UCSD

66

• Quick deployment• All multicast state in end systems• Computation at forwarding points simplifies

support for higher level functionality

Potential Benefits Over IP Multicast

MIT1

MIT2

CMU1

CMU2

CMU

BerkeleyMIT

UCSD

67

Concerns with End System Multicast

• Self-organize recipients into multicast delivery overlay tree• Must be closely matched to real network topology to be

efficient• Performance concerns compared to IP Multicast

• Increase in delay• Bandwidth waste (packet duplication)

MIT2

Berkeley MIT1

UCSD

CMU2

CMU1

IP Multicast

MIT2

Berkeley MIT1

CMU1

CMU2

UCSD

End System Multicast

Next Lecture

• Middleboxes• NFV• Readings:

• Network Functions Virtualisation (skim)• Design and Implementation of a Consolidated

Middlebox Architecture (read whole paper)

76

Documents

15-744: Computer Networking15744/lectures/08-evolution.pdfOF 1.0 Dec 2009 12 OF 1.1 Feb 2011 15 OF 1.2 Dec 2011 36 OF 1.3 Jun2012 40 OF1.4 Oct 2013 41 Proliferation of header fields