26
University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker, and Manish Vachharajani University of Colorado at Boulder Alexander L. Wolf - Imperial College London Antonio Carzaniga - University of Lugano 2007.12.03 John Giacomoni

University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research Lab

Frame Shared Memory:Line-Rate Networking on Commodity Hardware

John K. Bennett, Douglas C. Sicker, and Manish Vachharajani

University of Colorado at Boulder

Alexander L. Wolf - Imperial College London

Antonio Carzaniga - University of Lugano

2007.12.03

John Giacomoni

Page 2: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

ProblemProblemDescriptionDescription

Link Mbps fps ns/frame

T-1 1.5 2,941 340,000

T-3 45.0 90,909 11,000

OC-3 155.0 333,333 3,000

OC-12 622.0 1,219,512 820

GigE 1,000.0 1,488,095 672

OC-48 2,500.0 5,000,000 200

10 GigE 10,000.0 14,925,373 67

OC-192 9,500.0 19,697,843 51

????

How do we route?

How do we protect?

?? ?? ?? ??

How do we correlate?

Page 3: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

ASICASICSolutionsSolutions

Link Mbps fps ns/frame

T-1 1.5 2,941 340,000

T-3 45.0 90,909 11,000

OC-3 155.0 333,333 3,000

OC-12 622.0 1,219,512 820

GigE 1,000.0 1,488,095 672

OC-48 2,500.0 5,000,000 200

10 GigE 10,000.0 14,925,373 67

OC-192 9,500.0 19,697,843 51

?????? ?? ?? ??

How do we route?

How do we protect?

How do we correlate?

Page 4: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

ProgrammableProgrammableNetwork ProcessorsNetwork Processors

Intel® IXP2855

Page 5: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

:(

Page 6: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Multicore SystemsMulticore Systems

• GPP Multicore systems– Individual cores less powerful than UP

– 10s-100s-1000s of cores

– Full OS & Library Support

– Asymmetric (Alpha)

– Heterogeneous (AMD, Intel)

Intel (2x2-core) MIT RAW (16-core) 100-core 400-core

Page 7: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Moore’s CorollaryMoore’s Corollaryvs. Moore’s Lawvs. Moore’s Law

• SPEC Benchmark Suite Performance– Predicted vs. actual

CPU Performance (SPEC)

0

1000

2000

3000

4000

5000

6000

7000

4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

99 2000 2001 2002 2003 2004 2005 2006

Year/Quarter

SPEC Number

Integer

Floating Point

Integer Ideal

Graph Courtesy Tipp Moseley

Page 8: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Soft Network ProcessingSoft Network Processing(Soft-NP)(Soft-NP)

Page 9: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Soft-NP TechniqueSoft-NP Technique Frame GenerationFrame Generation

Generate 1 (Gen)Application (App)Output (OP)

Page 10: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

AMD OpteronAMD OpteronSystem OverviewSystem Overview

Page 11: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Data Flow Data Flow Frame GenerationFrame Generation

OSOP GenApp

Page 12: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

CommunicationCommunicationOverheadOverhead

Page 13: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

CommunicationCommunicationOverheadOverhead

GigE

Locks 200ns

Page 14: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

CommunicationCommunicationOverheadOverheadHardware 10ns

GigE

Locks 200ns

Page 15: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

CommunicationCommunicationOverheadOverhead

Lamport 160ns

Hardware 10ns

GigE

Locks 200ns

Page 16: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

CommunicationCommunicationOverheadOverhead

Lamport 160ns

Hardware 10nsFastForward 28ns

GigE

Locks 200ns

Page 17: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FastForwardFastForward

enqueue(data) { lock(queue); if (NEXT(head) == tail) { unlock(queue); return EWOULDBLOCK; } buffer[head] = data; head = NEXT(head); unlock(queue); return 0;}

enqueue_fastforward(data) {

if (NULL != buffer[head]) { return EWOULDBLOCK; } buffer[head] = data; head = NEXT(head);

return 0;}

• Cache-optimized CLF queues• Works with strong to weak consistency models• Hides die-die communication• Giacomoni, Moseley, and Vachharajani. “FastForward for Efficient Pipeline Parallelism: A

Cache-Optimized Concurrent Lock-Free Queue.” To appear: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), February 2008

Page 18: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Frame Shared MemoryFrame Shared Memory(FShm)(FShm)

• Pure software stack communicating via shared memory– Abstracted at the driver/NIC boundary– Cross-Domain modules (Kernel/Process, T/T, P/P, K/K)– Compatible with existing OS/library/language services

– Can communicate with any device on the memory interconnect

Page 19: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FShmFShmDriver APIDriver API

struct ifdirect { void (*if_direct_tick) (void *softc);

void (*if_direct_attach) (struct ifnet *, void *); void (*if_direct_detach) (struct ifnet *, void *);

int (*if_direct_tx) (void *softc, struct mbuf *txbuf); void (*if_direct_tx_post) (void *softc);

void (*if_direct_tx_clean_pre) (void *softc); struct mbuf* (*if_direct_tx_clean) (void *softc); void (*if_direct_tx_clean_post) (void *softc);

void (*if_direct_rx_pre) (void *softc); struct mbuf* (*if_direct_rx) (void *, struct mbuf *new_rxbuf); void (*if_direct_rx_post) (void *softc);};

Page 20: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FShmFShmEvaluation MethodologyEvaluation Methodology

• AMD Opteron 2.0 GHz– Dual-Processor & Dual-Core

• Compute average time per call– TSC

Page 21: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Frame GenerationFrame GenerationData FlowData Flow

OSAppOP Gen

Page 22: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FShm GenerateFShm Generate(linux pktgen)(linux pktgen)

64B* 1.36 Mfps

Page 23: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FShm CaptureFShm Capture(IDS)(IDS)

64B* 1.36 Mfps

Page 24: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FShm ForwardFShm Forward(Bridge)(Bridge)

64B* 1.36 Mfps

Page 25: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

FShm’sFShm’sFutureFuture

Lamport 160ns

Hardware 10nsFastForward 28ns

GigE

OC-48

Locks 200ns

Page 26: University of Colorado at Boulder Core Research Lab Frame Shared Memory: Line-Rate Networking on Commodity Hardware John K. Bennett, Douglas C. Sicker,

University of Colorado at Boulder

Core Research LabUniversity of Colorado at Boulder

Core Research Lab

Questions?

[email protected]