23
What’s needed to receive? A look at the minimum steps required for programming our anchor nic’s to receive packets

What’s needed to receive?

Embed Size (px)

DESCRIPTION

What’s needed to receive?. A look at the minimum steps required for programming our anchor nic’s to receive packets. A disappointment. Our former ‘nicwatch.cpp’ application does not seem to work reliably to show packets being received by the 82573L controller - PowerPoint PPT Presentation

Citation preview

Page 1: What’s needed to receive?

What’s needed to receive?

A look at the minimum steps required for programming our

anchor nic’s to receive packets

Page 2: What’s needed to receive?

A disappointment

• Our former ‘nicwatch.cpp’ application does not seem to work reliably to show packets being received by the 82573L controller

• It was based on the ‘raw sockets’ protocol implemented within the Linux kernel’s vast networking subsystem, thus offering us the prospect of a ‘hardware-independent’ tool -- if only it would show us all the packets!

Page 3: What’s needed to receive?

Two purposes…

• So let’s discard ‘nicwatch.cpp’ in favor of writing our own hardware-specific module that WILL be able to show us all the nic’s received packets, independently of Linux’s various layers of networking protocol code

• And let’s keep it as simple as possible, so we can see which programming steps are the truly essential ones for the 82573L nic

Page 4: What’s needed to receive?

Accessing 82573L registers

• Device registers are hardware mapped to a range of addresses in physical memory

• We can get the location and extent of this memory-range from a BAR register in the 82573L device’s PCI Configuration Space

• We then request the Linux kernel to setup an I/O ‘remapping’ of this memory-range to ‘virtual’ addresses within kernel-space

Page 5: What’s needed to receive?

i/o-memory remapping

dynamicram

nic registers

vram

IO-APIC

Local-APIC

userspace

APIC registers

kernel code/data

nic registers

vram

‘virtual’ address-spacephysical address-space

1-GB

3-GB

Page 6: What’s needed to receive?

Kernel memory allocation

• The NIC requires that some host memory for packet-buffers and receive descriptors

• The kernel provides a ‘helper function’ for reserving a suitable region of memory in kernel-space which is both ‘non-pageable’ and ‘physically contiguous’ (i.e., kzalloc())

• It’s our job is to decide how much memory our network controller hardware will need

Page 7: What’s needed to receive?

the packet’s data ‘payload’ goes here(usually varies from 56 to 1500 bytes)

Ethernet packet layout

• Total size normally can vary from 64 bytes up to 1522 bytes (unless ‘jumbo’ packets and/or ‘undersized’ packets are enabled)

• The NIC expects a 14-byte packet ‘header’ and it appends a 4-byte CRC check-sum

destination MAC address (6-bytes)

source MAC address(6-bytes)

Type/length(2-bytes)

Cyclic RedundancyChecksum (4-bytes)

0 6 12 14

Page 8: What’s needed to receive?

Rx-Descriptor Ring-Buffer

Circular buffer (128-bytes minimum – and must be a multiple of 128 bytes)

RDBA base-address

RDLEN (in bytes)

RDH (head)

RDT (tail)

= owned by hardware (nic)

= owned by software (cpu)

0x00

0x10

0x20

0x30

0x40

0x50

0x60

0x70

0x80

Page 9: What’s needed to receive?

Our ‘nicspy.c’ module

• It will be a ‘character-mode’ device-driver

• It will only implement ‘read()’ and ‘ioctl()’

• The ‘read()’ function will cause a task to sleep until a network packet has arrived

• An interrupt-handler will wake up the task

• A ‘get_info’ function will be provided as a debugging aid, so the NIC’s Rx descriptor-queue can be conveniently inspected

Page 10: What’s needed to receive?

Sixteen packet-buffers

• Our ‘nicspy.c’ driver allocates 16 buffers of size 1536 bytes (i.e., for normal ethernet)

unused unused

32-KB allocated (16 packet-buffers, plus Rx-Descriptor Queue)

#define KMEM_SIZE 0x8000 // 32KB = size of kernel memory allocation

void *kmem = kzalloc( KMEM_SIZE, GFP_KERNEL );if ( !kmem ) return –ENOMEM;

for the Rx Descriptor Queue (256 bytes)

for the sixteen packet-buffers

Page 11: What’s needed to receive?

Format for an Rx Descriptor

Base-address (64-bits) statusPacket-length

Packet-checksum

VLANtag

errors

16 bytes

The device-driver initializes this ‘base-address’ field with the physical address of a packet-buffer

The network controller will ‘write-back’ the values for these fields when it has transferred a received packet’s data into this packet-buffer

Page 12: What’s needed to receive?

Suggested C syntax

typedef struct {unsigned long long base_address;unsigned short packet_length;unsigned short packet_cksum;unsigned char desc_status;unsigned char desc_errors;unsigned short VLAN_tag;} RX_DESCRIPTOR;

‘Legacy Format’ for the Intel Pro1000 network controller’s Receive Descriptors

Page 13: What’s needed to receive?

RxDesc Status-field

PIF IPCS TCPCS VP IXSM EOP DD

7 6 5 4 3 2 1 0

DD = Descriptor Done (1=yes, 0=no) shows if nic is finished with descriptor EOP = End Of Packet (1=yes, 0=no) shows if this packet is logically last IXSM = Ignore Checksum Indications (1=yes, 0=no) VP = VLAN Packet match (1=yes, 0=no) USPCS = UDP Checksum calculated in packet (1=yes, 0=no) TCPCS = TCP Checksum calculated in packet (1=yes, 0=no) IPCS = IPv4 Checksum calculated on packet (1=yes, 0=no) PIF = Passed In-exact Filter (1=yes, 0=no) shows if software must check

UDPCS

Page 14: What’s needed to receive?

RxDesc Error-field

RXE IPE TCPE reserved=0 SE CE

7 6 5 4 3 2 1 0

RXE = Received-data Error (1=yes, 0=no) IPE = IPv4-checksum error TCPE = TCP/UDP checksum error (1=yes, 0=no) SEQ = Sequence error (1=yes, 0=no) SE = Symbol Error (1=yes, 0=no) CE = CRC Error or alignment error (1=yes, 0=no)

SEQreserved=0

Page 15: What’s needed to receive?

Essential ‘receive’ registers

enum {

E1000_CTRL 0x0000, // Device Control

E1000_STATUS 0x0008, // Device Status

E1000_ICR 0x00C0, // Interrupt Cause Read

E1000_IMS 0x00D0, // Interrupt Mask Set

E1000_IMC 0x00D8, // Interrupt Mask Clear

E1000_RCRL 0x0100, // Receive Control

E1000_RDBAL 0x2800, // Rx Descriptor Base Address Low

E1000_RDBAH 0x2804, // Rx Descriptor Base Address High

E1000_RDLEN 0x2808, // Rx Descriptor Length

E1000_RDH 0x2810, // Rx Descriptor Head

E1000_RDT 0X2818, // Rx Descriptor Tail

E1000_RXDCTL 0x2828, // Rx Descriptor Control

E1000_RA 0x5400, // Receive address-filter Array

};

Page 16: What’s needed to receive?

Receive Control (0x0100)

R=0

0 0FLXBUFSE

CRCBSEX R

=0PMCF DPF R

=0CFI

CFIEN

VFE BSIZE

BAM

R=0

MO DTYP RDMTS

ILOS

SLU

LPE UPE 0 0 R=0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

SBPEN

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

LBM MPE

EN = Receive Enable DTYP = Descriptor Type DPF = Discard Pause Frames SBP = Store Bad Packets MO = Multicast Offset PMCF = Pass MAC Control FramesUPE = Unicast Promiscuous Enable BAM = Broadcast Accept Mode BSEX = Buffer Size ExtensionMPE = Multicast Promiscuous Enable BSIZE = Receive Buffer Size SECRC = Strip Ethernet CRCLPE = Long Packet reception Enable VFE = VLAN Filter Enable FLXBUF = Flexible Buffer sizeLBM = Loopback Mode CFIEN = Canonical Form Indicator EnableRDMTS = Rx-Descriptor Minimum Threshold Size CFI = Canonical Form Indicator bit-value

We used 0x0000801C in RCTL to prepare the ‘receive engine’ prior to enabling it

Page 17: What’s needed to receive?

Device Control (0x0000)

PHYRST

VME R=0

TFCE RFCE RST R=0

R=0

R=0

R=0

R=0

ADVD3

WUC

R=0

D/UDstatus

R=0

R=0

R=0

R=0

R=0

FRCDPLX

FRCSPD

R=0

SPEED R=0

SLU

R=0

R=0

R=1

0 0 FD

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

GIOMD

R=0

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

FD = Full-Duplex SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved)GIOMD = GIO Master Disable ADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link Up D/UD = Dock/Undock status RFCE = Rx Flow-Control EnableFRCSPD = Force Speed RST = Device Reset TFCE = Tx Flow-Control EnableFRCDPLX = Force Duplex PHYRST = Phy Reset VME = VLAN Mode Enable

82573LWe used 0x040C0241 to initiate a ‘device reset’ operation

Page 18: What’s needed to receive?

0

Device Status (0x0008)

? 0 0 0 0 0 0 0 0 0 0 0GIO

MasterEN

0 0 0

0 0 0 0 PHYRA ASDV

ILOS

SLU

0 TXOFF 0 0

FD

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

FunctionID

LU

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16

SPEED

FD = Full-DuplexLU = Link UpTXOFF = Transmission PausedSPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved)ASDV = Auto-negotiation Speed Detection ValuePHYRA = PHY Reset Asserted

82573L

some undocumented functionality?

Page 19: What’s needed to receive?

PCI Bus Master DMA82573L i/o-memory

RX and TX FIFOs(32-KB total)

Host’s Dynamic Random Access Memory

Rx Descriptor Queue packet-buffer

packet-buffer

packet-buffer

packet-buffer

packet-buffer

packet-buffer

packet-buffer

DMA

on-chip RX descriptors

on-chip TX descriptors

Page 20: What’s needed to receive?

Our ‘read()’ algorithm

unsigned int rx_curr;

ssize_t my_read( struct file *file, char *buf, size_t len, loff_t *pos ){

// our global variable ‘rx_curr’ is the descriptor-array index // for the next receive-buffer descriptor to be processed

if ( this descriptor’s status is zero ) put calling task to sleep;

// wakeup the task when a fresh packet has been received

copy received data from the packet-buffer to user’s bufferclear this descriptor’s statusadvance our global variable ‘rx_curr’ to the next descriptorreturn the number of data-bytes transferred

}

Page 21: What’s needed to receive?

‘nicspy.cpp’

• This application calls our device-driver’s ‘read()’ function repeatedly, and displays the ‘raw’ ethernet packet-data each time

• It requires our ‘nicspy.c’ device-driver to be installed in the kernel, obviously

• There’s no ‘clash’ of filenames here – and their similarity helps keep them together:

nicspy.c and nicspy.ko (the kernel-side)nicspy.cpp and nicspy ( the user-side )

Page 22: What’s needed to receive?

in-class demo

• We can install ‘nicspy.ko’ on one of our anchor machines – making sure ‘eth1’ is ‘down’ before we do our module-install – and then we run ‘nicspy’ on that machine

• Next we install our ‘nicping.ko’ module on some other anchor machine – be sure its ‘eth1’ interface is ‘down’ beforehand – and then use ‘cat /proc/nicping’ for a transmit

Page 23: What’s needed to receive?