What's Under Your Hood? Implementing a Network Monitoring System

Preview:

DESCRIPTION

What's Under Your Hood? Implementing a Network Monitoring System. jonschipp@gmail.com. jonschipp@gmail.com. Who am I?. Jon Schipp Unix Admin Linux & Unix User Group Southern Indiana Computer Klub. jonschipp@gmail.com. and. I like computers a lot. jonschipp@gmail.com. - PowerPoint PPT Presentation

Citation preview

04/22/23 1

What's Under Your Hood?Implementing a Network Monitoring

System

jonschipp@gmail.com

04/22/23 2

Who am I?

jonschipp@gmail.com

Jon Schipp

Unix Admin

Linux & Unix User Group

Southern Indiana Computer Klub

04/22/23 3

and...

jonschipp@gmail.com

I like computers a lot

04/22/23 4

What's Network Monitoring? Monitoring?

jonschipp@gmail.com

Monitoring your network

Collecting data i.e. network traffic

Interpreting the data

04/22/23 5

Why?

jonschipp@gmail.com

Network issues

Attack detection

Record keeping

Fun

04/22/23 6

Focus

jonschipp@gmail.com

Small/Medium size business

Basement endeavors

Cheap goods

Working with what you have

04/22/23 7

where the magic happens

jonschipp@gmail.com

04/22/23 8

gimme the data

jonschipp@gmail.com

hubs

monitor/SPAN ports, port mirroring

taps

ip forwarding/relaying/tunneling, whatev

04/22/23 9

04/22/23 10

Forwarding/Relaying

jonschipp@gmail.com

Wireshark Remote Feature Network Minor Pro: Pcap-over-IP

tcpdump -nni eth0 -s0 -w -| nc 192.168.1.254 33246

SSL/Encryption: ssh, socat, ncat, crypcat, stunnel

Netfilter's Iptablesiptables -t mangle -A PREROUTING -p tcp -m multiport --dport 80,443,22,20,21 -i eth0 -j TEE --gateway 192.168.1.254 iptables -t mangle -A PREROUTING -p tcp -m multiport --dport 80,443,22,20,21 -o eth0 -j TEE --gateway 192.168.1.254

OpenBSD's PF pass out on em0 dup-to (em1 192.168.1.254) proto tcp from any to any port { 80, 443, 22, 20 ,21 } pass in on em0 dup-to (em1, 192.168.1.254) proto tcp from any to any port { 80, 443, 22, 20, 21 }

04/22/23 11

Architecture

jonschipp@gmail.com

04/22/23 12

High Speed Packet Capture

jonschipp@gmail.com

High-end equipment is expensive

DIY: tuning and compiling

Hardware is pretty fast nowadays but...

We are using software that isn't designed for efficient packet capture

04/22/23 13

NIC's

jonschipp@gmail.com

Get a quality card

NAPI is good

DMA is good

Intel PRO/1000 MT Gigabit models are generally good, $30 on Ebay

04/22/23 14

PCI buses

jonschipp@gmail.com

(bus speed in MHz) * (bus width in bits) / 8 = speed in Megabytes/second

PCI 66 MHz * 32 bit / 8 = 264 MB/s PCI X 66 MHz * 64 bit / 8 = 400 MB/s (minus 20% overhead) PCI X 133 MHz * 64 bit / 8 = 850 MB/s (minus 20% overhead) PCI X 266 MHz * 64 bit / 8 = 1700 MB/s (minus 20% overhead) PCI X 533 MHz * 64 bit / 8 = 3400 MB/s (minus 20% overhead) PCIe v1 2500 Mhz * 32 1 bit lanes / 8 = 250 MB/s (minus 20% overhead) PCIe v2 x1 5000 Mhz * 1 1 bit lane / 8 = 500 MB/s (minus 20% overhead) PCIe v2 x2 5000 Mhz * 2 1 bit lanes / 8 = 1000 MB/s (minus 20% overhead) PCIe v2 x4 5000 Mhz * 4 1 bit lanes / 8 = 2000 MB/s (minus 20% overhead) PCIe v2 x8 5000 Mhz * 8 1 bit lanes / 8 = 4000 MB/s (minus 20% overhead) PCIe v2 x16 5000 Mhz * 16 1 bit lanes / 8 = 8000 MB/s (minus 20% overhead) PCIe v2 x32 5000 Mhz * 32 1 bit lanes / 8 = 16000 MB/s (minus 20% overhead) PCIe v3 x32 5000 Mhz * 32 1 bit lanes / 8 = 19700 MB/s (minus 1.5% overhead)

1000/8 = 128 Megabytes/second.

10000/8 = 1250 Megabytes/second

04/22/23 15

Other things

jonschipp@gmail.com

Decent commodity CPU, e.g. Opteron whoops Xeon in capture

SMP is good

If you plan on storing the data, writing to disk will be a bottleneck

RAID Striping, SATA? for sure SSD (maybe ?) nah

04/22/23 16

Typical Frame Processing

jonschipp@gmail.com

Frame reaches NIC Ethernet preamble is removed FCS is calculated, if bad, dropped If interface is set in promiscuous mode, capture all Else, only process when dst MAC is me (unicast), or broadcast, or multicast (if on) FIFO to kernel ring buffer, CPU or DMA NIC generates an interrupt, interrupt handler is called Passed to host stack → ip_input module → tcp/udp module → userspace

04/22/23 17

Frame Processing

jonschipp@gmail.com

04/22/23 18

Specimen

jonschipp@gmail.com

FreeBSD 8.2-RELEASE

Ubuntu Server 10.04

04/22/23 19

mbuf kernel structure

jonschipp@gmail.com

FreeBSD - data and headers are stored in mbufs and mbuf clusters $netstat -m | head -n 3

82/653/735 mbufs in use (current/cache/total)0/648/648/25600 mbuf clusters in use (current/cache/total/max)0/256 mbuf+clusters out of packet secondary zone in use (current/cache)

sysctl kern.ipc.nmbclusters=25600 (default)

man mbuf: The total size of an mbuf, MSIZE, is a constant defined in <sys/param.h>.

$grep -H -n MSIZE /sys/sys/param.hsys/sys/param.h:145:#define MSIZE 256 /* size of an mbuf */

$ vmstat -z | grep mbuf_cluster mbuf_cluster: 2048, 25600

^size^ ^limit^

04/22/23 20

sk_buff kernel structure

jonschipp@gmail.com

Linux - data and headers are stored in sk_buffs

/usr/include/linux/skbuff.h

04/22/23 21

Problems

jonschipp@gmail.com

Each packet generates an interrupt, this can lead to receive live lock/interrupt storm Context switches

System Calls

04/22/23 22

Solutions

jonschipp@gmail.com

Device Polling

NAPI

Shared memory, mmap(), and Zero Copy

Bypassing host stack

04/22/23 23

Solutions, less so

jonschipp@gmail.com

Checksum offloading

Large Receive Offload (LRO)

Larger on-board memory size

More data descriptors

04/22/23 24

Capture Mechanisms/Subsystems

jonschipp@gmail.com

Berkeley Packet Filter (BPF) Filter packets before they get to user space

Linux Socket Filter (LSF) Extended BPF (kinda)

and PF_RING (Linux)

Others: CSPF, NDIS, xPF, MPF, DPF, Swift and so on...

04/22/23 25

libpcap

jonschipp@gmail.com

C library for packet capture Runs on almost all the modern Unices winpcap for windows

When data reaches user space, it's stored in the libpcap buffer, applications read from it

Provides link layer access to data available on the network through interfaces attached to the system.

04/22/23 26

FreeBSD Frame Processing

jonschipp@gmail.com

04/22/23 27

FreeBSD Processing cont.

jonschipp@gmail.com

3 copies due to double buffer

Deals with smaller buffers compared to Linux

Half of the double buffer is copied to user space

Packet is passed to each BPF device, /dev/bpf[0-9] (where application via libpcap binds to)

App reads from HOLD buffer, data is copied from the STORE buffer into the HOLD buffer

04/22/23 28

Linux Frame Processing

jonschipp@gmail.com

04/22/23 29

Linux Processing cont.

jonschipp@gmail.com

2 copies

Deals with larger buffers compared to FreeBSD Smart queue, pointers

Packets copied individually, not whole buffers full of packets

If packets are available, wake up user spacer(libpcap) to grab data from LSF

04/22/23 30

Tuning: Interrupt Livelock

jonschipp@gmail.com

Interrupt usage high?

Most modern Linux kernels are compiled with device polling FreeBSD does not have it on by defaultoptions DEVICE_POLLINGoptions HZ=1000make buildkernel KERNCONF=NEWKERNmake installkernel KERNCONF=NEWKERNifconfig em0 polling

Get a New API (NAPI) card

04/22/23 31

Tuning: Buffers

jonschipp@gmail.com

Kernel dropping lots of packets?

Increase the size of your kernel buffers

FreeBSD sysctl net.bpf.bufsize=4096 sysctl net.bpf.maxbufsize=524288

Linux sysctl net.core.rmem_default=114688 sysctl net.core.rmem_max=131071 net.core.netdev_max_backlog=1000

Increase kernel virtual memory size

04/22/23 32

Tuning: Drivers

jonschipp@gmail.com

Bad NIC performance?

FreeBSD: man driver e.g. man em: hw.em.rxd Number of receive descriptors allocated by the driver. The default value is 256. The 82542 and 82543-based adapters can handle up to 256 descriptors, while others can have up to 4096. echo hm.em.rxd=4096 >> /boot/loader.conf

Linux: ethtool, find driver README file (/usr/src/linux/) ethtool –g eth0

ethtool -G rx 4096

04/22/23 33

tcpdump tests, average

jonschipp@gmail.com

6,000,000 packets in 60 seconds using iperf, loss OS defaults, hardware: Dell PowerEdge 2850, Xeon (Quad), 4GB RAM tcpdump -nni em0 -w test96.pcap | FreeBSD: 0%, Linux: 8%

tcpdump -nni em0 -w /dev/null | FreeBSD: 0%, Linux: 0%

tcpdump -nni em0 -s0 -w test65535.pcap | FreeBSD: 1.6%, Linux: 22% tcpdump -nni em0 -s0 /dev/null | FreeBSD: 0%, Linux: .02%

04/22/23 34

libpcap buffers

jonschipp@gmail.com

libpcap library initializes libpcap buffer to 32kb, if bpf value is less than 32kb if ((ioctl (fd, BIOCGBLEN, (caddr_t)&v) < 0) || v < 32768) v = 32768;

Linux initializes its buffer size at 512Kb

Increase BPF buffer size globally, all apps, remember? net.bpf.bufsize, net.bpf.maxbufsize

Libpcap will initialize its buffer to size in net.bpf.bufsize

Set buffer for tcpdump only, use -B 524288 (512kb)

04/22/23 35

FreeBSD, interface drop counts

jonschipp@gmail.com

$ netstat -dI em0Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Dropem0 1500 <Link#2> 00:02:b3:9a:c2:03 2083316 0 0 1043607 0 0 0

$ netstat –BPid Netif Flags Recv Drop Match Sblen Hblen Command90460 em0 p--s--- 103 0 103 632 0 tcpdump43960 em0 p--s--- 3803363 0 3803363 712 0 ntop $ sysctl dev.em.0.droppeddev.em.0.dropped: 0

$ grep -R -H -n if_iqdrops /usr/src/

netstat

sys/dev/e1000/if_lem.c:3470:    ifp->if_iqdrops++;usr.bin/netstat/if.c:289:       idrops = ifnet.if_iqdrops

04/22/23 36

Linux, interface drop counts

jonschipp@gmail.com

$ ifconfig -a | egrep -

e "(^eth|drop)" $ ethtool -S eth0

$ awk '{ print $1, $5 }' /proc/net/dev

Inter-| face droplo: 0br0: 3354eth0: 0eth1: 0eth2: 0eth3: 14eth4: 0eth5: 103395

static int get_dev_fields(char *bp, struct interface *ife){ switch (procnetdev_vsn) { case 3: sscanf(bp, "%llu %llu %lu %lu %lu %lu %lu", &ife->stats.rx_bytes, &ife->stats.rx_packets, &ife->stats.rx_errors, &ife->stats.rx_dropped,

...

ifconfig

04/22/23 37

tcpdump/libpcap drops

jonschipp@gmail.com

“Packets captured” – Packets processed by tcpdump “Received by filter” – Passed the filter (LSF, BPF) “Dropped by kernel” - Not enough space in kernel buffer FreeBSD (kernel drops):

libpcap gets its drop count from the kernel (BPF)

ps_drop from pcap_stats() is bs_drop from BIOCGSTATS

Linux (kernel drops) libpcap gets its drop count from PF_PACKET’s PACKET_STATISTICS

ps_drop from pcap_stats() ps_ifdrop – Ubuntu addendum/patch (Linux , Tru64 Unix only) from /proc/net/dev

04/22/23 38

PF_RING for Linux

jonschipp@gmail.com

Creates new socket called PF_RING Works with existing PF_PACKET apps

Shared memory

Can bypass host stack, sniffing only

PF_RING aware drivers for faster capture: e1000, igb, ixgbe

04/22/23 39

PF_RING for Linux

jonschipp@gmail.com

Compile PF_RING

Compile PF_RING aware libpcap and tcpdump Load PF_RING kernel module modprobe pf_ring transparent_mode=2 enable_debug=0 enable_tx_capture=0 enable_ip_defrag=0

quick_mode=0

Recompile all apps to use new shared libraries, libpcap and PF_RING

./configure CPPFLAGS=”-I/usr/local/include” LDFLAGS=”-L/usr/local/lib -lpfring -lpcap” \ && make && make install

04/22/23 40

PF_RING DNA

jonschipp@gmail.com

Direct NIC Access, pure speed

Map NIC memory and registers to user land

Packet copy from the NIC to the DMA ring is done by the NIC's NPU

One application at a time can use the DMA ring

Requires DNA driver

04/22/23 41

PF_RING TNAPI

jonschipp@gmail.com

Threaded NAPI

04/22/23 42

vPF_RING

jonschipp@gmail.com

Virtual PF_RING

Hypervisor bypass

Zero-Copy

04/22/23 43

netmap FreeBSD

jonschipp@gmail.com

mmap() shared memory

Use less system calls Creates new device, /dev/netmap

1 GHz CPU can generate the 14.8 Mpps that can saturate a 10GigE interface

supports ixgbe, e1000, re

04/22/23 44

others to checkout

jonschipp@gmail.com

Ringmap – FreeBSD – code.google.com/p/ringmap/

Zero-copy sockets – FreeBSD: man zero_copy Requires specific NIC's

Recompile kernel with “options ZERO_COPY_SOCKETS”

MMAP() libpcap – Linux - http://public.lanl.gov/cpw/

The zero copy send and zero copy receive code can be individually turnedoff via the kern.ipc.zero_copy.send and kern.ipc.zero_copy.receive sysctlvariables respectively.

04/22/23 45

Interface Configuration

jonschipp@gmail.com

Linux FreeBSD /etc/network/interfaces /etc/rc.conf auto eth0

iface eth0 inet manual up ifconfig eth0 0.0.0.0 -arp up up ip link set eth0 promisc on up ip link set eth0 multicast on up ip link set eth0 mtu 1514 down ip link set eth0 promisc off down ifconfig eth0 down

auto eth1iface eth1 inet manual up ifconfig eth1 0.0.0.0 -arp up up ip link set eth1 promisc on up ip link set eth1 multicast on up ip link set eth1 mtu 1514 down ip link set eth1 promisc off down ifconfig eth1 down

ifconfig_em0=”inet 0.0.0.0 -arp promisc multicast mtu 1514 polling”

ifconfig_em1=”inet 0.0.0.0 -arp promisc multicast mtu 1514 polling”

Bridging two interfaces (Linux)

brctl addbr br0brctl addif br0 eth0 eth1ifconfig br0 up

04/22/23 46

Useful Applications

jonschipp@gmail.com

snort, ntop, tcpdump, iftop trafshow, wireshark, tshark, tcpick tcpflow, etherape, ngrep, tcptrack

suricata, bro-ids, ttt xplico, ifstat, tcpflow iptraf, bmon, bwm-ng, slurm dsniff, p0f, tcptrace, tcpreplay ipsumdump, speedometer

04/22/23 47

ntop

jonschipp@gmail.com

ntop -d -L -u ntop –access-log-file=/var/log/ntop/access.log -b -C –output-packet-path=/var/log/ntop-suspicious.log –local-subnets 192.168.1.0/24,192.168.2.0/24,192.168.3.0/24 -o -M -p /etc/ntop/protocol.list -i br0,eth0,eth1,eth2,eth3,eth4,eth5 -o /var/log/ntop

04/22/23 48

netsniff-ng

jonschipp@gmail.com

Linux, libpcap independent, zero-copy mechanism

Kernel compiled with CONFIG_PACKET_MMAP

04/22/23 49

Daemonlogger

jonschipp@gmail.com

Packet Logger & Soft Tap

This is a libpcap-based program.  It has two runtime modes:

1)It sniffs packets and spools them straight to the disk and can daemonize itself for background packet logging.

2)It sniffs packets and rewrites them to a second interface, essentially acting as a soft tap.  It can also do this in daemon mode.

04/22/23 50

etherape

jonschipp@gmail.com

04/22/23 51

iftop

jonschipp@gmail.com

04/22/23 52

IPTraf

jonschipp@gmail.com

04/22/23 53

Trafshow

jonschipp@gmail.com

04/22/23 54

tcpick

jonschipp@gmail.com

04/22/23 55

tcpstat

jonschipp@gmail.com

04/22/23 56

speedometer

jonschipp@gmail.com

04/22/23 57

bmon

jonschipp@gmail.com

04/22/23 58

Contact

jonschipp@gmail.com

Questions, comments, criticism: jonschipp@gmail.com

More info:

sickbits.networklabs.org/other/packetcapt dclinux.org

Recommended