Upload
ea-faisal
View
45
Download
0
Embed Size (px)
DESCRIPTION
Slides used for presentation during MOSC Q4 Meetup 2015
Citation preview
$ whoami
Engku Ahmad Faisal
⇛ github.com/efaisal⇛ twitter.com/efaisal⇛ facebook.com/eafaisal⇛ plus.google.com/u/0/+EAFaisal
Linux user since 1996/1997
Attempted to contribute to open source projects:few accepted, most rejected ;-P
$ whoami
Worked with Nexo Prima Sdn Bhd
● Open Source Cloud Infrastructure○ Virtualisation: oVirt/OpenStack○ Storage: Gluster/Ceph
● High Availability & Scalability Infrastructure○ Linux-based solutions
● System Performance Tuning & Profiling○ Focusing on web-based application on Linux platform
TCP STATE MACHINE
TCP :: ACTIVE CLOSE
3-way handshakeESTABLISHED
CLOSED
CLOSING
TIME_WAIT
FIN_WAIT_1
FIN_WAIT_2
Active C
lose
2MSL Timeout
close()/fin
ack/-
fin/ack
fin+ack/ackack/-
fin/ack
TCP :: ACTIVE CLOSE
● By the initiator of close()● TIME-WAIT & 2MSL are there for good reasons:
○ due to nature of Internet - packet lost, re-transmission, arrives late○ to ensure the other end properly closed
● RFC 793 states 2MSL should be 4 minutes● 2MSL:
○ MS Windows - 4 minutes○ Linux - 1 minute (hard coded)
TIME-WAIT is good for TCP communication over the Internet
TCP :: PASSIVE CLOSE
3-way handshakeESTABLISHED
CLOSED
LAST_ACK
CLOSE_WAIT Passive C
losefin/ack
close()/fin
ack/-
TCP :: PASSIVE CLOSE
● By the receiver of close()● CLOSE-WAIT
○ waits up to 60 seconds in Linux○ configurable via tcp_fin_timeout
● WARNING!Some resources on the Web wrongly informed their readers to tweak tcp_fin_timeout to tune TIME-WAIT
WEB APPLICATION OF TODAY
SIMPLIFIED WEB APP STACK
Client
Load Balancer
Web App
Database MQCacheRESTAPI
WEB APP STACK
● Supporting services for Web App layer typically use TCP as transport protocol● Web App layer is both:
○ TCP server listening to connection from the client○ TCP client connecting to various supporting services
● Consider a LAMP stack + memcached server○ Each HTTP request, creates/opens a TCP connection to the memcached○ At the end of the request, the connection is closed○ OMG! Ephemeral connection!
○ If we have more supporting services (MQ, REST API, etc), there might be more open/close
operations for each request○ HTTP is considered ephemeral by “nature”
IMPACT AND PROBLEMS
BUSY SERVER WITH EPHEMERAL CONNECTIONS
● Busy server, e.g. 1,000 HTTP requests/second● Web App layer also open TCP connection to backend services at that rate or
more● In 1 minute, we’re going to have thousands lingering TCP TIME_WAIT● You can check using netstat or ss command
$ ss -nt state time-wait$ netstat -tn | grep TIME_WAIT
PROBLEMS: CONNECTION TABLE SLOT
Connection in TIME-WAIT state hold a local port for 1 minute
Local port range is finite - 16-bit integer
In many distro, default to around 30,000
Can be changed: net.ipv4.ip_local_port_range
If local port range is exhausted, any connect() results in EADDRNOTAVAIL
PROBLEMS: ADDITIONAL MEMORY & CPU USAGE
● Memory Usage to Hold Socket Structure○ Though not really significant but annoying enough
● Additional CPU Usage○ Searching for free port uses CPU○ Wasting CPU cycle to iteratively purge tons TIME_WAIT connections
EXISTING & POTENTIAL SOLUTIONS
SOLUTION 1: tcp_tw_reuse
From Linux doc:“Allow to reuse TIME-WAIT sockets for new connections when it is safe from protocol viewpoint. Default value is 0. It should not be changed without advice/request of technical experts.”
Commonly recommended to be enabled$ echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
Dependent on another kernel param to be enabled: net.ipv4.tcp_timestamps
Does it really work?
SOLUTION 2: TIME-WAIT NEGOTIATION
Proposed by Theodore Faber, Joe Touch & Wei Yue from University of Southern California in 1999
No code available, claimed have experimental code written for SunOS 4.1.3
Involves modifying TCP by adding a new TCP option called TW-Negotiate, negotiated during the three-way handshake
Not a viable solution, simply a theoretical one
INTRODUCING LINUXTCPTW
LINUXTCPTW
Implementation of an old idea
● Once discussed in kernel core dev mailinglist to make TIME-WAIT tunable● Rejected by kernel core dev - TIME-WAIT is there for good reasons● Easily abused to make TCP non-compliant to standard● Open source project to create patch set to the kernel for configurable TIME-
WAIT● Introduce a new kernel param - tcp_timewait_len● A new entry in proc fs - /proc/sys/net/ipv4/tcp_timewait_len● Able to use sysctl for configuration - net.ipv4.tcp_timewait_len
THE PROJECT
Project lives at https://github.com/efaisal/linuxtcptw/
Binary release available for CentOS 6 and 7 at https://github.com/efaisal/linuxtcptw/releases
Unfortunately not battle tested in production environment yet - any volunteer?
Currently working on Ubuntu 14.04 LTS kernel
THANK YOU