1. 1 Andrew Vagin Developer, Linux Kernel team OpenVZ and Linux
Kernel Testing
2. 2 Agenda Linux containers and OpenVZ Ideal test lab Testing
techniques Performance testing Anecdotes
3. 3 Andrew Morton I'm curious. For the past few months,
[email protected] have discovered (and fixed) an ongoing stream of
obscure but serious and quite long-standing bugs. How are you
discovering these bugs? Andrew added later: hm, OK, I was
visualizing some mysterious Russian bugfinding machine or
something. Don't stop ;) David Miller This issue has existed since
the very creation of the netlink code :-)
4. 4 Linux Containers (LXC) Many isolated environments on top
of a single kernel Namespaces Resource accounting Better resource
accounting Checkpointing and live migration Extra features: cpu
limits, NFS inside CTs, etc OpenVZ Containers
5. 5 What makes a good test lab? Fully automated system with
deployment service A web interface for test scheduling Standard
test sets (combo #3, make it large) A web interface for test
results (comparisons, graphs, logs) Integration with a bug tracking
system Net or serial console to collect kernel oopses KVM, power
switch, other goodies
6. 6 How do we find bugs in the mainstream kernel Containers
help us find more bugs Independent life cycles Precise resource
accounting Containers allow us to Test initialization/finalization
of kernel subsystems Test error paths Catch more leaks than the
regular testing does Catch more race conditions by means of stress
testing
7. 7 Start/stop test Massive parallel start/stop and
suspend/resume Random resource parameters Helps to catch: Race
conditions Test error paths Memory leaks
8. 8 What makes a good performance test? Effective load: Atomic
(UnixBench) Complex (LAMP, SPEC-JBB, vConsolidate) Sane test
environment (no random cron jobs etc.) Automation (minimize human
interaction) Reproducible results, minimize variability Understand
test results, even good ones
9. 12 Density testing High density is important feature of
OpenVZ (vs VMs) Test measures response time on a number of CTs
increasing the number of CTs until time is bad It's not a stress
test Produce a big resource overcommit
10. 13 Other useful tests Week load test replays real httpd
logs in real containers Feature tests: isolation, CPU scheduler,
checkpointing, network virtualization, second level quota, etc.
Third-party tests: LTP, onnectathon, vSpecJBB, vConsolidate, UNIX
bench, sysbench, DVD-store, Netperf
11. 14 Real life stories
12. 15 (1) How a Russian bug finding machine works QA found a
leak of 78 bytes of kernel memory Developer was unable to reproduce
a bug He found that this is a leak of a 'struct user' object He
audited kernel code which references this object Found one
suspicious place Wrote a demo code to trigger the bug, and a fix
... PROFIT!
13. 16 (2) How resource controls prevented a DoS attack uid /
resource held maxheld barrier limit failcnt numothersocks 9 360 360
360 1 uid / resource held maxheld barrier limit failcnt kmemsize
1237973 14372344 14372700 14790164 80 numothersocks 9 360 360 360 1
A simple kernel attack using socketpair() a.k.a. CVE 2010-4249
14. 18 (3) How a guy measured netns performance It was a nice
sunny day... 5 different configurations to test Unpredictable,
random results CPU throttling caused by overheating; adding a case
fan helped!
15. 20 Conclusion Containers are good for kernel testing
Resource limits (cgroups) are also helpful [most] performance tests
are hoax