29
Teutholo gy Presented 2011-07-01 [email protected] image credit: http://www.flickr.com/photos/peterblapps/3250800528/

Teuthology Presented 2011-07-01 [email protected] image credit:

Embed Size (px)

Citation preview

Page 1: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Teuthology

Presented [email protected]

image credit: http://www.flickr.com/photos/peterblapps/3250800528/

Page 2: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Ceph as in

CephalopodaMolluscaInvertebrae

TeuthologyMalacology

Page 3: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Not your grandmother's software stack

Page 4: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

We tried Autotest

... and quickly discovered it's limitations

Currently at 15 independent patches, 24 files changed, 575 insertions(+), 19 deletions(-)

Realized Autotest's architecture is working against us.

We still use it for it's packaged "client side" tests, but not its multi-machine features.

Page 5: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Multi-machine control

Python+ Paramiko (SSH)+ gevent= orchestra

Real-timeInteractiveCentral controllerFull SSH protocol  (channels!)Not ChefNot Fabric

cluster = Cluster(...)cluster.run(...)cluster.only('x86').run(...)cluster.exclude('x86').run(...)

http://github.com/tv42/orchestra

Page 6: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Teuthology is a test runner

Run tasks on targets as told to by roles.

AutomaticallySetupMonitor healthRun test(s)Archive resultsArchive logs, core dumps, etcClean up

http://github.com/tv42/teuthology

Read the README

Page 7: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Run tasks on targets as told to by roles.

targets:- [email protected] [email protected] [email protected]

You need to have SSH working, without passphrases.

You need passphraseless sudo on the remote host.

YAML format:lists, dicts, strings, numbers.

Page 8: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Run tasks on targets as told to by roles.

roles:- [mon.0, mds.0, osd.0]- [mon.1, osd.1]- [mon.2, client.0]

Page 9: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Run tasks on targets as told to by roles.

roles:- [mon.0, mds.0, osd.0]- [mon.1, osd.1]- [mon.2, client.0]

targets:- [email protected] [email protected] ubuntu@sepiaZZ...

Page 10: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Run tasks on targets as told to by roles.

tasks:- ceph:- kclient: [client.0]- autotest:    client.0: [dbench]

Page 11: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Interactive mode

tasks:- interactive:

INFO:teuthology.run_tasks:Running task interactive...Ceph test interactive mode, use ctx to interact with the cluster, press control-D to exit...>>> 1+12>>>

Page 12: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Interactive mode

>>> ctx.cluster.only('osd.0').run(args=['uptime'])INFO:orchestra.run.out: 13:05:38 up 42 days, 23:17,  0 users,  load average: 0.12, 0.09, 0.07[<orchestra.run.RemoteProcess object at 0x28bd110>]

One RemoteProcess per command run.

Page 13: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Using just one Remote first

>>> (remote,) = ctx.cluster.only('osd.0').remotes.keys()>>> proc = remote.run(args=['echo', '*'])INFO:orchestra.run.out:*>>> proc<orchestra.run.RemoteProcess ...>>>> proc.command"echo '*'"

Shell quoting done for you.

Works like ctx.cluster.run.

Just one RemoteProcess, not a list.

Page 14: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Failing processes

>>> remote.run(args=['bork'])INFO:orchestra.run.err:bash: bork: command not found...CommandFailedError: Command failed with status 127: 'bork'

>>> proc = remote.run(args=['bork'],...     check_status=False)INFO:orchestra.run.err:bash: bork: command not found>>> proc.exitstatus127

Page 15: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Concurrency

>>> proc = remote.run(args=['uptime'], wait=False)>>> proc<orchestra.run.RemoteProcess object at 0x28bd1d0>>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2a10>

Page 16: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Concurrency

>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2a10>>>> import time; time.sleep(0)INFO:orchestra.run.out: 13:16:48 up 42 days, 23:28,  0 users,  load average: 0.35, 0.15, 0.08>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2a10>>>> proc.exitstatus.get()0

Page 17: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Capturing stdout/stderr

>>> from orchestra import run>>> proc = remote.run(args=['uname', '-m'],...     wait=False, stdout=run.PIPE)>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2dd0>>>> proc.exitstatus.ready()    # just for debugFalse>>> proc.stdout.read()'x86_64\n'>>> proc.exitstatus.get()0

Page 18: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Deadlocks you must avoid:stdout vs stderrstdout/err vs stdinstdout/err vs exit

Page 19: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Using Cluster

>>> processes = ctx.cluster.run(...     args=['uname', '-m'],...     wait=False,...     stdout=run.PIPE)>>> processes[<orchestra.run.RemoteProcess object at 0x28bdbf0>, <orchestra.run.RemoteProcess object at 0x28bdb90>, <orchestra.run.RemoteProcess object at 0x28bdad0>]>>> [p.stdout.read() for p in processes]['x86_64\n', 'x86_64\n', 'x86_64\n']>>> run.wait(processes)>>> 

Page 20: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Controlling stdout/stderr logging

>>> import logging>>> log = logging.getLogger(__name__)>>> log.info('foo')INFO:__builtin__:foo>>> ctx.cluster.only('osd.0').run(...     args=['uptime'],...     logger=log.getChild('uptime'))INFO:__builtin__.uptime.out: 13:52:49 up 43 days, 4 min,  0 users,  load average: 0.00, 0.01, 0.05[<orchestra.run.RemoteProcess object at 0x28bdb90>]>>> 

Usually looks like teuthology.task.foo

Page 21: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Tasks can be context managers

tasks:- ceph:- kclient: ...- autotest: ...- interactive:

Page 22: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

/tmp/cephtest

Must not exist already, or target is dirty  (see teuthology-nuke, later)

Used by tasks to store things in

Tasks are responsible for cleaning up after themselves  (no toplevel rm -rf, to flush out the bugs)

Anything in /tmp/cephtest/archive gets archived

Please bzip2 -9 any big files your task leaves in archive

Page 23: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Cleanups & failures

Clean up can fail, further cleanups are still attempted  -> always study the first error, not the last one.

If a task fails to clean up, the targets are left "dirty".

teuthology-nuke is a Big Hammer.

Page 24: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Archived results

2011-06-21T10-00-44/├── ceph-sha1├── config.yaml├── remote│   ├── [email protected]│   │   ├── log│   │   │   ├── client.admin.log.bz2│   │   │   ├── mds.0.log.bz2│   │   │   ├── mon.0.log.bz2│   │   │   └── osd.0.log.bz2│   │   └── syslog│   │       ├── kern.log.bz2│   │       └── misc.log.bz2│   ├── [email protected] ...│   └── [email protected]│       ├── autotest│       │   └── ...│       ├── log ...│       └── syslog ...├── summary.yaml└── teuthology.log

Page 25: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

gitbuilder

A low-key low-hype continuous integration tool

Builds tags and heads of branches

On bad build, tries older commits until finds green

We have it building ceph and our kernel fork

http://ceph.newdream.net/gitbuilder/http://ceph.newdream.net/gitbuilder-i386/http://ceph.newdream.net/gitbuilder-gcov-amd64/http://ceph.newdream.net/gitbuilder-deb-amd64/http://ceph.newdream.net/gitbuilder-kernel-amd64/

Page 26: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:
Page 27: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

We made gitbuilder create tarballs

http://ceph.newdream.net/gitbuilder/output/ref/origin_master/

Index of /output/ref/origin_master/mode links bytes last-changed name dr-x 2 4096 Jun 29 13:58 ./ dr-x 28 12288 Jun 29 15:16 ../ -r-- 1 149323650 Jun 29 13:58 ceph.x86_64.tgz -r-- 1 41 Jun 29 13:57 sha1

Don't trust the links, ProxyPass confuses the web server

Fetch .../output/origin_master/sha1, then fetch .../output/sha1/SHA1_HERE/ceph.x86_64.tgz

Page 28: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Future and topics not covered

teuthology-suitenightly runsmachine allocationgcovflavorscustom ceph buildsinstalling custom kernelsfailure testingmonitor health

Page 29: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:

Thank YouQuestions?

[email protected]