230
infrastructure as code might be literally impossible part 2 joe damato packagecloud.io

Infrastructure as code might be literally impossible part 2

  • Upload
    ice799

  • View
    657

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Infrastructure as code might be literally impossible part 2

infrastructure as code might be literally impossible

part 2

joe damato packagecloud.io

Page 2: Infrastructure as code might be literally impossible part 2

part 1

bit.ly/impossible-infra

Page 3: Infrastructure as code might be literally impossible part 2

hi, i’m joei like computers

i once had a blog called timetobleed.com

@joedamato

Page 4: Infrastructure as code might be literally impossible part 2

packagecloud.io@packagecloudio

Page 5: Infrastructure as code might be literally impossible part 2

follow along

blog.packagecloud.io

Page 6: Infrastructure as code might be literally impossible part 2

hi

Page 7: Infrastructure as code might be literally impossible part 2

disclaimer

Page 8: Infrastructure as code might be literally impossible part 2
Page 9: Infrastructure as code might be literally impossible part 2
Page 10: Infrastructure as code might be literally impossible part 2
Page 11: Infrastructure as code might be literally impossible part 2
Page 12: Infrastructure as code might be literally impossible part 2

infrastructure as code might be impossible because nothing works.

Page 13: Infrastructure as code might be literally impossible part 2
Page 14: Infrastructure as code might be literally impossible part 2
Page 15: Infrastructure as code might be literally impossible part 2

cognitive load

Page 16: Infrastructure as code might be literally impossible part 2

too much stuff

Page 17: Infrastructure as code might be literally impossible part 2

coping strategies

Page 18: Infrastructure as code might be literally impossible part 2

coping w cognitive load

copy & paste configs

stackoverflow

Page 19: Infrastructure as code might be literally impossible part 2

BTWThis is actually part of another talk I’m working on called

Programmers should get paid more & work less

Page 20: Infrastructure as code might be literally impossible part 2

anw

Page 21: Infrastructure as code might be literally impossible part 2

the problem is so pronounced, that in some cases it’s impossible to do seemingly simple tasks

Page 22: Infrastructure as code might be literally impossible part 2

some examples then some thoughts.

Page 23: Infrastructure as code might be literally impossible part 2

Today’s cool stories1. SSL 2. APT 3. Linux Networking 4. Linux Threading (maybe) 5. Python packaging (maybe)

Page 24: Infrastructure as code might be literally impossible part 2

SSL

Page 25: Infrastructure as code might be literally impossible part 2

SSL is important

Page 26: Infrastructure as code might be literally impossible part 2

agreed?

Page 27: Infrastructure as code might be literally impossible part 2

Ubuntu & Debian

don’t agree

Page 28: Infrastructure as code might be literally impossible part 2

SSL doesn't work on Debian

/ Ubuntu

Page 29: Infrastructure as code might be literally impossible part 2
Page 30: Infrastructure as code might be literally impossible part 2
Page 31: Infrastructure as code might be literally impossible part 2

anw

Page 32: Infrastructure as code might be literally impossible part 2
Page 33: Infrastructure as code might be literally impossible part 2

LOL gnutls, who cares?

Page 34: Infrastructure as code might be literally impossible part 2

apt-get!git!

curl!ngIRCd!

Page 35: Infrastructure as code might be literally impossible part 2

well, actually you should use

OpenSSL

Page 36: Infrastructure as code might be literally impossible part 2
Page 37: Infrastructure as code might be literally impossible part 2
Page 38: Infrastructure as code might be literally impossible part 2

I like rabbits.

Page 39: Infrastructure as code might be literally impossible part 2

* 3. All advertising materials mentioning features or use of this * software must display the following… !

* 6. Redistributions of any form whatsoever must retain the following…

OpenSSL says…

Page 40: Infrastructure as code might be literally impossible part 2

GPL says

6. ….You may not impose any further restrictions on the recipients' exercise of the rights granted herein.

Page 41: Infrastructure as code might be literally impossible part 2

These two licenses are not compatible.

Page 42: Infrastructure as code might be literally impossible part 2

in other words

Page 43: Infrastructure as code might be literally impossible part 2

software licenses force you to use a particular SSL library with a very painful bug.

Page 44: Infrastructure as code might be literally impossible part 2

greetings

Page 45: Infrastructure as code might be literally impossible part 2

(not sayin that OpenSSL is

bug free)

Page 46: Infrastructure as code might be literally impossible part 2

(but, am sayin NSS and gnutls have less mindshare)

Page 47: Infrastructure as code might be literally impossible part 2

btw

Page 48: Infrastructure as code might be literally impossible part 2
Page 49: Infrastructure as code might be literally impossible part 2
Page 50: Infrastructure as code might be literally impossible part 2
Page 51: Infrastructure as code might be literally impossible part 2
Page 52: Infrastructure as code might be literally impossible part 2

(hi)

Page 53: Infrastructure as code might be literally impossible part 2

OK but I don’t care about SSL,

I use GPG.

Page 54: Infrastructure as code might be literally impossible part 2
Page 55: Infrastructure as code might be literally impossible part 2

NO.!plz stop.

Page 56: Infrastructure as code might be literally impossible part 2

anw

Page 57: Infrastructure as code might be literally impossible part 2

APT

Page 58: Infrastructure as code might be literally impossible part 2

file compression is important

Page 59: Infrastructure as code might be literally impossible part 2

agreed?

Page 60: Infrastructure as code might be literally impossible part 2

Ubuntu & Debian

don’t agree

Page 61: Infrastructure as code might be literally impossible part 2
Page 62: Infrastructure as code might be literally impossible part 2

(more about hash sum mismatch

later)

Page 63: Infrastructure as code might be literally impossible part 2
Page 64: Infrastructure as code might be literally impossible part 2

in other words

Page 65: Infrastructure as code might be literally impossible part 2

APT bug when decompressing XZ files makes it impossible to install software reliably

Page 66: Infrastructure as code might be literally impossible part 2

this is unfortunate due to the slow release cycle of Debian/Ubuntu updates

Page 67: Infrastructure as code might be literally impossible part 2

“SO easy, that type of work can be done over the weekend”

Page 68: Infrastructure as code might be literally impossible part 2

-o Acquire::CompressionTypes::Order::=gz

Page 69: Infrastructure as code might be literally impossible part 2

… OK … hopefully that repo has gzip’d metadata or it’s gonna be a real short

trip

Page 70: Infrastructure as code might be literally impossible part 2

anw

Page 71: Infrastructure as code might be literally impossible part 2

hash sum mismatch

Page 72: Infrastructure as code might be literally impossible part 2

have you seen it?

Page 73: Infrastructure as code might be literally impossible part 2

do you know what it

means?

Page 74: Infrastructure as code might be literally impossible part 2

do you know why it

happens?

Page 75: Infrastructure as code might be literally impossible part 2

what it means

Page 76: Infrastructure as code might be literally impossible part 2
Page 77: Infrastructure as code might be literally impossible part 2

happens all the time…

Page 78: Infrastructure as code might be literally impossible part 2
Page 79: Infrastructure as code might be literally impossible part 2
Page 80: Infrastructure as code might be literally impossible part 2

how could that happen?

Page 81: Infrastructure as code might be literally impossible part 2

one of (at least) 3 ways

1. stale cache between client/server 2. XZ decompression bug 3. apt race condition

Page 82: Infrastructure as code might be literally impossible part 2

how to avoid each1. better HTTP headers… or use SSL…. but like gnutls ?? lol

2. don’t generate XZ archives 3. ?????? race condition ??????

Page 83: Infrastructure as code might be literally impossible part 2

APT race

Page 84: Infrastructure as code might be literally impossible part 2

how it happens

1. Download + cache Release file 2. repo owner updates repo 3. Download Packages files 4. Compare checksums from the (stale) Release file against Packages file

5. hash sum mismatch

Page 85: Infrastructure as code might be literally impossible part 2

this means…

Page 86: Infrastructure as code might be literally impossible part 2

it is impossible to

1. update your repository without breaking clients

2. generate consistent mirrors of other repositories

Page 87: Infrastructure as code might be literally impossible part 2

!!!!!!this is bad!

!!!!!

Page 88: Infrastructure as code might be literally impossible part 2

but i’ve done all of these before and never had a

problem?

Page 89: Infrastructure as code might be literally impossible part 2

congrats you got lucky!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Page 90: Infrastructure as code might be literally impossible part 2

so, wait, joe, are you saying that APT metadata is inherently racy?

Page 91: Infrastructure as code might be literally impossible part 2

yes!

Page 92: Infrastructure as code might be literally impossible part 2

and ubuntu agrees

Page 93: Infrastructure as code might be literally impossible part 2
Page 94: Infrastructure as code might be literally impossible part 2

OK so APT repos and the tools you use to generate them are fundamentally racy

Page 95: Infrastructure as code might be literally impossible part 2
Page 96: Infrastructure as code might be literally impossible part 2

so now what?

Page 97: Infrastructure as code might be literally impossible part 2

Acquire-by-hash

Page 98: Infrastructure as code might be literally impossible part 2

Acquire-by-hash• Mechanism for downloading metadata by it’s

hash sum • Server should keep “a few” older copies of

metadata around • Prevents the race condition from happening

Page 99: Infrastructure as code might be literally impossible part 2

Acquire-by-hash• Added in APT 1.2.0 • Ubuntu Xenial and newer • Debian Stretch and newer • not supported by reprepro!• not supported by aptly

Page 100: Infrastructure as code might be literally impossible part 2

only one way to get working, consistent, not

racy APT metadata

Page 101: Infrastructure as code might be literally impossible part 2

use packagecloud.io

Page 102: Infrastructure as code might be literally impossible part 2

Linux Networking

Page 103: Infrastructure as code might be literally impossible part 2

Full networking writeup

literally 90 pages

literally everything about linux networking

literally available here: http://bit.ly/linux-networking

Page 104: Infrastructure as code might be literally impossible part 2

summary

Page 105: Infrastructure as code might be literally impossible part 2
Page 106: Infrastructure as code might be literally impossible part 2

[random os] has a better/faster/leaner/whatever networking stack

than linux

Page 107: Infrastructure as code might be literally impossible part 2
Page 108: Infrastructure as code might be literally impossible part 2

lots and lots and lots of copy

paste coping

Page 109: Infrastructure as code might be literally impossible part 2

question

Page 110: Infrastructure as code might be literally impossible part 2

an answer

Page 111: Infrastructure as code might be literally impossible part 2

an other answer

Page 112: Infrastructure as code might be literally impossible part 2

yet an other answer

Page 113: Infrastructure as code might be literally impossible part 2

and on and on and on and on…

Page 114: Infrastructure as code might be literally impossible part 2

no one even knows what these

values mean

Page 115: Infrastructure as code might be literally impossible part 2

(p. much no one knows what these

values mean)

Page 116: Infrastructure as code might be literally impossible part 2

example

Page 117: Infrastructure as code might be literally impossible part 2

netdev_max_backlog

Page 118: Infrastructure as code might be literally impossible part 2
Page 119: Infrastructure as code might be literally impossible part 2
Page 120: Infrastructure as code might be literally impossible part 2
Page 121: Infrastructure as code might be literally impossible part 2
Page 122: Infrastructure as code might be literally impossible part 2

similarish explanations

Page 123: Infrastructure as code might be literally impossible part 2

what does it actually mean

tho?

Page 124: Infrastructure as code might be literally impossible part 2

If

Page 125: Infrastructure as code might be literally impossible part 2

if

• driver calls netif_receive_skb (likely) • and RPS is disabled (default)

Page 126: Infrastructure as code might be literally impossible part 2

Then

Page 127: Infrastructure as code might be literally impossible part 2

it doesn't do anything.

Page 128: Infrastructure as code might be literally impossible part 2

literally nothing. it’s not even checked.

Page 129: Infrastructure as code might be literally impossible part 2
Page 130: Infrastructure as code might be literally impossible part 2

if

• driver calls netif_rx (unlikely) • or RPS is enabled (rare for most ppl)

Page 131: Infrastructure as code might be literally impossible part 2

then data is queued to a backlog length

limited by netdev_max_backlog

Page 132: Infrastructure as code might be literally impossible part 2

coping strategies abound

Page 133: Infrastructure as code might be literally impossible part 2

here’s a coping strategy i think

is fine

Page 134: Infrastructure as code might be literally impossible part 2
Page 135: Infrastructure as code might be literally impossible part 2

curl | sudo bash

Page 136: Infrastructure as code might be literally impossible part 2

you aren’t reading all of the chef/puppet

source so what’s the difference?

Page 137: Infrastructure as code might be literally impossible part 2

(hi, be mad)

Page 138: Infrastructure as code might be literally impossible part 2

too damn hard to understand how a computer works

Page 139: Infrastructure as code might be literally impossible part 2

on that note…

Page 140: Infrastructure as code might be literally impossible part 2

Linux Threading

Page 141: Infrastructure as code might be literally impossible part 2

“threads are slow”

Page 142: Infrastructure as code might be literally impossible part 2

“context switches are expensive/

slow/…”

Page 143: Infrastructure as code might be literally impossible part 2
Page 144: Infrastructure as code might be literally impossible part 2
Page 145: Infrastructure as code might be literally impossible part 2

a 7 year old bugfix for XFree86 broke threads on Linux

Page 146: Infrastructure as code might be literally impossible part 2
Page 147: Infrastructure as code might be literally impossible part 2

Story Time

TLS segment selectors XFree86 Modules

Page 148: Infrastructure as code might be literally impossible part 2

Story Time

mmap MAP_32BIT

Page 149: Infrastructure as code might be literally impossible part 2

June 29, 2001

Page 150: Infrastructure as code might be literally impossible part 2

“This adds a new mmap flag to force mappings into the low

32bit address space. Useful e.g. for XFree86′s ELF loader or linuxthreads’ thread

local data structures.”

Page 151: Infrastructure as code might be literally impossible part 2

Nov 11, 2002

Page 152: Infrastructure as code might be literally impossible part 2

“532. Fixed module loader to map memory in the low 32bit address space on x86-64 (Egbert Eich).”

Page 153: Infrastructure as code might be literally impossible part 2

Story Time

ELF small code model 31bit mapping

Page 154: Infrastructure as code might be literally impossible part 2

Jan 4, 2003

Page 155: Infrastructure as code might be literally impossible part 2

“Make MAP_32BIT for 64bit processes only map in the first 31bit,

because it is usually used to map small model code. This fixes the X server crashes. Some cleanups in

this area.”

Page 156: Infrastructure as code might be literally impossible part 2

So: MAP_32BIT is actually MAP_31BIT

Page 157: Infrastructure as code might be literally impossible part 2
Page 158: Infrastructure as code might be literally impossible part 2
Page 159: Infrastructure as code might be literally impossible part 2

Mar 4, 2003

Page 160: Infrastructure as code might be literally impossible part 2

/* For Linux/x86-64 we have one extra requirement: the stack must be in the first 4GB. Otherwise the segment register base address is not wide enough. */

glibc

Page 161: Infrastructure as code might be literally impossible part 2

May 9, 2003

Page 162: Infrastructure as code might be literally impossible part 2

/* We prefer to have the stack allocated in the low 4GB since this allows faster context switches. */

glibc

Page 163: Infrastructure as code might be literally impossible part 2

justification for MAP_32BIT in glibc changed

Page 164: Infrastructure as code might be literally impossible part 2

Aug 13, 2008

Page 165: Infrastructure as code might be literally impossible part 2

“Pardo” report

https://lkml.org/lkml/2008/8/12/423

Page 166: Infrastructure as code might be literally impossible part 2

“Pardo” report

Page 167: Infrastructure as code might be literally impossible part 2

“Pardo” reportpardo filled the 31bit 1GB space

with thread stacks. !

subsequent allocations were doing a linear search for a free address

on the kernel side.

Page 168: Infrastructure as code might be literally impossible part 2

MAP_STACK is added.

Page 169: Infrastructure as code might be literally impossible part 2

(it does nothing)

Page 170: Infrastructure as code might be literally impossible part 2
Page 171: Infrastructure as code might be literally impossible part 2

June 29, 2001: MAP_32BIT added to kernel

Nov 11, 2002: XFree86 updated to use MAP_32BIT

time or w/e

Jan 4, 2003: MAP_32BIT updated for ELF small code

Feb 12, 2003: wrmsr slowness reported

Mar 4, 2003: MAP_32BIT added to glibc

May 9, 2003: MAP_32BIT retry added to glibc

Aug 13, 2008:“Pardo” reportAug 13, 2008: MAP_STACKAug 15, 2008: glibc updated

Page 172: Infrastructure as code might be literally impossible part 2
Page 173: Infrastructure as code might be literally impossible part 2

a few questions

Page 174: Infrastructure as code might be literally impossible part 2

how did we get here?

question

Page 175: Infrastructure as code might be literally impossible part 2

legacy code backward compat

an thought

Page 176: Infrastructure as code might be literally impossible part 2

free open source doesn’t exist

an thought

Page 177: Infrastructure as code might be literally impossible part 2

why so much copy-paste

coping?

question

Page 178: Infrastructure as code might be literally impossible part 2

necessary complexity

an thought

Page 179: Infrastructure as code might be literally impossible part 2

lack of timean thought

Page 180: Infrastructure as code might be literally impossible part 2

an aside:

Page 181: Infrastructure as code might be literally impossible part 2

but, why is there no time?

Page 182: Infrastructure as code might be literally impossible part 2

i don’t know, but could it be that

Page 183: Infrastructure as code might be literally impossible part 2

efficiency gains are captured by

management instead of engineering?

Page 184: Infrastructure as code might be literally impossible part 2

or could it be that…

Page 185: Infrastructure as code might be literally impossible part 2

working software systems aren’t

economically viable for 99% of companies?

Page 186: Infrastructure as code might be literally impossible part 2

hence why no one found that threading

bug for 5 years?

Page 187: Infrastructure as code might be literally impossible part 2

working software given complex requirements is

expensive

Page 188: Infrastructure as code might be literally impossible part 2

how much did you pay for your

an Linux?

Page 189: Infrastructure as code might be literally impossible part 2

?packagecloud.io@packagecloudio

Page 190: Infrastructure as code might be literally impossible part 2

Python Packaging

Page 191: Infrastructure as code might be literally impossible part 2

3 types of python packages

1. source distributions (sdists) 2. eggs 3. wheels

Page 192: Infrastructure as code might be literally impossible part 2

some …interesting… behavior with [-_.]

Page 193: Infrastructure as code might be literally impossible part 2

setup(name='hi_automacon', … !

!

setup(name=‘hi-automacon', … !

!

setup(name=‘hi.automacon', …

Page 194: Infrastructure as code might be literally impossible part 2

what do you think happens?

Page 195: Infrastructure as code might be literally impossible part 2

“There are only two hard things in Computer

Science: cache invalidation and naming

things.”

Page 196: Infrastructure as code might be literally impossible part 2

(literally unknown)

Page 197: Infrastructure as code might be literally impossible part 2

hi_automacon

Page 198: Infrastructure as code might be literally impossible part 2

setup.py: hi_automacon metadata: hi-automacon sdist: hi_automacon-1.0.tar.gz egg: hi_automacon-1.0-py2.7.egg wheel: hi_automacon-1.0-py2-none-any.whl

Page 199: Infrastructure as code might be literally impossible part 2

OK SO: wheels and eggs leave “_” in the filename but !translate it in the metadata to “-“ !

…. but not sdists

Page 200: Infrastructure as code might be literally impossible part 2

OK OK OK OK OK OK OK OK

Page 201: Infrastructure as code might be literally impossible part 2

thats fine not a big deal

Page 202: Infrastructure as code might be literally impossible part 2

weekend work and all that

Page 203: Infrastructure as code might be literally impossible part 2

hi-automacon

Page 204: Infrastructure as code might be literally impossible part 2

setup.py: hi-automacon metadata: hi-automacon sdist: hi-automacon-1.0.tar.gz egg: hi_automacon-1.0-py2.7.egg wheel: hi_automacon-1.0-py2-none-any.whl

Page 205: Infrastructure as code might be literally impossible part 2

OK SO: wheels and eggs translate “-“ to “_” in the filename but !leave it in the metadata !

…. but not sdists

Page 206: Infrastructure as code might be literally impossible part 2
Page 207: Infrastructure as code might be literally impossible part 2

package name file name metadata

dash underscore dash

underscore underscore dash

Page 208: Infrastructure as code might be literally impossible part 2
Page 209: Infrastructure as code might be literally impossible part 2

wheels and eggs only

Page 210: Infrastructure as code might be literally impossible part 2

sdists are WYSIWYG affff

Page 211: Infrastructure as code might be literally impossible part 2

hi.automacon

Page 212: Infrastructure as code might be literally impossible part 2

weird

• everything has ‘.’ in it • file names and metadata for all

package types

Page 213: Infrastructure as code might be literally impossible part 2
Page 214: Infrastructure as code might be literally impossible part 2

let’s curl against PyPI….

Page 215: Infrastructure as code might be literally impossible part 2

django-allauth

Page 216: Infrastructure as code might be literally impossible part 2

curl https://pypi.python.org/simple/

django-allauth/

Page 217: Infrastructure as code might be literally impossible part 2

HTTP 200

Page 218: Infrastructure as code might be literally impossible part 2
Page 219: Infrastructure as code might be literally impossible part 2

OK OK OK OK OK OK OK OK

Page 220: Infrastructure as code might be literally impossible part 2

curl https://pypi.python.org/simple/

django_allauth/

Page 221: Infrastructure as code might be literally impossible part 2

HTTP 302

Page 222: Infrastructure as code might be literally impossible part 2

< Location: /simple/ django-allauth

Page 223: Infrastructure as code might be literally impossible part 2
Page 224: Infrastructure as code might be literally impossible part 2

OK OK OK OK OK OK OK OK

Page 225: Infrastructure as code might be literally impossible part 2

curl https://pypi.python.org/simple/

django.allauth/

Page 226: Infrastructure as code might be literally impossible part 2

HTTP 200

Page 227: Infrastructure as code might be literally impossible part 2
Page 228: Infrastructure as code might be literally impossible part 2

(hi)

Page 229: Infrastructure as code might be literally impossible part 2

and now what happens if we try mixing the case?

Page 230: Infrastructure as code might be literally impossible part 2

lol maybe next time.