29
Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday, 2 October 13

Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Knowledge isImperfect

ACTING ON STALE, INCONSISTENT OR MISSING DATA

ULF WIGER, FEUERLABS, INC.

GOTO Aarhus 2013

Wednesday, 2 October 13

Page 2: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Outline

War  storiesNo  code,  no  algorithms,  no  Hadoop

ThoughtsThe  code  may  well  be  broken  before  it’s  even  wri<en

2Crimson Tide (1995) 1:21:26

Wednesday, 2 October 13

Page 3: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Experience

Wednesday, 2 October 13

Page 4: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Alaskan Adventure

Worked  on  military  Command  &  Controland  Emergency  Response  in  Alaska  1989–1995

The  core  of  Command  &  Control  is  control  of  informaAon“Where  are  my  assets,  and  what  is  their  status?”  (Col  Shepherd)

Near  Real-­‐Ame

World-­‐wide

No  single  point  of  failure

Pull  informaAon  from  any  source

4

Wednesday, 2 October 13

Page 5: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Ericsson adventure

13  years  building  telephony  systems  at  Ericsson

World’s  first  carrier-­‐gradevoice-­‐over-­‐packet  systems

5

[1]

Wednesday, 2 October 13

Page 6: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Feuerlabs Adventure—ongoing

“ConnecAng  the  Internet  of  Things™”

Building  modern  Connected-­‐Device  Management  services

6

Wednesday, 2 October 13

Page 7: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Traits

Wednesday, 2 October 13

Page 8: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

C2: Distinctive Challenges

Assume  enemy...acFvely  tries  to  destroy  your  infrastructureacFvely  feeds  you  misleading  informaFon

Deploy  anywhere,  anyAme

Fallback:  fully  manual

Mess  up—people  die!

8

US Marines CAC2 System

Wednesday, 2 October 13

Page 9: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Solutions (then)

No  single  point  of  failureFull  asynchronous  replicaFon  (40  sites)

SynchronizaAonControl  access;  strict  ownershipRely  on  model  for  manual  operaFon

Split  brainSite-­‐specific  data  cached  at  remote  sites

Limited  connecAon  speed  (down  to  19.2  KBps)Priority-­‐based  replicaFon

9

Wednesday, 2 October 13

Page 10: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Telecom: Special Challenges

Ubiquitous  servicePeople  expect  it  to  always  work

Emergency  callsShould  be  serviced  even  during  extreme  overload

“User-­‐friendly”  failure  modesFew  seconds  setup  FmeEcho  cancellaFon,  speech  quality,  tolerable  delays

LegacyGeneraFons  of  hardware,  soSware,  protocols

10

Wednesday, 2 October 13

Page 11: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Device Management Challenges

InformaAon  access  &  qualityRPC  validaFonConfig  data  consistencySW  status  (OTA  upgrades)

User  requirements  unclear

ConnecFon  quality/costRemote  probesSandboxing/securityFail/retry/Fmeout

11

Wednesday, 2 October 13

Page 12: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Decision Support

Wednesday, 2 October 13

Page 13: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Decision Support Basics

The  Four  Ws:Who  reported?What  happened?When  did  it  happen?Where  did  it  happen?(The  Why  is  saved  for  post-­‐mortem)

13

Wednesday, 2 October 13

Page 14: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

The Who

Affects  our  level  of  trustSomeFmes,  deliberate  misinformaFonOther  Fmes,  you  take  what  you  can  get

14

Wednesday, 2 October 13

Page 15: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

The What

Surprisingly  hard  to  report  sufficient  informaAon

Missing  data

ConflicAng  data

Incorrect  data

15

Wednesday, 2 October 13

Page 16: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Abstractions

Different  views  for  different  roles

AggregaAon  /  Drill-­‐down

16

Wednesday, 2 October 13

Page 17: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Ulf’s Law of Information Management

The  key  informaAon  flow  in  any  organizaAon  is  boeom-­‐upNot  managers  telling  workers  what  they  should  know

Keep  low-­‐level  informaAon,  aggregate  upAllow  digging  into  details  as  needed

Many  bad  decisions  are  based  on  missing  or  misleading  dataThe  ability  to  shape  data  for  reporFng  is  a  power  factorAutomaFon  can  miFgate  this

17

Wednesday, 2 October 13

Page 18: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

The ‘What’ for Developers

What  are  we  going  to  build?OSen  surprisingly  vague

18

An organization loses its intuitionwhen the person who has the answerisn’t talking to the person who has the question(Tim Berners Lee, “Weaving the Web”- from memory)

Wednesday, 2 October 13

Page 19: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Dealing with requirements

Agile  methods  great  for  boeom-­‐up  development

Sogware  development  is  a  top-­‐down  /  boeom-­‐up  acAvity

Tony  Hoare’s  Turing  Award  Speech:One  man/group  whose  purpose  is  to  understandwhat  is  being  done,  and  why

19

Wednesday, 2 October 13

Page 20: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Specifications

If  you  have  specs—make  the  most  of  themGenerate  code,  test  input,  spec-­‐driven  validaFon

Ogen,  you’ll  find  that  the  spec  is  broken

20

     STR  ::=  <  Diameter  Header:  275,  REQ,  PXY  >

                     <  Session-­‐Id  >                      {  Origin-­‐Host  }                      {  Origin-­‐Realm  }                      {  Destination-­‐Realm  }                      {  Auth-­‐Application-­‐Id  }                      {  Termination-­‐Cause  }                      [  User-­‐Name  ]                      [  Destination-­‐Host  ]                  *  [  Class  ]                      [  Origin-­‐AAA-­‐Protocol  ]                      [  Origin-­‐State-­‐Id  ]                  *  [  Proxy-­‐Info  ]                  *  [  Route-­‐Record  ]                  *  [  AVP  ]

(From rfc4005_nas.diaErlang/OTP’s Diameter application)

Wednesday, 2 October 13

Page 21: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Trust/verify

Trust  (assert)  data  from  internal  users

Check  data  from  external  users  (specificaAon-­‐driven)

21

Verify

Trust

Wednesday, 2 October 13

Page 22: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

The When

InformaAon  grows  stale

LifeAme  indicators?

PersistencyHow  long  should  data  live?

“Unknown”  is  a  useful  indicator

22

[2]

Wednesday, 2 October 13

Page 23: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Modeling data lifetimes

Don’t  mix  persistent  and  transient  data

Persistency  levelsreplicated  diskreplicated  RAMreplicaFon  factor

Erlang-­‐stylelightweight  processesautomaFc  GCsingle-­‐assignmentmessaging

23

Transient requestprocesses

Wednesday, 2 October 13

Page 24: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

The Where

In  Emergency  Response—obviously  important

In  tech,  the  Where  can  someAmes  be  inferredBut  absence  of  signal  is  hard  to  interpret

24

[3]

[4]

Wednesday, 2 October 13

Page 25: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Diagnosing absence of signal

“Virtual  Device”

InformaAon  back-­‐door

25

VDP Control

Data

Backplane

VDP

Status

(TCP/IP)

(UDP)

Distributed Erlang

Wednesday, 2 October 13

Page 26: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Knock-out Units

=  The  amount  of  service  that  can  be  lost  in  a  crash

You  will  lose  service—plan  for  it!

Beeer  to  fail  disAnctly  than  to  pretend  to  funcAon

Invariants:  If  they  fail,  all  bets  are  off

26

Connection fan-out

Replication messagesin-flight

Wednesday, 2 October 13

Page 27: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Let it Crash.... or Try for a Result?

TempAng  to  always  deliver  a  preey  result

A  result  that  looks  right,  while  erroneous,is  ogen  worse  than  no  result  at  all

27

[5]

Wednesday, 2 October 13

Page 28: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Conclusion

As  programmers,  we  someAmes  forget  to  model  failure

Key  is  to  think  of  informaAon  qualityData  lifeFmeData  loss  potenFalWhat  data  do  I  need  for  recovery?What  failures  can  we  discern?What  interrupFons  are  acceptable?What  do  our  users  expect?Invariants

28

Wednesday, 2 October 13

Page 29: Knowledge is Imperfect - GOTO Conference · 2013. 10. 2. · Knowledge is Imperfect ACTING ON STALE, INCONSISTENT OR MISSING DATA ULF WIGER, FEUERLABS, INC. GOTO Aarhus 2013 Wednesday,

Questions?

29

[1]  hep://evaluaAon.nbu.bg/pub/NGN_MP_e_book_CD/DL_NGN_2004%20Module%205/Module%205/1.7%20sogswithces.htm[2]  hep://docs.nimsog.com/prodhelp/en_US/Probes/Catalog/nas/3.6/index.htm?toc.htm?1942450.html[3]  hep://labs.vmware.com/vmtj/an-­‐anomaly-­‐event-­‐correlaAon-­‐engine-­‐idenAfying-­‐root-­‐causes-­‐boelenecks-­‐and-­‐black-­‐swans-­‐in-­‐it-­‐environments[4]  hep://news.techeye.net/sogware/bespoke-­‐os-­‐blip-­‐caused-­‐chaos-­‐in-­‐the-­‐air[5]  hep://www.theregister.co.uk/2013/08/06/

Wednesday, 2 October 13