Mystery Machine Overview

Nov, 2015

Review of Mystery Machine

Ivan [email protected]@gliush

Why❖ Need to debug and optimize applications

❖ Complex, heterogenous systems

❖ Different parts written in different languages

❖ Different communicative channels

❖ Different execution environments

❖ Even if individual components are optimized - the whole system might not work optimally

What

❖ They develop performance analysis tools

❖ They apply it to their pipeline

❖ They measure end-to-end performance:

❖ from the point of initiating a page load

❖ to the point when browser finishes rendering

Why not

❖ All current approaches assume you instrument your code, specify relations, etc

❖ Usually you don’t have time or ability

❖ Large systems are developed by large teams

❖ Adding instrumentation retroactively is a Herculean task

Overview❖ They generate a model via large scale reasoning of logs

❖ They can confirm relationships

❖ They need only (requestId, hostId, hostTS, eventId) in each log message

❖ UberTrace gathers all the log to one point

❖ MysteryMachine conducts causality model from that traces

❖ MysteryMachine performs analyses: identifying critical paths, slack analysis, outlier detection

UberTrace: why

❖ No tools to analyze inter-process optimality

❖ They need to have a single end-to-end performance tracing tool for all logs

UberTrace: requirements❖ Each log message should contain

❖ Unique request id

❖ Computer id (server node / client laptop)

❖ Timestamp (local clock)

❖ Event name (e.g. “start DOM arendering”)

❖ Task name (<Event,Task> should be unique)

❖ Propagate decision about logging particular request

UberTrace

❖ TS are from local clocks -> translated to global clock

❖ Execution time = Latest TS - Earliest TS

❖ RTT = Es - Ec

❖ Clock skew = 1/2 RTT

❖ Multiple observation, choose minimal one

Mystery Machine: casual model

❖ Split all logs into segments(two consecutive events for the same task)

❖ Create a casual model

❖ They validated this model for client-side js library (42 and 84 segments -> 2583 and 10458 casual relationships)




Mystery Machine: critical pathCritical path - set of segments for which a differential increase in segments execution time

would result in the same differential increase in the end-to-end latency

Mystery Machine: critical path

Mystery Machine: slack

Slack - the amount by which the duration of a segment may increase without increasing the end-to-end latency of the request

Mystery Machine: slack validation

Mystery Machine: slack analyses usage

Links

❖ Video: https://www.usenix.org/node/186168

❖ Slides: https://www.usenix.org/sites/default/files/conference/protected-files/osdi14_slides_chow.pdf

❖ Paper: https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-chow.pdf

https://www.usenix.org/node/186168

https://www.usenix.org/sites/default/files/conference/protected-files/osdi14_slides_chow.pdf

https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-chow.pdf

Software

Mystery Machine Overview