ETW, EventSource , SLAB, & Friends for Logging, Instrumention , and … Telemetry

  • Upload
    lajos

  • View
    53

  • Download
    2

Embed Size (px)

DESCRIPTION

HELLO my name is. ETW, EventSource , SLAB, & Friends for Logging, Instrumention , and … Telemetry. NYC Code Camp 14-September-2013. Bill Wilder. Boston Azure User Group http ://www.bostonazure.org @bostonazure. Bill Wilder http://blog.codingoutloud.com @codingoutloud. - PowerPoint PPT Presentation

Citation preview

PowerPoint Presentation

ETW, EventSource, SLAB, & FriendsforLogging, Instrumention, and TelemetryNYC Code Camp14-September-2013Boston Azure User Grouphttp://www.bostonazure.org@bostonazureBill Wilderhttp://blog.codingoutloud.com@codingoutloud

HELLOmy name isBill Wilder1Abstract:If my application runs on cloud infrastructure, am I done? Not if you wish to truly take advantage of the cloud. The architecture of a cloud-native application is different than the architecture of a traditional application and this talk will explain why. How to scale? How do I overcome failure? How do I build a system that I can manage? And how can I do all this without a huge monthly bill from my cloud vendor? We will examine key architectural patterns that truly unlock cloud benefits. By the end of the talk you should appreciate how cloud architecture differs from what most of use have become accustomed to with traditional applications. You should also understand how to approach building self-healing distributed applications that automatically overcome hardware failures without downtime (really!), scale like crazy, and allow for flexible cost-optimization.My name is Bill WilderHELLOmy name isBill [email protected]@codingoutloud

www.devpartners.comCONSIDERATIONSPerf CountersActivity ID (Correlation)WCF, TOOLS PerfView (xperf, tracerpt.exe, ), SLAB, .NET 4.5, LogMan,

SERVICES Azure TablesResources: http://msdn.microsoft.com/en-us/magazine/cc163437.aspx Improve Debugging And Performance Tuning With ETW

Who is Bill Wilder?

www.devpartners.comwww.bostonazure.orgwww.cloudarchitecturepatterns.com

4MARQUEE SPONSOR

PLATINUM SPONSOR

PLATINUM SPONSOR

PLATINUM SPONSOR

GOLD SPONSORS

SILVER SPONSORS

Distributed SystemsCloud Services are Distributed SystemsGathering and Aggregating information on Distributed Systems is HARDInsight via telemetry more critical than ever to debug, monitor, diagnose, track QoS (SLA), Whats in StoreStatus: State of Logging todayFrom Logging TelemetryETW & SLABTDD (Telemetry Style)Beyond ETW & SLAB

More a Journey than Final SolutionInspired by CAT member Mark SimmsPractical Azure

The term cloud is nebulousLogging TodayMost Common Logging Todayint x = foo.DoSomething();

// what could go wrong?2nd Most Common Logging Todaytry { int x = foo.DoSomething();}catch (Exception ex){ // Let's hope this never happens}

3rd Most Common Logging Todaytry { int x = foo.DoSomething();}catch (Exception ex){ // Handle the exception Logger.Error(ex.ToString());}

The term cloud is nebulousLogging Challenge:

Reactive: something unexpected happened

Not solution-oriented: why am I logging this and what do I hope to learn from it? who is the audience?Proactive Instrumentation (Telemetry?)var stopwatch = Stopwatch.StartNew();// call FooApistopwatch.Stop();var duration = (int)stopwatch.ElapsedMilliseconds;

Logger.Info( String.Format( "User {0} accessed method {1} (took {2} ms)", Thread.CurrentPrincipal.Identity.Name, "FooApi", duration);

Can you spot any problems??Some Challenges from Prior SlideFormatting done at logging siteUnstructuredPerformance hitNot centralized / coordinatedSeverity Level decided at logging siteWho is the customer of this logging statement?Who is using this code? (Distributed System)

The term cloud is nebulousEvent Tracing for WindowsETWETW BackgroundIntegrated into Windows Desktop and ServerUsed by Microsoft (.NET, ASP.NET, IIS, )Your data side-by-side (by time, activity id)Wicked fast (kernel-level buffers)Semantically rich (time, stack, custom)Standardized tooling support (more coming)ButHard to use for .NET developers (1 EventSource 1:N Event TraceUse Table vs. File vs. SQLConsider RX (in-proc only!)Focus first on seams in architectureUse Activity Id (when avail) and think about correlation across tiersContinually improve telemetry see TDD later

Semantic Logging Application BlockSLAB Augments ETW with:Easy wire-up Listeners to move events somewhere interestingWindows Azure NoSQL TableWindows Azure or SQL DatabaseFile (JSON)Unit testing supportNote Finicky! bullet on prior slideThe term cloud is nebulousWhen does Logging become TelemetryWarm up Storage EmulatorPrepare PerfView in its own directory (for files)Fire up Cerebrata with local Dev Storage Build Web API sample and log explicitly from Get methodAdd hook to log all web methods

32It is a capital mistake to theorize before one has data.

Sherlock Holmes, DevOps Team Leader

TelemetryAutomatic transmission and measurement of data from remote sources.DataFacts and statistics collected for reference or analysis.SOURCE: The Internet

TDDTest-Driven DevNeed new feature or change in behaviorBug was reportedSo weWrite a test for itSee the test failThen proceed toWrite code to implement new feature or fix bugTelemetry-Driven DevNeed to know how long a Web API call is takingNeed to diagnose errorSo weInstrument the codeObserve the dataThen proceed toAnswer questions & explain issues using dataSemantic Logging is a MindsetPlanning dev, ops, business are all potential customers Move effort to earlier in development process better-thought-out logging (instrumentation), rather than more effort in log parsingThink about what your application requires:Pattern: FooStart, FooEnd, FooException

Questions Telemetry Can AnswerHow long, on average, do my APIs take?Are my APIs meeting SLA?Is my site responding?How many users are currently on my site?Is everything going well? Code exceptionsIs my current capacity optimalCloud Services Better-Defined AutomatableSome questions have answers that can be automatedSLA performance complianceUp or NotDo X if Y example, SLASLA violations > 5% in past hour, alert humanAt end of month, create report and apply creditMUST HAVE STRUCTURED DATA to be possibleProcessing the data exercise for reader Tools for Answering QuestionsETW, SLAB, PerfViewWindows Azure Diagnostics (WAD)(quick demo if theres time)Log4net, nlog, Enterprise Library Logging AB

But wait theres more!The Right Tool for the JobWindows Azure PortalWindows Azure DiagnosticsELMAHGlimpseGoogle Analytics Real Time(some for money like)AppDyanmics, New Relic, Azure Watch, ELMAH emailFrom: Date: Wed, Sep 11, 2013 at 2:09 PMSubject: ELMAH-PageOfPhotos-ErrorTo: [email protected]: The controller for path '/create-error' was not found or does not implement IController.Generated: Wed, 21 Nov 2012 19:08:59 GMTSystem.Web.HttpException (0x80004005): The controller for path '/create-error' was not found or does not implement IController. at System.Web.Mvc.DefaultControllerFactory.GetControllerInstance(RequestContext requestContext, Type controllerType) at System.Web.Mvc.DefaultControllerFactory.CreateController(RequestContext requestContext, String controllerName) at System.Web.Mvc.MvcHandler.ProcessRequestInit(HttpContextBase httpContext, IController& controller, IControllerFactory& factory) at System.Web.Mvc.MvcHandler.BeginProcessRequest(HttpContextBase httpContext, AsyncCallback callback, Object state) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) Server Variables NameValueALL_HTTPHTTP_CONNECTION:keep-alive HTTP_ACCEPT:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 HTTP_ACCEPT_ENCODING:gzip, deflate HTTP_ACCEPT_LANGUAGE:en-US,en;q=0.5 HTTP_COOKIE:ASP.NET_SessionId=ishz5hhymltvtzwvz54gvble HTTP_HOST:pageofphotos.cloudapp.net HTTP_USER_AGENT:Mozilla/5.0 (Windows NT 6.2; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0 HTTP_DNT:1 HTTP_X_CLICKONCESUPPORT:( .NET CLR 3.5.30729; .NET4.0E) ALL_RAWConnection: keep-alive Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip, deflate Accept-Language: en-US,en;q=0.5 Cookie: ASP.NET_SessionId=ishz5hhymltvtzwvz54gvble Host: pageofphotos.cloudapp.net User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0 DNT: 1 X-ClickOnceSupport: ( .NET CLR 3.5.30729; .NET4.0E) APPL_MD_PATH/LM/W3SVC/1273337584/ROOTAPPL_PHYSICAL_PATHF:\sitesroot\0\AUTH_TYPEAUTH_USERAUTH_PASSWORD*****LOGON_USER INSTANCE_META_PATH/LM/W3SVC/1273337584LOCAL_ADDR10.207.192.38PATH_INFO/create-errorPATH_TRANSLATEDF:\sitesroot\0\create-errorQUERY_STRINGREMOTE_ADDR108.49.97.48REMOTE_HOST108.49.97.48REMOTE_PORT7102REQUEST_METHODGETSCRIPT_NAME/create-errorSERVER_NAMEpageofphotos.cloudapp.netSERVER_PORT80SERVER_PORT_SECURE0SERVER_PROTOCOLHTTP/1.1SERVER_SOFTWAREMicrosoft-IIS/8.0URL/create-errorHTTP_CONNECTIONkeep-aliveHTTP_ACCEPTtext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8HTTP_ACCEPT_ENCODINGgzip, deflateHTTP_ACCEPT_LANGUAGEen-US,en;q=0.5HTTP_COOKIEASP.NET_SessionId=ishz5hhymltvtzwvz54gvbleHTTP_HOSTpageofphotos.cloudapp.netHTTP_USER_AGENTMozilla/5.0 (Windows NT 6.2; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0HTTP_DNT1HTTP_X_CLICKONCESUPPORT( .NET CLR 3.5.30729; .NET4.0E)Glimpsewww.getglimpse.com

Google Analytics Real Timehttp://analytics.blogspot.com/2011/09/whats-happening-on-your-site-right-now.html43Bills Logging & Telemetry StackOLD still used/usefulLog4net, nlog, entlib logging blockIIS logsWindows EventsEvent ViewerExisting logging from existing services

NEWER distributed appsEvent Tracing for Windows (ETW)Semantic Logging mindsetTDD (Telemetry-Driven Dev)Continual incremental ImprovementsSLABPlatform Services: Windows Azure Portal, Windows Azure DiagnosticsThird-Party Services: ELMAH, Glimpse, Google Analytics Real Time, New Relic, AppDynamics, +So Now What?Realize old-school logging will be here for a loooong timeRealize ETW has rough edges, but is still the best we have for holistic analysis, kernel-mode performance, and standardized approachEmbrace Semantic Logging move the effort to where it has most leverageEmbrace TDD and continually elevate your logging to telemetryDont be a snob - use multiple tools if you canQuestions?Comments?More information??ResourcesEventSource Class (in .NET 4.5) - http://msdn.microsoft.com/en-us/library/system.diagnostics.tracing.eventsource.aspx SLAB (part of EntLib 6) - http://msdn.microsoft.com/en-us/library/dn169621.aspx PerfView - http://www.microsoft.com/en-us/download/details.aspx?id=28567Telemetry defined - http://en.wikipedia.org/wiki/Telemetry Telemetry Basics from CAT team -http://social.technet.microsoft.com/wiki/contents/articles/17987.cloud-service-fundamentals.aspx#Telemetry_Basics_and_Troubleshooting

http://msdn.microsoft.com/en-us/library/windowsazure/jj853352.aspx47More ResourcesActivity Id in.NET 4.5.1https://github.com/jonwagner/EventSourceProxy/wiki/Implementing-an-EventSourceTOOL Tutorial: https://github.com/jonwagner/EventSourceProxy/wiki/Using-LogMan-for-ETW-Tracing

Business Card

BostonAzure.orgBoston Azure cloud user groupFocused on Microsofts Public Cloud PlatformMonthly, 6:00-8:30 PM in Boston areaFood; wifi; free; great topics; growing communityFollow on Twitter: @bostonazure More info or to join our Meetup.com group: http://www.bostonazure.org

Looking for consulting help with Windows Azure Platform? someone to bounce Azure or cloud questions off?a speaker for your user group or company technology event?Just Ask!

Bill Wilder@codingoutloudhttp://blog.codingoutloud.comcommunity inquiries: [email protected] inquiries: www.devpartners.com book: www.cloudarchitecturepatterns.com

Contact Me

Find this slide deck hereDONE

Subliminal 0.2554