62
Visual Studio Team Services: Learning the Seven Habits of Effective DevOps Thoughts to share from Microsoft Developer Division’s transformation to DevOps Sam Guckenheimer Product Owner, Visual Studio Cloud Services Microsoft Corporation http://visualstudio.com/devops

Seven habits of effective devops - DevOps Day - 02/02/2017

Embed Size (px)

Citation preview

PowerPoint Presentation

Visual Studio Team Services: Learning the Seven Habits of Effective DevOpsThoughts to share from Microsoft Developer Divisions transformation to DevOpsSam GuckenheimerProduct Owner, Visual Studio Cloud ServicesMicrosoft Corporation

http://visualstudio.com/devops

About Me:Sam GuckenheimerProduct Owner, Visual Studio Cloud Services

>30 years software industry>13 years Microsoft2005-2010 led Agile TransformationAgile Conference 2014 keynote2010from Agile to DevOpsRugged DevOps ConnectDevOps Enterprise Summit (#DOES15, #DOES16) keynotes@samguckenheimerhttp://aka.ms/devops

The Journey

2010201120122013201420152017

Sprint 1August 2010

2016

Sprint 113January 2017

Sprint 29June 2012

Visual Studio Team ServicesPublic Preview

3

And We Support One Engineering System

If you want to go fast, go alone. If you want to go far, go together.

Our learnings as a SaaS provider

Visual Studio 11 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/20175

Our learnings as a SaaS provider

Team rooms

Planning horizons

What we accomplished3 week sprints

Progressive DeploymentWeek 1Week 2Week 3

Week 1Week 2Week 3

Week 1Week 2Week 3Sprint 98Sprint 97Sprint 99

The sprint plan

2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/20179

Sprint mails

Linked to Video, Backlog, Board, User Voice

Our learnings as a SaaS provider

Code: Cloud first, then move on-premisesOne code base with multiple delivery streamsSingle master branch, multiple release branchesShared abstraction layer

Update 1

Update 2Update N

Visual Studio Team ServicesTeam Foundation Server

Maintaining enterprise rigorEveryone is on ONE main master branchGit helps with lightweight topic branchingTiny, continuous mergingCode is fresh in your mind

BlueR:0 G:120 B:215CyanR:0 G:188 B:242Light GrayR:210 G:210 B:210Dark BlueR:0 G:32 B:80Dark GrayR:80 G:80 B:80GrayR:115 G:115 B:115PurpleR:92 G:45 B:145OrangeR:216 G:59 B:1GreenR:16 G:124 B:16Main colorsSecondary colors (use only when necessary)

14

PrinciplesTests should be written at the lowest level possibleWrite once, run anywhere - includes production systemProduct is designed for testabilityTest code is product code, only reliable tests surviveTesting infrastructure is a shared serviceTesting: Shift Left from Integration to Unit

L0: Run against raw drop. Only access binaries. Run in the build. Must be fast and reliable.L1: Run against raw drop, and can hit resources on machine (e.g. SQL test)L2: Run against special deployed product. We analyzed sources of timing issues and eliminated them.L3: Run against production

Test Strategy: Shift Left Progress By Sprint

Integration TestsUnit Tests

Our learnings as a SaaS provider

We measure availability by account and proactively reach out to customers with low availabilityFocus on the outliers (Embrace the Red)

Found one of the top customers with low availability. Proactively reached out and resolved their issue.

Visual Studio 11 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/201718

Talk to customers

UserVoiceTop Customers

Our learnings as a SaaS provider

Build-Measure-Learn

We believe {customer segment} wants {product/feature}because {value prop}To prove or disprove the above, the team will conduct the following experiment(s): The above experiment(s) prove(s) the hypothesis by impactingthe following metric(s):

Hypothesis

Experiment

Learning

Existing experience Baseline:36% conversion to project

Hypothesis: Can simplify path to ,magic moment

After 3 experiments, with different treatments:50% to 100% customersconversion to project (+18%)Experimentation

Microsoft Build 2016 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/2017 6:1022

All code is deployed, but feature flags control exposureFlags provide runtime control down to individual userUsers can be added or removed with no redeploymentEnables dark launchMechanism for progressive experimentation & refinementFeature flags

Feature Flags

All code is deployed, but feature flags control exposure

So we decided that this wasnt about the process. It was about the mechanics of our code base, and ensuring that our code base supported the way we wanted to work.24

ONOFF

FeatureFlag

25

ONOFF

FeatureFlag

26

ONOFF

FeatureFlag

27

ONOFF

FeatureFlag

28

ONOFF

FeatureFlag

29

ONOFF

FeatureFlag

30

31

Our learnings as a SaaS provider

We collect everything!Application Insights Analytics (Project Kusto) fortext search and queries over structured and semi-structured datahigh volume ingestion fast queries over very large data sets

Our learnings as a SaaS provider

Tracking Deployments to Production (5 Rings)

Canary (internal users)Smallest external data centerLargest external data centerInternational data centersAll the rest12345

Service status visible

RCA (Root Cause Analysis) transparency

Precise alerting necessary, but too much creates noiseRedundant alerts for same the issueNeeded to set right thresholds and tune oftenStateless alerts contributed to further noiseWe set alerting goalsEvery alert must be actionable and represent a real issue with the system Alerts should create a sense of urgency false alerts dilute thatSolutionConsolidate alerts so that only actionable alerts are sent to teamAutoroute according a health model based on suspect route causeExample of Learning: Reducing alerting

Health model in action

Found 3 errors for memory and performance

All 3 errors are related to the same code defect

Eliminated alert noise:~928 alerts per week to ~22Reduced DRI escalations by ~56%

APM component mapped to feature teamAuto-dialer engaged Global DRI

Visual Studio 11 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/201739

Security Mindset is Assume Breach40

Double blind test Full disclosure at or near end

vs. Share tactics & lessons learned Continued evolutionhttps://www.youtube.com/watch?v=i9qf3VdfcjE Use War Games to the learn attacks and practice response

40

Our learnings as a SaaS provider

Demo:How we use Visual Studio Team Servicesfor Microsofts One Engineering System

Worldwide Partner Conference 2015 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/2017 6:1042

Microsoft uses Visual Studio Team Services4x active user growth in 2 years to 64,000

2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/201743

Our learnings as a SaaS provider

Its About Reducing Cycle Time

Its About Reducing Cycle Time

Ronny KohaviDistinguished Engineer, Analysis and Experimentation

Its About Reducing Cycle Time

Its About Reducing Cycle Time

Live Site Culture and EngineeringLive Site HealthTime to DetectTime to CommunicateTime To MitigateCustomer ImpactIncident prevention itemsAging live site problems Customer support metricsSLA per customer account(SLA, MPI, top drivers)EngineeringBug cap per engineerAging bugs in important categoriesPass rate & coverage by test levelVelocity Time to buildTime to self testTime to deployTime to learn(Telemetry pipe)UsageAcquisitionEngagementDedicationChurnFeature usage

Segue: Takes us back to Team Dashboard (example of team autonomy and enterprise alignment)Kanban Board - Expedite Lanes lets you handle live site issuesOne thing live site culture requires is that it requires us to be on the same telemetry pipeline.Azure and services built-on AzureWhere is the problem?

Visual Studio 11 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/201749

2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Appendix: Our Dashboards, Backlog, Pipeline

Worldwide Partner Conference 2015 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/2017 6:1051

Home Page @ Product Level (VSTS)

All of these dashboard pages use customizablewidgets. The account admin can configure them however desired.

Home page @ Epic Level (Agile)

Kanban Board for Epic Area (here, Agile)

All of these boards are customizable (swimlanes, columns, filters, etc.)

Kanban Board for Single Feature Crew within

My Dashboard

My activity

Appendix: Our Telemetry Examples

Worldwide Partner Conference 2015 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.2/2/2017 6:1058

Home Page for Telemetry

Sample Availability for one Microservice (last 6 hours)

Sample performance by user scenario with percentiles compared

Tracking feature usage