Agile Operations - Xpdays France 2009

Preview:

DESCRIPTION

Explaining why developers and operations need to work together.

Citation preview

Agile and Operations

for people with a good taste

Gildas Le Nadan - Patrick Debois - Xpdays France 2009

GildasLe Nadan

Gildas comes from France

He has been working on servers for several years

Patrick Debois

Patrick comes from Belgium

He works as freelancer always looking for good opportunities

We both come from sysadmin land

But we have been looking with great interest to the agile developer community

Finally we decided to give our own version of agile for operations

Today we have two stories for you

The first one is about the impact of agile development on operations

The second one is what it means to do agile in Operations

So letʼs start with the first great story. It starts off like a beautiful project

Once upon a time there was a team that had a lot of Agile Developers with good intentions

They had prepared all their frameworks and IDEʼs

They had all the tools they needed

They even had experts on Usability

They worked in close cooperation with the customer

They worked hard

Using their Post-its in abundance

They dutifully continued holding their standup meetings

They monitored their progress using a backlog

And this is the result after their first sprint. A fully working product!

And the second sprint again, but still room for a lot of improvement

They worked and they worked, but after a while they realized ...

That they were adminstrating their systems in an adhoc manner and that they were not super sysadmins

So They called in a Sysadmin to fix it

But It turned out their development environment ...

was quiet different from their test environment.

and nowhere near production environment!

They were still using manual installations

The sysadmins also cleanup up the development and test environment, providing a good base for further development

While doing so, Operations followed their official ITIL guidelines,

While they cleaned up things a few testers found some bugs

But overall most clients were happy with the quality

But still it felt that there was something missing.. It required more Tasting euh testing by a senior person

There was a real nasty bug that came up once in a while

But the problem was Hard to catch

Eventually they nailed it

Confident now, they decided to make their first public release

There were some minor usability problems.

But they were easily solved by the project team with a temporary fix

They created some workarounds

Delivering a great result

Inspired by the success Marketing wanted to put out lots of new features

But then Operations Team shouted NOOOOOO!!!!!

because in reality, things started to get ugly in production

First they had dealt with it, doing some emergency patches

But now things got REALLY ugly

Customers were experiencing bad response times

When they finally activated the logging , carefully checking the impact, the logs were full of useless debug messages

Eventually it appeared, that the usage mix was different: real users would use it synchronous way but the request by the APIʼs were asynchronous

So operations decided to put this in the FAQ

And they put on a ticket system

They installed larger servers

But the platform stayed fragile and required frequent restarts

One of the classic problems is that projects think they are the only one

But this project was not the only one operations had to support

They also had to document things, not loose any information

because, when the project was finished, some developers were assigned maintenance fixes. At some moment, nobody of the original team was still there, and junior staff was trained to step in

The development team had the following view on the subject

Some people, probably the most senior, had a broader view of the platform operations was operating.

In reality it really looked this way

Ops giving specifications

To avoid this kind of surprises in the future, they invited operations people during the design phase. This way they could transmit their knowledge of the productions environment and it was written down in the request for proposals.

Ops wrote down every requirement they could think of

But this Big Design UpFront resulted in and over complicated,overly designed and over engineered solutions.

The solutions seems to be to integrate operations IN the project phase.This both in the beginning and during the project.So both in good and bad times...

Because these people will constantly think about ... logs

They will check that sizing is done correctly

They will think of emergency procedures

Make sure Parallel Processing works

That your applications are packaged nicely

that your data can be archived and that the backup AND restore works

take the necessary security measures

think of good deployment tools

They will think about reporting. Find relations with other systems Think of reports management will request for SLA reporting

In the end everybody will be proud of what they prepared

And that includes the serving staff as well!

If you think now. Yeah but Iʼm in another business

You will always require some kind of log files

You will always need infrastructure

More good tools

Someone who needs to deal with angry customers

Good End User Manuals

The need for archiving

Cleanup Routines

dealing with capacity peeks

Monitoring the health of your systems

Some who takes care of supplies to keep your systems going

Hopefully you will see the light in the end

but off course , disasters can still happen!

AgileManifesto

Ok, the operations team needs to be agile, and it needs to be integrated in the project.How would the agile manifesto apply to YOUR work as an operations member

http://agilemanifesto.org

We value the items on the left more then on the right

Individuals and interactionsover processes and tools

ITIL vs.

Agile

ITIL has lot of practices for keeping things running. It used to be a change moderator, but as development is more agile we need to adapt.ITIL v3 is has introduced the notion of continuous improvement too.

Operations as a cost centre

Increase in Maturity can bring Value (Gartner scale)

0 ADhoc1 Reactive2 Proactive3 Service4 Value

There’s no magic tool

There’s no magic tool that can save you from bad organization. It still requires you to think!

Working software over comprehensive Documentation

Working software over comprehensive DocumentationWorking means working in operation (Scope Problem , Dev) / Working Service

Customer Collaboration over Contract Negotiation

Who’s the customer (Internal, External / Different ASP, Normal company, internal support)

Responding to change over following a plan

Operations has been doing this for years. Every incident / issue requires us to react/adapt things

Avoid the “Big Design UpFront”.

Our highest priority is to satisfy the

customer through early and continuous delivery of valuable

software

Our highest priority is to satisfy the customers: endusers but also developers

What is early for the customer? 4d for a server, 2 min for a new account? What is value is for customer?

Risk Mgt

DEV /Project = Creating valueLoss of Value (protect value) = OPS

Welcome changing requirements even late in development. Agile processes harness

change for the customer’s competitive advantage

Ops are often very resistive to change. Bussines might require constant adaptation.

Deliver working software frequently,

from a couple of weeks to a couple of

months, with a preference for the shorter timescale

Do things often so you get better at it.

Avoid Big bang migrations. Go in small steps.

Business people and developers must work together daily throughout the project

Have operations in your project and afterwardsIn good and bad times...

Build projects around motivated

individuals. Give them the environment and

the support they need and trust them to get the job done.

Different environments : dev, test, prepod, training, trial, prod, qaDo you trust your developers to do deployment? Do you any secrets/super power they don’t have?

The most efficient method of conveying

information to and with a development team is

face-to-face

Don’t lock yourself in a small room with only email communication!

Working software is the primary measure

It things work, and people are satisfied, you’re doing a good job!

Agile Processes promote sustainable

development. The sponsors,

developers, and users should be able to maintain constant pace indefinitely

Shared projectsSpecialistsOn call + daily jobextend deployment power beyond ops team to spread the load

Continuous Attention to Technical Excellence and good

design enhances agility

Keep your skills sharp! You never know who is looking at you.

Scalability

Thinks of Scalabilty

M A N

AG E A

BI L I T

Y

Manageability (start, stop subparts/ monitor progress)

Maintainability

Maintenability = changed the text depending on the environment that changes

Securability

Securability

Reliability

Reliability

F l e x i b i l i t y

Flexibility

Why is it important

Ops has limited control over the elements they need to integrate or take care or

Loose CouplingA

B

C

CA’

B

CA’

BD

E FG

Noodle SoupLoose coupling

Butterfly Effect

Butterfly effect

KPI and Monitoring

KPI and Monitoring

Simplicity -- the art of maximizing the

amount of work not done

http://farm1.static.flickr.com/78/168397680_01673102c2.jpg?v=0Don’t go over engineer, Pragmatic

SimplicityDesign Issues: Keep things Simple Stupid (KISS)

Donʼt over cluster, loop networks, ...

Best Architectures,

requirements and designs emerge from

self-organizing teams.

Use the tools you can adapt to your needs as you require them. Not because they have a good marketing.

Closed Software Closed Hardware

Avoid Closed Source Software or Closed Appliances

Multiple Projects

• One Product Owner?

• = Program Manager

Be clear on who is your customers. Your boss, project manager(s), tickets?

Incidents vs Projects

Avoid being a Shared resource, pick the phone, take complaints, and new projectsSo you can commit to your work better.

Pair System Administration

Operations decided to go for pair sysadminstration

Project but also for incidentsLearning , spreading the knowledge (vs. specialist / hero culture)

http://www.flickr.com/photos/mitikusa/2504868526/

Continuous Improvement

• Burndown charts vs Qos

• Target, no absolute/ Estimation

Always try to improve yourself

Virtualized Hardware

Go virtual on your hardware. Stop your emotions ;-)

Automated Deployment

Automate things, that you don’t want to do over and over again.DRY: don’t repeat yourself

Config Mgt

Version control your stuff, use tools like puppet, chef in stead of custom scripts

Doing Incremental Steps

Work in small stepsChanges in configurations: better traceability

Refactoring

You need to correct mistakes.

Test Driven Administration

Be sure that you can test/monitor what you need to have things working.Otherwise you are blind when changes happen.

Trend analyze for better prediction of when things will fail

Even if project finishes, environment will changes (patches, new hardware). So you need to able to test

Sometimes cleaning is easy. But if there are Legacy systems with lots of dependencies or no clear owner

Work together on your teams Continuous Integration system. You will learn a lot

So the next time youʼre celebrating a new project release

Maybe youʼll remember us

Make Operations Fun Again

So that your operations team will be happy

Thanks you for listening!

Recommended