Upload
gabor-gunyho
View
3.579
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Stop the Line + Stop Feature Development, Lean practices for Software Product Development - F-Secure's experience report at LESS2011
Citation preview
Protecting the irreplaceable | f-secure.com
Stop the Line + Stop Feature Development Lean practices for Software Product Development
Gabor Gunyho | Juan Gutierrez Plaza | Régis Déau Improvement Coach Senior Manager, Agile Practices Manager, Testing Practices
2011-11-01
F-Secure – the company
• Founded in 1988,
listed on NASDAQ OMX
Helsinki
• Market cap ca 350 m€,
annual revenue ca 130 m€
(2010)
• Headquartered in Helsinki,
18 country offices, presence
in more than 100 countries
• 812 people, 300+ in R&D,
5 R&D offices in 4 countries
(2010)
© F-Secure Public 2011-11-01 2
Products and Services
© F-Secure Public 2011-11-01 3
Win, Mac, Linux, Android, iOS, RIM, Symbian, 20+ language versions
Customers: 200+ operator partners globally
© F-Secure Public 2011-11-01 4
0
2
4
6
8
10
12
14
16
18
20
22
Operator revenue (mEur/quarter)
About the Authors
© F-Secure Public 5
Gabor Gunyho Improvement Coach with the “R&D
Global Methods” team at F-Secure,
experienced Agile and Lean product
development expert, contributor and
reviewer of books on scaling Agile
and Lean SW development
Juan Gutierrez Plaza Currently „Agile Practices Manager‟
at F-Secure‟s SDC unit, focusing on
the R&D transformation of the site.
Experienced coach who has helped
different teams to improve in eng.
and process practices
Regis Déau Testing practices Manager at F-Secure
SDC unit, focusing on developing an
agile testing culture and improve
quality engineering practices for
continuously improving the R&D
standards
2011-11-01
What is this presentation all about?
• No “recipe”
• Just to share
how we did it
© F-Secure Public 2011-11-01 6
Text source: http://easteuropeanfood.about.com/od/hungariansoups/r/gulyasleves.htm Image source: http://www.clker.com/clipart-9889.html
The Project
© F-Secure Public 2011-11-01 7
Project setup
• Between 10 and 12 teams (about 100 people)
• Mostly in Helsinki, some in Kuala Lumpur, later also one in Poland
• Mostly feature teams
• Fairly mature in basic Scrum[1] and Agile engineering practices
• Some experience in multi-team projects[2][3] but not on this scale
• Major new product, significant changes in
• Business model
• Architecture
• Longer-Term Planning[4][5], including new backlog tooling
© F-Secure Public 2011-11-01 8
Project timeline
• Started: Dec 2009
• This presentation counts data from March 2010
• Project Split and Spin-off: March 2010
• Intermediate Public Release: Sept 2010
• Limited scope
• Stop the Line practice: since Sept 2010 (1st draft in June)
• Simplification of the practice: Oct 2010
• StL enforcer added on: March 2011
• Stop Feature Development practice: since Sept 2010
• Two-week sprints: 46 so far
• Most resulted in a public Technology Preview release
• Public Release Oct 2011
© F-Secure Public 2011-11-01 9
Stop the Line
© F-Secure Public 2011-11-01 10
What is it?
A practice coming from Lean that is originated from the
Toyota Production System (TPS) [6]
Stop-the-Line
Work is stopped if an abnormality is found.
Work continues only when problem is fixed.
© F-Secure Public 2011-11-01 11
What is it? – The Line
“Line” refers to production/assembly lines in automobile industry where
one station takes the output of the previous station as input
© F-Secure Public 2011-11-01 12
Image source: http://www.fourwheeler.com/techarticles/body/129_0703_toyota_assembly_factory/photo_02.html
What is it? - Stopping
• If a problem is found, anyone can “pull the cord” that:
• Stops the line from moving ahead
• Signals the problem to everyone on the line pointing
to the station in trouble
© F-Secure Public 2011-11-01 13
Image source: http://www.resourcesystemsconsulting.com/blog/archives/78
Image source: http://www.flickr.com/photos/9516941@N08/3334795306/
Fixing once and for all Why it happened? How to avoid it?
• The problem is fixed immediately
In addition
• To get all the benefits of the Stop the Line practice, a root
cause analysis is done to find what caused the problem
• To prevent recurrence of the same problem, fix the root
cause too
© F-Secure Public 2011-11-01 14
Why to use it?
• Focus on quality at all times
• Avoid burying problems deep in the product where it‟s
more difficult to fix it, potentially adding more problems on
top of the identified ones
• Everybody is aware of the problem so anyone who can
help, can contribute to fixing it
• Identify recurrent (systemic) problems so they are solved
once and for all
© F-Secure Public 2011-11-01 15
… and for us in SW. Development? (1/3)
• Detection
• A Stop-the-Line is raised when
• A build is failing (e.g. it doesn't compile or pass unit testing)
• Automated smoke test fails for more than 2 consecutive times
• A problem prevents manual testing to be performed
• Signals and automated actions
• Stop-the-Line radiator raises Stop-the-Line flag for the “line” i.e.,
product area
• Stop-the-Line commit hook prevents commits to source
repository for the affected line, except for fixing the StL case
© F-Secure Public 2011-11-01 16
… and for us in SW. Development? (2/3)
• Notification
• E-mail (first approach, issued manually)
• Stop-the-Line Radiator (since March 2010, automatically, from the
build system, with automated scripts)
© F-Secure Public 2011-11-01 17
… and for us in SW. Development? (3/3)
• Fixing
• A team or person claims the issue using the claim functionality in radiator
and then starts investigating it
• Same or other team or person starts fixing the problem
• Issues not claimed before next day are handled in the daily Scrum of
Scrums and picked up by some team
• Team works on Stop-the-Line case as high priority item until it is handled
• When radiator no longer declares Stop-the-Line, team is freed from this
responsibility
• Other teams not affected by the StL case can continue working on their
area
• Prevention
• Team worked on the Stop-the-Line case conducts a root cause analysis for
selected cases and records findings then sends note to project mailing list
© F-Secure Public 2011-11-01 18
In short…
© F-Secure Public 2011-11-01 19
Detection &
visualization Reaction Prevention
Problem is Found
Stop the Line
Fix the problem
immediately
Root cause Analysis
Fix the root cause
•Detected in
Test
Automation
•Visualization
by the radiator
•Team claims the StL case
•Team investigates and fixes issue
•Root cause analysis done by the
team (or multiple teams, if needed)
•Root causes are documented and
records made available
•Fixing of root causes is initiated
(fixing root causes may take
significant effort and time, ROI
analysis and planning takes place for
bigger initiatives)
An Implementation Detail
• Rule #1 when the StL is on then do not commit new feature
development code to the module that has the StL, only commit bug
fixes
• Unfortunately not everyone was careful enough to follow this rule
systematically so some commits not related to fixing the StL problem
were done whilst StL was still on
• To prevent the human errors an automated tool was introduced to
enforce the rule #1, the “StL enforcer”
• Hook was added in the repository that checks if the commit is done
during a StL event, and if so, commits are rejected, except for
those targeted fixing the StL case
• Introduced in the middle of the project (March 2010)
© F-Secure Public 2011-11-01 20
Bug Handling & Stop Feature Development
© F-Secure Public 2011-11-01 21
Bug Handling - the Old Model
• High level concept:
• A list of bugs
(and a long one,
i.e., “bug warehouse”)
• Decision making order:
1. Release Quality Engineer or
Project level bug review
2. Team bug review
3. Team member
© F-Secure Public 2011-11-01 22
• Using the bug count metric:
• A way to measure quality
• Release Quality Engineer follows, reports and escalates (no real process to react)
• Bug life cycle:
• Store all, prioritize continuously
• Only high priority bugs get fixed
• Rest remains on the list (>95% of all)
• Maintenance gets all bugs that development project did not have time to fix before the release date
• Maintenance never fixes these
Redefining bug handling: Our Goal
• Very fast track in closing new cases
• Get all new cases closed in less than 4 weeks (2 sprints)
• Make decision quickly, closest to the actual place of work
• Avoid building a big inventory (warehouse) of bugs by all
means
• To reduce recurring effort of prioritizing a long list
© F-Secure Public 2011-11-01 23
Bug Handling - The New Model
• Reversing the old decision-making order,
the new order:
1. Team members
2. Team bug review
3. Team bug review with Product Owner (+other stakeholders if needed)
4. Project bug review
© F-Secure Public 2011-11-01 24
Bug Handling - The New Model
• Using bug count limits – Stop Feature Development
• Work guidance:
• X bugs / team STOP new development in team
• Y bugs / project STOP the new development in whole project
• Bug life cycle:
• Extremely fast handling cycle:
• Fix in this sprint
• Fix in next sprint
• Trash (with “reason” category)
• For maintenance
• Yes we fix
• Trash (with “reason” category)
© F-Secure Public 2011-11-01 25
Stop Feature Development (SFD)
• What is it, then?
• An enhancement for StL
• Line is stopped not only when tests are not passing but when the number of non-critical bugs go over a threshold:
• Per team
• Per project
Later:
• Per Product Area
• Why?
• To control another dimension of the system dynamics
© F-Secure Public 2011-11-01 26
Image sources:
http://johnastor.files.wordpress.com/2011/02/obstacle1.jpg
http://messageboards.aol.com/aol/en_us/articles.php?boardId=89965&articleId=72064&func=5
Stop Feature Development (SFD)
• When/how to invoke it? (examples)
• 10+ cases / team -> Stop Feature Development for the team
• 100+ cases / project -> Stop Feature Development for the whole
project
Later another dimension was added:
• X+ cases / product area -> Stop Feature Development for the
product area
• Product Area A limit: 60 bugs
• Product Area B limit: 30 bugs
• Product Area C1 limit: 20 bugs
• Product Area C2 limit: 30 bugs
© F-Secure Public 2011-11-01 27
Stop Feature Development (SFD)
• When/how to “resume the line”?
• Hysteresis was added to the system to avoid unwanted rapid
switching of state, e.g.,
• Product Area C2 limit for SFD: bug count > 30
• Product Area C2 limit for Resume-the-Line: bug count < 27
© F-Secure Public 2011-11-01 28
The new bug handling process - overview
© F-Secure Public 2011-11-01 29
Some valid bugs will
get trashed, but that
is OK in this process!
The new bug handling process - summary
• Bug is created: Choose the correct product area and prioritize the bug with
your best guess.
• Decision: A team decides whether the bug is fixed in this sprint, next sprint
or trashed.
• Fix and test: A team fixes and tests their fix.
• Closing: All new bugs are closed in no more than 2 sprints (4 weeks).
• Inside the teams:
• Bugs not tracked if…
• The bug is fixed and tested by the team within the sprint
• The bug does not cross sprint or team boundaries
© F-Secure Public 2011-11-01 30
© F-Secure Public 2011-05-09 31
Statistics
© F-Secure Public 2011-11-01 32
StL+SFD Events vs. Releases
© F-Secure Public 2011-11-01 33
Apr 1
2010
Jan 1
2011
Now July 1
2010
July 1
2011
StL Enforcer StL + SFD
Note: StL event data is not available Sp 31 - 43
Oct 1
2010
Apr 1
2010
Sprint 29 – StL Root Cause Analysis Findings
• 11 StL cases was tracked for Sprint 29
• Frequent root cause categories:
1. Blind commits (or insufficiently tested commits)
2. Code/environment changes broke Test Automations
3. Large commits
• Actions that can prevent similar case in future:
1. Ensure sufficient testing before commit, commit to branch if needed
2. Monitor the radiator for smoke test results after commit (delay the commit to
next day if you plan to leave office soon)
3. Developers should test final builds manually more often
4. Make smaller and incremental commits
© F-Secure Public 2011-11-01 34
Detailed case breakdown
© F-Secure Public 2011-11-01 35
Root Cau
se C
atego
ry
Code/env
ironm
ent c
hange
s
broke
TA
Larg
e co
mm
its
Half im
plem
ented fe
ature
s
Blind co
mm
its
Test
Envir
onment C
hange
s
(IT)
How ca
n this
be pre
vente
d in
futu
re?
Hard /
Not worth
the
preve
ntion e
ffort
Monito
r Rad
iator
Smal
ler &
incr
emen
tal
com
mits
Fast
er TA
Har
dwar
e to
shorte
n TA cy
cle
Comple
te fe
ature
imple
menta
tion
Develo
pers sh
ould te
st R
ed
builds m
ore o
ften
Ensu
re su
fficie
nt te
stin
g
before
com
mit
Educa
te d
evelo
per on w
hat
to m
onitor
Case 1 1 1
Case 2 1 1 1 1 1
Case 3 1 1
Case 4 1 1 1 1 1 1
Case 5 1 1
Case 6 1 1
Case 7 1 1 1
Case 8 1 1 1 1
Case 9 1 1
Case 10 1 1
Total 3 2 1 4 1 3 3 2 1 1 3 5 1
Conclusions
© F-Secure Public 2011-11-01 36
Conclusions
• Overall quality of the product improved
• Number of StL events decreased by time
• StL enforcer helped to avoid making mistakes
• Not releasing every two weeks BECAME AN EXCEPTION and not a rule
• New bug handling process helped on focusing on important bugs
• SFD keeps the level of open bugs in a manageable number
• After a settle-down period, these practices change the mindset of the people to be more quality focused
• Next step:
• StL and SFD are “brakes” to avoid accidents, now we are learning how to drive at high speed safely (i.e., avoid making so many bugs in the first place)
© F-Secure Public 2011-11-01 37
Questions?
© F-Secure Public 2011-11-01 38
Acknowledgements
The authors would like to thank the whole project team and the whole R&D
organization of F-Secure and its management for making this presentation
possible and support the data collection and publishing
We‟d like to thank especially
• Petri Kuikka
• Risto Kumpulainen
• Pekka Kiviniemi
• Ferrix Hovi
for their contribution in the bug handling process, Continuous Integration and
Test Automation system and radiator design and implementation and data
visualization
© F-Secure Public 2011-11-01 39
References
[1] Schwaber, K., Beedle, M.: “Agile Software Development with Scrum”, Prentice Hall (2001)
[2] Larman, C., Vodde, B.: “Scaling Lean & Agile Development: Thinking and Organizational Tools for
Large-Scale Scrum”, Addison-Wesley Professional (2008)
[3] Larman, C., Vodde, B.: “Practices for Scaling Lean & Agile Development: Large, Multisite, and
Offshore Product Development with Large-Scale Scrum”, Addison-Wesley Professional (2010)
[4] Leffingwell, D.: “Scaling Software Agility: Best Practices for Large Enterprises”, Addison-Wesley
Professional (2007)
[5] Leffingwell, D.: “Agile Software Requirements: Lean Requirements Practices for Teams, Programs,
and the Enterprise”, Addison-Wesley Professional (2011)
[6] Womack, J.P., Jones, D.T., Roos, D.: The machine that changed the world (1990, 2007)
[7] Poppendieck, M., Poppendieck, T.: “Implementing Lean Software Development: from Concept to
Cash”, Addison-Wesley (2007)
© F-Secure Public 2011-11-01 40
Contact Information http://www.slideshare.net/Gunyho [email protected] [email protected] [email protected]
The authors did their best to attribute the authors of texts and images, and to recognize any copyrights, see more
details of copyrights, license terms and conditions for each source under the reference link provided. If you think
that anything in this material should be changed, added or removed, please contact the authors at the addresses
above
© F-Secure Public 2011-11-01 41
http://creativecommons.org/licenses/by-nd/3.0/