Test Driven Development Andrew Rendell Valtech

Effective and Pragmatic Test Driven Development

Andrew Rendell Valtech

[email protected]

Abstract

Test Driven Development has long been a key tool in the agile toolbox. Often it is suggested that the technique has moved into the mainstream and that not applying a test first approach is exceptional. Recent coverage in the community has even started to describe a post-TDD approach. Having worked with TDD for the last five years with varying degrees of rigor and success I have observed that far from being ubiquitous, effective application of TDD is uncommon.

This paper takes a pragmatic approach in evaluating the implementation of, impediments against and measurable benefits of TDD on a large, commercially successful, project. Analysis of this experience will show how and why TDD is being used incorrectly and how this situation can be corrected. The analysis will show how project delivery improved when a more effective approach was applied. 1. Introduction

This experience report details the application of Test Driven Development on construction of the Web’n’walk-3 mobile internet portal at T-Mobile International in the UK.

T-Mobile International is of one of the largest mobile companies in the world, a major branch of Deutsche Telekom AG whose subsidiaries and affiliated companies serve over 86 million mobile customers worldwide.

Web’n’walk-3 was the mobile internet portal for T-Mobile customers. When a customer pressed the internet button on their phone, the Web’n’walk-3 home page was to be displayed. This mandated that the page be delivered as quickly as possible and the application had to support a large number of concurrent requests. The Web’n’walk-3 portal was

required to run under significant load twenty four hours a day, every day of the year.

Unlike previous incarnations of the mobile internet portal at T-Mobile, the Web’n’walk-3 home page could be personalized. Customers could add widgets which displayed their favorite content. A variety of widgets providing everything from news to email clients existed.

From October 2007 the Web’n’walk-3 application was live in a number of European countries.

T-Mobile is very customer focused. As an organization it prides itself on the stability and scalability of the systems providing customer services. This encourages its development organizations to be conservative and employ very careful management of risk. At the same time the mobile telephony market is one of the most competitive in the world of telecommunications. This puts development groups under pressure to respond quickly to stay on the cutting edge.

These conflicting requirements of risk aversion and rapid adaptability create an interesting software development dynamic. Valtech believes that agile, and in particular Test Driven Development can help these companies deliver quickly and safely.

In the summer of 2007 the marketing organization within T-Mobile set their development teams a challenge: To build a new mobile internet portal which incorporated the latest Web2.0 experience where possible but still delivered carrier grade quality. The marketing team set an exact date and time for launch several months in the future. The timescales were much tighter than for previous projects and initially some doubted it was achievable. This experience report details the small part that a fairly successful application of TDD helped in delivery of that goal, on time.

2. Introduction of TDD

When a project is initiated how does it become ‘test driven’? In common with many other projects the Web’n’walk-3 application was not immediately identified as one that would be using TDD. At the point TDD was first considered, the project had been running for several weeks and created a sizable body of prototype code. There were some JUnit tests but these lacked structure and there was no common approach (or even clear intent) to achieve high coverage. As the project transitioned from elaboration into a construction phase various new team members were brought on board who identified that a TDD approach should be employed. Immediately upon making this decision, the technical architect used the Cobertura code coverage tool to identify that test coverage was very low and in several modules, non-existent.

2.1 Communicating the intent

It was communicated to the developers that a test first approach was to be employed. As has been the case on previous projects in my experience there was universal approval from the programmers for this decision. Programmers still consider TDD a cool technique and one worth adding to their repertoire.

The team held a number of white board sessions where the mechanics of TDD were discussed. The technical architect felt that although several team members were inexperienced in this area and would make mistakes they did on the whole understand the objective of using TDD and the implementation details.

2.2 Enforce TDD through application structure

The architecture of the Web’n’walk-3 application was essentially Model View Controller. The complexities of rendering the portal on mobile browsers were encapsulated within the view. All other functions such as customer identification, integration with downstream partners and enabling systems etc. were encapsulated in a set of server side modules. The interface between the views and the server side modules were the model objects.

The realization of a widget presented to a customer included a set of views and a set of server side modules. The first activity in development of that widget was the creation of a contract class intended to capture the expected output of the server side

modules. This class implemented a number of methods. Each method returned a model object in a state that corresponded to a scenario in the use cases.

For example, development of the email widget involved the view and the server side developers sitting together and creating a class with the following signature. This figure has been slightly revised for commercial and readability reasons.

+customerNotAuthenticatedWithPartner() : EmailWidgetModel+oneReadOneUnreadEmail() : EmailWidetModel+noUnreadEmailFiveReadEmail() : EmailWidgetModel+noEmail() : EmailWidgetModel+manyEmails() : EmailWidgetModel+errorGettingEmail() : EmailWidgetModel

XXXEmailContractEmailWidgetModel

XXXEmailAcceptanceTest JUnit4 test

XXXEmailWidgetService

Figure 1: Contracts and Acceptance tests.

Part of the infrastructure developed by the team allowed the view developer to execute the methods in the contract class. This injected the model object returned into the view under development. The view developer could then progress with the complex task of building the UI for all the use case scenarios with no further dependency on the server side developer.

The server side developer then created one JUnit test for every method in the contract class.

For example, the figure above details a contract class with the method 'customerNotAuthenticatedWithPartner'. The 'XXXEmailAcceptanceTest' class contains a test 'testCustomerNotAuthenticatedWithPartner'. The xml response from the email provider indicating non-authentication would be added to the behavior of a mock HTTP client. All other configuration data, such as customer database rows, would also be created.

The system is then in a state where the model object returned from a call to application should match the model object returned by the above contract class method. The JUnit test executes this call and asserts that the resulting model objects are equal. At this point the acceptance test for this particular scenario has passed. These tests run against a full application stack inside a Spring container. Only calls to external systems are stubbed.

The contract class above has become the test first definition of what is to be delivered. It is unambiguous and easy to understand. The sprint backlog was prioritized so that creation of this contract was the first activity addressed. The last item on the backlog for the use case was the successful execution of the entire acceptance test. This was

the measure by which the use case is deemed ‘completed’ by the developer.

In summary, the use of TDD was encouraged by: • Making a clear statement that TDD was to be

used; • Encouraging practical team discussions about

the benefits and implementation; • Using the Cobertura code coverage tool to

regularly measure overall coverage and identify ‘hot spots’ of non-compliance;

• Mandating a development approach which at the very least guaranteed that development would always begin (and hopefully continue) with a test;

2.3 Degree of freedom in application of the technique

One of the most refreshing aspects of working in an agile team is the focus on practicality and empowerment of the programmers. Whilst TDD had obvious benefits, the architect did not want to impose a regime that developers would either apply dogmatically or felt coerced into using at the expense of their own creativity. Therefore TDD was the preferred development approach but it was not mandated. Developers were discouraged from writing code before writing tests (other than the acceptance tests above), but it was not seen as a ‘serious’ offence not to do so.

Empowering developers did occasionally result in a reduction of quality (described in the next section).

Upon reflection, high test coverage, whether it was achieved through unit or integration tests, should have been nonnegotiable. A developer could be brilliant enough that they produce low defect, well structured code without coding a test first. This does not help subsequent, possibly less able, developers maintain that same code without the support of high test coverage.

3. Pitfalls in implementing TDD 3.1 Disillusionment

After initial enthusiasm, developers can come to resent TDD as they perceive it as being a technique which reduces their efficiency. This is especially true when burndown charts and measurement of velocity are introduced at the same time as TDD.

The majority of developers involved with the Web’n’walk-3 project were new to TDD. Most started with enthusiasm and applied the approach with vigor.

Often they made mistakes (see below) but then identified and learned from those mistakes and went on to refine their technique. There were a limited number of developers who did not respond positively to the experience. The sequence of events many developers experienced when adopting TDD are listed here:

1. Start using TDD and writing tests at the same time or before writing code.

2. Possibly spend too long writing tests and get consumed in applying the approach dogmatically.

3. Start to slip on the burndown and experience the pressure of having a velocity markedly lower than that of their peers.

4. Find that refactoring work is taking longer than they expected because poor code structure has lead to a large volume of fragile test code which also needs refactoring.

5. Start to comment out tests or mark them with @Ignore.

6. Stop writing tests at all or vastly reduce the number of tests in place before code is checked in. Attempt to write the tests last because the code can be considered ‘complete’ in their SCRUM report without the test (i.e. it can be used and nobody will immediately notice the poor coverage).

7. Fail to catch up with retrofitting tests. 8. End up with large areas of code without any testing

which would take far too long to retrospectively write tests for.

Most practitioners on our project made some or all of these mistakes. Even our best developers experienced events 1 through 4 in the first few iterations of development. Efforts were made to identify developers who were falling into these traps as early as possible and provide them with help. Often this was as simple as insisting that they revisit their estimates for work.

Developers who came away from the project with a negative view of TDD were those who went through all the above stages and still failed to correct their behavior.

In many cases those developers had been seen as very successful on other, non-TDD, projects. They had been held in high regard by their peers and managers. Their perception was that they had a strong record of delivery until they were forced to use TDD.

Burndown charts and the use of acceptance tests as the definition of completeness gave a clear measure of progress on Web’n’walk-3. On the negative developer's previous non-agile projects successful delivery may have been more subjective. These developers now perceived themselves to be under far greater scrutiny through the burndown charts. They also noted that instead of just cutting code, as they would have previously, they had to spend development time writing tests. Their (incorrect) conclusion was that it was the application of TDD which was responsible for their lack

of success and acclaim on the new project. In my opinion, this was a major factor in those developers disillusionment with TDD.

3.2 Dogmatic application

Dogmatic application, especially common to new practitioner, leads to increased costs without a corresponding return on investment.

During the initial development of Web’n’walk-3 the Cobertura tool revealed an unexpectedly uneven distribution of test coverage. Some modules had very high levels of coverage in domain objects and less in their services. Further investigation revealed that tests had been written with the sole purpose of increasing code coverage without providing any additional value. The most obvious of these were tests on the gettters and setters of POJOs. Where tests would have been most valuable, in the complex business logic of the services, there was less coverage.

Poor test coverage was more common in those classes whose construction did not enable straight forward test creation. More experienced developers simply redesigned the code to make it more testable when writing the test became onerous. In the majority of cases this led to better code.

Occasionally this approach produced code that was less elegant in some way that that of the developer’s initial implementation. In some cases this inelegance took the form of slightly more lines of code or possibly the removal of a novel new language feature. Our general approach was that it was better to have less elegant code if it could be tested with greater ease and the purpose of those tests was more obvious.

4. Benefits 4.1 Better cohesion

As practitioners gain experience of TDD their code quality increases. Classes become more cohesive and less tightly coupled.

The example below describes how one particular module was initially implemented with a dogmatic and less effective approach to TDD. It was then refactored using a more refined approach which resulted in a much better code structure.

Greg, a developer, was new to TDD. He was tasked with the implementation of a module which would fetch customer account data from an auction site and deliver it to the Web’n’walk-3 view layer. The module had a particular area of complexity, the transformation of the data from the auction site API

into a form for the Web’n’walk-3view. It was this area of complexity that required particularly rigorous testing and is the subject of this example.

As development progressed, requirements were refined and defects were reported. This led to a number of updates to the code. As the class diagram (amended for commercial and readability reasons) shows, the structure of the module degraded. The logic which implemented the transformation of data was implemented in three different locations. (This should have been spotted by Greg no matter whether TDD was being used or not).

Greg applied tests in a dogmatic fashion. It is likely that he did not always write the test first but built them to test code he had just created. This meant he failed to recoup the structural benefits to his code that might have applied. It is worth noting that the code did have high test coverage, which was useful in later refactoring.

Worker XmlService

HttpClient

WidgetService

+testCombineWachingAndBuyingTotals()WorkerTest

+testWatchingData()XmlServiceTest

AuctionWidgetModel

+testSelling()AuctionWidgetModelTest

Logic for implementation of transformation implemented piecemeal in several classes

Logic for testing data transformation implemented in several different classes

TestsTests

Tests

Figure 2: Low cohesion

When other developers came to correct defects caused by incorrect data transformation, they applied the TDD approach. First they started with the data from the auction site that reproduced the defect. Then they looked for the test class in which to add the new test. It quickly became obvious there was an issue as there was no single place to add the test. Instead two or sometimes three test classes had to be amended. This was an obvious ‘bad smell’.

In the next release a backlog item was executed which called for the refactoring of this particular module. Using the acceptance tests a second developer, Jo, started from the entry point to the system (the widget service in the above diagram) and worked down. Jo was very careful to ensure that the tests classes were exercising a cohesive set of functionality. Where a single test class performed very different types of test this was a cue for the class under test to delegate functionality. This is classic O-O design which could have been achieved using many other techniques but TDD was found to be particularly methodical and rigorous.

The construction of simple tests became a litmus for cohesive code. The end result was a much cleaner class structure where all the data transformations were in a single, simple class. The tests which defined how the propriety auction xml was interpreted were also encapsulated in a single class.

The diagram below shows the revised class structure after refactoring.

WidgetService WorkerHttpClient

XmlAdaptorAuctionWidgetModel

XmlAdaptorTest

All data transformation logic encapsulatred in XmlAdatpor. All tests and test data in XmlAdatporTest.

Figure 3: High cohesion

There were many other examples identified by the developers of the top down approach from the acceptance test leading to simpler classes with better cohesion.

4.2 TDD encourages YAGNI

YouAintGoingToNeedIt as described in Ron Jefferies’ wiki posting with the same title [1] is a powerful tool when building software. If there is no use for a particular feature, then it should not be built. This appears so obvious that it hardly needs stating but far too often functionality is added speculatively, especially when building infrastructure type libraries for use by other parts of the system.

Driving development top down from the acceptance tests encouraged developers to apply the YAGNI principle. They were naturally much less likely to build that extra piece of cool code which turns out to be of limited benefit in meeting the user goal.

Infrastructure components that might be used by several different modules should be built in response to a requirement identified by writing a class which requires that service. Building infrastructure components before building any users of those components tends to lead to the YAGNI principle being violated.

It was obvious from the early stages of the project that some concurrency infrastructure was required. Aggregation of content for a customer’s home page

would result in requests to different downstream systems. Those requests had to be executed in parallel. Failure to do so would have resulted in a page rendering time equal to the sum of all downstream request latencies. The aim was only to wait as long as the slowest request before completing a page. A series of whiteboard sessions were held and a design by sketch was created. This parallelization infrastructure was built at the same time as the classes that would eventually use it. The final implementation of this infrastructure component did achieve what it was intended, but at a cost. It was difficult for client classes to use and contained many features that seemed 'useful' to the developer but were never used in anger. A later refactoring exercise started development by building tests which were driven by the client classes. The resulting solution was far smaller, simpler to use and had less defects. Partly this was because the second time you do something you do it better. A great part of it was due to focusing on the real use rather than what was predicted.

4.3 Virtuous circle of agility

Solutions architects and business stakeholders quickly grasp that tests increase confidence of the functional correctness of a given area. This makes them more confident in their support of other agile techniques such as refactoring. This introduces a virtuous circle where code can be cut faster because developers know that they will be able to go back and optimize or add features later with management approval. It also means that the code base stays dynamic and is less likely to be the subject of arbitrary and all encompassing code freezes dictated by senior managers attempting to reduce risk of instability.

The following is a powerful example of this virtuous circle:

A new requirement emerged from the marketing team in the area of customer identification. Detailed analysis of the user population showed that a particular mobile handset known to be in wide use but not thought to be popular for mobile internet was actually responsible for a significant percentage of traffic on the network. The issue was that this particular handset had some very specific requirements regarding the identification of the customer using it.

The code module responsible for identification was the most complex of any in the system. It contained a myriad of business rules. It had dependencies on several downstream enabling platforms. Customer identification is very important as it is obviously key to the operation of an application whose market differentiator was a personalized experience.

Customer identification obviously has some critical security requirements, especially when presenting possibly sensitive personal data such as emails. Protection of customers’ data is very important to T-Mobile. To mitigate

the risk of an error in this part of the system detailed analysis had been performed by the development team and solution architects. State machines had been formulated and captured in UML. These were then used to construct a large suite of manual tests that were executed using a variety of handsets and radio networks by a very skilled tester.

Unfortunately at the point in the project where the new requirement was identified the human test exercise had already been completed. A meeting was called and attended by representatives of testing, solution architecture, program management and development. The mood was somber. Given the teams commitment to quality and the client’s aversion to risk in this area in particular it looked likely that incorporating the requirement would be impossible without a slippage. This all changed when the solution architect made the following, critical, statement:

“We must not forget this is a very different system to those we have gone live with in the past. We have an extensive array of automated tests that the developer built as part of the design and analysis of the identification component. Peter [the human tester] ran a very comprehensive test suite on this code and proved it was correct. The bugs that Peter found were fed back into Matt’s [the developer] JUnit tests. They all pass now. We can make this new change with a high degree of confidence that it won’t break the existing functionality. We have not had a server side system built like this in the past. It is a big step forward. We can do this.”

The confidence that the TDD approach had given the solution architect enabled him to make this statement in an environment where carrier grade quality is jealously guarded.

The change was made, the tests passed and this part of the system went live and continues to operate with a remarkable level of quality. The identification component had seventeen discrete states in its state machine. Its dependency graph included fourteen different spring beans. During testing only five defects were identified. After go live no defects were reported in what is possibly the most complex single part of the system. Four months after go live there was an update in the enabling platforms which changed the way the devices in question here were identified. This introduced complex new states that were previously un-testable outside of a mocked environment. No defects were reported.

5. Conclusions

Test Driven Development appears to be a technique that many developers are very aware of but

lack the necessary experience which allows them to utilize it most effectively.

When developers do gain the necessary experience in Test Driven Development and apply it pragmatically it does enhance the structure of code.

Test Driven Development was not responsible for the success of Web’n’walk-3. It was however, one of many important techniques applied by the team which allowed T-Mobile to achieve a higher degree of adaptability than had been exhibited by similar projects without sacrificing their commitment to carrier grade quality.

6. References [1] Jeffries R. , You’re NOT gonna need it! http://www.xprogramming.com/Practices/PracNotNeed.html

Technology

Test Driven Development Andrew Rendell Valtech