Test-Driven Development Principles
TDD consists of writing test cases that cover a desired feature, then writing the feature itself. In other words, the usage examples are written before the code even exists.
For example, a developer who is asked to write a function that provides the average value of a sequence of numbers will first write a few examples on how to use it, and
the expected results:
assert average(1, 2, 3) == 2
assert average(1, -3) == -1
These examples can be provided by another person as well. From there, the function can be implemented until the two examples work:
>>> def average(*numbers):
... return sum(numbers) / len(numbers)
...
>>> assert average(1, 2, 3) == 2
>>> assert average(1, -3) == -1
A bug or an unexpected result is a new example of usage the function should be able to deal with:
>>> assert average(0, 1) == 0.5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
The code can be changed accordingly, until the new test passes:
>>> def average(*numbers):
... # makes sure all numbers can be used as floats
... numbers = [float(number) for number in numbers]
... return sum(numbers) / float(len(numbers))
...
>>> assert average(0, 1) == 0.5
And more cases will make the code evolve:
>>> try:
... average()
... except TypeError:
... # we want an empty sequence to throw a type error
... pass
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "<stdin>", line 3, in average
ZeroDivisionError: integer division or modulo by zero
>>>
>>> def average(*numbers):
... if numbers == ():
... raise TypeError(('You need to provide at '
... 'least one number'))
... numbers = [float(number) for number in numbers]
... return sum(numbers) / len(numbers)
...
>>> try:
... average()
... except TypeError:
... pass
...
From there all tests can be gathered in a test function, which is run
every time the code evolves:
>>> def test_average():
... assert average(1, 2, 3) == 2
... assert average(1, -3) == -1
... assert average(0, 1) == 0.5
... try:
... average()
... except TypeError:
... pass
...
>>> test_average()
Every time a change is made, test_average is changed together with average,then run again to make sure all cases still work. The usage is to gather all tests in the tests folder of the current package. Each module can have a corresponding test module there.This approach provides a lot of benefits by:
• Preventing software regression
• Improving code quality
• Providing the best low-level documentation
• Producing robust code faster
Preventing Software Regression
We all face software regression issues in our developer lives. Software regression is a new bug introduced by a change. Regressions happen because of the simple fact that it is impossible at some point to guess what a single change in a codebase might lead to. Changing some code might break some other features, and sometimes lead to vicious side effects, such as silently corrupting data.
To avoid regression, the whole set of features software provides should be tested every time a change occurs.
Opening a codebase to several developers amplifies the problem, since each person will not be fully aware of all development activities. While having a version control system prevents conflicts, it does not prevent all unwanted interactions.
TDD helps reduce software regression. The whole software can be automatically tested after each change. This will work as long as each feature has the proper set of
tests. When TDD is properly done, the test base grows together with the codebase. Since a full test campaign can last for quite a long time, it is a good practice to delegate it to a buildbot, which can do the work in the background (this is described in Chapter 8). But local re-launching of the tests should be done manually by the user, at least for the concerned modules.
Improving Code Quality
When a new module, class, or a function is written, a developer focuses on how to write it and how to produce the best piece of code he or she can. But while he or she
is concentrating on algorithms, he or she might lose the user's point of view: How and when will his or her function be used? Are the arguments easy and logical to use? Is the name of the API right?
This is done by applying the tips described in the previous chapters, such as Choosing Good Names. But the only way to do it efficiently is to write usage examples. This is when the developer realizes if the code he or she wrote is logical and easy to use. Often, the first refactoring occurs right after the module, class, or function is finished.
Writing tests, which are use cases for the code, helps in having this user point of view. Developers will, therefore, often produce a better code when they use TDD. It is difficult to test gigantic functions that both calculate things as well as have side effects. Code that is written with testing in mind tends to be architected more cleanly
and modularly.
Providing the Best Developer Documentation
Tests are the best place for a developer to learn how software works. They are the use cases the code was primarily created for. Reading them provides a quick and deep
insight into how the code works. Sometimes, an example is worth a thousand words. The fact that these tests are always up to date with the codebase makes them the best
developer documentation a piece of software can have. Tests don't go stale in the same way documentation does, otherwise they would fail.
Producing Robust Code Faster
Writing without tests leads to extensive debugging sessions. A bug in one part of the software might be felt in a distant part of that software. Since you don't know who to
blame, you spend an inordinate amount of time debugging. It's better to fight small bugs one at a time when a test fails, because you'll have a better clue as to where the
real problem is. And testing is often more fun that debugging because it is coding.
If you measure the time taken to fix the code together with the time taken to write it, it will usually be longer than the time a TDD approach would take. This is not obvious when you start a new piece of code. This is because the time taken to set up test environment and write the first few tests is extremely long compared to the time taken just to write the first pieces of code.
But there are some test environments that are really hard to set up. For instance, when your code interacts with an LDAP or an SQL server, writing tests is not obvious at all. This is covered in the Fakes and Mocks section in this chapter.