Writing high-quality software

How do we know these various statistical functions work? This is potentially very tricky programming, with lots of opportunities to have things go wrong.

The best tool to make sure that software works is unit testing. The idea behind unit testing is to break a module down into separate units—usually functions or classes—and test each unit in isolation. Python gives us two ways to perform unit testing:

  • Putting examples into docstrings for modules, functions, and classes
  • Writing separate unittest.TestCase classes

Most secret agents will be very happy with docstring test cases. They're easy to write. We put them in the docstring right in front of the rest of the code. They're visible when we use the help() function.

We create these docstring test cases by copying and pasting known correct results from interactive Python. The copy and paste will include the >>> prompt to make it easy to find the examples. Of course, we also include the output that's expected. Once we include this in the docstring, the doctest module will find and use the example.

In some cases, we need to fake the expected results. It's actually common to have worked out what the answer is supposed to be before having written any working Python code. If we're sure the docstring example has the expected right answer, we can leverage this and use it to help debug the code.

Let's look at a simple function we wrote earlier:

def mean(values):
    """Mean of a sequence (doesn't work with an iterable)

    >>> from ch_5_ex_1 import mean
    >>> mean( [2, 4, 4, 4, 5, 5, 7, 9])
    5.0
    """
    return sum(values)/len(values)

We added the example interaction to the function's docstring. We included what looks like a copy and paste of the sequence of interactions that will exercise this function. In some cases, we make the sequence up based on what we plan to write, not what we've written.

We can exercise the several different ways. The easiest is this:

python3 -m doctest ch_5_ex_1.py

We run the doctest module as a top-level main application. The single argument to this application is the name of a Python application that has doctest examples pasted into docstrings.

There's no output if everything works. If we're curious, we can ask for more verbose output:

python3 -m doctest -v ch_5_ex_1.py

This will produce voluminous output that shows each test that was found in the docstrings in the module.

The other techniques include building a self-testing module and writing a separate script that just runs tests.

Building a self-testing module and a test module

One of the techniques that works out nicely is using the __name__ == "__main__" technique to add a test script to a library module. We'll evaluate the doctest.testmod() function to test the functions and classes defined in a module.

It looks like this:

if __name__ == "__main__":
    import doctest
    doctest.testmod()

If this module is being run from the command line, it's the main module, and global __name__ will be set to "__main__". When this is true, we can import the doctest module and evaluate doctest.testmod() to confirm that everything else in the module works.

We can also write a separate test script. We might call it "test.py"; it might be as short as this:

import doctest
import ch_5_ex_1
doctest.testmod( ch_5_ex_1 )

This short script imported the doctest module. It also imported the module we're going to test.

We used the doctest.testmod() function to locate doctest examples in the given module. The output looks like this:

TestResults(failed=0, attempted=2)

This is a confirmation that there were two lines of >>> examples, and everything worked perfectly.

Creating more sophisticated tests

There are times when we have to be a little cautious of the doctest example output. These are situations where Python's behavior is not specified to the level of detail where we can copy and paste interactive results without thinking about what we're doing.

When working with dict and set collections, the order of the items is not guaranteed.

  • For a dict, a doctest string needs to include sorted() to force a specific order. It's essential to use sorted(some_dict.items()) instead of simply using some_dict.
  • The same consideration applies to sets. We must use something like sorted(some_set) instead of some_set.

Some internal functions such as id() and repr() can display a physical memory address that's unlikely to be the same each time we run the tests. There's a special comment we can include that will alert doctest to skip the details. We'll include #doctest: +ELLIPSIS and replace the ID or address with ... (three dots).

Another place we might use ellipsis is to shorten up a very long bit of output.

For example, we might have a module docstring like this:

"""Chapter 5, example 1

Some simple statistical functions.

>>> from ch_5_ex_1 import mean, median
>>> data = [2, 4, 4, 4, 5, 5, 7, 9]
>>> data # doctest: +ELLIPSIS
[2, 4..., 9]
>>> mean( data )
5.0
>>> median( data )
4.5

"""

A module docstring must be (almost) the first lines in a module file. The only line that might come before the module docstring is a one-line #! comment. A #! comment line, if present, is aimed at the OS shell and identifies the rest of the file as being a Python script, not a shell script.

We used the # doctest: +ELLIPSIS directive on one of our tests. The result wasn't complete, it had "..." in the expected results to show the parts doctest should ignore.

Floating-point values may not be identical for different processors and OSes. We have to be careful to show floating-point numbers with formatting or rounding. We might use "{:.4f}".format(value) or round(value,4) to assure that the insignificant digits are ignored.

Adding doctest cases to a class definition

We looked at doctests in modules and functions. We can put doctests in several places in a class definition. This is because we have several places to put docstrings.

The class as a whole can have a docstring right at the top. It's the first line after the class statement. Also, each individual method within a class can have its own private docstring.

We might, for example, include a comprehensive docstring at the beginning of our class definition:

class AnnualStats:
    """Collect (year, measurement) data for statistical analysis.

    >>> from ch_5_ex_4 import AnnualStats
    >>> test = AnnualStats( [(2000, 2),
    ...    (2001, 4),
    ...    (2002, 4),
    ...    (2003, 4),
    ...    (2004, 5),
    ...    (2005, 5),
    ...    (2006, 7),
    ...    (2007, 9),] )
    ...
    >>> test.min_year()
    2000
    >>> test.max_year()
    2007
    >>> test.mean()
    5.0
    >>> test.median()
    4.5
    >>> test.mode()
    4
    >>> test.stddev()
    2.0
    >>> list(test.stdscore())
    [-1.5, -0.5, -0.5, -0.5, 0.0, 0.0, 1.0, 2.0]
    """

This provides a complete rundown of all of the features of this class in one tidy summary.

Tip

Our sample data leads to a standard deviation of exactly 2.0. This trick shows that with clever test data, we can circumvent some of the doctest float-point output limitations.