Metaclasses at work - Generating well formed tests in Python

2016-09-08

I was fortunate to be given an opportunity to give a lightening talk at Pycon 2016. I spoke about how we can generate well-formed tests using the meta-classes facility of Python.

Problem Statement:

In many instances, we end up with tests that check for a common set of invariants for a large set of inputs. For example, say we need to test a function such as


    def foo(x, y, z):
    '''
    Args:  x, y z are Bool's
    '''
        return (x != y and x != z or x and z)

To generate a comprehensive set of tests for this function, we will need 8 tests, which all look the same except for the input that will vary.


    class TestUnit(unittest.TestCase, metaclass=TestFunctionFooMeta):
        def check_common_invariants(self, actual_value, expected_value):
            self.assertEqual(actual_value, expected_value)

        def test_scenario_1(self):
            x, y, z = True, True, True

            result = foo(x, y, z)
            self.assertEqual(result, True)

        def test_scenario_2(self):
            x, y, z = True, True, False

            result = foo(x, y, z)
            self.assertEqual(result, False)

        def test_scenario_1(self):
            .
            .
            .

Our objective here is to avoid such bolierplate code. In the real-world the 8 scenerios expresssed here could translate into 100’s of inputs from production data that can be used to regression test new changes to code.

The alternate approach of looping around all the input and have the assert statements in the loop is not desirable, since we loose the descriptive error reporting mechanism of provided by the test runtime. For example, if a test fails, you really would not know which input the tests failed for. It would be an anti-pattern to take such an approach. Here is an example of the anti-pattern.


    def test_all_scenarios(self):
        '''
        This test is an anti-pattern. Looping around all inputs
        does not provide a good error reporting mechanism when test fails
        '''
        values = [True, False]
        expected_results = [
                True,
                False,
                True,
                True,
                True,
                False,
                True,
                False]

        for i, (x, y, z) in enumerate(product(i, i, i)):
            r = foo(x, y, z)
            self.assertEquals(expected_results[i], r)

In the real world, this would apply to any complex application that deals with lot numeric data and produces a large data set of outputs. It would be beneficial to be able to use past production data(which we assume would have been validated, by its use), and assume that to be a good indicator ot expected output. Coupling the input that went into producing that production data, we could create valuable tests that could be handy as a regression test suite and can also be used effectively while refactoring our code.

So, how do we avoid such bolier plate code? One option is to use Python’s meta-classes facility to auto-generate our tests.

Lets do this in steps:

  1. We will need a factory function that will create our test methods, that we will attach to our test class.

    def create_func(test_name, input_args, expected_args):
        def func(self):
            result = foo(*input_args)   # call the Function under test
            self.check_common_invariants(result, expected_args)
            func.__name__ = "test_" + str(test_name)
            return func

Here the create_func, returns a func object. This func will be attached to the Test class we will create in the next few steps. This func object will be the test_* method in the test class. We will create as many func objects as the number of test inputs we have. The func object, will in turn invoke the foo method and then call all function, that will verify all the invariants that the result of the function has to satisfy.

Now, let’s look at how we can loop around all test inputs and attach the func object to the test classes. We will create a class, that will act as a meta class for our test class.


    class TestFunctionFooMeta(type):
        def __new__(cls, name, bases, attrs):
            def create_func(test_name, input_args, expected_args):
                def func(self):
                    result = foo(*input_args)   # call the Function under test
                    self.check_common_invariants(result, expected_args)
                func.__name__ = "test_" + str(test_name)
                return func

            # get_test_args yields (test_name, input_args, expected_value)
            for test_name, input_args, expected_args in get_test_args():
                func = create_func(test_name, input_args, expected_args)
                attrs[func.__name__] = func
            return type.__new__(cls, name, bases, attrs)

We see that we plugged in the create_func into the new function of the class we will use to create our Test class. Between, line 11 and 13, we loop over all possible inputs, and expected output, get a new func for each instance and then attach the newly minted func to the Test class that is getting created.

And, here is the definition of get_test_args.

    def get_test_args():
        x = y = z = [True, False]
        from itertools import product, repeat
        input_args = list(product(x, y, z))
        expected_args = [
            True,
            False,
            True,
            True,
            True,
            False,
            True,
            False]

        test_names = list(range(1, len(expected_args) + 1))
        for test_name, i, e in zip(test_names, input_args, expected_args):
            yield (str(test_name), i, e)

All this function does is to generate the input/output mapping. In a real-world example, this method would essentially be assembling the input and grabbing the expected results from production data. That way, effective regression tests can be built using real-data.

Here is the gist.