Better test output with AST rewriting and a patched standard library

Quite often, I prototype code in notebooks. To make it easier to transfer the code outside, I also write tests using ipytest, a small package I wrote a while back. However, I never could get pytest's nice assertions to work properly. Prompted by this GH issue, I decided to have another look.

To cut a long story short: the current version of ipytest offers the full pytest experience from inside the notebook. Just run pip install ipytest and give it a try.

In the following paragraphs I would like to describe how I implemented the assertion support. But first: did you ever wonder how Jupyter shows stack traces? Bear with me. It is less obvious than it sounds. As a motivating example, consider running the following code inside a notebook

def foo():
    raise ValueError()

foo()

If you do, you get a nice stack trace similar to

-------------------------------------------------------------
ValueError                  Traceback (most recent call last)
~/Code/misc-exp/BuildingBlocks/TimeSeries.ipynb in <module>()
      2     raise ValueError()
      3
----> 4 foo()

~/Code/misc-exp/BuildingBlocks/TimeSeries.ipynb in foo()
      1 def foo():
----> 2     raise ValueError()
      3
      4 foo()

ValueError:

So far so good. But when you now try to use the builtin exec function, things start to break down. Just execute

exec('''
def foo():
    raise ValueError()

foo()
''')

and the stack trace becomes a disappointing

-------------------------------------------------------------
ValueError                  Traceback (most recent call last)
~/Code/misc-exp/BuildingBlocks/TimeSeries.ipynb in <module>()
      4
      5 foo()
----> 6 ''')

~/Code/misc-exp/BuildingBlocks/TimeSeries.ipynb in <module>()

~/Code/misc-exp/BuildingBlocks/TimeSeries.ipynb in foo()

ValueError:

It does neither show source code nor line numbers making debugging quite painful (in the python interpreter you get line numbers, but still no source). However, surely there has to be a way to fix this issue: in Jupyter most code is run via exec, but exceptions reliably give nice stack traces.

Naively, I hoped to find a documented API somewhere in the python standard library. After an hour of fruitless search, I decided to have a look into how IPython is doing it (that you can is one of the reasons I love open source). As it turns out there is no API after all and IPython relies on Python internals to show stack traces.

The relevant code starts in InteractiveShell.run_cell with the skeleton

cell_name = self.compile.cache(cell, self.execution_count)
code_ast = compiler.ast_parse(cell, filename=cell_name)
code_ast = self.transform_ast(code_ast)
self.run_ast_nodes(code_ast.body, cell_name, ...)

For the most part, the codes reads how I would have written it. Only the line where the cell code is cached sticks out. When you dig deeper, you will find in essence the following code:

# Stdlib imports
...
import linecache
...
# Now, we must monkeypatch the linecache directly so that parts of the
# stdlib that call it outside our control go through our codepath
# (otherwise we'd lose our tracebacks).
linecache.checkcache = check_linecache_ipython
...

def cache(self, code, number=0):
    """...
    Returns
    -------
    The name of the cached code (as a string). Pass this as the filename
    argument to compilation, so that tracebacks are correctly hooked up.
    """
    ...

To summarize: IPython gets nice stack traces by monkeypatching the stdlib linecache module and registering fictitious files. Once a traceback is constructed, the patched cache is queried and the content of this fictitious file returned. I have to admit this process seems a bit brittle in my eyes.

But back to the issue at hand: how to get nicely formatted asserts when running pytest inside a notebook? When run on the commandline, pytest will modify your code to execute additional steps that will show the inputs of the assert. All the code modification magic lives inside the internal package _pytest.assertion.rewrite. This module includes a function rewrite_asserts that modified the AST passed to it.

So how did I end up implementing the assertion rewriting in ipytest? As you do, I ignored package boundaries and simply hooked into the pytest internals (the hypocrisy is not lost on me). Ignoring issues of encapsulation, you will run into the problem of missing stack traces with exec, once you try to execute the modified AST. Thankfully, we can just piggy back on IPython's solution to this problem.

After putting everything together, this code

# cell 1
import ipytest.magics
__file__ = 'Untitled.ipynb'

# cell 2
%%run_pytest -v

def test_foo():
    actual = [1, 2, 3]
    assert actual == [2, 3, 4]

gives nicely formatted error messages (including stack traces):

[...]
============================ FAILURES ===========================
---------------------------- test_foo ---------------------------
    def test_foo():
        actual = [1, 2, 3]
>       assert actual == [2, 3, 4]
E       assert [1, 2, 3] == [2, 3, 4]
E         At index 0 diff: 1 != 2
E         Full diff:
E         - [1, 2, 3]
E         + [2, 3, 4]

<ipython-input-3-8b817ee3c044>:4: AssertionError
==================== 1 failed in 0.56 seconds ====================

If you feel ipytest could be useful to you, I would love for you to give it a try. Just install it via pip install ipytest. If you have any issues or feedback, please feel free to contact me via @c_prohm on twitter or github.