Prototyping to tested code

An introduction to ipytest

Christopher Prohm (@c_prohm)
PyCon.DE 2018, Karlsruhe


Disclaimer

The views and opinions expressed in this talk are mine and do not necessarily reflect the ones of my employer. The content and materials are my own.

About Me


Physicist by training, turned data scientist.
Working at Volkswagen Data:Lab in Munich.


Avid user of Jupyter notebooks. Also, conflicted about Jupyter notebooks.

Jupyter notebooks

My take on it

Notebooks are hard:

  • global state is confusing to me
  • git and notebooks do not mesh well in my view
  • my notebooks seems to becomes messier over time

But notebooks overall increases my productivity enormously:

  • rapid feedback and exploration
  • documentation (incl. math)

Notebook vs. Modules?


Use what is most efficient.


Combine the best of both worlds,
move code progressively out of notebooks.


Use same libraries & tooling inside and outside notebooks.


The notebook-module continuum

notebook modules matplotlib, bokeh, altair, ... dask, pyspark dask, pyspark mlflow mlflow panel (*) panel (*) ...

How does such a workflow look like in practice?
How does testing fit in this picture?
How does ipytest support pytest inside notebooks?

Getting Started

 
! pip install pytest       <------- Hugely popular testing framework
! pip install ipytest      <------- Integration of pytest and notebooks
                                    Full disclosure: I am the author.
In [2]:
import ipytest.magics                  # <--- enable IPython magics


import ipytest                         # <--- enable pytest's assert rewriting
ipytest.config.rewrite_asserts = True  # 


__file__ = "IPyTestIntro.ipynb"        # <--- make the notebook filename available 
                                       #      to ipytest
In [3]:
%%run_pytest[clean] -qq

def test_example():
    assert [1, 2, 3] == [1, 2,3 ]
.                                                                [100%]

The main ipytest API


%%run_pytest[clean] -qq
     ^         ^     ^
     +---------|-----|---- execute tests with pytest
               |
               +-----|---- delete any previously defined tests
                     |
                     +---- arbitrary pytest arguments



pytest support

 

pytest is doing all the heavy lifting 😀. Most (all?) pytest features work out of the box.

  • @pytest.mark.*
  • @pytest.fixture
  • --pdb
  • -l
  • ...
  • Assertion rewriting

Assertion rewriting

In [4]:
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 0]
    #                           error at ^^^^^^^^^^^^^
In [5]:
%%run_pytest[clean] -qq

def test_keep_odds():
    assert keep_odds([1, 2, 3, 4]) == [1, 3]
F                                                                [100%]
=============================== FAILURES ===============================
____________________________ test_keep_odds ____________________________

    def test_keep_odds():
>       assert keep_odds([1, 2, 3, 4]) == [1, 3]
E       assert [2, 4] == [1, 3]
E         At index 0 diff: 2 != 1
E         Full diff:
E         - [2, 4]
E         + [1, 3]

<ipython-input-5-757929023375>:3: AssertionError

Debugger

In [10]:
%%run_pytest[clean] -qq -x --pdb 

def test_pdb():
    l = [1, 2, 3]
    assert l == []
F
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    def test_pdb():
        l = [1, 2, 3]
>       assert l == []
E       assert [1, 2, 3] == []
E         Left contains more items, first extra item: 1
E         Full diff:
E         - [1, 2, 3]
E         + []

<ipython-input-10-26043487c447>:4: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> <ipython-input-10-26043487c447>(4)test_pdb()
-> assert l == []
(Pdb) q

Exit: Quitting debugger
!!!!!!!!!!!!!!! _pytest.outcomes.Exit: Quitting debugger !!!!!!!!!!!!!!!

How does ipytest work?

Small package.
Creative use of extension APIs of pytest, jupyter.

# pytest plugin to make notebooks look like modules
class ModuleCollectorPlugin(object):
    def pytest_collect_file(self, parent, path):
        ...

# ipython plugin to rewrite asserts
shell = get_ipython()
shell.ast_transformers.append(...)

Prototyping to Production

Navigating the notebook / module continuum

Directory layout

notebooks/
notebooks/IPyTestIntro.ipynb

Requirements

Pipfile          # <---- abstract
Pipfile.lock     # <---- concrete

Packaging

setup.py
src/             # <---- source code

Tests

tests/

Pipfile & pipenv


[packages]
ipytest = "*"
pytest = "*"
ipytest-demo = {editable = true, path = "."}
#              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#               make local module editable
# ...

[scripts]
test = "pytest tests"

setup.py


from setuptools import setup, PEP420PackageFinder

setup(
    name='ipytest-demo',
    version='0.0.0',
    py_modules=["keep_odds"],
    # ^^^ when using modules (credit: @tmuxbee)
    #
    # alternative for packages:
    # packages=PEP420PackageFinder.find('src'),
    package_dir={'': 'src'},
)


From notebooks to modules (1/4)

Write the code and explore it inside notebooks

In [11]:
# Write your functionality
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]
In [12]:
# Interactive Exploration
keep_odds([1, 2, 3, 4, 5, 6])
Out[12]:
[1, 3, 5]

From notebooks to modules (2/4)

Write tests

In [13]:
# Write your functionality
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]
In [14]:
# Interactive Exploration
keep_odds([1, 2, 3, 4, 5, 6])
Out[14]:
[1, 3, 5]
In [15]:
%%run_pytest[clean] -qq

def test_keep_odds():
    assert keep_odds([1, 2, 3, 4, 5, 6]) == [1, 3, 5]
.                                                                [100%]

From notebooks to modules (3/4)

Move the code to a module, continue experimenting with tests inside notebook

In [16]:
!cat ../src/keep_odds.py
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]
In [17]:
%%run_pytest[clean] -qq

# reload the module
ipytest.reload('keep_odds')
from keep_odds import keep_odds


def test_keep_odds():
    assert keep_odds([1, 2, 3, 4, 5, 6]) == [1, 3, 5]
.                                                                [100%]

From notebooks to module (4/4)

Move everything outside the notebook

In [18]:
!cat ../src/keep_odds.py
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]
In [19]:
!cat ../tests/test_keep_odds.py
from keep_odds import keep_odds


def test_keep_odds():
    assert keep_odds([1, 2, 3, 4, 5, 6]) == [1, 3, 5]
In [20]:
!pytest -qq ../tests
.                                                                [100%]

How well does it work?

Moving code out of notebooks

  • Development packages & reloading
  • More and more libraries with support for both environments

- Development inside notebook

  • More support to reason about global state
  • Integration into notebooks of more libraries
  • Better tooling (type checking, completion, refactoring, ...)

X Keeping notebooks & modules in sync

  • Moving code into notebook
  • Regression checking for notebooks (papermill?)
  • Notebook & package aware tools
  • ...

Conclusion

Notebooks offer a very effective environment for rapid iteration
Interactive tests of code allow to create test input/output pairs quickly

Notebooks can become cumbersome for large code bases
  Move code out of notebooks progressively
  Use same libraries & tooling.
  For testing: ipytest & pytest

Caveat: Hidden state requires some care (%%run_pytest[clean], reload)


Install: pip install pytest ipytest
Twitter @c_prohm