Create step-wise / incremental tests in
New 'generator' style is there, check it out !
pytest-harvest compatibility fixtures, check them out !
Did you ever want to organize your test in incremental steps, for example to improve readability in case of failure ? Or to have some optional steps, executing only conditionally to previous steps' results?
pytest and its great
@pytest.fixture decorators, so that you can create incremental tests with steps without having to think about the pytest fixture/parametrize pattern that has to be implemented for your particular case.
This is particularly helpful if:
- you wish to share a state / intermediate results across test steps
- your tests already rely on other fixtures and/or parameters, such as with pytest-cases. In that case, finding the correct pytest design, that will ensure that you have a brand new state object at the beginning of each test suite while ensuring that this object will actually be shared across all steps, might be very tricky.
pytest-steps you don't have to care about the internals: it just works as expected.
pytest-steps has not yet been tested with pytest-xdist. See #7
> pip install pytest_steps
1. Usage - "generator" mode¶
This new mode may seem more natural and readable to non-pytest experts. However it may be harder to debug when used in combination with other
pytest tricks. In such case, do not hesitate to switch to good old "explicit" mode.
Start with you favorite test function. There are two things to do, to break it down into steps:
- decorate it with
@test_stepsto declare what are the steps that will be performed, as strings.
- insert as many
yieldstatements in your function body as there are steps. The function should end with a
!!! note Code written after the last yield will not be executed.
For example we define three steps:
from pytest_steps import test_steps @test_steps('step_a', 'step_b', 'step_c') def test_suite(): # Step A print("step a") assert not False # replace with your logic intermediate_a = 'hello' yield # Step B print("step b") assert not False # replace with your logic yield # Step C print("step c") new_text = intermediate_a + " ... augmented" print(new_text) assert len(new_text) == 56 yield
That's it! If you run
pytest you will now see 3 tests instead of one:
============================= test session starts ============================= (...) collected 3 items (...)/test_example.py::test_suite[step_a] <- <decorator-gen-3> PASSED [ 33%] (...)/test_example.py::test_suite[step_b] <- <decorator-gen-3> PASSED [ 66%] (...)/test_example.py::test_suite[step_c] <- <decorator-gen-3> PASSED [100%] ========================== 3 passed in 0.06 seconds ===========================
You might wish to use
yield <step_name> instead of
yield at the end of each step when debugging if you think that there is an issue with the execution order. This will activate a built-in checker, that will check that each step name in the declared sequence corresponds to what you actually yield at the end of that step.
b- Shared data¶
By design, all intermediate results created during function execution are shared between steps, since they are part of the same python function call. You therefore have nothing to do: this is what is shown above in step c where we reuse
intermediate_a from step a.
c- Optional steps and dependencies¶
In this generator mode, all steps depend on all previous steps by default: if a step fails, all subsequent steps will be skipped. To circumvent this behaviour you can declare a step as optional. This means that subsequent steps will not depend on it except explicitly stated. For this you should:
- wrap the step into the special
yieldthe corresponding context object at the end of the step, instead of
Noneor the step name. This is very important, otherwise the step will be considered as successful by pytest!
# Step B with optional_step('step_b') as step_b: print("step b") assert False yield step_b
If steps depend on an optional step in order to execute, you should make them optional too, and state it explicitly:
- declare the dependency using the
should_run()to test if the code block should be executed.
The example below shows 4 steps, where steps a and d are mandatory and b and c are optional with c dependent on b:
from pytest_steps import test_steps, optional_step @test_steps('step_a', 'step_b', 'step_c', 'step_d') def test_suite_opt(): # Step A assert not False yield # Step B with optional_step('step_b') as step_b: assert False yield step_b # Step C depends on step B with optional_step('step_c', depends_on=step_b) as step_c: if step_c.should_run(): assert True yield step_c # Step D assert not False yield
Running it with
pytest shows the desired behaviour: step b fails but does not prevent step d to execute correctly. step c is marked as skipped because its dependency (step b) failed.
============================= test session starts ============================= (...) collected 4 items (...)/test_example.py::test_suite_opt[step_a] <- <decorator-gen-3> PASSED [ 25%] (...)/test_example.py::test_suite_opt[step_b] <- <decorator-gen-3> FAILED [ 50%] (...)/test_example.py::test_suite_opt[step_c] <- <decorator-gen-3> SKIPPED [ 75%] (...)/test_example.py::test_suite_opt[step_d] <- <decorator-gen-3> PASSED [100%] ================================== FAILURES =================================== _______________ test_suite_optional_and_dependent_steps[step_b] _______________ (...) ================ 1 failed, 2 passed, 1 skipped in 0.16 seconds ================
d- Calling decorated functions manually¶
In some cases you might wish to call your test functions manually before the tests actually run. This can be very useful when you do not wish the package import times to be counted in test execution durations - typically in a "benchmarking" use case such as shown here.
It is now possible to call a test function decorated with
@test_steps manually. For this the best way to understand what you have to provide is to inspect it.
from pytest_steps import test_steps @test_steps('first', 'second') def test_dummy(): print('hello') yield print('world') yield print(help(test_dummy))
So we have to provide two arguments:
________step_name_. Note: the same information can be obtained in a more formal way using
signature from the
inspect (latest python) or
funcsigs (older) packages.
Once you know what arguments you have to provide, there are two rules to follow in order to execute the function manually:
- replace the
None, to indicate that you are executing outside of any pytest context.
- replace the
step_nameargument with which steps you wish to execute:
Noneto execute all steps in order, or a list of steps to execute some steps only. Note that in generator mode, "by design" (generator function) it is only possible to call the steps in correct order and starting from the first one, but you can provide a partial list:
> test_dummy(None, None) hello world > test_dummy(None, 'first') hello > test_dummy(None, 'second') ValueError: Incorrect sequence of steps provided for manual execution. Step #1 should be named 'first', found 'second'
e- Compliance with pytest¶
Under the hood, the
@test_steps decorator simply generates a wrapper function around your function and mark it with
@pytest.mark.parametrize. The function wrapper is created using the excellent
decorator library, so all marks that exist on it are kept in the process, as well as its name and signature.
@test_steps should be compliant with all native pytest mechanisms. For exemple you can use decorators such as
@pytest.mark.parametrize before or after it in the function decoration order (depending on your desired resulting test order):
@test_steps('step_a', 'step_b') @pytest.mark.parametrize('i', range(2), ids=lambda i: "i=%i" % i) def test_suite_p(i): # Step A print("step a, i=%i" % i) assert not False # replace with your logic yield # Step B print("step b, i=%i" % i) assert not False # replace with your logic yield
If you execute it, it correctly executes all the steps for each parameter value:
============================= test session starts ============================= (...) collected 4 items (...)/test_example.py::test_suite_p[i=0-step_a] <- <decorator-gen-3> PASSED [ 25%] (...)/test_example.py::test_suite_p[i=0-step_b] <- <decorator-gen-3> PASSED [ 50%] (...)/test_example.py::test_suite_p[i=1-step_a] <- <decorator-gen-3> PASSED [ 75%] (...)/test_example.py::test_suite_p[i=1-step_b] <- <decorator-gen-3> PASSED [100%] ========================== 4 passed in 0.07 seconds ===========================
You can also use fixtures as usual, but special care has to be taken about function-scope fixtures. Let's consider the following example:
usage_counter = 0 @pytest.fixture def my_fixture(): """Simple function-scoped fixture that return a new instance each time""" global usage_counter usage_counter += 1 print("created my_fixture %s" % usage_counter) return usage_counter @test_steps('step_a', 'step_b') def test_suite_one_fixture_per_step(my_fixture): # Step A print("step a") assert my_fixture == 1 yield # Step B print("step b") assert my_fixture == 2 # >> raises an AssertionError because my_fixture = 1 ! yield
Here, and that can be a bit misleading,
my_fixture()twice, because there are two pytest function executions, one for each step. So we think that everything is good...
- ...however the second fixture instance is never be passed to our test code: instead, the
my_fixtureinstance that was passed as argument in the first step will be used by all steps. Therefore we end up having a failure in the test furing step b.
It is possible to circumvent this behaviour by declaring explicitly what you expect:
- if you would like to share fixture instances across steps, decorate your fixture with
- if you would like each step to have its own fixture instance, decorate your fixture with
from pytest_steps import one_fixture_per_step @pytest.fixture @one_fixture_per_step def my_fixture(): """Simple function-scoped fixture that return a new instance each time""" global usage_counter usage_counter += 1 return usage_counter
Each step will now use its own fixture instance and the test will succeed (instance 2 will be available at step b).
When a fixture is decorated with
@one_fixture_per_step, the object that is injected in your test function is a transparent proxy of the fixture, so it behaves exactly like the fixture. If for some reason you want to get the "true" inner wrapped object, you can do so using
2. Usage - "explicit" mode¶
In "explicit" mode, things are a bit more complex to write but can be easier to understand because it does not use generators, just simple function calls.
Like for the other mode, simply decorate your test function with
@test_steps and declare what are the steps that will be performed. In addition, put a
test_step parameter in your function, that will receive the current step.
The easiest way to use it is to declare each step as a function:
from pytest_steps import test_steps def step_a(): # perform this step ... print("step a") assert not False # replace with your logic def step_b(): # perform this step print("step b") assert not False # replace with your logic @test_steps(step_a, step_b) def test_suite_1(test_step): # Optional: perform step-specific reasoning, for example to select arguments if test_step.__name__ == "step_a": print("calling step a") # Execute the step by calling the test step function test_step()
Note: as shown above, you can perform some reasoning about the step at hand in
test_suite_1, by looking at the
Custom parameter name
You might want another name than
test_step to receive the current step. The
test_step_argname argument can be used to change that name.
Variants: other types¶
This mechanism is actually nothing more than a pytest parameter so it has to requirement on the
test_step type. It is therefore possible to use other types, for example to declare the test steps as strings instead of function:
@test_steps('step_a', 'step_b') def test_suite_2(test_step): # Execute the step according to name if test_step == 'step_a': step_a() elif test_step == 'step_b': step_b() ...
This has pros and cons:
- (+) you can declare the test suite before the step functions in the python file (better readability !)
- (-) you can not use
@depends_onto decorate your step functions: you can only rely on shared data container to create dependencies (as explained below)
In this explicit mode all steps are optional/independent by default: each of them will be run, whatever the execution result of others. If you wish to change this, you can use the
@depends_on decorator to mark a step as to be automatically skipped or failed if some other steps did not run successfully.
from pytest_steps import depends_on def step_a(): ... @depends_on(step_a) def step_b(): ...
step_b will now be skipped if
step_a does not run successfully.
Note that if you use shared data (see below), you can perform similar, and also more advanced dependency checks, by checking the contents of the shared data and calling
pytest.fail() according to what is present. See
step_b in the example below for an illustration.
@depends_on decorator is only effective if the decorated step function is used "as is" as an argument in
@test_steps(). If a non-direct relation is used, such as using the test step name as argument, you should use a shared data container (see below) to manually create the dependency.
c- Shared data¶
In this explicit mode, by default all steps are independent, therefore they do not have access to each other's execution results. To solve this problem, you can add a
steps_data argument to your test function. If you do so, a
StepsDataHolder object will be injected in this variable, that you can use to store and retrieve results. Simply create fields on it and store whatever you like:
import pytest from pytest_steps import test_steps @test_steps('step_a', 'step_b') def test_suite_with_shared_results(test_step, steps_data): # Execute the step with access to the steps_data holder if test_step == 'step_a': step_a(steps_data) elif test_step == 'step_b': step_b(steps_data) def step_a(steps_data): # perform this step ... print("step a") assert not False # replace with your logic # intermediate results can be stored in steps_data steps_data.intermediate_a = 'hello' def step_b(steps_data): # perform this step, leveraging the previous step's results print("step b") # you can leverage the results from previous steps... # ... or pytest.skip if not relevant if len(steps_data.intermediate_a) < 5: pytest.skip("Step b should only be executed if the text is long enough") new_text = steps_data.intermediate_a + " ... augmented" print(new_text) assert len(new_text) == 56
d- Calling decorated functions manually¶
In "explicit" mode it is possible to call your test functions outside of pytest runners, exactly the same way we saw in generator mode.
An exemple can be found here.
e- Compliance with pytest¶
You can add as many
@pytest.mark.parametrize and pytest fixtures in your test suite function, it should work as expected: a new
steps_data object will be created everytime a new parameter/fixture combination is created, and that object will be shared across steps with the same parameters and fixtures.
- by default all function-scoped fixtures will be "one per step" in this mode (you do not even need to use the
@one_fixture_per_stepdecorator - although it does not hurt).
- if you wish a fixture to be shared across several steps, decorate it with
from pytest_steps import cross_steps_fixture usage_counter = 0 @pytest.fixture @cross_steps_fixture def my_cool_fixture(): """A fixture that returns a new integer every time it is used.""" global usage_counter usage_counter += 1 print("created my_fixture %s" % usage_counter) return usage_counter def step_a(): print('hello') def step_b(): print('world') @test_steps(step_a, step_b) def test_params_mode(test_step, my_cool_fixture): # assert that whatever the step, the fixture is the same (shared across steps) assert my_cool_fixture == 1 test_step()
3. Usage with
a- Enhancing the results df¶
You might already use
pytest-harvest to turn your tests into functional benchmarks. When you combine it with
pytest_steps you end up with one row in the synthesis table per step. For example:
|test_my_app_bench[A-1-train]||passed||2.00009||train||1||my dataset #A||0.832642|
|test_my_app_bench[A-1-score]||passed||0||score||1||my dataset #A||nan|
|test_my_app_bench[A-2-train]||passed||1.00017||train||2||my dataset #A||0.0638134|
|test_my_app_bench[A-2-score]||passed||0.999928||score||2||my dataset #A||nan|
|test_my_app_bench[B-1-train]||passed||0||train||1||my dataset #B||0.870705|
|test_my_app_bench[B-1-score]||passed||0||score||1||my dataset #B||nan|
|test_my_app_bench[B-2-train]||passed||0||train||2||my dataset #B||0.764746|
|test_my_app_bench[B-2-score]||passed||1.0004||score||2||my dataset #B||nan|
You might wish to use the provided
handle_steps_in_results_df utility method to replace the index with a 2-level multiindex (test id without step, step id).
b- Pivoting the results df¶
If you prefer to see one row per test and the step details in columns, this package also provides NEW default
[module/session]_results_df_steps_pivoted fixtures to directly get the pivoted version ; and a
pivot_steps_on_df utility method to perform the pivot transform easily.
You will for example obtain this kind of pivoted table:
|test_my_app_bench[A-1]||1||my dataset #A||passed||2.00009||0.832642||passed||0|
|test_my_app_bench[A-2]||2||my dataset #A||passed||1.00017||0.0638134||passed||0.999928|
|test_my_app_bench[B-1]||1||my dataset #B||passed||0||0.870705||passed||0|
|test_my_app_bench[B-2]||2||my dataset #B||passed||0||0.764746||passed||1.0004|
Two examples are available that should be quite straightforward for those familiar with pytest-harvest:
- here an example relying on default fixtures, to show how simple it is to satisfy the most common use cases.
- here an advanced example where the custom synthesis is created manually from the dictionary provided by pytest-harvest, thanks to helper methods.
Main features / benefits¶
- Split tests into steps. Although the best practices in testing are very much in favor of having each test completely independent of the other ones (for example for distributed execution), there is definitely some value in results readability to break down tests into chained sub-tests (steps). The
@test_stepsdecorator provides an intuitive way to do that without forcing any data model (steps can be functions, objects, etc.).
- Multi-style: an explicit mode and a generator mode are supported, developers may wish to use one or the other depending on their coding style or readability target.
- Steps can share data- In generator mode this is out-of-the-box. In explicit mode all steps in the same test suite can share data through the injected
steps_datacontainer (name is configurable).
- Steps dependencies can be defined: a
@depends_ondecorator (explicit mode) or an
optional_stepcontext manager (generator mode) allow you to specify that a given test step should be skipped or failed if its dependencies did not complete.
- pytest documentation on parametrize
- pytest documentation on fixtures
- pytest-cases, to go further and separate test data from test functions
Do you like this library ? You might also like my other python libraries
Want to contribute ?¶
Details on the github page: https://github.com/smarie/python-pytest-steps