How do I make custom comparisons in pytest?

pytest raises
pytest assert string contains
customize pytest-html report
pytest parameterized tests example
pytest customize output
pytest indirect
pytest fail
pytest-expect

For example I'd like to assert that two Pyspark DataFrame's have the same data, however just using == checks that they are the same object. Ideally I'd also like to be specify whether order matters or not.

I've tried writing a function that raises an AssertionError but that adds a lot of noise to the pytest output as it shows the traceback from that function.

The other thought I had was to mock the __eq__ method of the DataFrames but I'm not confident that's the right way to go.

Edit:

I considered just using a function that returns true or false instead of an operator, however that doesn't seem to work with pytest_assertrepr_compare. I'm not familiar enough with how that hook works so it's possible there is a way to use it with a function instead of an operator.

My current solution is to use a patch to override the DataFrame's __eq__ method. Here's an example with Pandas as it's faster to test with, the idea should apply to any object.

import pandas as pd
# use this import for python3
# from unittest.mock import patch
from mock import patch


def custom_df_compare(self, other):
    # Put logic for comparing df's here
    # Returning True for demonstration
    return True


@patch("pandas.DataFrame.__eq__", custom_df_compare)
def test_df_equal():
    df1 = pd.DataFrame(
        {"id": [1, 2, 3], "name": ["a", "b", "c"]}, columns=["id", "name"]
    )
    df2 = pd.DataFrame(
        {"id": [2, 3, 4], "name": ["b", "c", "d"]}, columns=["id", "name"]
    )

    assert df1 == df2

Haven't tried it yet but am planning on adding it as a fixture and using autouse to use it for all tests automatically.

In order to elegantly handle the "order matters" indicator, I'm playing with an approach similar to pytest.approx which returns a new class with it's own __eq__ for example:

class SortedDF(object):
    "Indicates that the order of data matters when comparing to another df"

    def __init__(self, df):
        self.df = df

    def __eq__(self, other):
        # Put logic for comparing df's including order of data here
        # Returning True for demonstration purposes
        return True


def test_sorted_df():
    df1 = pd.DataFrame(
        {"id": [1, 2, 3], "name": ["a", "b", "c"]}, columns=["id", "name"]
    )
    df2 = pd.DataFrame(
        {"id": [2, 3, 4], "name": ["b", "c", "d"]}, columns=["id", "name"]
    )

    # Passes because SortedDF.__eq__ is used
    assert SortedDF(df1) == df2
    # Fails because df2's __eq__ method is used
    assert df2 == SortedDF(df2)

The minor issue I haven't been able to resolve is the failure of the second assert, assert df2 == SortedDF(df2). This order works fine with pytest.approx but doesn't here. I've tried reading up on the == operator but haven't been able to figure out how to fix the second case.

The writing and reporting of assertions in tests — pytest documentation, return explanation for comparisons in failing assert expressions. you can run the test module and get the custom output defined in the conftest file: $ pytest -q� Note that an existing pytest.ini file will always be considered a match, whereas tox.ini and setup.cfg will only match if they contain a [pytest] or [tool:pytest] section, respectively. Options from multiple ini-files candidates are never merged - the first one wins ( pytest.ini always wins, even if it does not contain a [pytest] section).

To do a raw comparison between the values of the DataFrames (must be exact order), you can do something like this:

import pandas as pd
from pyspark.sql import Row

df1 = spark.createDataFrame([Row(a=1, b=2, c=3), Row(a=1, b=3, c=3)])
df2 = spark.createDataFrame([Row(a=1, b=2, c=3), Row(a=1, b=3, c=3)])

pd.testing.assert_frame_equal(df1.toPandas(), df2.toPandas())

If you want to specify by order, you can do some transformations on the pandas DataFrame to sort by a particular column first using the following function:

def assert_frame_equal_with_sort(results, expected, keycolumns):
  results = results.reindex(sorted(results.columns), axis=1)
  expected = expected.reindex(sorted(expected.columns), axis=1)

  results_sorted = results.sort_values(by=keycolumns).reset_index(drop=True)
  expected_sorted = expected.sort_values(by=keycolumns).reset_index(drop=True)

  pd.testing.assert_frame_equal(results_sorted, expected_sorted)


df1 = spark.createDataFrame([Row(a=1, b=2, c=3), Row(a=1, b=3, c=3)])
df2 = spark.createDataFrame([Row(a=1, b=3, c=3), Row(a=1, b=2, c=3)])

assert_frame_equal_with_sort(df1.toPandas(), df2.toPandas(), ['b'])

The writing and reporting of assertions in tests, assert with the assert statement; assertions about expected exceptions; Making use of context-sensitive comparisons; Defining your own assertion comparison� It is possible to add your own detailed explanations by implementing the pytest_assertrepr_compare hook. pytest_assertrepr_compare (config, op, left, right) [source] return explanation for comparisons in failing assert expressions. Return None for no custom explanation, otherwise return a list of strings.

just use the pandas.Dataframe.equals method https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.equals.html

For example

assert df1.equals(df2)

assert can be used with anything that returns a boolean. So yes you can write any custom comparison function to compare two objects. As long as the custom function returns a boolean. However, in this case there is no need for a custom function as pandas already provides one

Full pytest documentation — pytest documentation, Calling pytest through python -m pytest � Possible exit codes � Getting help on Making use of context-sensitive comparisons � Defining your own explanation for � Pytest is a testing framework which allows us to write test codes using python. ==zz,"cc and zz comparison failed" pytest will look for the fixture in the test

You can use one of pytest hooks, particularity the pytest_assertrepr_compare. In there you can define what tyou you want to compare and how, also docs are pretty good and with examples. Best of luck. :)

Parametrizing tests — pytest documentation, For other objects, pytest will make a string based on the argument name: If you want to compare the outcomes of several implementations of a given API, you we mark the rest three parametrized tests with the custom marker basic , and for� Note. Node IDs are of the form module.py::class::method or module.py::function.Node IDs control which tests are collected, so module.py::class will select all test

Comparing objects and sequences — testfixtures 6.14.1 , This type of comparison is also used on objects that make use of __slots__ . of needing to infrequently compare your own subclasses of python basic types,� Fortunately recent PyInstaller releases already have a custom hook for pytest, but if you are using another tool to freeze executables such as cx_freeze or py2exe, you can use pytest.freeze_includes() to obtain the full list of internal pytest modules. How to configure the tools to find the internal modules varies from tool to tool, however.

Testing Python Applications with Pytest, This tutorial will get you started with using pytest to test your next Python project. and use compared to the numerous assertSomething functions found in unittest . We would like to handle this case in our function by raising a custom� Pytest is a testing framework based on python. It is mainly used to write API test cases. This tutorial helps you understand − This tutorial is designed to benefit IT professionals and students who want to take a step further in their QA Automation career by adding a strong testing framework to

Python Testing 101: pytest, It is one of my most recommended Python test frameworks. and deprecations � Custom assertion comparisons � Advanced assertion introspection (My recommendation is to put all pytests under “[project root]/tests”.)� Make sure all packages with test code have an ‘init.py’ file. If I do all of that, pytest seems to find all my code nicely. If you are doing something else, and are having trouble getting pytest to see your test code, then take a look at the pytest discovery documentation. Running unittests from pytest

Comments
  • It seems it is not as simple as it sounds. link
  • I'm fine with comparing the two DataFrames, my question is how do I use that comparison logic within a pytest assertion
  • If you want to do that specific test, you could make a function that returns True or False for the comparison you wish to make and then, make a simple IsTrue assertion ? Something like that. I'm aware of unittest and haven't tried pytest, but it must be the same logic, right ?
  • I'm looking more for how to use something like this with pytest, I already have the code to compare the DataFrame's
  • The issue with this approach is it doesn't seem to work with pytest_assertrepr_compare which I'd also like to take advantage of. That acts as a hook that receives the operator, left, and right elements and lets you define how the failure should show in the log. I'll add that detail to my question.
  • I originally thought that was the answer as well but my understanding is that is simply used to drive how the data is represented in the log once the assert call fails, it has no control over how the assert is handled.