Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (50.2k points)

How do I unit test python dataframes?

I have functions that have input and output as dataframes. Almost every function I have does this. Now if i want to unit test this what is the best method of doing it? Does it seem a bit of an effort to create a new dataframe (with values populated) for every function?

Are there any materials you can refer me to? Should you write unit tests for these functions?

How do you Unit Test Python DataFrames
Intellipaat-community
by (110 points)
If you use pytest for unit testing, you can provide common input dataframes with Pytest fixtures. Since each function performs a different transformation, you will probably need to declare expected outputs inside each unit test. You can compare the actual outputs to expected outputs by extracting the `values` from the dataframes, and then comparing them with one of the numpy test functions. (https://numpy.org/doc/stable/reference/routines.testing.html)

1 Answer

–1 vote
by (108k points)

While Pandas' test functions are originally used for internal testing, NumPy includes a very useful set of testing functions that are referred to in the following link:

https://docs.scipy.org/doc/numpy/reference/routines.testing.html

These functions compare NumPy arrays, but you can get array that underlies a Pandas Data Frame using the values property. You can specify a simple Data Frame and examine what your function returns to what you expect.

One method you can use is to define one set of test data for a number of functions. That way, you can use Pytest Fixtures to define that Data Frame once, and use it in multiple tests.

How do you Unit Test Python DataFrames
Intellipaat-community
by (110 points)
This answer is plagiarized with only a couple of insignificant word changes from a stackoverflow thread posted 2 years earlier. (https://stackoverflow.com/questions/41852686/how-do-you-unit-test-python-dataframes)

Browse Categories

...