3 Reasons Why We Shouldn’t Initialize Our Test Data With Production Code

Automated testing is a crucial part of modern software development because it's an efficient way to make sure that our application is working as expected. However, there is a tempting shortcut that developers sometimes take: initializing test data with production code. This approach seems convenient because it looks like a quick and easy way for setting up different test scenarios. In fact, it's a risk that can undermine the very purpose of our testing efforts.

In this blog post, I will identify three reasons why we shouldn't initialize our test data with production code and describe how we can initialize our test data a in way which allows us to write robust and useful automated tests.

Let's begin.

This blog post assumes that our application uses a relational database.

The Problem

Let's assume that we have to write automated tests for the TodoItemRepository class which provides CRUD operations for todo items. The source code of the TodoItemRepository class looks as follows:

import java.util.List;

class TodoItemRepository {
    
    TodoItem create(CreateTodoItem input) {
        //Implementation left blank purpose
    }

    TodoItem delete(Long id) {
        //Implementation left blank purpose
    }

    List<TodoItem> findAll() {
        //Implementation left blank purpose
    }

    TodoItem findById(Long id) {
        //Implementation left blank on purpose
    }

    TodoItem update(UpdateTodoItem input) {
        //Implementation left blank on purpose
    }
}

Now, if we want to write tests for the findAll() method with JUnit 5, the "pseudocode" of our test class could look as follows:

import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.DisplayName;
import org.junit.jupiter.api.Nested;
import org.junit.jupiter.api.Test;

@DisplayName("Find all todo items")
class FindAllTodoItemsTest {

    //Obviously, this field must be initialized somehow
    private TodoItemRepository repository;

    @BeforeEach
    void cleanUpTodoItems() {
        //The implementation of this method must delete all
        //todo items from the todo_item table.
    }

    @Nested
    @DisplayName("When no todo items is found from the database")
    class WhenNoTodoItemsIsFound {

        @Test
        @DisplayName("Should return an empty list")
        void shouldReturnAnEmptyList() {
            var todoItems = repository.findAll();
            //Verify that todoItems list is empty
        }
    }
    
    @Nested
    @DisplayName("When two todo items is found from the database")
    class WhenTwoTodoItemsIsFound {

        @BeforeEach
        void insertTwoTodoItemsIntoDatabase() {
            repository.create(CreateTodoItem.builder()
                    .withDescription("First todo item")
                    .withTitle("Todo item 1")
                    .build()
            )

            repository.create(CreateTodoItem.builder()
                    .withDescription("second todo item")
                    .withTitle("Todo item 2")
                    .build()
            )
        }

        @Test
        @DisplayName("Should return two todo items")
        void shouldReturnTwoTodoItems() {
            var todoItems = repository.findAll();
            //Verify that the todoItems list contains two items
        }

        @Test
        @DisplayName("Should return the expected first todo item")
        void shouldReturnExpectedFirstTodoItem() {
            var first = repositories.findAll().get(0);
            //Verify that the first todo item contains the expected data
        }

        @Test
        @DisplayName("Should return the expected second todo item")
        void shouldReturnExpectedSecondTodoItem() {
            var second = repositories.findAll().get(1);
            //Verify that the second todo item contains the expected data
        }
    }
}
Naturally, the required setup code depends on the scenarios which are needed to verify that the system under test is working as expected.

Next, I will identify three reasons why we shouldn't use this approach for inserting test data into the database.

1. A Good Test Can Fail for Only One Reason

It's often said that a good test can fail for only one reason because it asserts only one logical concept. These tests are huge time savers because we don't have to waste any time for figuring out what's wrong. Instead, we can concentrate our efforts on fixing the problem.

In the context of this blog post, we can rephrase the definition of a good test and state that a good test can fail only if we change the behavior of the system under test. In other words, a good test must not fail if we change the behavior of methods which aren't used by the system under test. This makes our life a lot easier if a test fails because we know that the problem is found from the system under test.

If we write tests which initialize the required test data with production code, our tests won't fulfill the definition of a good test. For example, if we change the implementation of the create() method:

  • Some (or all) tests which ensure that the create() method is working as expected fail if the new implementation isn't compatible with the old one.
  • Some (or all) tests which initialize the required test data by invoking the create() method fail if the test data that's inserted into the database by the new implementation isn't same as the test data that's inserted into the database by the old implementation.

It's a good thing that our tests fail if the new implementation of the create() method isn't compatible with the old one. That being said, the only tests which should fail in this situation must test the create() method. If this is the case, we know immediately that we have to either fix our implementation or make changes to our tests.

On the other hand, if the failing tests are testing different methods (including the create() method), we cannot be sure why these tests failed. This will slow us down because we have to first figure out why our tests failed and invest extra time for fixing tests which shouldn't fail in this situation (tests which won't test the create() method).

The problem is that tests which fail "randomly" aren't useful because they aren't actionable and developers don't trust them. Also, the constant debugging caused by these tests is extremely frustrating and decreases our motivation. This increases the probability that someone will either ignore or remove these tests.

2. A Good Test Won't Slow Us Down

Most applications are in a constant state of evolution where the source code undergoes frequent updates and modifications. That's why it's crucial that its test suite can adapt to these changes without requiring an expensive rewrite. This increases the probability that these tests will be kept up-to-date instead of ignored or deleted.

A maintained and up-to-date test suite is extremely valuable for developers who need to understand the behavior of the application. If the tests assert only one logical concept and have good names, the test suite will act as an executable specification which helps developers to understand the requirements of the application.

In other words, if we write tests which which won't slow us down, we will save time in the long run. That's why we shouldn't initialize our test data with production code because:

1. Our tests will slow us down because they can fail for multiple reasons.

2. Our tests will slow us down because they won't compile if we change the return type or signature of the method that inserts test data into the database. Let's assume that we will do one of the following changes to the create() method:

  • Add a new method parameter.
  • Remove a method parameter.
  • Change the type of a method parameter.
  • Change the return type.
  • Change the type of a method parameter's field.
  • Change the type of the return type's field.

After we have changed the create() method:

  • The tests which ensure that the create() method is working as expected won't compile.
  • The test code which initialize the required test data by invoking the create() method won't compile.

It's natural that the tests which test the create() method won't compile, but we shouldn't put ourselves in a situation where we have to fix compilation errors from unrelated test code. This is boring and repetitive work which destroys both our productivity and motivation.

3. Our tests will slow us down if our test data contains relationships between different entities because managing these relationships requires extra work. The problem is that we cannot insert a new row into a database table that has a foreign key column if we don't know the value of the "parent" table's primary key column. This means that we must:

  • Insert our test data into the database in the "correct" order.
  • Store the values of the primary key columns which are (hopefully) returned by our production code which inserts new rows into the database.
  • Use the stored ids when we insert new rows into the database tables which have foreign key columns.

In my experience, when we add new tables and relationships to our database, the code that stores the required ids and passes them around becomes spaghetti code that's frustrating and slow to maintain.

I think that these three examples demonstrate that if we write tests which initialize the required test data with production code, our tests will slow us down when we have to make changes to our application. The problem is that if we modify a method which is used to initialize the required test data, we have to check every test which uses or tests this method and make sure that our changes won't cause any problems. This costs both time and money, and has a huge negative effect to the developer experience.

3. Test Data Must Be Easy to Read

If a test fails and it's not immediately clear why it failed, it's extremely useful to see the test data that's passed to the system under test as an input and that's found from the database when the failed test is run. If we initialize the required test data with production code, reading the test data can be a bit complicated because we have to either:

1. Read the production code that's used to initialize our test data and try to figure out what kind of data is inserted into the database. This isn't the end of the world if we initialize our test data with repository classes which don't contain any business logic because the repository code is often quite straightforward.

On the other hand, if we initialize our test data with a service class that contains business logic, it takes a lot more work to figure out what kind of test data is found from the database when the failed test is run because we have to determine what kind of test data is produced by our business logic. The amount of work that's required to do this depends on the complexity of our business logic and there are situations when this approach isn't a valid option (or at least the fastest option).

2. Debug the failing test, stop the execution after the test data is inserted into the database, and take a look at the rows which are found from the relevant database tables. Again, this isn't the end of the world and in some (tricky) situations we have to do this anyway, but this requires extra work that isn't always required if we have an easy access to our test data.

It's true that sometimes a test failure is so arcane that fixing it requires that we take a (very) good look at the source code and we might have to do a long debug session as well. However, most test failures aren't mythical beasts that can be slain only by a very special hero. In my experience, most test failures are caused by simple mistakes which are easy to fix if our test data is easy to read.

In other words, if we want to be as productive as possible, we must do our best to ensure that these simple mistakes are as easy to fix as possible. This means that we shouldn't initialize the required test data with production code.

I have now identified three reasons why we shouldn't insert our test data into the database by using production code. Let's move on and find out what's the best way to insert test data into our database.

What Should We Do Then?

It's not easy to create a test data set that's useful, easy to read, easy to maintain, and doesn't make us feel sick in the stomach every time when we think about it. That's why we should keep out test data generation logic as simple as possible and use a technique which doesn't tie our hands when we make changes to our application. I think that we should use one of these three options (in this order):

1. Use SQL. This approach has the following benefits:

  • Because our tests don't depend on the system under test (they only invoke the tested method or API endpoint), our tests can fail only if we change the behavior of the system under test (the invoked method or API endpoint). This means that our tests won't tie our hands when we want to make changes to the production code. In other words, our application is easy to maintain.
  • It's easier to write an SQL script (imo) than to write the code that inserts the same data into the database. This means that our tests are easier (and faster) to write.
  • Because all developers are familiar (hopefully) with SQL and the SQL syntax ensures that column names and values are specified next to each other, our test data is easy to read. This means that we can fix simple mistakes by taking a quick look at our test data.
  • Because many popular testing frameworks have support for executing SQL scripts, we don't have to write any plumbing code. All we have to do is to write the required SQL scripts and ensure that the testing framework executes these scripts. Thus, if we use the right testing framework, our tests are easy (and fast) to write.
  • If our testing framework doesn't have support for executing SQL scripts, we can always write the required plumbing code ourself. It's not complicated and if we give it some thought, we can write a reusable component that can be used by all test classes. This means that we have to write this plumbing code only once.

The only downside of this approach is that we have to store the test data in our SQL scripts and in the constant classes which define the test data that's inserted into the database. That being said, in my experience, this doesn't have any practical impact to our productivity (as long as we use constant classes) and I think that the benefits of this approach outweigh the drawbacks.

2. Generate the required test data programmatically. If we cannot use SQL, we should write test data generator classes which insert the required test data into the database. This approach has the following benefits:

  • Because our tests don't depend on the system under test (they only invoke the tested method or API endpoint), our tests can fail only if we change the behavior of the system under test (the invoked method or API endpoint). This means that our tests won't tie our hands when we want to make changes to the production code. In other words, our application is easy to maintain.
  • We can store our test data in only one place. This means that our test data is as easy (and fast) to change as possible.

On the other hand, this approach has the following drawbacks:

  • Even though the code of our test data generators is quite straightforward (if we write it in the right way), we still have to invest time for writing the required code. Also, we have to write a new test data generator every time when we add a new table to the database, and modify the existing test data generators when we make changes to the existing database tables. Because writing code is somewhat slower (in my experience) than writing SQL scripts, this approach will slow us down a bit.
  • Our test data isn't as easy to read as possible because we have to read the source code of our test data generator classes before we know what kind of test data is inserted into the database. That being said, as long as our test data generators simply insert the configured test data into the database, our test data is still a lot easier to read than the test data that's generated with production code.

3. Insert the required test data into our database by using repositories. Even though I think that using production code for this purpose creates an unhealthy dependency between our tests and production code, I must also admit that sometimes this is probably the best possible solution. For example, if we have a large test suite which generates the required test data with production code, it makes sense to use this approach in new tests as well.

If we must generate our test data with production code, we should try to minimize the drawbacks of this decision. We can do this by following these rules:

  • Insert the test data into the database by using repository methods. Because the repository classes don't have any business logic (hopefully), it's relatively easy to determine what kind of data is found from the database when our tests are run. Also, because our repository methods won't typically change as often as service methods, we will minimize the amount of maintenance that's required by our test suite when we make changes to the production code.
  • Write test data generator classes. We should put our test data generation logic to one place and make sure that our test classes use our test data generators instead of invoking repository methods. This ensures that if we make breaking changes to our repository methods, we don't have to go through our test suite and fix every test class which invokes the changed repository methods. Instead, we can simply fix our test data generators. In other words, if we use test data generators, we will write tests which are as easy to maintain as possible (in this situation).

We should now understand why we shouldn't generate our test data with production code. Also, we should be able to identify the different options which allow us to insert test data into our database and select the option that makes most sense. Let's summarize what we learned from this blog post.

Summary

This blog post has taught us six things:

  • If we initialize our test data with production code, we write tests which can fail for more than one reason.
  • If we initialize our test data with production code, we write tests which slow us down when we make changes to the production code.
  • If we initialize our test data with production code, our test data isn't as easy to read as possible.
  • We should initialize our test data with SQL scripts.
  • If we cannot (or don't want to) use SQL scripts, we should write test data generators which insert the required test data into the database.
  • If we must initialize our test data with production code, we should write test data generators which invoke repository methods and make sure that our tests use our new test data generators.
0 comments… add one

Leave a Reply