Unit Test Your Database.

published at 14 Nov, 2011 by Szymon Lipiński tags: orm postgresql testing

I observe the endless war about TDD. On one side there are those who claim that TDD is the best thing ever. On the other side there are those who claim that TDD is just a waste of time and is too slow.

But what about databases? Nobody writes about them too seriously.

In some books there are some wise words like:

  • we should use TDD for DAO
  • we should use TDD for ORM objects
  • we should use TDD for queries

This is very good for testing the unique, foreign and primary keys. Very good for all checks. What is missing here?

Should we use a little bit more TDD here?

Two simple examples of what should be done:

TDD Your Indices

Imagine you have a database filled by an ETL process, which loads huge amount of data. This means you can have many different databases with different versions of the data and structure. For one database you could notice that it’s is slow. It is too slow. Simple index could help. You check that and it really helps. OK, good.

You just fix the ETL to create this index after loading the data. You are happy – you’ve already fixed the problem. Did you fix the real problem?

What with the rest of the databases? Well… they need that index too. So you just create the index on all the databases/schemas you remember about.

  • Have you created the index on all?
  • What is the probability of forgetting about some of schemas and databases?
  • What if suddenly someone used the old ETL version instead of the new one?
  • What if the old version is still used because you will release the new version in a week?

The answer is simple: TDD.

TDD you database.

If you are using PostgreSQL, just write a simple test using pgTap. Then it would be as simple as:

    SELECT has_index(
      'public',
      'tab',
      'my_index',
      ARRAY['a', 'b']
    );

Next time when the database is slow, just run this simple test. It will tell you if the index exists or maybe you should create it.

Automate This

Automate the tests. You can use Jenkins for that with the plugin for TAP. I will write about that some time later, but Jenkins is a really great tool. With the TAP plugin you can easily combine testing your Java (or Python, or C++) project and test the database as well.

Version Your Database

Versioning can be done in many ways. This entry is about Java, so I’d use liquibase, which integrates database patches with java in a very nice way (yes, I know, there is XML, but nothing is perfect).

OK, this is not fully about the TDD. You shouldn’t use the TDD for pure versioning, but you should version the database, so you can check the version when your application starts.

Imagine an application which checks the database version during the start up. It can alert it the version is wrong. The it can just exit with an error. Some say that it could be frustrating a little bit. I think that even more frustrating is when you notice the problem after many hours or days of the application running. Suddenly it turns out that the application was using wrong database all the time. You tested the application on a different database and later run it on quite a different schema. What if it caused some unmaintainable inconsistency in the data?

Data is, in most cases, the most valuable asset you have.

TDD Sucks

Short answer to all those who say that such testing is stupid. That the time is too precious to spend it for writing tests.

Just calculate how much time those programmers and admins could save if the potential inconsistency bugs would be caught by the tests earlier.

Just calculate how much time would programmers save if they wouldn’t be just managing the bug tickets. And this could be caused only because they couldn’t ensure everything works before deploying an application to production.