Should DbFit be used for running performance or production data tests?

This question has been discussed on the DbFit forums a few times (and very similar discussions have taken place over at the FitNesse forums); here is my 2c on the subject.

Unit tests and executable specifications

DbFit was originally designed as a unit-level testing framework for database code. I (and many others) have also used DbFit (and FitNesse) for writing executable specifications - I believe that the tool is a really good fit for those types of tests.

Now, to maintain quality in large and complex software, those types of tests probably aren’t enough - most teams will want to also regularly run performance and/or production data tests as well. Now, can DbFit be used to run those tests? Sure. Should DbFit be used to run those tests? I’m not so sure.

Technical reasons against

Problems start when you start considering input data. You could choose to insert the data using DbFit (and quickly find that a multi-thousand Insert table is a maintenance nightmare), or find yourself writing custom fixtures that import data from a flat file. Both approaches are very clunky.

If, say, you wish to compare the results of two 100,000 row resultsets with DbFit - not uncommon for production data tests. You might try to do this with the compare stored queries table, but this will probably be horribly slow, if it works at all. The reason for this is the computationally inefficient algorithm that compare stored queries uses for comparison - the algorithm works fine for 10, 100 (and even 1000) rows, but doesn’t scale well beyond that.

Even if the comparison worked, in the case that you get failures, there’s currently no way to filter out matching from non-matching rows, meaning that the user basically has to scroll up and down the results to find the failures - not ideal.

These certainly aren’t insurmountable technical hurdles, and could be fixed with moderate effort within DbFit.

Philosophical reasons against

A tool like DbFit brings a significant amount of complexity (and consequently cost). You have to spend time:

  • setting DbFit up as part of the build system
  • learning the syntax
  • debugging when it doesn’t like the latest environment change

(the list goes on) and that can all eat up a large amount of time.

A “one tool fits all” approach might seem tempting, but here a “best tool for the job” tends to cause less issues in the long-term. If your prod data tests looks like this:

  1. Identify some production data
  2. Run the production data against the old version of the system
  3. Run the production data against the new version of the system
  4. Manually compare output of the two versions

you are likely to see a better cost/benefit ratio by using a simpler approach to triggering the test runs, comparing data and seeing the differences (eg running steps 2, 3 from a shell script, and using a simple comparison query, again on the command line) than using DbFit.