It’s one thing to see someone drive a car and have a theoretical understanding of what the pedals do and how to change gears. It’s is a completely different thing to be able to drive a car safely on the street. There are some activities that require many small details and some hands-on experience to be able to execute them successfully.
The good news is that unit testing is a lot simpler than driving a standard shift, but there are a lot of small details you need to get right to do it successfully. Even after reading about unit testing and being convinced of its benefits, programmers are often not sure how to get started. This month’s column is not going to try to convince you of the many benefits of unit testing (I hope you are already convinced), but rather, describe some very concrete tips to help you get started right away.
Goals of Unit Testing
There are many different reasons to write unit tests:
- Correctness testing. Checking that the code behaves as designed.
- Boundary testing. Checking that the code behaves correctly in odd or boundary situations.
- Regression testing. Checking that the behavior of the code doesn’t change unintentionally over time.
- Performance testing. Checking that the program meets certain minimum performance or memory constraints.
- Platform testing. Checking that the code behaves the same across multiple platforms.
- Design. Tests provide a way to advance the code design and architecture. This is usually referred as Test-Driven Development (TDD).
- Full game or tool testing. Technically this is a functional test, not a unit test anymore because it involves the whole program instead of a small subset of the code, but a lot of the same techniques apply.
Some developers use unit tests only for one of the reasons listed above, while others use many kinds of tests for a variety of different reasons. It’s important to recognize that because there are so many different uses for unit tests, no single solution is going to fit everybody. The ideal setup for some of those situations is going to be slightly different than for others. The basics are the same for all of them though.
When working with unit tests, these are our main goals:
- Spend as little time as possible writing a new test.
- Be notified of failing tests, and see at a glance which ones failed and why.
- Trust our tests. Have them be consistent from run to run and robust in the face of bad code.
Most of us have created one-off programs in the past to test some particularly complicated code. It’s usually a quick command line program that runs through a bunch of cases and asserts after each one that the results were correct. That’s the most bare-bones way of creating unit tests.
Unfortunately, it’s also a pain and it misses on most of the unit-testing goals described in the previous section: creating a new program just for that is a pain, we have to go out of our way to run the tests, and it usually gets out of date faster than the latest Internet meme. That is in part why a lot of programmers have an initial aversion to writing unit tests.
If you’re considering writing even a small unit test, you should use a unit-testing framework. A unit-testing framework removes all the busy work from writing unit tests and lets you spend your time on the logic of what to test. This doesn’t mean that the framework writes the tests for you. Be very wary of any tool that claims to do that! No, a unit-testing framework is simply a small library that provides all the glue for running unit tests and reporting the results. Sorry, you still need to use your brain and do (some of) the typing.
A quick search will reveal plenty of unit-testing frameworks to choose from for your language of choice, and most of them are free and open source so you can rely on them and modify them to suit your needs. For C/C++ and game development, I strongly recommend starting with UnitTest++. Charles Nicholson and I wrote that framework a few years ago specifically with games and consoles in mind. Many game teams have adopted it for their games and tools, and it has been used on lots of different game platforms including current and last generation consoles, Windows, Linux, and Mac PCs, and even on the iPhone. In most situations, it should be a straight drop-in to your project and you’re up and running.
If you end up using a different testing framework, or even if you roll your own, the techniques described here still apply, even if the syntax is slightly different.
Writing your first test is easy as pie:
int a = 4;
CHECK(a == 4);
To run it you need to add the following line to your executable somewhere. We’ll talk more about the physical organization of tests in a moment.
int failedCount = UnitTest::RunAllTests();
Done! Easy, wasn’t it? When you compile and run the program you should see the following output:
Success: 1 test passed.
Test time: 0.00 seconds.
Let’s add a failing test:
int a = 5;
CHECK(a == 4);
Now we get:
/fullpath/filename.cpp:17: error: Failure in MyFailingTest: a == 4
FAILURE: 1 out of 2 tests failed (1 failures).
Test time: 0.00 seconds.
That’s great, but if we’re going to diagnose the problem, we probably need to know the value of the variable a, and all the test is telling us is that it’s not 4. So instead, we can change the CHECK statement to the following:
int a = 5;
And now the output will be
/fullpath/filename.cpp:17: error: Failure in MyFailingTest: Expected 4 but was 5
FAILURE: 1 out of 2 tests failed (1 failures).
Test time: 0.00 seconds.
Much better. Now we get both the error information and the value of the variable under test. Virtually all unit-testing frameworks include different types of CHECK statements to get more information when testing floats, arrays, or other data types. You can even make your own CHECK statement for your own common data types such as colors or lists.
As a bonus, if you’re using an IDE, double-clicking on the test failure message should bring you automatically to the failing test statement.
When To Run
When to run unit tests will depend on what is being tested and how long it takes to do so. In general, the more frequently you run the tests, the better. The sooner you get feedback that something went wrong, the easier it will be to fix. Maybe even before it was checked in and it spread to the rest of the team. On the flip side, realistically, building and running a set of unit tests takes a certain amount of time, so it’s important to find the right balance between feedback frequency and time spent waiting for tests.
At the very least, all tests should run once a day, during the nightly build process in your build server (You have a build server, don’t you? If not, run over and read this column in the August 2008 issue). It doesn’t matter how long they take or how how many different projects you need to run. Just add them to the build script, and hook their output into the build results.
On the other extreme, you can build your tests every time you build the project and execute them as a postbuild step. That way, any time you make a change to a project, all tests will execute and you’ll see if anything went wrong. This is a great approach, but I wouldn’t recommend it if tests add more than a couple of seconds to the incremental build time, otherwise, they’ll be slowing you down more than they help.
For most developers, some approach in the middle will make most sense. For example, take a small, fast subset of tests that are more likely to break, and run those with every build. Whenever any code is checked in to version control, the build server can run those tests plus a few more, slower ones. And finally, at night, you can bring out the big guns and run those really long, really thorough ones that take a few hours to complete. You can separate those tests into different projects, or, if your framework supports them, into different test suites, which allow you to decide which sets of tests to execute at runtime.
If a unit test fails and nobody notices, is it really an error? Just running the tests isn’t good enough. We have to make sure that someone sees the failure it and fixes the problem.
Most unit-testing frameworks will let you customize how you want the failure errors to be reported. By default they will probably be sent to stdout, but you can easily customize the framework to send them to debug log streams, save to a file, or upload them to a server.
Even more important than the actual error messages is detecting whether there were any failures. After running all the tests, there is usually some way to detect how many tests failed. The program that was running the tests can detect any failures, print an error message, and exit with an error code. That error code will propagate to the build server and trigger a build failure. Hopefully by now alarm sounds are going off across the office and someone is on his way to fix it.
When people start down the unit test path, they often struggle to figure out how to physically lay out the unit tests. In the end, it really doesn’t matter too much as long as it makes sense to you, the final build doesn’t contain any tests, and they’re still easy to build and run.
My personal preference is to keep unit tests separate from the rest of the code. Usually I end up creating one file of tests for every cpp file. So FirstPersonCameraController.h and .cpp have a corresponding TestFirstPersonCameraController.cpp. Since I use this convention regularly throughout all of my code, I have a custom IDE macro to toggle between a file and its corresponding test file. I also put all the tests in a separate subdirectory to keep them as physically separate as possible.
I prefer to break up my code into several static libraries for each major subsystem: graphics, networking, physics, animations, etc. Each of those libraries has a set of unit tests, but instead of compiling them into the library, I create a separate project that creates a simple executable program. That project contains all the unit tests and links against the library itself, and in its main entry point it just calls the function to runs all unit tests and returns the number of failures. This keeps the tests separate from the library, but still very easy to build.
If all your code is organized into libraries, and your game is just a collection of libraries linked together, that’s all you need. Most games and tools, however, have a fair amount of code that you might want to test in the project itself. Since the game is an executable, you can’t easily link against it from a different project like we did before. In this case, I build the unit tests into the game itself, and I optionally call them whenever a particular command-line parameter like -runtests is present. Just make sure to #ifdef out all the tests in the final build.
Running the tests on the PC where you build the code is very straightforward. But unless you’re only creating games and tools for that platform, you will definitely want to run your tests on different platforms as well. Unit tests are an invaluable tool for catching slight platform inconsistencies caused by different compilers, architecture idiosyncrasies, or varying floating point rules.
Unfortunately, running unit tests on a different platform from your build machine is usually a bit more involved and not nearly as fast as doing it locally. You need to start by compiling the tests for the target platform. This is usually not a problem since you’re already building all your code for that platform, and hopefully your unit testing framework already supports it. Then you need to upload your executable with the tests and any data required to the target platform and run it there. Finally, you need to get the return code back to detect if there were any failures. This is surprisingly the trickiest part of the process with a lot of console development kits. If getting the exit code is not a possibility, you’ll need to get creative by parsing the output channel, or even waiting for a notification on a particular network port.
Some target platforms are more limited than others in both resources and C++ support. One of the features that makes UnitTest++ a good choice for games is that it requires minimal C++ features (no STL) and it can be trimmed down even further (no exceptions or streams).
For example, running unit tests on the PS3 SPUs was extremely useful, but it required stripping the framework down to the minimum amount of features. It was also tricky being able to fit the library code plus all the tests in the small amount of memory available. To get around that, we ended up changing the build rules for the SPUs so each test file created its own SPU executable (or module). We then wrote a simple main SPU program that would load each module separately, run its tests, keep track of all the stats, and finally report them.
Running a set of unit tests on the local machine can be an almost instant process, but running them on a remote machine is usually much slower, and can take up to 10 or 20 seconds just with the overhead of copying them and launching the program remotely. For this reason, you’ll want to run tests on other platforms less frequently.
No Leaking Allowed
Finally, if you’re going to have this all this unit testing code running on a regular basis, you might as well get as much information out of it as possible. I have found it invaluable to keep track of memory leaks around the unit test code.
You’ll have to hook into your own memory manager, or use the platform-specific memory tracking functions. The basic idea is to get a memory status before running the tests, and another one after all the tests execute. If there are any extra memory allocations, that’s probably a leak. In that case, you can report it as a failed build by returning the correct error code.
Watch out for static variables or singletons that allocate memory the first time they’re used. They might be reported as memory leaks even though it wasn’t what you were hoping to catch. In that case, you can explicitly initialize and destroy all singletons, or, even better, not use them at all, and keep your memory leak report clean.
You’re now armed with all you need to know to set up unit tests into your project and build pipeline. Grab a testing framework and get started today.
This article was originally printed in the May 2009 issue of Game Developer.