Stepping Through the Looking Glass: Test-Driven Game Development (Part 3)

After reading the first two parts of this article, you should have enough information to strike boldly and apply test-driven development to your projects. At first you’re likely to find that the road is somewhat rough and bumpy, though. How do you break up this module into something testable? How can you prevent your tests from becoming a burden to maintain? What exactly should you test?

This third and last part will cover some tips that should help smooth out the learning curve and make you more productive from the start. We’ll also look at some of the consequences of applying TDD to a project and what kind of things you can expect on the other side of the looking glass.

Tips from the trenches

When I started doing test-driven development, I was going through the motions, following the advice from Kent Beck’s book, and there was no doubt I was getting a lot of benefit in what I was doing. However, I recently looked back at the tests I was writing back then and I was a bit shocked. They’re quite different from what I would write today. In particular, they were not very clean, maintainable tests. These are some tips that I wish I had known when I started out.

Use fixtures extensively

Using fixtures (objects that automatically set and destroy some state for you for every test), is a great way to keep tests small and to the point. All the setup and shutdown code gets refactored away into the fixture and the test can deal just with the things we’re testing. They will also make the tests much easier to refactor later on (which, if you follow TDD, you will be doing extensively).

Make sure your unit test framework allows you to have fixtures and that you can create them as easily as possible.

The rule I follow for using fixtures is the following (very similar to other XP rules about refactoring): the first time I write a test that has some setup and teardown, I just put it in the test itself because it’s the fastest way to get the test up and running. The second time I write a test that requires a similar setup and teardown, I also put it in the test, make the test pass, and then I consider whether to fold it into a fixture or not (depending on how extensive the setup code is). The third time I write a test that needs the same code, I also put it in the test first, but then I make a fixture right away.

I try to never refactor tests into a fixture while I have failing tests, though. First make the tests pass, then move common code into the fixture (which is part of the test refactoring step if you remember the first article).

Keep your tests short and simple

alice This one is really important. Probably the most important of all the advice in this article. So if you’re only going to take one thing away, make sure it’s this one.

When developing using TDD, you will create as much test code as production code. Sometimes as much as one and a half times as much! That code isn’t set in stone either. Quite the contrary, it’s supposed to be your safety net, and you’ll be changing it and moving it around as you create new code and refactor existing classes.

That means that test code should be very easy to change and refactor itself. It’s like the scaffolding around a building. It’s not enough to make it reach places and be secure. You have to make sure you can take it down and move it around as needed.

I’m a big fan of simple code, but it’s extremely important that your tests should be as simple as possible. The idea is that when a test fails, you want to know right away what failed. If the test itself is 100 lines long and is full of check statements and loops, you’ll have to resort to the debugger to find out what went wrong.

How do we keep the tests as simple as possible?

Keep tests extremely short. I try to keep my tests under about 10 lines. Preferably less. I want something I can look at and know at a glance what it does. My favorite tests are the ones that are one or two lines long.
Label your tests correctly. The name of your test should tell you what it does. Don’t call your test TestHealth (what exactly does that do?). Instead, call it DamageLowersHealth.
Aim to have one CHECK statement per test. The idea here is that you want to be testing only one thing at a time. If there’s something else to test, even if it’s with the same inputs, write a new test. Not everybody buys into this, but it has made a huge difference in the tests I write. When I started out doing TDD I was not shy about making more complex tests with multiple CHECKs. Then, when a test failed, it wasn’t always immediate obvious what had gone wrong. Clearly to do this effectively you need to use fixtures as described in the previous item.

Here’s an example of what I consider a bad unit test:

// BAD UNIT TEST! 
TEST (TestArmor) 
{ 
    ArmorComponent armor(100, 10); 
    CHECK_EQUALS (100, armor.Amount()); 
    CHECK_EQUALS (10, armor.RechargeRate()); 
    armor.ApplyDamage(30); 
    CHECK_EQUALS (70, armor.Amount()); 
    armor.ApplyDamage(80); 
    CHECK_EQUALS (0, armor.Amount()); 
    armor.Update(1.0f); 
    CHECK_EQUALS (10, armor.Amount()); 
    armor.Update(10.0f); 
    CHECK_EQUALS (100, armor.Amount()); 
}

It actually doesn’t look too bad, right? It’s not too long a test, and it doesn’t have any nasty loops. Still, it’s a really bad unit test. Imagine you just made some changes to some other part of the system and the fifth CHECK statement fails. Can you look at it for a second and tell me what went wrong? Probably not. Heck, I don’t even know what the test is doing by the test name! The test is far too complex. Here’s how it should look instead:

class ArmorFixture 
{ 
public: 
    ArmorFixture() : armor(100, 10) {} 
    ArmorComponent armor; 
};  

TEST_F (ArmorFixture, ArmorHasCorrectInitialValue) 
{ 
    CHECK_EQUALS (100, armor.Amount()); 
} 

TEST_F (ArmorFixture, ArmorHasCorrectRechargeRate) 
{ 
    CHECK_EQUALS (10, armor.RechargeRate()); 
} 

TEST_F (ArmorFixture, DamageLowersArmorByCorrectAmount) 
{ 
    armor.ApplyDamage(30); 
    CHECK_EQUALS (70, armor.Amount()); 
} 

TEST_F (ArmorFixture, DamageDoesNotGoBelowZero) 
{ 
    armor.ApplyDamage(300); 
    CHECK_EQUALS (0, armor.Amount()); 
} 

TEST_F (ArmorFixture, UpdateRechargesArmorByCorrectAmount) 
{ 
    armor.ApplyDamage(50); 
    armor.Update(1.0f); 
    CHECK_EQUALS (60, armor.Amount()); 
} 

TEST_F (ArmorFixture, UpdateDoesNotRechargePastMax) 
{ 
    armor.ApplyDamage(50); 
    armor.Update(20.0f); 
    CHECK_EQUALS (100, armor.Amount()); 
}

If a test fails right now, it should be immediately obvious what went wrong. Also notice how the largest test is only three lines long. It’s trivial enough to see what we’re testing at a glance.

Keep your tests fast

“Unit tests run fast. If they don’t run fast, they aren’t unit tests” â€“ Michael Feathers, Working Effectively with Legacy Code.

We’re going to have lots of unit tests. Lots of them! Thousands upon thousands, so they’d better run fast. Think about it. If your average test takes 100 ms, if we run 5000 of them that means it’s going to take over 8 minutes to run them! An ideal unit test should take less than 1 ms, ideally around 1 micro second or so (so 5000 tests will take less than half a second, which is a bit more acceptable).

What does that mean?

Forget about doing file operations in your unit tests. Or at least minimize them. You probably have a stream class, so anything that requires serialization should use memory instead. Besides, accessing the disk probably means keeping data files along with the unit tests, which makes them harder to keep up to date. Still, if you have to do it for a few tests, that’s fine. Just don’t make a habit out of it.
Don’t initialize/shutdown expensive systems in your fixtures. If you absolutely need to have a system that’s expensive to initialize, do it once per test project (and make sure you reset it to a default state after every test).
Work with one library at a time. Ideally, you should only be working with the code you’re actively changing. Pull in the minimum number of libraries you need to develop those tests and keep your build times down to a minimum to have the fastest possible iteration times. You’ll also only need to constantly run the tests for the library you’re working on. Once you’ve implemented a feature you can build the whole project and run all the unit tests for all the libraries to make sure everything else is still A-OK

If the number of unit tests for a single library gets overwhelming, start thinking of breaking them into suites, so you can run a subset of them while developing. Personally, I haven’t gotten there yet, but I like to keep my libraries small and simple also, so running a few thousand tests still keeps my incremental build times to about a second or less.

Refactoring should be just that: refactoring

Resist the temptation to add new functionality when you’re supposed to be refactoring. It’s very easy. I admit it. It’s also too easy to sneak in a few changes here and there when nobody is looking (yet another reason why pair programming is great). But resist it.

If you decide that you should check that the index is within range, save that for the next test. What are you going to do if a pointer is NULL, return false? Write a test. So what if it’s a trivial test? It’s only going to take you 10 seconds to write it, so why not? That way you keep maximum code coverage and it’s harder for things to slip between the cracks.

Refactoring should be just that: changing the format and structure of the program, but not its functionality.

Bug – Test – Fix

Pop quiz: A producer walks through the door and tells you there’s a bug in the inventory code. Apparently if you add two similar objects, when you remove one from the inventory, the other one disappears. You immediately realize that you were searching based on object name, not on unique ID, and you can fix it in 30 seconds. What do you do?

Let me answer that by asking another question: Is fixing that bug a refactoring of code, or are you changing existing behavior? What does TDD say about changing behavior/adding features? Right. You got it. Write a test first.

Clearly, the current tests you have are not catching that situation, so go ahead and create a test that fails because it shows that situation:

TEST (RemovingOneOfTwoSimilarObjectsRemovesOnlyOne) 
{ 
    Inventory inv; 
    GameObject potion1("Health potion"); 
    GameObject potion2("Health potion"); 
    inv.Add(potion1); 
    inv.Add(potion2); 
    inv.Remove(potion1); 
    CHECK_EQUAL (1, inv.ItemCount()); 
}

Once you see that test fail, then you can go and implement the fix. Now the test should be passing and everybody is happy. Check in the code and the test and you’ll make sure that bug never comes back. This is one of the best ways of keeping your codebase in good health. Let’s call it preventative medicine.

Classes and tests

When writing tests, don’t obsess about keeping a one-to-one correlation between classes and test files (which contain many tests, of course). That’s the way tests will start out, but I find that sometimes I have too many tests checking fairly different things, so I end up putting them in different files, even though they’re testing the same class.

Could that be a “smell” that the class needs some refactoring and needs to be split? Perhaps. But sometimes it just feels right to separate the tests into logical groups that way without affecting the class. Don’t worry about it. It’s normal.

Private stuff

Should you test private methods and member variables? Not everybody agrees, but I would say no. Test the public interface. That’s what you care about, after all.

However, I find very useful to have access to non-public library classes. Fortunately, I keep my tests as a subdirectory of the library, so in a way they’re almost part of the library itself. I find that having access to classes that are not exposed outside of the library is very helpful to use as mock objects or to help with other parts of the test.

Introducing TDD

It’s great if you’re psyched about TDD and want to use it in your project. What about the rest of your team, though? How should you introduce TDD in your current project? You’re not going to convert everybody overnight, so plan on doing it incrementally.

First, you can start doing it quietly by yourself. Nobody is stopping you from using TDD to develop the functionality you were going to do. Check in your test code as well, but make it so it’s not built and executed every time your code is modified. Probably a few programmers are going to notice your tests being checked in and will be curious about it. This is your chance to start seeding the idea.

Next, you need to get the lead programmer involved and get him on board. You’ll need to agree that those tests should always pass and checking in code that breaks them is not acceptable. At this point, make sure the tests are executed with every automated build, and even whenever somebody modifies the code under test. It’s important that people see the tests as a good thing and understand what they’re doing as opposed to being things that are getting in the way. If people start commenting out tests to prevent them from failing, you know they’re not having the desired effect.

Finally, assuming that everything was successful and people recognize the tests as being useful, you might want to propose using TDD at the start of a new project, or for some specific tool or library. Make sure everybody is on board with the idea, get a few copies of Kent Beck’s book, and help educate them with your experience.

When all you have is a hammer…

Don’t let the novelty of TDD blind you. Even though I believe that TDD is a great technique that is widely applicable to most of the code we write, there are some cases when I would not recommend TDD. It’s important to recognize that and know when not to use it. After all, TDD is supposed to help us. Whenever it gets in the way, or the benefits we reap don’t outweigh the drawbacks, out with it!

Experimental code. If you’re just experimenting, putting together a prototype, or doing anything that is going to be thrown away, then there’s probably little benefit to using TDD. This is what XP calls “spikes”: short tasks with the intent of experimenting with something to learn more about how to proceed. There’s very little reason to use TDD in this situation.
High-level scripting code. Chances are your game has a scripting language. How about telling your designers to write tests for their code before they write the scripts? Yeah, right! Not only would that be an exercise in futility, but it wouldn’t be that useful (unless your scripts extend to a fairly low level). Designer scripts, by definition, are probably changing all the time and no other parts of the code directly depend on them. They should probably be relatively simple, so writing unit tests for them is not all that important.
Don’t test hardwired data. This goes without saying, but don’t test the specific data values in your tests. If you’re writing a test for a grenade weapon that deals damage within a certain radius, don’t hardwire the specific radius in your tests, because that’s going to be constantly tweaked until the game ships. Check the radius in your test and check against that.
Shaders. A couple of years ago that wasn’t a big deal because shaders were so limited. Now they’re starting to be quite complicated and general, so maybe starting to think about TDD wouldn’t be a bad idea. If somebody out there is doing TDD with graphics shaders, please drop me an email. I’d love to know more about how you’re doing it and whether it’s useful.
Multithreaded code. TDD uses unit tests, which means it tests very small bits of functionality in isolation. The current emphasis on multithreaded code makes it so there will be things we can’t easily test through TDD, especially dealing with thread interactions. I haven’t had to deal with this yet, but I’m sure it’s going to come up. For now, I’ll just ignore threading issues when I’m writing unit tests and deal with it later.

GUI tools

Can we apply TDD to GUI tools? And the more important question is, even if we can, should we? Absolutely!

A lot of the bulk of game development goes into tools and program plug-ins. Especially now, for this next generation of consoles coming up, I’m convinced that the asset pipeline is going to play a key role in differentiating games from each other and making the ones with a really solid pipeline stand out from the rest. Not doing TDD on tools is going to exclude a good percentage of all code developed for a game, and one of the most important parts at that.

There’s enough material here for a whole other article ( or even a whole book). You can also refer to the Test First User Interfaces mailing list.

The main idea is to separate the GUI from the logic. Unfortunately a lot of GUI development tools encourage you to write your logic right in the function that handles the button pressing. Resist that temptation. Put all the logic for your tool in a separate library, and hook it up to the GUI with the minimum amount of code. This is a bit like the “humble dialog” technique described by Michael Feathers.

Keeping that separation also makes it a lot easier to provide different interfaces to your code. Maybe you want to also have a command-line only version of the tool, or maybe you want to make it into a Windows service. All that is trivial if you’ve completely separated your GUI from your logic.

Consequences of doing TDD

Amount of code

You should expect the overall amount of code you write to increase, maybe by as much as 100% to 150%. That’s a lot of code. It’s all right, though. That’s no ordinary code. It’s code that is looking out for the other half of your code. It’s not code that you have to worry about architecting and dealing with complex dependencies, or anything. It’s extremely simple, and it’s driven by the features you implement. There’s just quite a bit of it.

Development speed

What does writing all that code do to your development speed? Doesn’t it slow things down? Yes, it will. At first.

Whenever you start learning a new programming language, technique, or tool, it’s normal for your productivity to drop temporarily. You’ve spent years honing your skills on a different technique, so switching takes a while and you need some practice.

In my own experience with doing TDD in a not very TDD-friendly environment and with C++ and a game engine, I will admit that it took one or two months to get to the point where I felt I was as productive as before. That’s a fair amount of time, so I don’t recommend switching to doing TDD towards the end of a project. Wait until the beginning of the next project, or do it if you’re not in the critical path and you think you can afford a small productivity hit.

Even before you reach the point where you’re comfortable enough with TDD to develop at the same speed as before, you’ll already be reaping all of the benefits that TDD provides (safety net, better design, documentation, etc, etc).

Here’s the interesting part, though: I claim that once you’re comfortable with TDD and you’ve applied it to a large amount of code, your development speed will actually be faster than it would be without TDD. So not only you get all the other benefits, but things will go faster and more smoothly. How’s that possible? You have to write all that extra code and that takes time!

Think of it as a snowball (or as a Katamari). Doing development without TDD allows you to get going pretty quickly. You take a little snowball and run with it. But as you roll it on the snow, it keep growing and growing. At first it’s not a big deal, but eventually it’ll start to slow you down. Eventually it’ll become a huge ball that takes several people just to move it a small amount. At that point, you better be ready to ship the game because it’s not going much further.

Doing TDD, on the other hand, is like starting out with a medium size ball (say, a foot or two in diameter), but it’ll never get significantly larger than that. You can happily keep rolling it around, adding features and changing things, and you’ll still be as fast and flexible as you were the first day.

Of course, I have no hard data to back any of this other than my own personal experience. If anybody has seen some studies on the subject, please let me know so I can see if they agree with my observations.

More interfaces

In C++, the only (or at least, the easiest) way to use mock objects is to use virtual functions through an interface class. The more you use that technique, the more virtual functions you’ll end up with, even if you didn’t really need them for the production code itself.

A good example could be a graphics rendering library, which is just a light wrapper around the graphics hardware abstraction. If you know you’re only going to use one type of graphics hardware, you don’t really need an interface class and a lot of virtual functions. But if you want to insert a mock object there (which you probably will, to avoid using a real renderer in your tests), it’ll force you to create unnecessary virtual functions.

Of course, many times, when you need a mock object, you also need to use polymorphism at runtime, so you needed those virtual functions anyway.

How much of a big deal is it? I’d say for the most part it’s not worth sweating it. Those are still fairly coarse functions, so having a virtual call isn’t going to affect your frame rate very much at all. They certainly won’t for higher-level code that is not called as frequently. If the extra virtual function calls are an issue, you can change those tests to check the state of the objects directly instead of using mock objects.

I hope you found this peek behind the looking glass useful. Hopefully it convinced you to give test-driven development an honest try. And at the very least, I hope that it made you think about how you develop software and question your assumptions. But don’t worry, this is not the end: this is a topic I plan on revisiting in future articles. By now you should know your way into the looking glass. Feel free to step through any time you want.

Noel Llopis

March 6, 2005

Still not convinced? Post here with questions about TDD or specific examples that you’re having trouble applying TDD to and I’ll try to lend a hand.
Kim GrÃ¤sman

March 7, 2005

As for testing private methods — I have this hunch that I haven’t checked up what other people think about: If you find you want to test it, it probably shouldn’t be private.

My thinking is that if a private method is so complex it needs testing, it’s probably not primitive (in the Lakos sense), and should be extracted into a utility class, or similar.

Does that resonate with the feelings of anyone else?

– Kim
Jamie Fristrom

March 7, 2005

So have you managed to spread TDD at the studio formerly known as Sammy? Is *Darkwatch* a TDD project?

“They should probably be relatively simple, so writing unit tests for them is not all that important.” Deep sigh…
Benoit Miller

March 8, 2005

Very interesting series of articles, congrats!

“But if you want to insert a mock object there (which you probably will, to avoid using a real renderer in your tests) […]”

…but what if the code being tested *is* the renderer?

I’ve been thinking about TDD for a while but it’s *hard* to apply this to 3D graphics. Tests will be long and complicated, and there is no clear way to validate the results (comparing images doesn’t always work, for a variety of reasons).

How are you dealing with this? Are the 3D guys skipping the TDD? 🙂
Benoit Miller

March 8, 2005

Very interesting series of articles, congrats!

“But if you want to insert a mock object there (which you probably will, to avoid using a real renderer in your tests) […]”

…but what if the code being tested *is* the renderer?

I’ve been thinking about TDD for a while but it’s *hard* to apply this to 3D graphics. Tests will be long and complicated, and there is no clear way to validate the results (comparing images doesn’t always work, for a variety of reasons).

How are you dealing with this? Are the 3D guys skipping the TDD? 🙂
Robert 'Groby' Blum

March 8, 2005

Benoit:

Can’t speak for Noel here, obviously – but in my experience, 3D code is tested after the fact. It’s hard math, and that can’t easily be *driven* by tests. I’m not certain it’s entirely true, but I’ve got the feeling that the softer the rules, the easier it is to drive your development by tests.

That doesn’t mean you *don’t* test it – it’s just that you write the code, manually verify correctness, and then fix the unit test. Still gives you coverage – it just doesn’t help you as much in the design department.

I’ve done it for a cross-platform particle engine, and it was *immensely* useful. No, it’s not a full 3D engine – but the same ideas would apply.
Chad Austin

April 27, 2005

Test Driven Development: First Impressions
Astronerd

July 4, 2005

Avoiding File I/O In Unit Tests

Noel wrote a series of articles about TDD in game development here Games from Within: Stepping Through the Looking Glass: Test-Driven Game Development (Part 3).He mentions that the tests should run fast and that means no file I/O. However, sometimes…
Bernd Salewski

February 13, 2014

Your articles on TDD rock, even after 9 years….
Powerslave

December 20, 2014

Noel, I have two words to say: THANK YOU!!!
Thank you for giving me back the hope. The hope that even in the mainstream game development community there are people like you, who seriously do care about code quality and neat development techniques like TDD.
Thank you for your hard work putting this series together and thank you for promoting best practices among those less familiar with them.