Remote Game Editing

I’ve long been a fan of minimal game runtimes. Anything that can be done offline or in a separate tool, should be out of the runtime. That leaves the game architecture and code very lean and simple.

One of the things you potentially give up by keeping the game runtime to a minimum is an editor built in the game itself. But that’s one of those things that sounds a lot better than it really is. From a technical point of view, having an editor in the game usually complicates the code a huge amount. All of a sudden you need to deal with objects being created and destroyed randomly (instead of through clearly defined events in the game), and you have to deal with all sorts of crazy inputs and configurations.

The worse part though, is having to implement some sort of GUI editing system in every platform. Creating the GUI to run on top of the game is not easy, requiring that you create custom GUI code or try to use some of the OpenGL/DirectX libraries available. And even then, a complex in-game GUI might not be a big deal on a PC, but wait and try to use that interface on a PS3 or iPhone. After all, making games is already complicated and time-consuming enough to waste more time reinventing the widget wheel.

Remote game editing manages to keep a minimal runtime and allow you to quickly create native GUIs that run on a PC. It’s the best of both worlds, and although it’s not quite a perfect solution, it’s the best approach I know.

Debug Server

Miguel Ángel Friginal already covered the basics of the debug server, so I’m not going to get in details there.

The idea is that you run a very simple socket server on the game, listening in a particular port. This server implements the basic telnet protocol, which pretty much means that it’s a line-based, plain-text communication.

The main difference between my debug server and Miguel’s (other than mine is written in cross-platform C/C++ instead of ObjC), is that I’m not using Lua to execute commands. Using Lua for that purpose is a pretty great idea, but goes against the philosophy of keeping the runtime as lean and mean as possible.

Instead, I register variables with the server by hand. For each variable, I specify its memory address, it’s type, any restrictions (such as minimum and maximum values), and a “pretty name” to display in the client. Sounds like a lot of work, but it’s just one line with the help of a template:

registry.Add(Tweak(&plantOptions.renderGloss, "render/gloss", "Render gloss"));
registry.Add(Tweak(&BouquetParams::FovY, "bouquet/fov", "FOV", Pi/32, Pi/3))

And yes, if I were to implement this today, I would probably get rid of the templates and make it all explicit instead (ah, the follies of youth 🙂

TweakUtils::AddBool(registry, &plantOptions.renderGloss, "render/gloss", "Render gloss");
TweakUtils::AddFloat(registry, &BouquetParams::FovY, "bouquet/fov", "FOV", Pi/32, Pi/3);

The debug server itself responds to three simple commands:

  • list. Lists all the variables registered in the server.
  • set varname value. Sets a value.
  • print varname. Gets the value for that variable.

For example, whenever the server receives a set command, it parses the value, verifies that it’s within the acceptable range, and applies it to the variable at the given memory location.

Telnet Clients

telnet.pngBecause we used the standard telnet protocol, we can start playing with it right away. Launch the game, telnet into the right port, and you can start typing away.

However, most telnet clients leave much to be desired for this. They all rely on the history and cursor manipulation being handled by the shell they assume you’re connected to. Here we aren’t connected to much of anything, but I’d like to be able to push up arrow and get my last command, and be able to move to the beginning of the line or the previous word like I would do in any text editor. The easiest solution I found for that was to use a telnet client prepared for that kind of thing: A MUD client! Just about any will do, but one that works well for me is Atlantis.

So far, we’ve implemented the equivalent of a FPS console, but working remotely. And because the code is fully portable, our game can be in just about any platform and we can always access it from our PC. Not just that, but we can even open multiple simultaneous connections to various development devices if you need to run them all at once.

Custom Clients

Game parameter tweaking is something that is OK through a text-based console, but really comes into its own when you add a GUI. That’s exactly what we did at Power of Two Games. We created a generic GUI tool (based on WinForms since we were on Windows at the time), that would connect to the server, ask for a list of variables, and generate a GUI on the fly to represent those variables. Since we knew type and name of each variable, it was really easy to construct the GUI elements on the fly: A slider with a text field for floats and ints, a checkbox for bools, four text fields for vectors, and even a color picker for variables of the type color.

It worked beautifully, and adjusting different values by moving sliders around was fantastic. We quickly ran into two problems through.

The first one is that we added so many different tweaks to the game, that it quickly became unmanageable to find each one we wanted to tweak. So, in the spirit of keeping things as simple as possible (and pushing the complexity onto the client), we decided that the / symbol in a name would separate group name and variable name. That way we could group all related variables together and make everything usable again.

The second problem was realizing that some variables were changing on the runtime without us knowing it on the client. That created weird situations when moving sliders around. We decided that any time a registed variable changes on the server, it should notify any connected clients. That worked fine, but, as you can imagine, it became prohibitively expensive very quickly. To get around that, we added a fourth command: monitor varname. This way clients need to explicitly register themselves to receive notifications whenever a variable changes, and the GUI client only did it for the variables currently displayed on the screen.

tweaker.png

During this process, it was extremely useful to be able to display a log console to see what kind of traffic there was going back and forth. It helped me track down a few instances of bugs where changing a variable in the client would update it in the server, sending an update back to the client, which would send it again back to the server, getting stuck in an infinite loop.

You don’t need to stop at a totally generic tool like this either. You could create a more custom tool, like a level editor, that still communicates with the runtime through this channel.

Flower Garden Example

For Flower Garden, I knew I was going to need a lot of knobs to tweak all those plant DNA parameters, so I initially looked into more traditional GUI libraries that worked on OpenGL. The sad truth is that they all fell way short, even for development purposes. So I decided to grab what I had at hand: My trusty tweaking system from Power of Two Games.

I’m glad I did. It saved a lot of time and scaled pretty well to deal with the hundreds of parameters in an individual flower, as well as the miscellaneous tweaks for the game itself (rendering settings, infinite fertilizer, fast-forwarding time, etc).

Unfortunately, there was one very annoying thing: The tweaker GUI was written in .Net. Sure, it would take me a couple of days to re-write it in Cocoa (faster if I actually knew any Cocoa), but as an indie, I never feel I can take two days to do something tangential like that. So instead, I just launched it from VMWare Fusion running Windows XP and… it worked. Amazingly enough, I’m able to connect from the tweaker running in VMWare Fusion to the iPhone running in the simulator. Kind of mind boggling when you stop and think about it. It also connects directly to the iPhone hardware without a problem.

VMWare Fusion uses up a lot of memory, so I briefly looked into running the tweaker client in Mono. Unfortunately Mono for the Mac didn’t seem mature enough to handle it, and not only was the rendering of the GUI not refreshing correctly, but events were triggered in a slightly different order than in Windows, causing even more chaos with the variable updates.

Here’s a time-lapse video of the creation of a Flower Garden seed from the tweaker:

Drawbacks

As I mentioned earlier, I love this system and it’s better than anything else I’ve tried, but it’s not without its share of problems.

Tweaking data is great, but once you find that set of values that balances the level to perfection… then what? You write those numbers down and enter them in code or in the level data file? That gets old fast. Ideally you want a way to automatically write those values back. That’s easy if the tool itself is the editor, but if it’s just a generic tweaker, it’s a bit more difficult.

One thing that helped was adding a Save/Load feature to the tweaker GUI. It would simply write out a large text-based file with all the variables and their current values. Whenever you load one of those, it would attempt to apply those same values to the current registered variables. In the end, I ended up making the Flower Garden offline seed file format match with what the tweaker saved out, so that process went pretty smoothly.

Another problem is if you want lots of real-time (or close to real time) updates from the server. For example, you might want to monitor a bunch of data points and plot them on the client (fps, memory usage, number of collisions per frame, etc). Since those values change every frame, it can quickly overwhelm the simple text channel. For those cases, I created side binary socket channels that can simply send real-time data without any overhead.

Finally, the last drawback is that this tweaking system makes editing variables very easy, but calling functions is not quite as simple. For the most part, I’ve learned to live without function calls, but sometimes you really want to do it. You can extend the server to register function pointers and map those to buttons in the client GUI, but that will only work for functions without any parameters. What if you wanted to call any arbitrary function? At that point you might be better off integrating Lua in your server.

Future Directions

This is a topic I’ve been interested in for a long time, but the current implementation of the system I’m using was written 3-4 years ago. As you all know by now, my coding style and programming philosophy changes quite a bit over time. If I were to implement a system like this today, I would do it quite differently.

For all I said about keeping the server lean and minimal, it could be even more minimal. Right now the server is receiving text commands, parsing them, validating them, and interpreting them. Instead, I would push all that work on the client, and all the server would receive would be a memory address, and some data to blast at that location. All of that information would be sent in binary (not text) format over a socket channel, so it would be much more efficient too. The only drawback is that we would lose the ability to connect with a simple telnet client, but it would probably be worth it in the long run.

This post is part of iDevBlogADay, a group of indie iPhone development blogs featuring two posts per day. You can keep up with iDevBlogADay through the web site, RSS feed, or Twitter.

Nitty Gritty Unit Testing

It’s one thing to see someone drive a car and have a theoretical understanding of what the pedals do and how to change gears. It’s is a completely different thing to be able to drive a car safely on the street. There are some activities that require many small details and some hands-on experience to be able to execute them successfully.

The good news is that unit testing is a lot simpler than driving a standard shift, but there are a lot of small details you need to get right to do it successfully. Even after reading about unit testing and being convinced of its benefits, programmers are often not sure how to get started. This month’s column is not going to try to convince you of the many benefits of unit testing (I hope you are already convinced), but rather, describe some very concrete tips to help you get started right away.

Goals of Unit Testing

There are many different reasons to write unit tests:

  • Correctness testing. Checking that the code behaves as designed.
  • Boundary testing. Checking that the code behaves correctly in odd or boundary situations.
  • Regression testing. Checking that the behavior of the code doesn’t change unintentionally over time.
  • Performance testing. Checking that the program meets certain minimum performance or memory constraints.
  • Platform testing. Checking that the code behaves the same across multiple platforms.
  • Design. Tests provide a way to advance the code design and architecture. This is usually referred as Test-Driven Development (TDD).
  • Full game or tool testing. Technically this is a functional test, not a unit test anymore because it involves the whole program instead of a small subset of the code, but a lot of the same techniques apply.

Some developers use unit tests only for one of the reasons listed above, while others use many kinds of tests for a variety of different reasons. It’s important to recognize that because there are so many different uses for unit tests, no single solution is going to fit everybody. The ideal setup for some of those situations is going to be slightly different than for others. The basics are the same for all of them though.

When working with unit tests, these are our main goals:

  • Spend as little time as possible writing a new test.
  • Be notified of failing tests, and see at a glance which ones failed and why.
  • Trust our tests. Have them be consistent from run to run and robust in the face of bad code.

Testing Framework

lack_of_tests.jpgMost of us have created one-off programs in the past to test some particularly complicated code. It’s usually a quick command line program that runs through a bunch of cases and asserts after each one that the results were correct. That’s the most bare-bones way of creating unit tests.

Unfortunately, it’s also a pain and it misses on most of the unit-testing goals described in the previous section: creating a new program just for that is a pain, we have to go out of our way to run the tests, and it usually gets out of date faster than the latest Internet meme. That is in part why a lot of programmers have an initial aversion to writing unit tests.

If you’re considering writing even a small unit test, you should use a unit-testing framework. A unit-testing framework removes all the busy work from writing unit tests and lets you spend your time on the logic of what to test. This doesn’t mean that the framework writes the tests for you. Be very wary of any tool that claims to do that! No, a unit-testing framework is simply a small library that provides all the glue for running unit tests and reporting the results. Sorry, you still need to use your brain and do (some of) the typing.

A quick search will reveal plenty of unit-testing frameworks to choose from for your language of choice, and most of them are free and open source so you can rely on them and modify them to suit your needs. For C/C++ and game development, I strongly recommend starting with UnitTest++. Charles Nicholson and I wrote that framework a few years ago specifically with games and consoles in mind. Many game teams have adopted it for their games and tools, and it has been used on lots of different game platforms including current and last generation consoles, Windows, Linux, and Mac PCs, and even on the iPhone. In most situations, it should be a straight drop-in to your project and you’re up and running.

If you end up using a different testing framework, or even if you roll your own, the techniques described here still apply, even if the syntax is slightly different.

Hello Tests

Writing your first test is easy as pie:

#include <UnitTest++.h>

TEST(MyFirstTest)
{
    int a = 4;
    CHECK(a == 4);
}

To run it you need to add the following line to your executable somewhere. We’ll talk more about the physical organization of tests in a moment.

int failedCount = UnitTest::RunAllTests();

Done! Easy, wasn’t it? When you compile and run the program you should see the following output:

Success: 1 test passed.
Test time: 0.00 seconds.

Let’s add a failing test:

TEST(MyFailingTest)
{
    int a = 5;
    CHECK(a == 4);
}

Now we get:

/fullpath/filename.cpp:17: error: Failure in MyFailingTest: a == 4
FAILURE: 1 out of 2 tests failed (1 failures).
Test time: 0.00 seconds.

That’s great, but if we’re going to diagnose the problem, we probably need to know the value of the variable a, and all the test is telling us is that it’s not 4. So instead, we can change the CHECK statement to the following:

TEST(MyFailingTest)
{
    int a = 5;
    CHECK_EQUAL(4, a);
}

And now the output will be

/fullpath/filename.cpp:17: error: Failure in MyFailingTest: Expected 4 but was 5
FAILURE: 1 out of 2 tests failed (1 failures).
Test time: 0.00 seconds.

Much better. Now we get both the error information and the value of the variable under test. Virtually all unit-testing frameworks include different types of CHECK statements to get more information when testing floats, arrays, or other data types. You can even make your own CHECK statement for your own common data types such as colors or lists.

As a bonus, if you’re using an IDE, double-clicking on the test failure message should bring you automatically to the failing test statement.

When To Run

When to run unit tests will depend on what is being tested and how long it takes to do so. In general, the more frequently you run the tests, the better. The sooner you get feedback that something went wrong, the easier it will be to fix. Maybe even before it was checked in and it spread to the rest of the team. On the flip side, realistically, building and running a set of unit tests takes a certain amount of time, so it’s important to find the right balance between feedback frequency and time spent waiting for tests.

At the very least, all tests should run once a day, during the nightly build process in your build server (You have a build server, don’t you? If not, run over and read this column in the August 2008 issue). It doesn’t matter how long they take or how how many different projects you need to run. Just add them to the build script, and hook their output into the build results.

On the other extreme, you can build your tests every time you build the project and execute them as a postbuild step. That way, any time you make a change to a project, all tests will execute and you’ll see if anything went wrong. This is a great approach, but I wouldn’t recommend it if tests add more than a couple of seconds to the incremental build time, otherwise, they’ll be slowing you down more than they help.

For most developers, some approach in the middle will make most sense. For example, take a small, fast subset of tests that are more likely to break, and run those with every build. Whenever any code is checked in to version control, the build server can run those tests plus a few more, slower ones. And finally, at night, you can bring out the big guns and run those really long, really thorough ones that take a few hours to complete. You can separate those tests into different projects, or, if your framework supports them, into different test suites, which allow you to decide which sets of tests to execute at runtime.
Reporting Results

If a unit test fails and nobody notices, is it really an error? Just running the tests isn’t good enough. We have to make sure that someone sees the failure it and fixes the problem.

Most unit-testing frameworks will let you customize how you want the failure errors to be reported. By default they will probably be sent to stdout, but you can easily customize the framework to send them to debug log streams, save to a file, or upload them to a server.

Even more important than the actual error messages is detecting whether there were any failures. After running all the tests, there is usually some way to detect how many tests failed. The program that was running the tests can detect any failures, print an error message, and exit with an error code. That error code will propagate to the build server and trigger a build failure. Hopefully by now alarm sounds are going off across the office and someone is on his way to fix it.

Project Organization

When people start down the unit test path, they often struggle to figure out how to physically lay out the unit tests. In the end, it really doesn’t matter too much as long as it makes sense to you, the final build doesn’t contain any tests, and they’re still easy to build and run.

My personal preference is to keep unit tests separate from the rest of the code. Usually I end up creating one file of tests for every cpp file. So FirstPersonCameraController.h and .cpp have a corresponding TestFirstPersonCameraController.cpp. Since I use this convention regularly throughout all of my code, I have a custom IDE macro to toggle between a file and its corresponding test file. I also put all the tests in a separate subdirectory to keep them as physically separate as possible.

I prefer to break up my code into several static libraries for each major subsystem: graphics, networking, physics, animations, etc. Each of those libraries has a set of unit tests, but instead of compiling them into the library, I create a separate project that creates a simple executable program. That project contains all the unit tests and links against the library itself, and in its main entry point it just calls the function to runs all unit tests and returns the number of failures. This keeps the tests separate from the library, but still very easy to build.

If all your code is organized into libraries, and your game is just a collection of libraries linked together, that’s all you need. Most games and tools, however, have a fair amount of code that you might want to test in the project itself. Since the game is an executable, you can’t easily link against it from a different project like we did before. In this case, I build the unit tests into the game itself, and I optionally call them whenever a particular command-line parameter like -runtests is present. Just make sure to #ifdef out all the tests in the final build.

Multiplatform Testing

Running the tests on the PC where you build the code is very straightforward. But unless you’re only creating games and tools for that platform, you will definitely want to run your tests on different platforms as well. Unit tests are an invaluable tool for catching slight platform inconsistencies caused by different compilers, architecture idiosyncrasies, or varying floating point rules.

Unfortunately, running unit tests on a different platform from your build machine is usually a bit more involved and not nearly as fast as doing it locally. You need to start by compiling the tests for the target platform. This is usually not a problem since you’re already building all your code for that platform, and hopefully your unit testing framework already supports it. Then you need to upload your executable with the tests and any data required to the target platform and run it there. Finally, you need to get the return code back to detect if there were any failures. This is surprisingly the trickiest part of the process with a lot of console development kits. If getting the exit code is not a possibility, you’ll need to get creative by parsing the output channel, or even waiting for a notification on a particular network port.

Some target platforms are more limited than others in both resources and C++ support. One of the features that makes UnitTest++ a good choice for games is that it requires minimal C++ features (no STL) and it can be trimmed down even further (no exceptions or streams).

For example, running unit tests on the PS3 SPUs was extremely useful, but it required stripping the framework down to the minimum amount of features. It was also tricky being able to fit the library code plus all the tests in the small amount of memory available. To get around that, we ended up changing the build rules for the SPUs so each test file created its own SPU executable (or module). We then wrote a simple main SPU program that would load each module separately, run its tests, keep track of all the stats, and finally report them.

Running a set of unit tests on the local machine can be an almost instant process, but running them on a remote machine is usually much slower, and can take up to 10 or 20 seconds just with the overhead of copying them and launching the program remotely. For this reason, you’ll want to run tests on other platforms less frequently.

No Leaking Allowed

Finally, if you’re going to have this all this unit testing code running on a regular basis, you might as well get as much information out of it as possible. I have found it invaluable to keep track of memory leaks around the unit test code.

You’ll have to hook into your own memory manager, or use the platform-specific memory tracking functions. The basic idea is to get a memory status before running the tests, and another one after all the tests execute. If there are any extra memory allocations, that’s probably a leak. In that case, you can report it as a failed build by returning the correct error code.

Watch out for static variables or singletons that allocate memory the first time they’re used. They might be reported as memory leaks even though it wasn’t what you were hoping to catch. In that case, you can explicitly initialize and destroy all singletons, or, even better, not use them at all, and keep your memory leak report clean.

You’re now armed with all you need to know to set up unit tests into your project and build pipeline. Grab a testing framework and get started today.

This article was originally printed in the May 2009 issue of Game Developer.

The Const Nazi

Anybody who worked with me or saw any of my code, would know right away why they call me the Const Nazi. That’s because in my coding style, I make use of the keyword const everywhere. But instead of going on about how const is so great, I’m going to let Hitler tell us how he really feels about it.


No Flash? Try the QuickTime video version.

Let me get one thing out of the way to stop all the trigger-happy, const-bashing, would-be-commenters: const doesn’t make any guarantees that values don’t change.

You can change a const variable by casting the constness away, or referencing it through a pointer, but you really had to go out of your way to do that. If it helped with that, const would solve (or improve) the memory aliasing problem like Hitler pointed out. It doesn’t, so const is pretty weak as far as promises go. It just says “I, the programmer, promise not to change this value on purpose (unless I’m truly desperate)”. Still, even a promise like that goes a long way helping with readability and maintenance.

With that out of the way, what exactly do I mean by using const everywhere?

Const non-value function parameters

Any reference or pointer function parameters that are pointing to data that will not be modified by the function should be declared as const. If you’re going to use const just for one thing, this is the one to use. It’s invaluable glancing at a function signature and seeing which parameters are inputs and which ones are outputs.

void Detach(PhysicsObject& physObj, int attachmentIndex, const HandleManager& mgr);

Marking those parameters as const also serves as a warning sign in case a programmer in the future tries to modify one of them. Imagine the disaster if the calling code assumes data never changes, but the function suddenly starts modifying that data! const won’t prevent that from happening, but will remind the programmer that he’s changing the “contract” and needs to revisit all calling code and check assumptions.

Const local variables

This is a very important use of const and one of the ones hardly anyone follows. If I declare a local (stack) variable and its value never changes after initialization, I always declare it const. That way, whenever I see that variable used later in the code, I know that its value hasn’t changed.

const Vec2 newPos = AttachmentUtils::ApplySnap(physObj, unsnappedPos);
const Vec2 deltaPos = newPos - physObj.center;
physObj.center = newPos;

This is one of the reasons why I did a 180 on the ternary C operator (?). I used to hate it and find it cryptic and unreadable, but now I find it compact and elegant and it fulfills my const fetish very well.

Imagine you have a function that is going to work in one of two objects and you need to compute the index to the object to work on. You could do it this way:

int index;
if (some condition)
	index = 0;
else
	index = 2;

DoSomethingWithIndex();

Not only does that take several lines not to do much, but index isn’t const (argh!). So every time I see index anywhere later on in that function, I’m going to have to spend the extra mental power to make sure nothing has changed (and, with my current coding style, I would assume it has changed).

Instead, we can simply do this:

const int index = (some condition) ? 0 : 2;
DoSomethingWithIndex();

Ahhhh… So much better!

Const member variables

This one doesn’t really apply to me anymore because I don’t use classes and member variables. But if you do, I strongly encourage you do mark every possible member function as const whenever you can.

The only downside is that sometimes you’ll have some internal bit of data that is really not changing the “logical” state of an object, but it’s still modifying a variable (usually some caching or logging data). In that case, you’ll have to resort to the mutable keyword.

Const value function parameters

const_nazi.jpgApparently I’m not a total Const Nazi because this is one possible use of const that I choose to skip (even though I tried it for a while because of Charles).

Marking a value function parameter as const doesn’t make any difference from the calling code point of view, but it serves the same purpose as marking local stack variables as const in the implementation of the function. You’re just saying “I’m not going to modify that parameter in this function” so it makes the code easier to understand.

I’m actually all for this, but the only reason I’m not doing it is because C/C++ makes it a pain. Marking parameters as const in the function declaration adds extra verbosity and doesn’t help the person browsing the functions at all. You could actually put the const only in the function definition and it will work, but at that point the declaration and the definition are different, so you can’t copy and paste them or use other automated tools or scripts.

The concept of const is one of the things I miss the most when programming other languages like C#. I don’t understand why they didn’t add it to the language. On something like Python or Perl I can understand because they’re supposed to be so free form, but C#? (Edit: How about that? Apparently C# has const. It was either added in the last few years or I completely missed it before). It also really bugs me that Objective C or the Apple API doesn’t make any use of const.

Frankly, if it were up to me, I would change the C/C++ language to make every variable const by default and adding the nonconst or changeable (or take over mutable) keyword for the ones you want to modify. It would make life much more pleasant.

But then again, that’s why the call me the Const Nazi.

This post is part of iDevBlogADay, a group of indie iPhone development blogs featuring two posts per day. You can keep up with iDevBlogADay through the web site, RSS feed, or Twitter.

The Power of Free (aka The Numbers Post #3)

It has been two months since the last “numbers post”. It covered the Valentine’s Day promotion, spike in sales, and subsequent settling out at a very nice level. Here’s a recap of what things looked like at the beginning of May (revenue was about $1500 per week):

fg_before.png

The Plan

Mother’s Day happens at the beginning of May (in the US and Canada anyway, I’m afraid I was too busy in April and I missed Mother’s Day in a lot of European countries). I figured it would be the perfect time to do another push.

If there’s something I learned from past experience, is that the more you manage to concentrate any kind of promotion, the more effective it will be. So in preparation for Mother’s Day, I created a new update (with iPad support), a couple of new in-app purchase items (new set of seeds and some fertilizer bundles), and sent out the usual announcement on the Flower Garden mailing list, Facebook page, and Twitter.

But in addition to all of that, I tried a new strategy: I gave Flower Garden away for free. Yes, completely for free.

The idea sounded really scary at first. After all, I would be giving away my baby for free. Would I lose a lot of money doing that? Would it depreciate the perceived value of Flower Garden? Would it annoy loyal users seeing an app they paid for given away? Fortunately it appears that the answer to those questions was no.

The reason I decided to give Flower Garden away for free was mostly to get it into more people’s hands. I was thinking I would lose some money initially, but then more than make up for it when I turned it back to paid because of all the extra users and word of mouth. There is already a free version with a limited number of pots and seeds, but people are hungry to download paid apps for free.

To add extra impact to this price change, I had Flower Garden featured as the free app for Mother’s Day weekend in Free App Calendar. Unlike other free app web sites, the folks at Free App Calendar are very developer friendly and are not out to take a cut of your profits or charge outrageous fees. It was an absolute pleasure dealing with them.

Mother’s Day

On Saturday May 8th, a few minutes past midnight the day before Mother’s Day, I switched Flower Garden over to free. Now I was committed!

Right away there was a lot of positive reaction around the announcement. Everybody on Twitter and Facebook were responding really well and spreading the word. Major sites like Touch Arcade and 148Apps covered the Mother’s Day promotion and got lots of extra eyeballs on the sale.

After the first day, the data was in: Flower Garden had been downloaded 12,500 times. That was great! As a reference, Flower Garden Free was usually downloaded between 800 and 1,000 times per day, so that was a 10x improvement.

On Sunday, Mother’s Day, things got even better. News had time to propagate more, and people were sending bouquets like crazy, so by the end of the day there had been an additional 26,000 downloads. That’s exactly what I was hoping for!

As a matter of fact, it was doing so great, that I decided to leave it for free as long as the number of downloads was significantly higher than what the free version was normally getting. After Mother’s Day downloads started going down, but they were still pretty strong the following Sunday. Here’s what the download numbers looked like for those 9 days:

fg_week_downloads.png

The important question is how much revenue was there during that time? I was giving the app away for free but it had in-app purchases. Would they make up for it? The answer was a resounding yes!

fg_week_revenue.png

Mother’s Day went on to become the biggest day in terms of revenue since Flower Garden was launched. Bigger even than Christmas or Valentine’s Day! Things started going down after that, but still at a very high level. The little bump towards the end of the week is a combination of the weekend (which always results in more sales), and the feature of Flower Garden on the App Store across most of Europe.

These numbers took a bit to sink in. It really shows that in-app purchases are definitely tied to the number of downloads. If you manage to give away twice as many copies, you’ll probably get close to twice as many in-app purchases. That effect is amplified if you have multiple in-app purchase items available.

It’s also interesting to notice that revenue didn’t follow the same drop-off curve as downloads. It wasn’t nearly as sharp. I suspect two things are going on in there:

  • Some users downloaded Flower Garden during the sale weekend and weren’t interested in it at all. Downloading it was a knee-jerk reaction to any app that goes free, so that never translated into an in-app purchase. Users later in the week however, probably downloaded it because they received a bouquet or were interested in it, so they had a much higher likelihood of buying something through the Flower Shop.
  • Fertilizer. Fertilizer is the only consumable item available for purchase in Flower Garden. Unlike a non-consumable item, the number of sales is not tied to the number of new users, but to the number of current, daily users. The more users launch your app every day, the higher the sales of consumable items. Some of the new users of Flower Garden went on to buy fertilizer later in the week, making revenue higher than you would expect from the download curve.

Flipping The Switch

The number of downloads on Sunday May 16th was slightly over 2,000. At that point I decided that it was close enough to the number of downloads Flower Garden Free was normally getting, so I flipped the switch back to paid. Things were going great, so messing with it was a pretty scary thing to do. Even scarier than it had been setting it free in the first place.

During that week, Flower Garden rose up on the charts. It reached #73 in the top games in the US and was charting very high in all the subcategories and on the iPad charts. As soon as I flipped the switch back to paid, it dropped out of sight from the charts. Fortunately, within a couple of days it came back to its position before that crazy week.

Most importantly, Flower Garden Free, which had dropped quite a bit during that promotion, immediately went back up to the top 100 in the Kids and Family subcategories like before.

As you can expect, as a result of giving it away for free, the ratings on the App Store went down quite a bit. While it was a paid application, the ratings were around 4 stars, but they dropped down to 2.5 stars after that week. It seems people love a free app, but are very quick to criticize it and give it a low rating (especially if it has in-app purchases).

Fortunately bad ratings can be easily fixed with a new update, and some encouragement to users to leave positive rating on the App Store. Now it’s back up to over 4.5 stars.

Aftermath

Now it’s two months later and the dust has had a chance to settle down. Apart from the very nice sales spike during the sale, was it worth it? Again, the answer is a definite yes.

Here’s the revenue since the start of the promotion:

fg_two_months.png

As you can see, it quickly went down, but it settled at a reasonably high level. In fact, compare this two-month period (highlighted in blue) with the previous sales:

fg_after2.png

Before the promotion revenue was hovering around $1,500 per week, now it settled down to about $2,400 per week. An average day today is bigger than Christmas day! Very nice change!

I think the reason why it settled at a higher revenue level than before is because it got more exposure during that week. Lots of people sent bouquets, which introduced new users to Flower Garden. It’s the viral effect I was hoping for from the start, and although it never reached epidemic proportions, it has been enough to keep Flower Garden alive and well.

Adding iPad support as part of the latest update probably helped too. The iPad market is smaller than the iPhone one, but a lot of early adopters are eager to find good apps for their new toys. The smaller market size also allowed Flower Garden to appear in the iPad charts more easily, increasing exposure that way.

Here’s a breakdown of where revenue came from in the last month (I’m excluding the period where Flower Garden was free to get a more accurate view):

sales_breakdown.png

As you can see, consumable items (fertilizer) account for almost half the revenue. Consumable items are a factor of your current userbase, so getting a large influx of new users can result in a permanent revenue increase instead of just a sales spike. It also shows what a small percentage actual app sales are, which explains why even while Flower Garden was free, revenue was still up.

Conclusion

This was a wild ride again! It was definitely worth doing the promotion and it definitely brought home how powerful free can be. However, I’m trying to decide the pricing scheme for my next game, and even though free plus in-app purchases is very tempting, I’m not sure it’s the way to go.

What do you think? Are new games better off being free with in-app purchases, or can indie games be successful being paid (and still having in-app purchases)?

This post is part of iDevBlogADay, a group of indie iPhone development blogs featuring two posts per day. You can keep up with iDevBlogADay through the web site, RSS feed, or Twitter.

When Is It OK Not To TDD?

The basic principles of Test-Driven Development (TDD) are very simple and easy to understand. Every programmer quickly grasps those and is able to apply them to simple cases and low level libraries (math libraries seem to be everybody’s favorite TDD proving ground [1]).

What becomes significantly more difficult is learning to effectively apply TDD to code with more dependencies. A question that I’m often asked from people trying to use TDD for the first time is “How can you possibly use TDD for high-level game code? It’s impossible!”.

When The Going Gets Tough

At that point, the temptation to give up on using TDD for high-level code becomes very strong. It seems that all the time spent writing mocks and hooking objects together is a waste of time and that you could be writing game code much faster without it. “Maybe it would be best to skip TDD, just for that part”.

Bad idea!

tdd_cycle.jpgBy giving up on TDD for high-level code, you’ll be missing out on the main benefit of TDD: designing better code. I’ve said this many times, but it bears repeating: TDD is not a testing technique. It’s a design technique. You get lots of benefits from applying TDD [2], but one of the main one is much more modular and dependency-free code.

That’s because TDD turns our weaknesses into advantages. We’re humans and we’re lazy. We don’t want to repeat ourselves constantly or write complex code. If we’re writing a test for some code we’ll write in the future, we’re going to create a very simple test. We’re probably going to want to put some object or data on the stack and make a function call and nothing else. We certainly don’t want to initialize the graphics system, create a new world, add some level data, start the physics simulation, and then call the function! So by being lazy, we end up writing very simple code with the least amount of dependencies.

What does that mean for our high-level code? The AI logic seems to need access to every single game system, at least the way we implemented it in the past. Using TDD to implement AI logic doesn’t mean writing tests to write the same code you would have written before. It means writing tests and then writing some code that makes those tests pass and nothing else. For example, we might realize that the AI doesn’t need to access the physics system, cast a ray in the world, find out what entities is interesting, and return that information. Instead, all it might need to do is output some data saying that it wants a ray query done at a later time (which in turn is the key to batching those ray casts and achieving high performance).

You soon realize that the code you write is very different (and better!) to what you would have written before. Once you get over that hump, TDDing high-level code is not about twisting your code with mocks, but becomes simpler code that takes simple input data and creates simple output data. By pushing through and forcing yourself to apply TDD, you made a breakthrough and reaped all the benefits.

That’s the main reason why TDD books push the idea that no code can be written without a test first. Otherwise, inexperienced TDD programmers would be too quick to quit and would miss out on the technique completely.

Not only that, but code that is not created with TDD tends to have the bad habit of spreading. When code was not designed through TDD, it means that other code that uses it will probably be difficult to TDD itself. So it’s a bit like const-correctness: You’re either in our out, and there’s very little room for half measures.

On the other hand, I would argue that using TDD on a math library is a bad idea. It’s essential to write good unit tests for a math library, but probably not to design it through TDD. Are you really going to implement a cross product differently just because you wrote tests before? The emphasis there has to be on correctness and performance, not on creating the interface or implementation through tests.

It’s A Tool To Help Development

A few days ago, I received an email from Caleb Gingles with a great question:

I’ve started a new game project and have been making an effort to strictly follow TDD from the beginning. Which has been a challenge, as much of the coding at the beginning of the project has consisted of operating system requirements and framework stuff. […] However, there are other situations that seem much harder to test. Like the main loop in a game. According to Kent Beck’s book, everything begins with testing, and no code is written unless a test requires it. But how can a test require the main loop framework in a game? The loop just… is. It exists to allow you to do certain other things, it doesn’t have much purpose in and of itself.

Applying what I just talked about, we could use TDD to create the main loop of a game. It definitely requires thinking about it differently and breaking preconceptions. It’s about writing code to pass the tests, not writing tests to fit the code we want to write.

So we could do something like this:

TEST(MainLoopExecutesCorrectNumberOfTimes)
{
    MainLoopState loopState;
    MainLoopUtils::Execute(loopState, 4);
    CHECK_EQUAL(4, loopState.frameCount);
}

All of a sudden we see a way to test something that before it seemed impossible to test. We can add tests to make sure certain functions are getting called (maybe we add those function pointers to the loopState), that the main loop exits under certain conditions, etc.

For some games with complex GUIs and different game modes, it might make a lot of sense to have a more flexible main loop and develop it completely through TDD. On the other hand, for a simple iPhone game that has a single main loop that never changes, there might be no point at all in using TDD. The loop code is going to end up looking more or less the same as it would without TDD, just more complex and generic. TDD didn’t help any in that case.

How about code that calls into OpenGL, or some other non-TDD-friendly library? Do we have to TDD that? Frankly, I wouldn’t bother. I did give it a good try at one point several years ago and I can say it is possible, but it’s a pain and you gain almost nothing from it. The main benefits is trusting that the library you’re using really does what it’s supposed to do, but it won’t affect much the design of your code. So what you’re really doing is adding tests to that library (which might or might not be of value).

My advice in a situation like that: Call the library from as few places as possible (I didn’t want to say make a thin wrapper because that implies isolation which is not the goal here), and just test that your code is called at the right time with the right data. The actual code that calls the library should be small and simple enough that you should feel OK not having any tests for it.

Even though it seems to go against what I said in the previous section, TDD is not all or nothing. It’s not a religion.

When you’re starting out, I encourage you to push yourself to think how you could apply TDD to situations you don’t think are possible. Once you’re experienced enough, you’ll know when to say enough is enough, and use TDD only when it actually benefits you. At that point you’ll also know how to write code in a way that plays well with TDD, even if the code itself was not developed through TDD.

In the end, TDD is a tool to help you develop better and faster. Don’t ever let it get in the way of that.

[1] That’s actually a horrible place to apply TDD. More on that in a second.

[2] Usable API, simple code, unit tests, documentation, safety net for refactoring among others.