Back to The Future (Part 1)

Insanity: doing the same thing over and over again and expecting different results. – (attributed) Albert Einstein

How would you like to be able to reproduce every crash report that QA adds to the bug database quickly and reliably? How useful would it be to be able to put a breakpoint the frame before a crash bug happens?

You can do all that and more if your game is deterministic and you feed it the same inputs as an earlier run. Sounds easy? It is, if you implement it early on and you keep it that way during development. If you choose not to make your game deterministic, your team will go insane by Einstein’s definition, and maybe by a few other definitions as well by the time the project ends.

Determinism

A system is said to be deterministic if, given the same set of inputs, it produces the same set of outputs. Think of your game as a big, black box, with some inputs and outputs.

The two main input types for most games are:

  • Player input devices. The state of the gamepad, keyboard, mouse, or any other input device is used to update the simulation.
  • System clock. Most games query the system clock once per frame to retrieve a current time and delta time. The game then advances the simulation forward by that amount of time. Even console games that expect to run at a rocksolid 60 Hz are often implemented this way to be able to deal with different TV refresh rates or with slower builds during development.

The main outputs of a game system are the pixels on the screen and the generated sounds. Optionally there are other outputs such as network packets or force feedback.

To make the game deterministic we need to make sure that, for a given set of inputs, it always produces the same outputs. In other words, playing the game twice, entering the same button presses at the exact times, the game will end up in the same state both times (same location, same health, number of lives, NPC positions, etc).

What about random numbers, which we rely on so much in games? Aren’t random numbers inputs as well? Yes and no. In games we use pseudo random number generators. That means that the sequence of numbers from a particular seed is always the same. For a given seed, all runs of the game are fully deterministic, so the random numbers are not really an input into the system. The seed to the random number generator is an input, since it will greatly affect the state of the simulation.

Record and Playback

Recording all player inputs and the system clock is as straightforward as it sounds: At the beginning of the simulation loop, sample the clock and the player input devices. Then save those values to a file and continue the game simulation as usual. You want to make sure the file is flushed after writing the values for each frame. That way, if the game crashes, you’ll have all the input leading up to that frame and you’ll be able to play it back right up to the point where it crashed.

Recording the input has a negligible performance impact and the amount of data saved is very small. As an example, in our current game, every frame we save the frame number, the clock time, frame delta time, and a 20-byte structure for each gamepad (see Listing 1). Without any compression, that’s 52 bytes per frame for a two-player game. At 60 Hz, a 20-minute game would be a tiny 3.6 MB, which is small enough to attach to a bug tracking system, email to the team, or archive with the build itself.

Listing 1. Game input structures.

struct FameInput{    uint32 frameNumber;    float time;    float dt;};

struct RawControllerState{    uint32 buttons;    float leftStickX;    float leftStickY;    float rightStickX;    float rightStickY;};

Using the recorded input for playback is almost as easy. Every frame, before the simulation starts, we read the input data from the file and feed it to the game as if it came from the system clock and the input devices. A clean way to do that is to separate the reading of the data from where it comes from. In C++ we can use abstract base classes that define an interface describing how to retrieve the data. For example, a class IGameInput can define an interface, and the class GamepadGameInput reads the data from the hardware, while the class FileGameInput reads it from the recorded file (see Listing 2).

Listing 2. IGameInput, GamepadGameInput and FileGameInput

class IGameInput{public:    virtual void GetInputData(RawControllerState& state) = 0;};

class GamepadGameInput: public IGameInput{public:    virtual void GetInputData(RawControllerState& state);};

class FileGameInput: public IGameInput{public:    virtual void GetInputData(RawControllerState& state);};

In addition to recording and playing back those inputs every frame, we also need to record the seed to the random number generator before the game starts, and read it and apply it during playback. That will ensure that all the random numbers are the same in both runs of the game.

Verification

If you’re going to rely on this recording system, you need to make sure the playback produces the exact same results as the original session. If they aren’t the same, the whole system is worthless. Also, because we’re just recording input, if the state of the game during playback ever starts to diverge, it will continue getting more and more out of sync from there. We need a way to make sure the playback produces the same state as the initial recording.

Earlier we identified some of the outputs of the game as the pixels on the screen or the sound produced by the speakers. We could compare pixels with a previous run of the game, but the storage requirements would be enormous, and the performance less than ideal. Besides, subtle changes in shaders, lighting, or a simple texture change would throw the comparison off. An easier solution is to check the game state itself. If an enemy is in a different location in two runs of the game, of course it’s going to render to a different set of pixels. It will be a lot faster and easier to compare game states than raw output.

During recording, in addition to the game input, we can save some of the state of the game. There’s no need to record it all, just some of the most important and representative state. Good candidates include player and enemy positions, prop transforms, score, etc. Unless it’s crucial to your game, there’s no need to record things like player animations or enemy AI state. Chances are that if any of those diverge, the positions for those entities will also diverge right away.

Once we have all of this state recorded, we can verify the state does not change during playback. Every frame, we compare the current game state with the recorded game state. If they’re different, even by the smallest amount, we know something has gone wrong and we flag it right away. The game isn’t quite deterministic and something needs to be fixed. We can choose to record and verify the game state anywhere in the simulation loop (as long as they’re both done in the same spot), but if we do it after the simulation step instead of before, we’ll be able to see what the inputs were that caused the divergence this frame, which will help when debugging. The main loop is shown in Listing 3.

Listing 3. Main loop structure

while (!done){    SampleClockAndInputs();   // from different sources through the same interface    RecordInput();    UpdateSimulation();    if (recordGameState) RecordGameState();    if (verifyGameState) VerifyGameState();}

Unlike recording game input, recording the game state can be a much more expensive operation. Traversing the game structures can be a significant performance hit due to cache misses, and the data saved to disk often results in large files. Because of this, in our current game player input is recorded all the time, but recording game state is optional, and we can be controlled through a command-line parameter. Game state verification is also optional, since sometimes we want to play back a recorded set of inputs even knowing that the state of the game is going to diverge due to changes we’ve made.

For a few of you, checking the game state won’t be enough. If you’re writing a middleware graphics layer, you probably want to verify that the values you’re generating in the back buffer are the same ones from a previous pass. Or maybe that improving the performance of a rendering algorithms still generates the same image. In that case, you might want to consider something like Perceptual Image Difference, which will be a much less error-prone than comparing exact values for pixels.

Monkeying Around

Once the game is fully deterministic and the playbacks are rock solid, you want to make sure it stays that way. It’s all too easy to introduce bugs that will cause playbacks to diverge. For example, a single, innocent-looking, uninitialized variable can change the simulation depending on the value it happens to have for this run.

The best method I’ve found to stress test the playback system is to use a recorded session of monkey input. The monkey input method consists of feeding the game pseudo-random inputs (as if a monkey were playing the game). Recording both the input and game state of a monkey input play session, and then playing it back verifying the game state kills two birds with one stone: You get some nice automated testing of your game, and you verify that the game is fully deterministic. It sounds too simple to be useful, but you’ll be amazed at how many bugs your first session of monkey input will uncover.

A clean way to implement the monkey input, is writing a new class that implements the IGameInput interface. This new class will generate game input for all the buttons and axes of a game controller, but it won’t come from the hardware game controller or from a recorded session, but from randomly generated values. Apart from being a very clean way to insert input into the game, this approach has the advantage that the monkey input can be recorded just as if it were regular input coming from the controller. That means we can later replay a session and verify its game state, which makes for a very useful functional test.

It turns out that totally random input values are not ideal, as it becomes apparent as soon as you implement the naive random monkey input class. If the state of the jump button is truly random, it will be pressed and released almost every frame, which means the player will hardly ever jump high enough off the floor. A better implementation will use a range with a minimum and maximum press time durations, which will give much better results. You might also want to avoid pressing specific buttons (like the pause button, or the restart button combination). Resist the temptation to make the monkey input too smart and make it behave more like a real player, for example, by limiting the number of buttons it can have pressed simultaneously. Part of the benefit of the monkey input is that it will do very unexpected things, that no sane player will ever do on purpose, and by doing that, it will unearth many more problems and issues with the game.

One day, shortly after I implemented the monkey input system, the playback verification started failing. Some game entities were ending up in a different state than expected. After doing some digging, I realized that I was using the same random number generator for the game simulation and for the monkey input. Since the playback was just reading input values from the file and not generating them on the fly, the random number sequence was getting off sync right away, causing entities with some randomness in them to behave differently. Lesson learned: Make sure that nothing in the monkey input affects the game systems. In this case, I solved it by using two instances of the random number generator, one for the monkey and one for the game.

Playback in The Real World

At this point we can easily reproduce bugs. We can re-run the game and put a breakpoint right before the game crashes—except that the crash happens 20 minutes into the playback and nobody wants to wait that long.

Fortunately, we can make time go by faster. During playback we don’t care about tearing artifacts, so we can turn off the vertical sync, which will speed things up a bit. Most importantly, we often don’t even care whether we render anything or not, so if we turn off rendering completely, we can make the playback run significantly faster, particularly if you were graphics bound. One word of caution: If the rendering part of the game does significant work, and especially if you suspect it might be interacting with the rest of the game to cause a bug or some other source of non-determinism, you might want to do the same work in the rendering system, constructing the push buffer the way you would normally do, but never send it off to the graphics hardware.

Turning off the vertical sync and disabling the graphics rendering, is just reducing the amount of time we spend in each frame. But it still takes 10 minutes to get 10 minutes into the game, we just go through many more frames to get there. The last piece of the puzzle to make time go faster is to be able to set a fixed timestep. Now we can go through the simulation at a rate much faster than real time (and the beefier you computer, the faster it will go). In my current game, we can go through 10 minutes of game time in about 1 minute of real time.

One use for recording and playback that we haven’t mentioned is performance comparisons. When you’re optimizing the game, you can record a gameplay sequence and gather some performance statistics: average fps, longest frames, etc. Then you can apply your optimizations, playback the same input sequence, and compare the performance statistics to see how effective the optimizations really were. Just make sure you use real clock time and not the recorded clock time, otherwise you won’t see any differences, even with all the hard effort you put into the optimizations.

There are a few details we glossed over that might prevent some games from begin fully deterministic: network traffic, asynchronous file I/O, and threading issues. We’ll cover those in detail next month.

You’ll soon find that input recording and playback becomes an essential tool for in development process and you’ll wonder how you ever lived without it. Spend a few hours implementing it early in the development cycle and reap the benefits many times over during production. You’ll be the hero of the day when that dreaded crash bug from QA arrives minutes before a deadline.

Thanks to Jim Tilander for being a great idea bouncing-board and proofreading the article.

This article was originally printed in the May 2008 issue of Game Developer.

  • Ofek Shilon

    while trying to reproduce release-build bugs in debug build.

    At least on the same processor, that *is* achievable by using either-
    (1) compiler switches (eg, fp:fast on an otherwise debug build),
    (2) pragmas (#pragma float_control), and occasionally -
    (3) controlling rounding directly via the intrinsic _controlfp.
    Lots more info at -
    http://msdn.microsoft.com/en-us/library/aa289157(vs.71).aspx#floapoint_topic7

    A specific gotcha for game developers, is that calling a Direct3D device methods can impact floating precision (which is weird at first, but kinda makes sense). You can bypass this behaviour with use of the D3DCREATE_FPU_PRESERVE flag while creating your device:
    http://msdn.microsoft.com/en-us/library/bb172527(VS.85).aspx

  • Andrew Wiley

    You’d think there’d be some way to disable the extra accuracy that is added, but I can’t see any conceivable way that something that low-level could be changed. Maybe I should email Intel about it… Not sure what would happen there.
    Anyway, yea, it’s hard to get rid of floats altogether. I’ve seen cases where people went from floating-point to fixed-point, but the general approach is just to try and use ints. The main problem there is that AFAIK you have to write your own physics because all the engines I’ve seen use floats. Supposedly PhysX software physics (that don’t take advantage of a PhysX card) are deterministic, but the Ageia developers don’t seem very sure. I suspect they would have the same float issue. Hardware accelerated PhysX is completely nondeterministic because of thread race conditions that regularly occur. The physics stay realistic, but the order in which events are processed isn’t consistent.
    It’s ironic because most of the articles I’ve read about determinism extoll on how easy network programming is when your game is deterministic, and only a few of those articles mention the float problem.

    Maybe one day Intel and AMD will release SDK’s that allow you to set the float accuracy, but until then… oh well.

  • Andrew Wiley

    I’ve read a lot about determinism in the past, and you cover it quite thoroughly here, but one problem I’ve heard of that you seem to not include is float processing.
    It may not apply to you… but then again, if you’re replaying the same input on different systems, it does.
    I’ve read that Intel and AMD have different accuracies in their FPU’s (IIRC Intel calculates out the float several bits farther than requested and uses the extra data to round off). This leads to small differences in floating point values that build up over time and thwart determinism.
    So I’ve read, anyway. Is there someway you’re getting around this? Do you just never hit it because you don’t network your games?
    Is all this I’ve written complete fiction?

  • noel

    You’re right about different hardware treating floats differently. Heck, the same hardware in debug and release will cause different results because of rounding differences and what goes to memory and what stays in registers.

    It hasn’t been a problem for us because we always replay inputs in the same hardware we record it in. Actually, I think I might have replayed some inputs from our functional test server (AMD) on my home PC (Intel Core 2 Duo), and that worked fine. But for us the most important thing is to replay it in the same hardware configuration (same console), so that’s not a problem. 

    It would be a big deal if you tried to use that as part of network play though, so you’ll probably want to do the standard syncing of transforms every so often to correct any deviations or some other scheme like that.

    Or, if you can get away with it, don’t use floats at all. But that’s a lot harder if you’re having lots of physics integrated in your game though.

     

  • http://www.claushoefele.com/ Claus

    Wouldn’t it be possible to combine game state and input state captures: capture the game state in regular, but large, intervals (say every 300-600 frames) and record the input in-between those game state updates? (Similar to the concept of I-frames and P-frames in video.)

    That way, you could fast-forward to the last recorded game state before the bug happened and replay the input from this point. This also minimises errors that accumulate during playback if your game engine is not 100% deterministic.

    -Claus

  • noel

    That’s a good idea, especially because you can scrub forwards or backwards to each saved game state, and get to a crash situation much faster.

    I don’t think it’s a substitute for having the engine fully deterministic, because if it isn’t, it will be very hard to reproduce the problem, even after a nearby checkpoint. Also, a game state save often won’t preserve everything the way it was (individual particles, memory heaps, etc). So it might work for some things, but not for others. But yeah, I really like the idea of combining full game states and input playback.