Data-Oriented Design Now And In The Future

There has been a lot of recent discussion (and criticism) on Data Oriented Design recently. I want to address some of the issues that have been raised, but before that, I’ll start with this reprint from my most recent Game Developer Magazine. If you have any questions you’d like addressed, add write a comment and I’ll try to answer everything I can.

 

Last year I wrote about the basics of Data-Oriented Design (see the September 2009 issue of Game Developer). In the time since that article, Data-Oriented Design has gained a lot of traction in game development and many of teams are thinking in terms of data for some of the more performance-critical systems. Continue reading

Start Pre-allocating And Stop Worrying

One of the more frequent questions I receive is what kind of memory allocation strategy I use in my games. The quick answer is none (at least frame to frame, I do some allocation at the beginning of each level on a stack-based allocator). This reprint of one of my Inner Product column covers quite well how I feel about memory allocation.


We’ve all had things nagging us in the back of our minds. They’re nothing we have to worry about this very instant, just something we need to do sometime in the future. Maybe that’s changing those worn tires in the car, or making an appointment with the dentist about that tooth that has been bugging you on and off.

Dynamic memory allocation is something that falls in that category for most programmers. We all know we can’t just go on allocating memory willy-nilly whenever we need it, yet we put off dealing with it until the end of the project. By that time, deadlines are piling on the pressure and it’s usually too late to make significant changes. With a little bit of forethought and pre-planning, we can avoid those problems and be confident our game is not going to run out of memory in the most inopportune moment.

On-Demand Dynamic Memory Allocation

memory-all-ranks.jpgThe easiest way to get started with memory management is to allocate memory dynamically whenever you need it. This is the approach many software engineering books consider as ideal and it’s often encouraged in Computer Science classes.

It’s certainly an easy approach to use. Need a new animation instance when the player is landing on a ledge? Allocate it. Need a new sound when we reach the goal? Just allocate another one!

On-demand dynamic memory allocation can help to keep memory usage to a minimum, because, you only allocate the memory that you need and no more. In practice it’s not quite as neat and tidy because there can be a surprisingly large amount of overhead per allocation, which adds up if programmers become really allocation-happy.

It’s also a good way to shoot yourself in the foot.

Games don’t live inside a Computer Science textbook, so we have to deal with real world limitations, which make this approach cumbersome, clunky, and potentially disastrous. What can go wrong with on-demand dynamic memory allocation? Almost everything!

Limited Memory

Games, or any software for that matter, run on machines with limited amounts of memory. As long as you know what that limit is, and you keep extremely careful track of your memory usage, you can remain under the limit. However, since the game is allocating memory any time it needs it, there will most likely come a time when the game tries to allocate a new block but there is no memory left. What can you do then? Nothing easy I’m afraid. You can try to free an older memory block and allocate the new one there, or you can try to make your game tolerant to running out of memory. Both those solutions are very complex and difficult to implement correctly.

Even setting memory budgets and sticking to them can be very difficult. How can a designer know that a given particle system isn’t going to run out of memory? Are these AI units going to create too many pathfinding objects and crash the game? Hard to say until we run the game in all possible combinations. And even then, how do you know it isn’t going to crash five minutes later? Or ten? It’s almost impossible to know for certain.

If you insist in using this approach, at the very least, you should tag all memory allocations, so you have an idea of how memory is being used. You can either tag each allocation based on what system initiated it (physics, textures, animation, sound, AI, etc) or even on the filename where it originated, which has the advantage that it can be automated and should still give you a good picture of the overall memory usage.

Memory Fragmentation

Even if you take lots of pain not to go over your the available memory, you might still run into trouble because of memory fragmentation. You might have enough memory for a new allocation, but in the form of many small memory blocks instead of a large contiguous one. Unless you provide your own memory allocation mechanism, fragmentation is something that is very hard to track on your own, so you can’t even be ready for it until the allocation fails.

Virtual Memory

Virtual memory could solve all those problems. In theory, if you run out of real memory, the operating system swaps out some older, unused pages to disk and makes room for the new memory you requested. In practice, it’s just a bad caching scheme because it can be triggered at the worst possible moment, and it doesn’t know about what data it’s swapping out or how your game uses it.

Games, unlike most other software, have a “soft realtime” requirement: The game needs to keep updating at an acceptable interactive rate, which is somewhere around 15 or more frames per second. That means that gamers are going to make a trip to the store to return your game if it pauses for a couple of seconds every few minutes to “make some room” for new memory. So relying on virtual memory isn’t a particularly attractive solution.

Additionally, lots of games run in platforms with fixed amounts of RAM and no virtual memory. So when memory runs out, things won’t get slow and chuggy, they’ll crash hard. When the memory is gone, it’s really gone.

Performance Problems

There are some performance issues that are relatively easy to track down and fix. Usually ones that occur every frame and are happening in a single spot: some expensive operation, a O(n3) algorithm, etc. Then there are performance problems introduced by dynamic memory allocations, which can be really hard to track down.

Standard malloc returns memory pretty quickly, and usually doesn’t ever register on the the profiler. Every so often though, whenever the game has been running for a while and memory is pretty fragmented, it can spike up and cause a significant delay for just a frame. Trying to track down those spikes has caused more than one programmer to age prematurely. You can avoid some of those problems by using your own memory manager, but don’t attempt to write a generic one yourself from scratch. Instead start with some of the ones listed in the references.

Malloc spikes are not the only source of performance problems. Allocating many small blocks of memory can lead to bad cache coherence when the game code access them sequentially. This problem usually manifests itself as a general slowdown that can’t be narrowed down in the profiler. With today’s hardware of slow memory systems and deep caches, good memory access patterns are more important than ever.

Keeping Track Of Memory

Another source of problems with dynamic memory allocation are bugs in the logic that keeps track of the allocated memory blocks. If we forget to free some of them, our program will have memory leaks and has the potential to run out of memory.

The flip side of memory leaks are invalid memory access. If we free a memory block and later we access it as if it were allocated, we’ll either get a memory access exception, or we’ll manage to corrupt our own game.

Some techniques, such as reference counting and garbage collection can help keep track of memory allocations, but introduce their own complexities and overhead.

Introducing Pre-allocation

On the opposite corner of the boxing ring is the purely pre-allocated game. It excels at everything that the dynamically-allocated game is weak at, but it has a few weaknesses of its own. All in all, it’s probably a much safer approach for most games though.

The idea behind a pre-allocation memory strategy is to allocate everything once and never have to do any dynamic allocations in the middle of the game. Usually you grab as big a block of memory as you can, and then you carve it out to suit your game’s needs.

Some advantages are very clear: no performance penalties, knowing exactly how your memory is used, never running out of memory, and no memory fragmentation to worry about. There are some other more subtle advantages, such as being able to put data in contiguous areas of memory to get best cache coherency, or having blazingly-fast load times by loading a baked image of a level directly into memory.

The main drawback of pre-allocation is that is more complex to implement than the dynamic allocation approach and it takes some planning ahead.

Know Your Data

For preallocation to work, you need to know ahead of time how much of every type of data you will need in the game. That can be a daunting proposition, especially to those used to a more dynamic approach. However, with a good data baking system (see last month’s Inner Product column), you can get a global view of each level and figure out how big things need to be.

There is one important design philosophy that needs to be adopted for preallocation to work: Everything in the game has to be bounded. That shouldn’t feel too restrictive; after all, the memory in your target platform is bounded, as well as every single resource. That means that everything that can create new objects, including high-level game constructs, should operate on a fixed number of them. This might seem like an implementation detail, but it often bubbles up to what’s exposed to game designers. A common example is an enemy spawner. Instead of designing a spawner with an emission rate, it should have a fixed number of enemies it can spawn (and potentially reuse them after they’re dead).

Potentially Wasted Space

If you allocate enough data for the worst case in your game, that can lead to a lot of unused data most of the time. That’s only an issue if that unused data is preventing you from adding more content to the game. We might initially balk at the idea of having 2000 preallocated enemies when we’re only going to see 10 of them at once. But when you realize that each of those enemies is only taking 256 bytes and the total overhead is 500 KB, which can be easily accommodated in most modern platforms today.

Preallocation doesn’t have to be as draconian as it sounds though. You could relax this approach and commit to having each level preallocated and never having dynamic memory allocations while the game is running. That still allows you to dynamically allocate the memory needed for each level and keep wasted space to a minimum. Or you can take it even further and preallocate the contents of memory blocks that are streamed in memory. That way each block can be divided in the best way for that content and wasted space is kept to a minimum.

Reuse, RecycleM

If you don’t want to preallocate every single object you’ll ever use, then you can create a smaller set, and reuse them as needed. This can be a bit tricky though. First of all, it needs to be very much specific to the type of object that is reused. So particles are easy to reuse (just drop the oldest one, or the ones not in view), but might be harder with enemy units or active projectiles. It’s going to take some game knowledge of those objects to decide which ones to reuse and how to do it.

It also means that systems need to be prepared to either fail an allocation (if your current set of objects is full and you don’t want to reuse an existing one), or they need to cope with an object disappearing from one frame to another. That’s a relatively easy problem to solve by using handles or other weak references instead of direct pointers.

Then there’s the issue that reusing an object isn’t as simple as constructing a new one. You really need to make sure that when you reuse it, there’s nothing left from the object it replaced. This is easy when your objects are just plain data in a table, but can be more complicated when they’re complex C++ classes tied together with pointers. In any case, you can’t apply the Resource Acquisition Is Initialization (RAII) pattern, but it doesn’t seem to be a pattern very well suited for games, and it’s a small price to pay for the simplicity that preallocation provides.

Specialized Heaps

Truth be told, a pure pre-allocated approach can be hard to pull off, especially with highly dynamic environments or games with user-created content. Specialized heaps is a combination of dynamic memory allocation and pre-allocation that takes the best of both worlds.

The idea behind specialized heaps is that the heaps themselves are pre-allocated, but they allow some form of specialized dynamic allocation within them. That way you avoid the problems of running out of memory, or memory fragmentation globally, but you still can perform some sort of dynamic allocation when needed.

One type of specialized heaps is based on the object type. If you can guarantee that all objects allocated in that heap are going to be of the same size, or at least a multiple of a certain size, memory management becomes much easier and less error prone, and removes a lot of the complexity of a general memory manager.

My favorite approach for games is to create specialized heaps based on the lifetime of the objects allocated in them. These heaps use sequential allocators, always allocating memory from the beginning of a memory block. When the lifetime of the objects is up, the heap is reset and allocations can start from the beginning again. The use of a simple sequential allocator bypasses all the insidious problems of general memory management: fragmentation, compaction, leaks, etc. See the code in http://gdmag.com/resources/code.htm for an implementation of a SequentialAllocator class.

The heap types most often used in games are:

  • Level heap. Here you allocate all the assets and data for the level at load time. When the level is unloaded, all objects are destroyed at once. If your game makes heavy use of streaming, this can be a streaming block instead of a full level.
  • Frame heap. Any temporary objects that only need to last a frame or less get allocated here, and destroyed at the end of the frame.
  • Stack heap. This one is a bit different from the others. Like the other heaps, it uses a sequential allocator and objects are allocated from the beginning, but instead of destroying all objects at once, it only destroys objects up to the marker that is popped fro the stack.

What About Tools?

You can take everything I’ve written here, and (almost) completely ignore it for tools. I fall in the camp of the programmers who consider the runtime as a totally separate beast from the tools. That means that the runtime can be lean and mean and minimalistic, but I can relax and use whatever technique makes me more productive when writing tools. That means you can allocate memory any time you want, you can use complex libraries like STL and Boost, etc. Most tools are going to run on a beefy PC and a few extra allocations here and there won’t make any difference.

Be careful with performance-sensitive tools though. Tools that build assets or compute complex lighting calculations might be a bottleneck in the build process. In that case, performance becomes crucial again and you might want to be a bit more careful about memory layout and cache coherency.

On the other hand, if the tool you’re writing is not performance sensitive, you should ask yourself if it really needs to be written in C++. Maybe C# or Python are better languages if all you’re doing is transforming XML files or verifying that a file format is correct. Trading performance for ease of development is almost always a win with development tools.


Next time you reach out for a malloc inside your main loop, think about how it can fail. Maybe you can pre-allocate that memory and stop worrying about what’s going to happen the day you prepare the release candidate.


This article was originally printed in the February 2009 issue of Game Developer.

The Const Nazi

Anybody who worked with me or saw any of my code, would know right away why they call me the Const Nazi. That’s because in my coding style, I make use of the keyword const everywhere. But instead of going on about how const is so great, I’m going to let Hitler tell us how he really feels about it.



No Flash? Try the QuickTime video version.

Let me get one thing out of the way to stop all the trigger-happy, const-bashing, would-be-commenters: const doesn’t make any guarantees that values don’t change.

You can change a const variable by casting the constness away, or referencing it through a pointer, but you really had to go out of your way to do that. If it helped with that, const would solve (or improve) the memory aliasing problem like Hitler pointed out. It doesn’t, so const is pretty weak as far as promises go. It just says “I, the programmer, promise not to change this value on purpose (unless I’m truly desperate)”. Still, even a promise like that goes a long way helping with readability and maintenance.

With that out of the way, what exactly do I mean by using const everywhere?

Const non-value function parameters

Any reference or pointer function parameters that are pointing to data that will not be modified by the function should be declared as const. If you’re going to use const just for one thing, this is the one to use. It’s invaluable glancing at a function signature and seeing which parameters are inputs and which ones are outputs.

void Detach(PhysicsObject& physObj, int attachmentIndex, const HandleManager& mgr);

Marking those parameters as const also serves as a warning sign in case a programmer in the future tries to modify one of them. Imagine the disaster if the calling code assumes data never changes, but the function suddenly starts modifying that data! const won’t prevent that from happening, but will remind the programmer that he’s changing the “contract” and needs to revisit all calling code and check assumptions.

Const local variables

This is a very important use of const and one of the ones hardly anyone follows. If I declare a local (stack) variable and its value never changes after initialization, I always declare it const. That way, whenever I see that variable used later in the code, I know that its value hasn’t changed.

const Vec2 newPos = AttachmentUtils::ApplySnap(physObj, unsnappedPos);
const Vec2 deltaPos = newPos - physObj.center;
physObj.center = newPos;

This is one of the reasons why I did a 180 on the ternary C operator (?). I used to hate it and find it cryptic and unreadable, but now I find it compact and elegant and it fulfills my const fetish very well.

Imagine you have a function that is going to work in one of two objects and you need to compute the index to the object to work on. You could do it this way:

int index;
if (some condition)
	index = 0;
else
	index = 2;

DoSomethingWithIndex();

Not only does that take several lines not to do much, but index isn’t const (argh!). So every time I see index anywhere later on in that function, I’m going to have to spend the extra mental power to make sure nothing has changed (and, with my current coding style, I would assume it has changed).

Instead, we can simply do this:

const int index = (some condition) ? 0 : 2;
DoSomethingWithIndex();

Ahhhh… So much better!

Const member variables

This one doesn’t really apply to me anymore because I don’t use classes and member variables. But if you do, I strongly encourage you do mark every possible member function as const whenever you can.

The only downside is that sometimes you’ll have some internal bit of data that is really not changing the “logical” state of an object, but it’s still modifying a variable (usually some caching or logging data). In that case, you’ll have to resort to the mutable keyword.

Const value function parameters

const_nazi.jpgApparently I’m not a total Const Nazi because this is one possible use of const that I choose to skip (even though I tried it for a while because of Charles).

Marking a value function parameter as const doesn’t make any difference from the calling code point of view, but it serves the same purpose as marking local stack variables as const in the implementation of the function. You’re just saying “I’m not going to modify that parameter in this function” so it makes the code easier to understand.

I’m actually all for this, but the only reason I’m not doing it is because C/C++ makes it a pain. Marking parameters as const in the function declaration adds extra verbosity and doesn’t help the person browsing the functions at all. You could actually put the const only in the function definition and it will work, but at that point the declaration and the definition are different, so you can’t copy and paste them or use other automated tools or scripts.


The concept of const is one of the things I miss the most when programming other languages like C#. I don’t understand why they didn’t add it to the language. On something like Python or Perl I can understand because they’re supposed to be so free form, but C#? (Edit: How about that? Apparently C# has const. It was either added in the last few years or I completely missed it before). It also really bugs me that Objective C or the Apple API doesn’t make any use of const.

Frankly, if it were up to me, I would change the C/C++ language to make every variable const by default and adding the nonconst or changeable (or take over mutable) keyword for the ones you want to modify. It would make life much more pleasant.

But then again, that’s why the call me the Const Nazi.


This post is part of iDevBlogADay, a group of indie iPhone development blogs featuring two posts per day. You can keep up with iDevBlogADay through the web site, RSS feed, or Twitter.

The Always-Evolving Coding Style

This is my first entry into #iDevBlogADay. It all started very innocently with a suggestion from Miguel, but the ball got rolling pretty quickly. The idea is to have one independent iPhone game developer write a blog entry each day of the week. At first we thought we would be hard-pressed to get 7 developers, but it’s starting to seem we might have multiples per day!

Check out the new sidebar with all the #iDevBlogADay blogs. We’re also putting together a common RSS feed if you want to subscribe to that instead.

Writing is addictive, so don’t be surprised if this once-a-week minimum turns into multiple-times-a-week.

 

matrix.jpgEvery developer who’s been working on a team for a while is able to tell the author of a piece of code just by looking at it. Sometimes it’s even fun to do a forensic investigation and figure out not just the original author, but who else modified the source code afterwards.

What I find interesting is that I can do the same thing with my own code… as it changes over time. Every new language I learn, every book I read, every bit of code I see, every open-source project I browse, every pair-programming session, every conversation with a fellow developer leaves a mark behind. It slightly changes how I think of things, and realigns my values and priorities as a programmer. And those new values translate into different ways to write code, different architectures, and different coding styles.

It never happens overnight. I can’t recall a single case where I changed my values in a short period of time, causing dramatic changes to my coding style. Instead, it’s the accumulation of lots of little changes here and there that slowly shifts things around. It’s like the movement of the Earth’s magnetic pole: very slow, but changes radically over time (although maybe just a tad bit faster).

Why Talk About Coding Styles

Coding style in itself is purely a personal thing, and therefore, very uninteresting to talk about. However, in its current form, my coding style goes against the grain of most general modern “good practices”. A few weeks ago I released some sample source code and it caused a bit of a stir because it was so unconventional. That’s when I realized it might be worth talking about it after all (along with George bugging me about it), and especially the reasons why it is the way it is.

Before I even start, I want to stress that I’m not advocating this approach for everybody, and I’m certainly not saying it’s the perfect way to go. I know that in a couple of years from now, I’ll look back at the code I’m writing today and it will feel quaint and obsolete, just like the code I wrote during Power of Two Games looks today. All I’m saying is that this is the style that fits me best today.

Motivation

This is my current situation which shapes my thinking and coding style:

  • All my code is written in C and C++ (except for a bit of ObjC and assembly).
  • It’s all for real-time games on iPhone, PCs, or modern consoles, so performance and resource management are very important.
  • I always try to write important code through Test-Driven Development.
  • I’m the only programmer (and only designer).
  • Build times in my codebase are very fast.

And above all, I love simplicity. I try to achieve simplicity by considering every bit of code and thinking whether it’s absolutely necessary. I get rid of anything that’s not essential, or that’s not benefitting the project by at least two or three times as much as it’s complicating it.

How I Write Today

So, what does my code look like these days? Something like this (this is taken from a prototype I wrote with Miguel of Mystery Coconut fame):

namespace DiverMode
{
    enum Enum
    {
        Normal,
        Shocked,
        Inmune,
    };
}

struct DiverState
{
    DiverState()
        : mode(DiverMode::Normal)
        , pos(0,0)
        , dir(0)
        , o2(1)
        , boostTime(0)
        , timeLeftInShock(0)
        , timeLeftImmune(0)
    {}

    DiverMode::Enum mode;
    Vec2 pos;
    float dir;
    float o2;

    float boostTime;
    float timeLeftInShock;
    float timeLeftImmune;
};

namespace DiverUtils
{
    void Update(float dt, const Vec2& tiltInput, GameState& state);
    void Shock(DiverState& diver);
    void StartSprint(DiverState& diver);
    void StopSprint(DiverState& diver);
}

The first thing that stands out is that I’m using a struct and putting related functions in a namespace. It may seem that’s just a convoluted way of writing a class with member functions, but there’s more to it than that.

By keeping the data in a struct instead of a class, I’m gaining several advantages:

  • I’m showing all the data there is and how big it is. Nothing is hidden.
  • I’m making it clear that it’s free of pointers and temporary variables.
  • I’m allowing this data to be placed anywhere in memory.

The fact that the functions are part of a namespace is not really defensible; it’s pure personal preference. It would have been no different than if I had prefixed them with DriverUtils_ or anything else, I just think it looks clearner. I do prefer the functions to be separate and not member functions though. It makes it easier to organize functions that work on multiple bits of data at once. Otherwise you’re stuck deciding whether to make them members of one structure or another. It also makes it easier to break up data structures into separate structures later on and minimize the amount of changes to the code.

Probably one of the biggest influences on me starting down this path was the famous article by Scott Meyers How Non Member Functions Improve Encapsulation. I remember being shocked the first time I read it (after having read religiously Effective C++ and More Effective C++). That reasoning combined with all the other changes over the years, eventually led to my current approach.

Since everything is in a structure and everything is public, there’s very little built-in defenses against misuse and screw-ups. That’s fine because that’s not a priority for me. Right now I’m the only programmer, and if I work with someone else, I expect them to have a similar level of experience than me. Some codebases written with a defensive programming approach have an amazing amount of code (and therefore complexity) dedicated to babysitting programmers. No thanks. I do make extensive use of asserts and unit tests to allow me to quickly make large refactorings though.

Another thing to note that might not be immediately obvious from the example above is that all functions are very simple and shallow. They take a set of input parameters, and maybe an output parameter or just a return value. They simply transform the input data into the output data, without making extensive calls to other functions in turn. That’s one of the basic approaches of data-oriented design.

Because everything is laid out in memory in a very simple and clear way, it means that serialization is a piece of cake. I can fwrite and fread data and have instant, free serialization (you only need to do some extra work if you change formats and try to support older ones). Not only that, but it’s great for saving the game state in memory and restoring it later (which I’m using heavily in my current project). All it takes is this line of code:

oldGameState = currentGameState

This style is a dream come true for Test-Driven Development (TDD). No more worrying about mocks, and test injections, or anything like that. Give the function some input data, and see what the output is. Done! That simple.

One final aspect of this code that might be surprising to some is how concrete it is. This is not some generic game entity that hold some generic components, with connections defined in XML and bound together through templates. It’s a dumb, POD Diver structure. Diver as in the guy going swimming underwater. This prototype had fish as well, and there was a Fish structure, and a large array of sequential, homogeneous Fish data. The main loop wasn’t generic at all either: It was a sequence of UpdateDivers(), UpdateFish(), etc. Rendering was done in the same, explicit way, making it extra simple to minimize render calls and state changes. When you work with a system like this, you never, ever want to go back to a generic one where you have very little idea about the order in which things get updated or rendered.

Beyond The Sample

To be fair, this sample code is very, very simple. The update function for a reasonable game element is probably larger than a few lines of code and will need to do a significant amount of work (check path nodes, cast rays, respond to collisions, etc). In that case, if it makes sense, the data contained in the structure can be split up. Or maybe the first update function generates some different output data that gets fed into later functions. For example, we can update all the different game entities, and as an output, get a list of ray cast operations they want to perform, do them all in a later step, and then feed the results back to the entities either later this frame or next frame if we don’t mind the added latency.

There’s also the question of code reuse. It’s very easy to reuse some low level functions, but what happens when you want to apply the same operation to a Diver and to a Fish? Since they’re not using inheritance, you can’t use polymorphism. I’ll cover that in a later blog post, but the quick preview is that you extract any common data that both structs have and work on that data in a homogeneous way.

 

What do you think of this approach? In which ways do you think it falls short, and in which ways do you like it better than your current style?

Great Presentation on Data-Oriented Design

Memory CPU gapA few days ago, Tony Albrecht posted the slides of his presentation titled “Pitfalls of Object-Oriented Design” [1]. Even though the title is really broad and could easily be misinterpreted, it’s not just a general bash on OOD. Instead, it’s very much focused on how object-oriented design is not a good match for high-performance apps (games) on modern hardware architectures with slow memory access and deep memory hierarchies. His proposed solution: Data-oriented design.┬áSpot on!

If you haven’t seen the presentation, go download it right now. It’s really well put together and he has some great detailed examples on how caches are affected with traditional object-oriented design vs. a more data-oriented approach.


[1] Thanks to Christer Ericson for pointing that out.