in Tools

Bad News for Scons Fans

We have been talking a lot about Scons recently at the Power of Two Games World Headquarters. MSBuild has proven to be quite a pain to work with for our asset builds and eventually left us dissatisfied (that’s material for a whole other entry). So we kept looking over to Scons as a possible solution.

On paper it looks great:

  • Built from the ground up for parallel builds
  • Uses a real programming language (Python) instead of overly-complicated XML files
  • It allows for discovery of assets to build as other assets are parsed

The big question mark hanging over it was whether it was fast enough. I had done some experiments with Scons and other build systems a few years ago, and Scons totally failed by having ridiculously long incremental build times just checking if anything had changed. That was simply not acceptable. If no data has changed, I want the build system to detect it and come back in a second or less.

Since then, I’ve been told that Scons had finally fixed its dependency checking and it was much faster now. Music to my ears!

So I resurrected the old script I wrote back then to generate a nightmare stress test, downloaded the latest stable version of Scons (1.0.1) and fired it up.

I’m sorry to report that Scons is still as unusable as it was back then. I generated a codebase like the one I used to measure all the build systems (50 libraries, 100 classes each, 15 internal includes, 5 external includes). I built everything once, then ran it again and it took 49 seconds. I have some pretty nice hardware, including a fast SATA hard drive and a 2.8GHz Core2Duo with plenty of RAM, and 49 seconds to say that nothing needs to be changed just doesn’t cut it.

Now to be fair, right now we’re looking at Scons to build our assets, not our code. Maybe the nested project configuration is not particularly realistic, so I tried something simpler: 5000 files in a single library without any dependencies among each other at all. This is a very conservative example for an asset build of a large commercial game.

  • It’s only 5000 files. A real game can easily have 10 times as many asset files.
  • All the files are tiny (cpp files in this case). Assets can get very large when dealing with textures, sounds, and animations.
  • There are zero dependencies among assets in this example. You’ll often have models depending on shaders and textures, levels on game entity definitions, etc.

Building assets doesn’t have to be the instant, no delay kind of build that I require building code. I expect most developers will be building assets locally, tweaking things, rebuilding, and trying them in the game. The faster the build is, the better, but a delay of a few seconds can be acceptable. If Scons can’t handle this build in less than 10 seconds, it’s certainly doomed in a full-scale game.

The result: 37 seconds!

For purely masochistic reasons, I decided to see how it scales, so I threw 20,000 files at it (again, without any dependencies). Incremental build time without any changes: 3 minutes and 46 seconds. That’s over 6 times longer for processing 4 times the number of files, so it doesn’t even scale linearly.

Scons is definitely too slow, and MSBuild has its share of problems. What’s left out there? Any recommendations for good parallel asset building systems? Hopefully something lightweight and easily configurable through a real language. Anything?

29 Comments

  1. I haven’t got around to playing with these tools yet, but perhaps you might find some time if your get desperate enough?!

    http://sham.sourceforge.net/
    http://www.cs.berkeley.edu/~billm/memoize.html

    They basically work off of the same idea: run the commands to do the build, monitoring the inputs and outputs with kernel hooks to generate dependencies information. No need to specify or calculate the dependencies explicitly via some heuristic hack.

    I imagine it would be pretty easy to implement a parallel build system on top of these. You could simply write your build instructions sequentially in a list with certain barrier points. All commands between barrier N and N+1 could be run in parallel. e.g.

    compile blah/foo.cpp blah/out/foo.obj
    compile blah/bar.cpp blah/out/bar.obj
    compile blah/baz.cpp blah/out/baz.obj
    compile guff/foo.cpp guff/out/foo.obj
    compile guff/bar.cpp guff/out/bar.obj
    compile guff/baz.cpp guff/out/baz.obj
    ——————————————- # barrier
    make-library guff/out/*.obj libs/guff.lib
    make-library blah/out/*.obj libs/blah.lib
    ——————————————- # barrier
    # etc …

  2. Could you make the generator for your test code available, or email it to me? I’m guessing it’s C or C++, so the number of preprocessor defines is important too.

    “50 libraries, 100 classes each, 15 internal includes, 5 external includes). I built everything once, then ran it again and it took 49 seconds.”

    I’ve done a fair amount of work recently analyzing no-change (aka null, do nothing etc) build times with SCons.

  3. I *HATE* (capital H) scons.

    My current project and my last project used it. The #1 issue I have with it is no one seems to actually understand it and so because it’s just a python library and you have all of python with which to hang yourself every sub project ends up being some personal hack of the guy who setup the build for that sub project. Back in make days, editing or adding something to a makefile was always relatively easy. Writing my own build system in perl or python has been relatively easy but everytime I see someone go try to add something to scons it’s 4-8 hours of pain.

    One thing that many build systems seem to have problems with is dependencies that can’t be known before the build starts. An example is you don’t know what textures need to be processed until you build the maya/max file. So what you really want is

    .mb->.middleformat->.texturelist->[texture1, texture2, …]

    When .texturelist already exists and is up to date (ie, newer than both .middleformat and .mb) it’s easy for some build system to follow the rules and build the textures but most build systems I’ve used don’t like it when that list doesn’t exist yet and therefore has to be added after the build has started. Or to put it another way, most build systems start with only some programming language in mind and make the assumuption they can scan the text quickly to figure out implicit dependencies. That assumption is not true for assets.

  4. My team recently migrated from the Autotools suite (*that’s* pain) to SCons. Our codebase isn’t that large (~175 files code, more for other stuff (Doxygen etc.)), but we found Scons was, while slower than ‘make’, easily fast enough for the ease of use difference.

    I just has a quick look at your test script, and I notice it’s not using any of SCons’ speed features. Reading your problem domain, one in particular might be useful. SCons uses objects called ‘Deciders’ to figure out if a file/dependency has changed. The default Decider uses MD5 sums of the file contents, which adds a *lot* of time overhead. The reason this is the default behaviour is explained here: http://www.scons.org/doc/production/HTML/scons-user.html#AEN815, but the important point is that it can be easily changed (using one of several built-ins or writing your own). There’s one that will use the later-than-mtime instead, like make, (flaw: doesn’t catch introduction of older files into the file tree), and another that’ll use mtime-exact (any different mtime causes a rebuild), and several more. There are trade-offs involved with these, but we found that mtime-exact took 66% of the time of md5s, which is not as fast as others but takes a lot of the sting out.

  5. Is kjam still alive ? I have sent mail to him but no response.

  6. Haha, this is quite old, but I figured you might like to try out Fbuild. It’s vaguely similar to SCons on the outside, but the internals are much different, and it’s quite fast.

Comments are closed.