The great majority of the literature on warfare concentrates on topics such as formations, maneuvers, equipment, and training. What they often leave out is the importance of the supply lines. The most cunningly devised plan will be worthless in the long term if your supply lines fail.
The same can be said for large-scale C++ development. Most of the books and articles out there deal with class hierarchies, object-oriented design, and mind-bending template tricks. However, when it comes down to it, a solid physical structure and good code layout will go a long way towards making all programmers productive. When the milestones near and the pressure piles on, a badly structured C++ codebase is likely to be as fatal as the cold, Russian winter was to Napoleon’s army.
I’ve already written before about how important the physical layout of a codebase can be ([1] and [2]), so I’m not going to go over it again. This is going to be about how to use one of the many tools at our disposal to make working with a large codebase more bearable: pre-compiled headers.
Benefits and drawbacks
The only benefit of pre-compiled headers is build speed. Nothing more, nothing less. But we’re not talking about a measly 10-15% build speed improvement. Pre-compiled headers can easily improve build times by an order of magnitude or more depending on your code structure. Clearly, it’s a technique worth exploring.
The major benefit is when doing full rebuilds, but it can also help when building individual files. So it will help with the bane of the C++ programmer: iteration time (it won’t do anything to help with link times, though).
What’s not to like about pre-compiled headers? Several things, it turns out:
- Using a single set of pre-compiled headers exposes more symbols than necessary for many modules. That can lead to increased physical and logical dependencies.
- Modifying a header that is part of the pre-compiled headers set will trigger a full rebuild.
- Pre-compiled headers are not supported in every environment (although these days most compilers seem to support them to some extent).
- Bloated pre-compiled headers can slow things down. This drawback is only from hearsay and minor anecdotal personal evidence. It sort of makes sense (with the pre-compiled header file getting huge), but I would like to see some hard data. Has anybody measured this?
What’s the best way to deal with the drawbacks in order to be able to take full advantage of the super-speedy compile times?
How do pre-compiled headers work?
A C++ compiler operates on one compilation unit (cpp file) at the time. For each file, it applies the pre-preprocessor (which takes care of doing all the includes and “baking” them into the cpp file itself), and then it compiles the module itself. Move on to the next cpp file, rinse and repeat. Clearly, if several files include the same set of expensive header files (large and/or including many other header files in turn), the compiler will be doing a lot of duplicated effort.
The simplest way to think of pre-compiled headers is as a cache for header files. The compiler can analyze a set of headers once, compile them, and then have the results ready for any module that needs them.
I won’t get into the specifics on how to set up pre-compiled headers; that’s very environment-specific. You might want to start with Bruce Dawson’s great article on pre-compiled headers on Visual C++, or the GNU documentation on pre-compiled headers with gcc.
In very general terms, you need to specify one file (here we’ll call it PreCompiled.h) as containing the pre-compiled headers. Anything included in that file will then become part of the pre-compiled headers cache. Then, you just need to make sure to include PreCompiled.h in each module you want to use the pre-compiled headers from. One word of warning: Visual C++ ignores all lines in the cpp file before #include “PreCompiled.h”, so it’s a good habit to make sure that’s the first line of code in each file.
Organizing your code
Unless you’re dealing with very small code bases, you should really split up your code into multiple libraries. Not only will you get the most benefit from pre-compiled headers that way, but you’ll reap a variety of other benefits. Then set up each library with one set of pre-compiled headers.
The big question now is, what should go in the pre-compiled headers? You can put anything you want, really. But by observing a few simple guidelines, you can maximize your build times, which is what this is all about.
- Add “expensive” includes to the pre-compiled header file. “Expensive” headers are the ones that cause a cascade of other includes. Including these every time you compile a file can be quite time consuming. Some of the usual suspects are windows.h, STL headers, single includes for whole APIs, etc.
- Add headers that are included from many different files, even if they’re not very expensive. A header file that is included from 50 different files is almost as bad as a header file that is included once but causes 50 includes of its own. Actually, a simple file that is included in many cpp files is preferable to an expensive one included only once because recompiling one of those many cpp files is going to be fast even if the file is not in the pre-compiled headers, so iteration is still relatively fast.
- Don’t put any headers from the library itself in the pre-compiled headers. The only exception to this is if you have a header in your library that happens to be included everywhere (which is probably a sign that something is wrong anyway). Otherwise, every time you modify a header that was included in the pre-compiled headers file, you’ll cause a full rebuild.
So clearly, the best candidates to add to the pre-compiled header list are expensive includes that happen many times. Those are the ones that are going to get us the big speedup in build times.
At the same time, we don’t want to blindly add every header used by our library. That will make regenerating the pre-compiled headers slightly slower, but, most importantly, will cause all the symbols in those headers to be available to the whole library, which is something undesirable if we’re trying to keep dependencies to a minimum. There’s also the potential issue of pre-compiled header bloat, but I need to confirm that.
Since those guidelines can be applied automatically, I decided to write a script to report the headers that would most benefit from being in the pre-compiled headers for a particular project. Instead of trying to parse the C++ code and find the chains of includes (which is not that hard but requires dealing with include paths, define statements, and almost implementing the full C pre-processor), the script uses the output of building the project with the option to show all includes (-H with gcc, /showincludes in Visual C++). The includes are conveniently formatted with indentation corresponding to the level at which they were included, so it’s very easy for the script to determine which includes are expensive.
The script also accepts a string to ignore all includes which include that string in their path. That way you can easily eliminate includes from the library itself from being recommended to be part of the pre-compiled headers. Unfortunately, gcc outputs the relative path that was used to get to the include, which makes it harder to filter out specific libraries and might result in the same header being included in different ways. To prevent that, you might want to process the output first and automatically change all the paths to be absolute instead of relative (the script can’t do it because there are no guarantees you’re running it on the same platform as the build, let alone in the same directory of the same machine).
Right now the script can parse the output created from gcc and Visual C++, but it should be really easy to extend to any other compiler just by changing the regular expressions it uses to parse the include outputs.
As an example, I ran it the script on the output of building one of the libraries in our game engine at High Moon. These are the results:
Counting includes in includes.txt Cpp files: 38 (Header file, score, times included, includes caused by the header) ('file1.h', 11590, 190, 60) ('file2.h', 532, 532, 0) ('file3.h', 401, 401, 0) ('file4.h', 384, 4, 95) ('file5.h', 330, 33, 9) ('file6.h', 319, 11, 28) ('file7.h', 228, 228, 0) ('file8.h', 155, 155, 0) ('file9.h', 151, 151, 0) ('file10.h', 105, 35, 2) ('file11.h', 105, 105, 0) ('file12.h', 101, 101, 0)
Each include file is reported with a score, the number of times it is included in the program, and the number of other includes it causes. The score is the most important value about each file, which is simply the product of the number of times a file is included and the number of includes that file causes (plus one to take itself into account). So the higher the score, the more expensive an include is.
Interestingly, at the top of the list we have an extremely expensive include that should absolutely be added to the pre-compiled header list. That alone should make a significant positive impact in build times. The rest of the top files listed could be added, although their impact would be much lower.
As another example, I ran the script on the results of building my current mp3 player, Amarok.
Counting includes in /usr/local/src/amarok-1.2.3/amarok/src/out.txt Cpp files: 764 (Header file, score, times included, includes caused by the header) ('/usr/lib/qt3//include/qglobal.h', 46620, 630, 73) ('/usr/lib/qt3//include/qmap.h', 3740, 110, 33) ('/usr/lib/gcc/i586-mandrake-linux-gnu/3.4.1/../../../../include/c++/3.4.1/vector', 3182, 37, 85) ('/usr/lib/qt3//include/qstring.h', 2900, 116, 24) ('/usr/include/kconfigskeleton.h', 2280, 30, 75) ('/usr/include/klineedit.h', 1955, 17, 114) ('/usr/lib/qt3//include/qobject.h', 1605, 107, 14) ('/usr/include/kapplication.h', 850, 34, 24) ('/usr/lib/gcc/i586-mandrake-linux-gnu/3.4.1/include/stddef.h', 808, 808, 0) ('/usr/lib/qt3//include/qwinexport.h', 784, 784, 0) ('/usr/lib/qt3//include/qptrlist.h', 560, 112, 4) ('/usr/include/sys/types.h', 546, 42, 12) ('/usr/local/include/taglib/taglib.h', 432, 6, 71) ('/usr/lib/qt3//include/qnamespace.h', 420, 84, 4) ('/usr/include/kaction.h', 396, 22, 17) ('/usr/include/kguiitem.h', 216, 27, 7) ('/usr/include/kdirlister.h', 189, 9, 20) ('/usr/lib/gcc/i586-mandrake-linux-gnu/3.4.1/include/limits.h', 184, 184, 0) ('/usr/include/kfiledialog.h', 182, 7, 25) ('/usr/lib/gcc/i586-mandrake-linux-gnu/3.4.1/../../../../include/c++/3.4.1/fstream', 176, 4, 43) ('/usr/include/klistview.h', 161, 23, 6) ('/usr/include/kactioncollection.h', 156, 13, 11) ('/usr/include/kdiroperator.h', 144, 3, 47) ('/usr/include/time.h', 143, 143, 0) ('/usr/include/kpopupmenu.h', 143, 13, 10) ('/usr/include/bits/wordsize.h', 142, 142, 0) ('/usr/lib/qt3//include/qdir.h', 140, 28, 4) ('/usr/lib/qt3//include/private/qucomextra_p.h', 129, 43, 2) ('/usr/lib/qt3//include/qmetaobject.h', 129, 43, 2) ('/usr/lib/gcc/i586-mandrake-linux-gnu/3.4.1/../../../../include/c++/3.4.1/iostream', 126, 3, 41)
This is a larger project than the previous example (764 cpp files), so the potential gains in build time are also much higher. The top 10 headers are all included in many files as well as quite expensive, so they would all be great candidates to add to a pre-compiled header list. Considering that it took about 10 minutes to build on my 3GHz machine, I’d say it would definitely benefit from some judicious use of pre-compiled headers.
Multiplatform development
A lot of C++ compilers support pre-compiled headers nowadays (gcc, Visual C++, and Codewarrior for sure). Unfortunately, there are still compilers and platforms out there without pre-compiled header support that we need to deal with.
Usually, the code will build the same in a platform without pre-compiled header support, but it will be much, much slower because not only do we not have the caching effect of pre-compiled headers, but we’re also including more headers in every compilation unit as part of the PreCompiled.h file. So if we’re not careful, all the build speedups we gained using pre-compiled headers are going to come back and slow down builds in platforms without pre-compiled header support by a huge factor.
The best way to deal with this is to be able to turn on the pre-compiled headers on and off. For example, your pre-compiled header file might look like this:
#ifdef USE_PRECOMPILED_HEADERS #include <string> #include <sstream> #include <iostream> #endif
For platforms without pre-compiled header support, just don’t define USE_PRECOMPILED_HEADERS. Clearly, for this to work every cpp file needs to include all the header files it needs to, independently of whether they’ve been included in PreCompiled.h or not. This is not a bad habit to get in anyway, because it makes dependencies more explicit and it doesn’t slow the build down any (because all those header files are already in the pre-compiled headers, so they’re free).
Having the ability to turn it on and off also allows us to do builds sometimes without relying on pre-compiled headers. This might be useful if we want to verify that files can compile on their own or if we want to generate lists of includes to feed to the script described earlier to find good candidates for adding to the pre-compiled header list.
Maintenance
Pre-compiled headers are mostly fire and forget. You set them up once with the big offenders and leave them be. However, the more a library changes, the more it might benefit from an update of the pre-compiled headers. Whenever a library feels like it’s building slowly, you should see if there are any obvious headers that should be added to the pre-compiled header list.
That’s one reason I like to display the build times for each library (in my case, I like to include unit test times as well). It’s too easy to let a library build more and more slowly over time without ever realizing it. It’s like the proverbial frogs taking a hot bath. But if you have hard data, you can see that the library is now taking 10 seconds to build, but last week it was only taking 6 seconds (yes, it doesn’t seem like much, but when you have 50+ libraries it quickly adds up!). For bonus points, I want my automated build system to keep historical data of build times and display a plot over time, which will make zeroing in build slowdowns much easier.
If you’re interested in maintaining compatibility with other platforms, I’d recommend doing a build with pre-compiled header support turned off every so often (once a night or even once a week), and make sure everything works fine. While you’re at it, take a pass at the code with PCLint, and you’ll be horrified at how many dangerous things you missed.
There really are almost no excuses not to use pre-compiled headers in a large C++ project. If you’re not using them, you’re doing yourself a disservice. And if you’re already using them, you might be able to tweak them a bit and squeeze an extra 10-20% speedup out of your builds. Whatever you do, make sure to keep those build times low to have as fast an edit-build-run cycle as possible.
The Care and Feeding of Pre-Compiled Headers
A badly structured C++ codebase is likely to be as fatal as the cold, Russian winter was to Napoleon’s army. [Games from Within] Another installment of a very valuable series on physical design of large-scale C++ projects. (first and…
Cabeceras precompiladas multiplataforma
The in-article link to ‘list_precomp.py’ is broken. 🙂
Fixed. Thanks 🙂
This was the kick in the ass I needed to finally learn to use precompiled headers – thanks 🙂
Quote: “Bloated pre-compiled headers can slow things down.”
They certainly can. And with Visual C++ 7.1, which we’re using at work, you can quite easily get large precompiled headers when using a lot of templates and especially template metaprogramming in the header files used in the precompiled header.
For one of our projects that uses a lot of template metaprogramming using Boost.MPL we had a precompiled header file size of around 200 MB. Not only did build times get longer quite a bit, but also compiler memory usage increased dramatically. Large precompiled header files can also trigger the error “C1204 compiler limit: internal structure overflow” in VC 7.1. However, Microsoft ascertains us that this is fixed in VC 8.0. After reducing the number of precompiled headers used, build times went back to normal and the compiler errors disappeared.
Of course, templates generally cause longer build times by themselves – especially with a lot of dependent template instantiations. But in the case of this particular project, precompiled headers didn’t reduce but rather increased build times.
/Y3 where have you been all my life
I was just flicking through some of the entries over on Games from Within and I came across an entry on precompiled headers which is a good introduction of the pros and cons of using precompiled headers in a cross…
‘list_precomp.py’ actual location is http://www.gamesfromwithin.com/wp-content/uploads/bin/list_precomp_py.txt
I referenced this post in my post titled ‘Speeding up build times in Visual Studio 2008’, found here: http://aprogrammers.blogspot.co.il/2009/05/speeding-up-build-times-in-visual.html
Hello. What’s your comment on ccache v/s pre compiled headers?
Personally I think I can drop pre compiled headers when I have ccache.
On badly structured and very large projects, it seems to me that ccache is easy to setup compared to pre compiled headers, especially when the project cannot be built using an editor like VC++ or something that has support for pre-compiled headers built-in.
The use case of ccache is completely different, I believe, as it reuses compilation result iff the whole c(pp) file is identical after doing preprocessing AFAIK. I don’t think that’s what typically happens when developing. Typically you do change the cpp file but you don’t touch those huge template libraries that you use.
You’re right that ccache solves a different problem. However you actually get huge wins in your day-to-day development (not just when doing clean builds). A couple of examples:
1) When you switch between branches in Git, you will find that ccache kicks in when you switch back to a branch that you have built before.
2) When you make experimental changes in commonly used headers (e.g. change a config/constant/define or change the type of a function argument, etc), and then when you revert your changes – Boom! ccache!
The really nice thing with ccache is that, unlike PCH, it’s non-intrusive.
Great post, but instead of “you can maximize your build times, which is what this is all about.”, don’t you mean “minimize”?
To auto-generate stdafx.h finding the chain of includes: https://github.com/g-h-c/pct