Building Software is Hard (for good reasons) (evanjones.ca)

[ 2012-January-14 12:46 ]

An important but often overlooked part of engineering quality software is building the software. By "building" I mean the act of taking source code and converting it into something executable. The process varies depending on the programming language, but in my experience it works like this: when a project is started the build doesn't matter. You use the "standard" tools for the language you are using, builds are fast since there isn't much code, and because there are few people collaborating it doesn't matter if it takes multiple steps to get the build to work on a new system. However, as the project grows and adds dependencies, the setup process gets longer. It starts taking forever to rebuild when a single file changes. There are mysterious errors that go away by doing a clean build from scratch. It turns out that getting a complex build system to work well is hard. I used to think the reason is that the tools were bad. I've since learned more about this, and I've changed my mind. Building software is hard because there are many policy decisions that need to be made, and there are no "correct" answers. As a result, the tools are very configurable and complicated, and each project works slightly differently.

This complexity is the reason I like the Ninja build system. It takes the opposite stance: it is explicitly not very configurable. Evan Martin created it because Scons and GNU Make were both too slow to compile Google Chrome, taking 10 seconds before even starting a compile. Ninja, on the other hand, takes less than a second because it is engineered from the ground up for fast incremental builds. To achieve this it has a fairly minimal design: it only traverses a dependency graph. If the input files are older than the output files, then it executes the corresponding build rule. Unlike other systems, it has no built-in knowledge about how any language is compiled, so every step must be specified explicitly.

As a result, Ninja is intended to be used with another tool that generates the build rules (a "meta build system"). After having worked with it, I think this is actually a fairly elegant way to divide the problem space. Every build tool needs to compare inputs and outputs to determine if they are up to date, which is all that Ninja does. This means the higher-level policy decisions about how code should be built are left to another tool. As a result, I suspect Ninja could be used as the backend for most other build tools.

To explore this hypothesis, I've started using Ninja for a medium-sized project that involves multiple programming languages and a small number of third-party dependencies. Already I have discovered that the programmer's view of how code should be built does not always translate directly to the underlying build commands. In particular, the model for different languages are very different. Particularly hard problems are how to deal with third-party dependencies and how to make the build portable. I hope to write about these issues in detail soon, as I get more experience with the build model for different languages.