I use bazel at work heavily in a very large project. I’ve always worked with scripting languages so I’m much less familiar with build systems.
I’m not sure I get what makes bazel so good. It seems pretty simple to me. You have a bunch of directories with BUIILD files that are each sort of like Makefiles.
Am I missing something? It kind of just seems like a hodgepodge of scripts. I don’t dislike it, but I’m also not seeing anything amazing.
I’ll take a stab at it, since I’ve done some migrations to Bazel (and also away from Bazel). The comparison to Make and makefiles is good because Make, unlike some other build systems, is mostly declarative. Most of your makefile is going to declare what the inputs and outputs are.
If you use Make long enough, there are some obvious improvements you want. Multiple outputs, rebuild when options change, and easier cross-compiling are the top ones. Various build systems attempt to add these features. In my mind, Ninja is the only build system that added these features well, and it worked because Ninja removed all the other features to focus on just the build process (as opposed to specification / configuration).
If you think about these problems with Make, you realize that it kind of boils down to one big thing: you want your build system to always rebuild when necessary, and you want it to almost never rebuild when unnecessary. (Plus the bit about cross-compiling.)
Other build systems rely on the developer writing the build scripts to just “get it right”. Bazel is different because it sandboxes the rules to enforce hermeticity. In Make, I can include "pear.h" which includes "orange.h", but let’s suppose that "orange.h" is actually a generated source file… now, try writing this out in a Makefile (if you’re a masochist, say you’re cross-compiling). Yes, "orange.h" should be declared as an input to anything that includes "pear.h", but in practice, developers are going to screw it up. At that point you can end up with a build that uses two different versions of "orange.h".
Bazel sandboxes the commands so that any rule not declared to depend on "orange.h" will not be able to open "orange.h" at all. The process won’t see the file at all.
This opens the door for all sorts of optimizations and query features that are simply unreliable if you have to trust that human developers are writing the rules correctly. These optimizations, for large projects, result in radical build time improvements. For most build systems, a shared build cache would come with a risk of bad cache entries, but with Bazel, the risk is substantially lower. It’s also easier to get reproducible builds, which make it substantially easier to do certain types of auditing.
Bazel (and Bazel-derivatives/alikes) is not unique in fixing all of these problems. For example, there is build2, which is arguably a lighter-weight and closer to make (in spirit, not in syntax) solution (no Java dependency, etc). Here is an intro, if anyone is interested: https://build2.org/build2/doc/build2-build-system-manual.xht...
It doesn't do Bazel's style sandboxing where your entire compiler toolchain is part of your build system. It does what we call high-fidelity builds: it tracks changes not only to inputs but also to compile options, compiler itself (checksum), environment, etc., and if any of these auxiliary dependencies change, it triggers a rebuild. There are advantages and disadvantages to both approaches with build2's being lighter weight.
The OP listed noted that hermeticity is required for fast builds, and that humans can't be relied upon to provide hermeticity as a property. How does build2 guarantee hermeticity or otherwise support fast builds without a clean room solution a la bazel?
The OP was vague on why exactly hermetic builds are required to achieve fast builds. The only concrete thing they mentioned is caching which doesn't require hermetic builds (in the strict sense, as in preventing any outside changes) provided you can detect changes accurately.
To give a specific example, Bazel may prevent you from accidentally using a different version of the compiler while build2 will detect that you are attempting to use a different version.
Caching depends on a correct model of the dependency tree. If a dependency isn't included in the model, then a non-hermetic tool will silently succeed, incorrectly while a hermetic tool will do the correct thing and fail the build. Keeping the model accurate is too hard for humans, so in practice, non-hermetic build tools can't cache correctly.
It would be interesting to see how build2 achieves these things (hermeticity, cross-building, shared caches). I can’t see anything in the linked documentation that mentions these problems at all.
It currently doesn't do fully hermetic configurations (as in, where it is impossible for any inputs to come outside of the project). Instead build2 does high-fidelity builds where it makes sure that if any of the inputs change (including options, environment, compiler itself, etc), the target gets rebuilt. See my reply to a sibling comment for some details. We do plan to add support for hermetic builds though it won't be exactly like Bazel's -- the idea is to prevent and, where not possible, detect external changes and fail rather than rebuilding.
Regarding caching (and distributed compilation), this is currently on the TODO list though a lot of the infrastructure is already there. For example, the same change detection that is used for high-fidelity builds will be used to decide if what's in the cache is usable in any particular build.
While I agree we should mention these points in the documentation (things are still WIP on that front), I don't think cross-compilation deserves mentioning: for any modern build system it should just work. In build2 we simply implement things in the "cross-compile first" way, with native compilation being a special case (host == target).
and then, after a successful build, you create a new file named "foo.h" in a directory earlier in the search path than the foo.h that was used in the previous compile?
It does but not in a way you probably expect it to be handled: i.e., with some filesystem mechanisms to make the update command only see what has been declared as target's dependencies -- I must admit I don't know Bazel does this in a cross-platform manner (Windows, Mac OS); copying seems way too heavy-handed.
In any case, in build2 this is "handled" by not including headers as "foo.h" but as <libfoo/foo.h>, that is, with the project prefix. You can read more on this here: https://build2.org/build2-toolchain/doc/build2-toolchain-int... And the proper fix will hopefully come with C++20 modules.
You get a lot of features for very cheap compared to other systems. That's the best way I can describe it.
Bazel is basically cross platform out of the box if one is careful with it: that includes a consistent build organization across platforms, cross builds if configured, and so on. It can build a library declaration for mobile and desktop and embedded and web in a single workspace; try that with CMAKE.
Bazel has a universal package system (e.g. download an archive or git repo) that allows for custom ecosystems (yes some of which are not great yet) to exist regardless of what is normal that ecosystem. This is especially notable for C++ where using CMAKE as a package system is a nightmare, with bazel I can just download any C++ repo off the internet and ignore it's CMAKE file for my own BUILD file. Also notably it's the first build system for C++ that hasn't required me to build or configure boost myself, someone can run bazel build against a repo of mine with boost in it without even knowing what boost is and it just builds.
Which brings me to hermeticy and reproducibility. If one is careful running bazel it always uses the same code for a platform (I've never had or seen weird "on my machine" issues with it which is impressive; reverting/stashing changes has always gotten people a working build again). All of the sources are version pinned, and so on. Getting to hermetic takes some work, but it's possible which is nice. A side effect of all this is my instructions for a bazel project are usually: install bazel, run build; and it just works! Yes parts of the ecosystem suck and break this, but that's a work in progress.
There are other features, like the query system, the test runner, the macro system, the local override idiom, the parallel build. The point is it really is a build tool for whatever needs to be built, however it needs to be built, and not just a scripting language useful for building things.
My experience is that I realized the value when I switched to a project that didn’t use it. Especially when I wanted to use other Google technologies like protos, or when I wanted to incrementally rebuild dependencies.
Disclosure: I work at Google, with Blaze, but not on it, or on Bazel. All opinions mine.
I’m not sure I get what makes bazel so good. It seems pretty simple to me. You have a bunch of directories with BUIILD files that are each sort of like Makefiles.
Am I missing something? It kind of just seems like a hodgepodge of scripts. I don’t dislike it, but I’m also not seeing anything amazing.