Hacker Read

maxmcd · 2024-04-07 02:49:46

I weep for this period of time where we don't have sticky disks readily available for builds. Uploading the layer cache each time is such a coarse and time-consuming way to cache things.

Maybe building from scratch all the time is a good correctness decision? Maybe stale values in disks is a tricky enough issue to want to avoid entirely?

If you keep a stack of disks around and grab a free one when the job starts you'd end up with good speedup a lot of the time. If cost is an issue you can expire them quickly. I regularly see CI jobs spending >50% of their time downloading the same things, or compiling the same things, over and over. How many times have I triggered an action that compiled the exact same sqlite source code? Tens of thousands?

Maybe this is fine, I dunno.

reply

maxmcd | karma 1735 | avg karma 3.64 · | 2024-01-30 16:20:19

Yes, please do what Depot does and put fast persistent disks close to builds to cache docker layers. Github action runners and circleci and all the others adding expensive network calls to manually cache layers has always been such a time sink and I think moves lots of people to remove caching entirely.

boronine | karma 173 | avg karma 2.19 · | 2024-04-07 03:48:15

I've spent days trying all of these solution at my company. All of these solutions suck, they are slow and only successful builds get their layers cached. This is a dead end. The only workable solution is to have a self-hosted runner with a big disk.

lxe | karma 6689 | avg karma 5.59 · | 2023-02-22 12:26:20

> Building Docker images in CI today is slow. CI runners are ephemeral, so they must save and load the cache for every build.

>...persistent disks significantly lowers build time

Does this mean your solution places specific caches, like bazel, node_modules, .yarn, and other intermediary artifacts onto a shared volume and reuses them among jobs?

reply

AtNightWeCode | karma 391 | avg karma 0.41 · | 2021-10-12 11:38:30

All the CI/CD build agents with no cache and so on. This is a general problem for all tech. For the web, cache is cheap but as far as I know there is no equal way to cache builds as cheap.

I think there needs to be a redesign in how dependencies work in most programming languages. Deterministic builds have been such a game changer and I think that CPU vs bandwidth may be the next big area to explore when it comes to compiling code.

reply

tiagod | karma 1808 | avg karma 3.95 · | 2021-04-22 09:21:03+00:00

And the CI with many-GB Docker images is very painful... Turning on layer caching usually makes the process even slower as it needs to pull and unpack the previous image before starting, and if you turn it off you're downloading a ton of deps on every build.

If you separate the heavy stuff into a base-image you still have to load it on the beginning of CI, which without beefy machines with local SSD caching can take a loooong time.

reply

JoshTriplett | karma 44606 | avg karma 4.76 · | 2023-01-30 19:28:15

There are a lot of things that waste cycles on a build machine, but they're worth the reproduciblity. I think that problem would be better solved via caching, ideally.

I would hazard a guess that there are far fewer people these days who download a tarball and `./configure` `make` `make install` than there are distros (who often need to patch) and developers (who will be working from git anyway).

reply

cercatrova | karma 7810 | avg karma 3.1 · | 2022-11-01 07:58:08

I want those. Why would devs not want fast build times and incremental compilation through caching?

mfer | karma 5204 | avg karma 4.34 · | 2021-09-22 08:36:42

I'm reminded of the waste many CI sets do. Where they download the same set of dependencies at the same versions for each run. No effort put into caching thing.

There is so much bandwidth used because of that (I've seen the numbers for some projects and it's HUGE).

reply

viraptor | karma 41139 | avg karma 2.79 · | 2019-06-10 08:35:20

Or even have a dedicated build server/farm with object caching. At some point building on your own machine is not a great solution.

nhoughto | karma 451 | avg karma 1.83 · | 2020-08-05 21:47:23+00:00

Just the build caching is worth the price of entry.

Gradle especially does a great job at this.

reply

notnmeyer | karma 214 | avg karma 1.41 · | 2024-04-07 02:20:35

this is pretty neat—it’s been a while since i’ve tried caching layers with gha. it used to be quite frustrating.

my previous experience was that in nearly all situations the time spent sending and retrieving cache layers over the network wound up making a shorter build step moot. ultimately we said “fuck it” and focused on making builds faster without (docker layer) caching.

reply

randombit | karma 590 | avg karma 5.0 · | 2010-08-20 01:11:43+00:00

Don't forget ccache - storing the cache on a fast disk, it can easily speed up a build by 10x.

sgammon | karma 958 | avg karma 1.53 · | 2023-12-24 14:03:59

All great points but in practice, tools like Bazel and sccache are incredibly conservative about hashes matching, to include file path on disk and even env var state.

One goal of these tools is to guarantee that such misconfiguration results in a cache key mismatch, rather than a hit and a bug.

There are tons of challenges designing a remote build cache product, like anything, but that one has turned out to be a reliable truth.

Some other interesting insights:

- transmitting large objects is often not profitable, so we found that setting reasonable caps on what’s shared with the cache can be really effective for keeping transmissions small and hits fast

- deferring uploads is important because you can’t penalize individual devs for contributing to the cache, and not everybody has a fast upload link. making this part smooth is important so that everyone can benefit from every compile.

- build caching is ancient, Make does its own simple form of build caching, but the protocols for it vary in robustness greatly, from WebDAV in ccache to Bazel’s gRPC interface

- most GitHub Actions builds occur in a small physical area, so accelerating build artifacts is an easier problem than, say, full blown CDN serving

The assumptions that definitely help:

- it’s a cache, not a database; things can be missing, it doesn’t need strong consistency

- replication lag is okay because a build cache entry is typically not requested multiple times in a short window of time; the client that created it has it locally

- it’s much better to give a fast miss than a slow hit, since the compiler is quite fast

- it’s much better to give a fast miss than an error. You can NEVER break a build; at worst it should just not be accelerated.

It’s an interesting problem to work on for sure.

reply

pmontra | karma 15472 | avg karma 2.26 · | 2023-04-11 09:52:41

Probably everybody knows that each test run and build is downloading GB of data but they're doing it quickly, they don't cost much money or none at all, and it's easier to do it than setting up a local cache and use it (CI, local dev machines,) etc. The only reason I ever saw some optimization at that level was because building the base image took too long, so we were saving one and we were rebuilding it only when dependencies changed. I can't remember the details.

insanitybit | karma 4243 | avg karma 2.69 · | 2023-11-10 12:33:07

It's not that you have to, it's that you have many different builds that are going to stomp on each other's caches, plus your build services are often ephemeral - especially since I was at a small startup where we wanted to shut systems down overnight to keep the money.

Tainnor | karma 3416 | avg karma 1.9 · | 2023-09-22 18:19:16

Unfortunately, my team has some builds that take ~25 min without caching and maybe 2 min with caching.

I'm still not entirely sure why it's the case, but the connection to the package registry is incredibly slow, so downloading all dependencies takes forever.

reply

lanstin | karma 2936 | avg karma 1.73 · | 2023-01-09 11:02:06

I recently moved to docker build pipeline for a project, and it’s redownloading all deps on each source file change, unlike the efficient on disk incremental compilation, because of how docker layer caching works, so my usage skyrocketed (and my build times went from seconds to minutes).

cconstantine | karma 538 | avg karma 3.24 · | 2008-11-24 14:52:07+00:00

Something like this might be exactly what we need, but we don't make money building distributed build artifact caching systems. If we don't make money doing it, we aren't doing it :(

"Is this good for the company?"

reply

growse | karma 2190 | avg karma 2.39 · | 2023-12-04 16:36:01

This is fine if you treat your CI provider as a "dumb shell runner". But good CI platforms have actually useful features and APIs (e.g. caching) and if you want to use them, a simple Makefile isn't going to work. For projects where the difference between a cold and warm cache build is tens of minutes, those features have meaningful quality of life improvements.

This may be a tradeoff you're ok with, but for a lot of people, it's not.

reply