Reducing garbage-collection pause time in a Haskell program

You’re actually doing pretty well to have a 51ms pause time with over 200Mb of live data. The system I work on has a larger max pause time with half that amount of live data.

Your assumption is correct, the major GC pause time is directly proportional to the amount of live data, and unfortunately there’s no way around that with GHC as it stands. We experimented with incremental GC in the past, but it was a research project and didn’t reach the level of maturity needed to fold it into the released GHC.

One thing that we’re hoping will help with this in the future is compact regions: https://phabricator.haskell.org/D1264. It’s a kind of manual memory management where you compact a structure in the heap, and the GC doesn’t have to traverse it. It works best for long-lived data, but perhaps it will be good enough to use for individual messages in your setting. We’re aiming to have it in GHC 8.2.0.

If you’re in a distributed setting and have a load-balancer of some kind there are tricks you can play to avoid taking the pause hit, you basically make sure that the load-balancer doesn’t send requests to machines that are about to do a major GC, and of course make sure that the machine still completes the GC even though it isn’t getting requests.

Leave a Comment

tech