How to gain control of a 5GB heap in Haskell?

Large memory usage and occasional CPU spikes is almost certainly the GC kicking in. You can see if this is indeed the case by using RTS options like -B, which causes GHC to beep whenever there is a major collection, -t which will tell you statistics after the fact (in particular, see if the GC times are really long) or -Dg, which turns on debugging info for GC calls (though you need to compile with -debug).

There are several things you can do to alleviate this problem:

  • On the initial import of the data, GHC is wasting a lot of time growing the heap. You can tell it to grab all of the memory you need at once by specifying a large -H.

  • A large heap with stable data will get promoted to an old generation. If you increase the number of generations with -G, you may be able to get the stable data to be in the oldest, very rarely GC’d generation, whereas you have the more traditional young and old heaps above it.

  • Depending the on the memory usage of the rest of the application, you can use -F to tweak how much GHC will let the old generation grow before collecting it again. You may be able to tweak this parameter to make this un-garbage collected.

  • If there are no writes, and you have a well-defined interface, it may be worthwhile making this memory un-managed by GHC (use the C FFI) so that there is no chance of a super-GC ever.

These are all speculation, so please test with your particular application.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)