RuntimeError: CUDA out of memory. How can I set max_split_size_mb?

The max_split_size_mb configuration value can be set as an environment variable.

The exact syntax is documented, but in short:

The behavior of caching allocator can be controlled via environment variable PYTORCH_CUDA_ALLOC_CONF. The format is PYTORCH_CUDA_ALLOC_CONF=<option>:<value>,<option2>:<value2>

Available options:

  • max_split_size_mb prevents the allocator from splitting blocks larger than this size (in MB). This can help prevent fragmentation and may allow some borderline workloads to complete without running out of memory. Performance cost can range from ‘zero’ to ‘substantial’ depending on allocation patterns. Default value is unlimited, i.e. all blocks can be split. The memory_stats() and memory_summary() methods are useful for tuning. This option should be used as a last resort for a workload that is aborting due to ‘out of memory’ and showing a large amount of inactive split blocks.

So, you should be able to set an environment variable in a manner similar to the following:

Windows: set 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512'

Linux: export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512'

This will depend on what OS you’re using – in your case, for Google Colab, you might find this question helpful.

Leave a Comment