Strategy for partitioning dask dataframes efficiently

As of Dask 2.0.0 you may call .repartition(partition_size="100MB").

This method performs an object-considerate (.memory_usage(deep=True)) breakdown of partition size. It will join smaller partitions, or split partitions that have grown too large.

Dask’s Documentation also outlines the usage.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)