It is simple:
partitionColumn
is a column which should be used to determine partitions.-
lowerBound
andupperBound
determine range of values to be fetched. Complete dataset will use rows corresponding to the following query:SELECT * FROM table WHERE partitionColumn BETWEEN lowerBound AND upperBound
-
numPartitions
determines number of partitions to be created. Range betweenlowerBound
andupperBound
is divided intonumPartitions
each with stride equal to:upperBound / numPartitions - lowerBound / numPartitions
For example if:
lowerBound
: 0upperBound
: 1000-
numPartitions
: 10
Stride is equal to 100 and partitions correspond to following queries:
SELECT * FROM table WHERE partitionColumn BETWEEN 0 AND 100
SELECT * FROM table WHERE partitionColumn BETWEEN 100 AND 200
...
SELECT * FROM table WHERE partitionColumn BETWEEN 900 AND 1000