-
You can use an anonymous function either directly in a
flatMap
json_data_rdd.flatMap(lambda j: processDataLine(j, arg1, arg2))
or to curry
processDataLine
f = lambda j: processDataLine(j, arg1, arg2) json_data_rdd.flatMap(f)
-
You can generate
processDataLine
like this:def processDataLine(arg1, arg2): def _processDataLine(dataline): return ... # Do something with dataline, arg1, arg2 return _processDataLine json_data_rdd.flatMap(processDataLine(arg1, arg2))
-
toolz
library provides usefulcurry
decorator:from toolz.functoolz import curry @curry def processDataLine(arg1, arg2, dataline): return ... # Do something with dataline, arg1, arg2 json_data_rdd.flatMap(processDataLine(arg1, arg2))
Note that I’ve pushed
dataline
argument to the last position. It is not required but this way we don’t have to use keyword args. -
Finally there is
functools.partial
already mentioned by Avihoo Mamka in the comments.