PySpark serialization EOFError

The error appears to happen in the pySpark read_int function. Code for which is as follows from spark site :

def read_int(stream):
length = stream.read(4)
if not length:
    raise EOFError
return struct.unpack("!i", length)[0]

This would mean that when reading 4bytes from the stream, if 0 bytes are read, EOF error is raised. The python docs are here.

Leave a Comment