There are two parts of explanation for answering your question.
I. NPY vs. NPZ
As we already read from the doc, the .npy format is:
the standard binary file format in NumPy for persisting a single arbitrary NumPy array on disk. … The format is designed to be as simple as possible while achieving its limited goals. (sources)
And .npz is only a
simple way to combine multiple arrays into a single file, one can use ZipFile to contain multiple “
.npy” files. We recommend using the file extension “.npz” for these archives. (sources)
So, .npz is just a ZipFile containing multiple “.npy” files. And this ZipFile can be either compressed (by using np.savez_compressed) or uncompressed (by using np.savez).
It’s similar to tarball archive file in Unix-like system, where a tarball file can be just an uncompressed archive file which containing other files or a compressed archive file by combining with various compression programs (gzip, bzip2, etc.)
II. Different APIs for binary serialization
And Numpy also provides different APIs to produce these binary file output:
np.save—> Save an array to a binary file in NumPy.npyformatnp.savez–> Save several arrays into a single file in uncompressed.npzformatnp.savez_compressed–> Save several arrays into a single file in compressed.npzformatnp.load–> Load arrays or pickled objects from.npy,.npzor pickled files
If we skim the source code of Numpy, under the hood:
def _savez(file, args, kwds, compress, allow_pickle=True, pickle_kwargs=None):
...
if compress:
compression = zipfile.ZIP_DEFLATED
else:
compression = zipfile.ZIP_STORED
...
def savez(file, *args, **kwds):
_savez(file, args, kwds, False)
def savez_compressed(file, *args, **kwds):
_savez(file, args, kwds, True)
Then back to the question:
- If only use
np.save, there is no more compression on top of the.npyformat, only just a single archive file for the convenience of managing multiple related files. - If use
np.savez_compressed, then of course less memory on disk because of more CPU time to do the compression job (i.e. a bit slower).