UnsatisfiedLinkError: /tmp/snappy-1.1.4-libsnappyjava.so Error loading shared library ld-linux-x86-64.so.2: No such file or directory

In my case, install the missing libc6-compat didn’t work. Application still throw java.lang.UnsatisfiedLinkError. Then I find in the docker, /lib64/ld-linux-x86-64.so.2 exist and is a link to /lib/libc.musl-x86_64.so.1, but /lib only contains ld-musl-x86_64.so.1, not ld-linux-x86-64.so.2. So I add a file named ld-linux-x86-64.so.2 linked to ld-musl-x86_64.so.1 in /lib dir and solve the problem. Dockerfile I use: FROM … Read more

Spark SQL – difference between gzip vs snappy vs lzo compression formats

Compression Ratio : GZIP compression uses more CPU resources than Snappy or LZO, but provides a higher compression ratio. General Usage : GZip is often a good choice for cold data, which is accessed infrequently. Snappy or LZO are a better choice for hot data, which is accessed frequently. Snappy often performs better than LZO. … Read more

Methods for writing Parquet files using Python?

Update (March 2017): There are currently 2 libraries capable of writing Parquet files: fastparquet pyarrow Both of them are still under heavy development it seems and they come with a number of disclaimers (no support for nested data e.g.), so you will have to check whether they support everything you need. OLD ANSWER: As of … Read more