How to optimize shuffle spill in Apache Spark application

Learning to performance-tune Spark requires quite a bit of investigation and learning. There are a few good resources including this video. Spark 1.4 has some better diagnostics and visualisation in the interface which can help you. In summary, you spill when the size of the RDD partitions at the end of the stage exceed the … Read more

Difference in Used, Committed and Max Heap Memory

From the Java Doc of MemoryUsage, getUsed is: the amount of used memory in bytes getCommitted() Returns the amount of memory in bytes that is committed for the Java virtual machine to use. This amount of memory is guaranteed for the Java virtual machine to use. getMax() Returns the maximum amount of memory in bytes … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)