How to change memory per node for apache spark worker

When using 1.0.0+ and using spark-shell or spark-submit, use the –executor-memory option. E.g. spark-shell –executor-memory 8G … 0.9.0 and under: When you start a job or start the shell change the memory. We had to modify the spark-shell script so that it would carry command line arguments through as arguments for the underlying java application. … Read more

Error in SLURM cluster – Detected 1 oom-kill event(s): how to improve running jobs

The approved answer is correct but, to be more precise, error slurmstepd: error: Detected 1 oom-kill event(s) in step 1090990.batch cgroup. indicates that you are low on Linux’s CPU RAM memory. If you were, for instance, running some computation on GPU, requesting more GPU memory than what is available will result in an error like … Read more

What’s the difference between operating system “swap” and “page”? [closed]

In spite of the historical interchanging of these two terms, they indicate different things. They are both methods for managing moving data in memory to another storage device, called a backing store (often a hard drive), but they use different methods of doing so. Swapping involves the moving of a process’s entire collection data in … Read more

Software memory bit-flip detection for platforms without ECC

The thing is, ECC is dirt cheap compared to “software ECC countermeasures”. You can easily detect if they have ECC modules and complain (or print a warning) when they don’t. http://www.cyberciti.biz/faq/ecc-memory-modules/ For example, can we see all writes to memory (both from user and kernel space), to distinguish between intended memory changes from in-memory bit … Read more

Keras uses way too much GPU memory when calling train_on_batch, fit, etc

It is a very common mistake to forget that the activations, gradients and optimizer moment tracking variables also take VRRAM, not just the parameters, increasing memory usage quite a bit. The backprob calculations themselves make it so the training phase takes almost double the VRAM of forward / inference use of the neural net, and … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)