How to add a new node to my Elasticsearch cluster

TIPS TO ADD ANOTHER NODE: 1) VERSIONS: It is a good advise to check all of your nodes for the status: http://elastic-node1:9200/ Keep in mind that in most cases: VERSION NEED TO BE THE SAME, EVEN MINOR { “name” : “node2”, “cluster_name” : “xxxxxxxxxxx”, “cluster_uuid” : “n-xxxxxxxxxxxxxxx”, “version” : { “number” : “5.2.2”, “build_hash” : … Read more

How to submit a job to any [subset] of nodes from nodelist in SLURM?

You can work the other way around; rather than specifying which nodes to use, with the effect that each job is allocated all the 7 nodes, specify which nodes not to use: sbatch –exclude=myCluster[01-09] myScript.sh and Slurm will never allocate more than 7 nodes to your jobs. Make sure though that the cluster configuration allows … Read more

How to fix symbol lookup error: undefined symbol errors in a cluster environment

After two dozens of comments to understand the situation, it was found that the libhdf5.so.7 was actually a symlink (with several levels of indirection) to a file that was not shared between the queued processes and the interactive processes. This means even though the symlink itself lies on a shared filesystem, the contents of the … Read more

How to set amount of Spark executors?

In Spark 2.0+ version use spark session variable to set number of executors dynamically (from within program) spark.conf.set(“spark.executor.instances”, 4) spark.conf.set(“spark.executor.cores”, 4) In above case maximum 16 tasks will be executed at any given time. other option is dynamic allocation of executors as below – spark.conf.set(“spark.dynamicAllocation.enabled”, “true”) spark.conf.set(“spark.executor.cores”, 4) spark.conf.set(“spark.dynamicAllocation.minExecutors”,”1″) spark.conf.set(“spark.dynamicAllocation.maxExecutors”,”5″) This was you can let … Read more

Easy way to use parallel options of scikit-learn functions on HPC

SKLearn manages its parallelism with Joblib. Joblib can swap out the multiprocessing backend for other distributed systems like dask.distributed or IPython Parallel. See this issue on the sklearn github page for details. Example using Joblib with Dask.distributed Code taken from the issue page linked above. from sklearn.externals.joblib import parallel_backend search = RandomizedSearchCV(model, param_space, cv=10, n_iter=1000, … Read more

What are the differences between a node, a cluster and a datacenter in a cassandra nosql database?

The hierarchy of elements in Cassandra is: Cluster Data center(s) Rack(s) Server(s) Node (more accurately, a vnode) A Cluster is a collection of Data Centers. A Data Center is a collection of Racks. A Rack is a collection of Servers. A Server contains 256 virtual nodes (or vnodes) by default. A vnode is the data … Read more