Understand Spark: Cluster Manager, Master and Driver nodes

1. The Cluster Manager is a long-running service, on which node it is running?

Cluster Manager is Master process in Spark standalone mode. It can be started anywhere by doing ./sbin/start-master.sh, in YARN it would be Resource Manager.

2. Is it possible that the Master and the Driver nodes will be the same machine? I presume that there should be a rule somewhere stating that these two nodes should be different?

Master is per cluster, and Driver is per application. For standalone/yarn clusters, Spark currently supports two deploy modes.

  1. In client mode, the driver is launched in the same process as the client that submits the application.
  2. In cluster mode, however, for standalone, the driver is launched from one of the Worker & for yarn, it is launched inside application master node and the client process exits as soon as it fulfils its responsibility of submitting the application without waiting for the app to finish.

If an application submitted with --deploy-mode client in Master node, both Master and Driver will be on the same node. check deployment of Spark application over YARN

3. In the case where the Driver node fails, who is responsible for re-launching the application? And what will happen exactly? i.e. how the Master node, Cluster Manager and Workers nodes will get involved (if they do), and in which order?

If the driver fails, all executors tasks will be killed for that submitted/triggered spark application.

4. In the case where the Master node fails, what will happen exactly and who is responsible for recovering from the failure?

Master node failures are handled in two ways.

  1. Standby Masters with ZooKeeper:

    Utilizing ZooKeeper to provide leader election and some state storage,
    you can launch multiple Masters in your cluster connected to the same
    ZooKeeper instance. One will be elected “leader” and the others will
    remain in standby mode. If the current leader dies, another Master
    will be elected, recover the old Master’s state, and then resume
    scheduling. The entire recovery process (from the time the first
    leader goes down) should take between 1 and 2 minutes. Note that this
    delay only affects scheduling new applications – applications that
    were already running during Master failover are unaffected. check here
    for configurations

  2. Single-Node Recovery with Local File System:

    ZooKeeper is the best way to go for production-level high
    availability, but if you want to be able to restart the Master if
    it goes down, FILESYSTEM mode can take care of it. When applications
    and Workers register, they have enough state written to the provided
    directory so that they can be recovered upon a restart of the Master
    process. check here for conf and more details

Leave a Comment